Thanks to visit codestin.com
Credit goes to github.com

Skip to content

- B 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town #16155

@jrsykes

Description

@jrsykes

While using the boston_housing data set, a data set hosted by the Scikit-learn package and used to demo models on house price prediction, I came across a feature titled 'B'. This struck me as odd because all other features had been given descriptive names such as 'AGE' or 'TAX'. It turns out that B = 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town. I naively assumed, as this data was being hosted by a prestigious package, that these data were in the data set because they offer significant explanatory value, which would point to a strongly pervasive racist mentality in the population at the time. However, after reading the blog post attached below, it appears as though the data in the B feature of the Boston housing data set were manufactured in an attempt to encourage segregation of the races. If true, this would be strong evidence of systemic institutional racism and by continuing to use this fraudulent data we would be perpetuating the effect desired by the author. I hope you will agree that we would be doing the scientific literature a service by investigating this issue further and ultimately consigning this data to historic reference archives and not encouraging its use in modern research by hosting it.

I look forward to your response,

Jamie R. Sykes

https://medium.com/@docintangible/racist-data-destruction-113e3eff54a8

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions