Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Dataset Description - California Housing #4862

@s2t2

Description

@s2t2

Describe the current behavior
There is a "README.md" file in the "sample_data" directory in the Colab filesystem that attempts to provide a link to more information about the california housing CSV files in the "sample_data" directory. However that link is broken:

california_housing_data*.csv is California housing data from the 1990 US
Census; more information is available at: https://developers.google.com/machine-learning/crash-course/california-housing-data-description

Describe the expected behavior
Expect a written description of the dataset to be in the README file, or a working link to where to find this information.

What web browser you are using
Chrome

Additional context
There is a lot of information about a similar california housing dataset from sklearn and tensorflow. However that dataset is slightly different (contains column about occupants, also expresses bedrooms and bathrooms as averages instead of totals).

Given the nature of these differences between the datasets, it isn't totally apparent if these datasets are meant to be the same, or if they are from the same source, what transformation operations were taken on the original dataset. Any transformations should be documented.

from sklearn.datasets import fetch_california_housing

dataset = fetch_california_housing()
print(type(dataset))
print(dataset.DESCR)

Alternatively, there is this kaggle dataset which more closely resembles the Colab dataset, and says:

This data was initially featured in the following paper:
Pace, R. Kelley, and Ronald Barry. "Sparse spatial autoregressions." Statistics & Probability Letters 33.3 (1997): 291-297.

and I encountered it in 'Hands-On Machine learning with Scikit-Learn and TensorFlow' by Aurélien Géron.
Aurélien Géron wrote:
This dataset is a modified version of the California Housing dataset > available from: Luís Torgo's page (University of Porto)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions