This doc compares a sample Corona Data Scraper (CDS) record with a Li record. If you note any errors in this doc, please open an issue, notify us on Slack, or issue a Pull Request.
-
Discontinued reports:
timeseries-tidy.csv. This report continually crashed during generation. We've replaced it withtimeseries-tidy-small.csv, which contains the same data but removes much of the duplication.timeseries.json. This report was not atomic; it relied on some external resource or resources, and was not clear on its own.
-
Renamed reports:
data.jsonanddata.csvhave been renamed tolatest.jsonandlatest.csvrespectively.
- Common changes: Most reports add
locationIDandslug(a url-friendly location representation, e.g. "butte-county-california-us") - Combining data sources: Sometimes multiple sources cover the same location. For example, JHU, New York Times, and California sources may all submit data for California. These sources are combined in the final reports where possible, and conflicts are resolved by priority. See Combining Data Sources.
- locationID: Every location in Li is identified with a unique
locationID, comprised of iso1, iso2, and fips codes from https://github.com/hyperknot/country-levels. Examples:iso1:US= United States,iso1:us#iso2:us-al= State of Alabama,iso1:us#iso2:us-al#fips:01125= Tuscaloosa County, Alabama. - Integration testing samples: samples for automated test verification are in
tests/integration/events/reports/expected-results.
| CDS record | Li record |
|---|---|
|
|
locationID,slug,name,level,city,county,state,country,lat,long,population,aggregate,tz
iso1:us#iso2:us-ca#fips:06007,butte-county-california-us,"Butte County, California, United States",county,,Butte County,California,United States,39.67,-121.6,219186,,America/Los_Angeles
The source data for this report is from https://github.com/hyperknot/country-levels. The report is generated and posted to s3 using ./tools/geojsondb. See the README in that folder.
The report is comprised of geojson and census data, keyed by locationID.
{
"iso1:us#iso2:us-al#fips:01001": {
"geometry": {
"coordinates": [
[
[ -86.918, 32.664 ],
[ -86.817, 32.66 ],
...
]
],
"type": "Polygon"
},
"properties": {
"area_m2": 1566509298,
"census_data": {
"AFFGEOID": "0500000US01001",
"ALAND": 1539602123,
"AWATER": 25706961,
"COUNTYNS": "00161526",
"LSAD": "06"
},
"center_lat": 32.54,
"center_lon": -86.64,
"countrylevel_id": "fips:01001",
"county_code": 1,
"fips": "01001",
"name": "Autauga County",
"name_long": "Autauga County, AL",
"population": 55869,
"state_code_int": 1,
"state_code_iso": "US-AL",
"state_code_postal": "AL",
"timezone": "America/Chicago"
},
"type": "Feature"
},
...
}
Replaces data.json report.
| CDS record | Li record |
|---|---|
|
|
Replaces data.csv report.
name,level,city,county,state,country,cases,deaths,recovered,tested,active,population,populationDensity,lat,long,url,aggregate,hospitalized_current,rating,tz,featureId,countryId,stateId,countyId
"Autauga County, Alabama, United States",county,,Autauga County,Alabama,United States,84,4,,,,55869,35.66464625666353,32.507999999999996,-86.66499999999999,https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv,county,,0.6274509803921569,America/Chicago,fips:01001,iso1:US,iso2:US-AL,fips:01001
locationID,slug,name,level,city,county,state,country,lat,long,population,aggregate,tz,cases,deaths,recovered,active,tested,hospitalized,hospitalized_current,discharged,icu,icu_current
iso1:us#iso2:us-ca#fips:06007,butte-county-california-us,"Butte County, California, United States",county,,Butte County,California,United States,39.67,-121.6,219186,,America/Los_Angeles,21,4,,,200,5,,,2,
| CDS record | Li record |
|---|---|
|
|
- added locationID, slug, sources, dateSources, potentially add warnings
- tz is not in an array
- removed rating, url, featureId
The data fields in a given record can be supplied by many sources: one source may return cases and deaths, and another return hospitalizations and tests. The field dateSources shows where each field comes from.
A shorthand is shown for the date ranges for which the sources supplied data. For example, "2020-05-21..2020-06-18": "src" means that src supplied the data from 05-21 to 06-18.
If there are conflicts in the data (e.g., multiple sources return cases, but they're inconsistent), a warnings element is added. e.g.,
"warnings": {
"2020-06-19": {
"cases": "conflict (src1: 3, src2: 2, src3: 1)",
"deaths": "conflict (src2: 22, src3: 11)"
},
...
name,level,city,county,state,country,lat,long,population,url,aggregate,tz,2020-06-02,2020-06-03,...
"Lower Austria, Austria",state,,,Lower Austria,Austria,48.22100,15.7605,1653419,https:...js,,Europe/Vienna,2867,2868,...
locationID,slug,name,level,city,county,state,country,lat,long,population,aggregate,tz,2020-05-21,2020-05-22
iso1:us#iso2:us-ca#fips:06007,butte-county-california-us,"Butte County, California, United States",county,,Butte County,California,United States,39.67,-121.6,219186,,America/Los_Angeles,21,22
This has replaced the old timeseries-tidy.csv report. You can pull in the location data using locations.csv.
name,level,city,county,state,country,population,lat,long,aggregate,tz,date,type,value
"Lower Austria, Austria",state,,,Lower Austria,Austria,1653419,48.221000000000004,15.7605,,Europe/Vienna,2020-06-02,cases,2867
locationID,date,type,value
iso1:us#iso2:us-ca#fips:06007,2020-06-28,cases,21
iso1:us#iso2:us-ca#fips:06007,2020-06-28,deaths,4
iso1:us#iso2:us-ca#fips:06007,2020-06-28,tested,200
iso1:us#iso2:us-ca#fips:06007,2020-06-28,hospitalized,5
iso1:us#iso2:us-ca#fips:06007,2020-06-28,icu,2
name,level,city,county,state,country,population,lat,long,url,aggregate,tz,cases,deaths,recovered,active,tested,hospitalized,hospitalized_current,discharged,icu,icu_current,growthFactor,date
"Lower Austria, Austria",state,,,Lower Austria,Austria,1653419,48.221000000000004,15.7605,https://info.gesundheitsministerium.at/data/GenesenTodesFaelleBL.js,,Europe/Vienna,2867,97,2678,92,,,,,,,,2020-06-02
locationID,slug,name,level,city,county,state,country,lat,long,population,aggregate,tz,cases,deaths,recovered,active,tested,hospitalized,hospitalized_current,discharged,icu,icu_current,date
iso1:us#iso2:us-ca#fips:06007,butte-county-california-us,"Butte County, California, United States",county,,Butte County,California,United States,39.67,-121.6,219186,,America/Los_Angeles,21,4,,,210,1,,,10,,2020-05-21