JSON Validator and CKAN Search
POD Schema is used for JSON validation http://project-open-data.github.io/schema/
Package search API v.3 of data.gov catalog is used for search http://catalog.data.gov/api/3/action/package_search
Full agencies and their data.json urls are listed on the POD Dashboard http://data.civicagency.org/offices
-
Download the
composer.pharexecutable or use the installer.$ curl -sS https://getcomposer.org/installer | php -
Run Composer:
$ php composer.phar install
- Check and update
config/agency_json_urls.csv. The format is simple:"AGENCY_TITLE", json_url
"Department of Agriculture",http://www.usda.gov/data.json
"Department of Education",http://www.ed.gov/data.json
"Department of Energy",http://www.energy.gov/data.json
- Run
php standalone/download.phpto download latest JSONs. Use 'test' param (php standalone/download.php test) to skip re-downloading files, and just run json testing/fixing of existing datasets.
The data/agency_json_download.log will contain overall statistics about latest json update
Run php standalone/update-schema.php to get latest schema from
http://project-open-data.github.io/schema/1_0_final/single_entry.json
- Put all your JSON datasets to /data/ folder OR download them using download.php
Files must be in JSON, named by *.json pattern
example1.jsondepartment_treasury.jsonlast_department.json
-
Run script
For a standalone version, just run
php standalone/process.php. -
Grab the results from /results/ folder
The results will be called using data files name, with _results postfix:
example1_results.jsonexample1_results.csvdepartment_treasury_results.jsondepartment_treasury_results.csvlast_department_results.jsonlast_department_results.csv
The processing.log in same folder will give you some overall statistics information.
- JSON online editor (http://www.jsoneditoronline.org)
- POD online json validator (http://project-open-data.github.io/json-validator/)
- The CKAN API Documentation (http://docs.ckan.org/en/latest/api.html#ckan.logic.action.get.package_search)