(It's Italian for "schools.")
Public Schools 3!
- Intro
- Run outstanding migrations
- Fire up the server
- Integrate new data
- Update Makefile to reflect current year
- Updating district boundaries and campus coordinates
- Updating district and campus entities
- Updating AskTED data
- Updating TAPR data
- Checking the local server
- Updating cohorts data
- Updating the CSS styling and other static assets
- More small changes
- Updating the sitemap
- Deploying on the test servers
- Deploying on production servers
- Quick deploy
- Workspace
- Admin
If this is your first time setting up the schools database, please start with README_SETUP.md. The remainder of this Readme is focused on running updates to data in the Schools Explorer. Every year, we need to update cohorts, TAPR, district boundaries, campus coordinates and the entities files for districts and campuses.
The Schools Explorer has a local, staging, and production setup, which can be mostly standardized through Docker containerization. Production has two servers, which is handy in case you want to test deployment to one while keeping the other intact in case you need to quickly roll back changes.
The local and staging environment both connect to local PostgreSQL databases with the PostGIS extension, whereas production uses an external Postgres database hosted on AWS.
There are two types of data downloads supporting the scuole app:
- Manually download. The
scuole-datarepo contains Jupyter notebooks that guide you through formatting and organizing data where the app can access it. If data is incorrectly formatted, you may see errors while working through this repo. - Automatic download. This involves running a command which downloads the latest data directly from a website. We use this process for updating AskTED directory info.
There are also two sections to the schools database:
- Schools Explorer which has a page for every district and campus (school) that has the latest metric.
- Higher Education Outcomes Explorer where we publish cohort information for regions and counties.
When running the scripts to update the School Explorer, make sure you follow this order when updating the data:
- District boundaries and campus coordinates
- District and campus entities
- AskTED
- TAPR
Updating the Higher Education cohorts explorer is a separate process which can be run independently from the schools explorer part whenever we get new data.
To get started, make sure your .env file is getting loaded correctly. the following in your terminal (in my case the path is ~/Documents/data-projects/schools-explorer/scuole-data):
echo $DATA_FOLDERIf this returns nothing or points to the wrong folder, you can manually specify the location of scuole-data by typing export DATA_FOLDER=~/your/local/path/to/scuole-data/in terminal.
If you or another developer have made changes to data structures (models) in Django, you'll need to run the following to catch up with any outstanding migrations you might have:
python manage.py migrateMake sure Docker is running and step into a Docker shell.
make compose/local
make docker/shellIf you're actively troubleshooting/debugging newly integrated data, you may need to roll back the database to the last stable instance. To do so, first go to the data/bootstrap-entities in the Makefile and change the year to the last stable year (e.g. 2021-2022) for both bootstrapdistricts_v2 and bootstrapcampuses_v2.
If you need to start from a clean slate, load in the prior year's data; first make local/reset-db, then sh bootstrap.sh. This may take ~10 minutes to run.
Next, collect static files and fire up the server. Previous instructions suggested sh docker-entrypoint.sh but in my experience this breaks the css. Try this instead:
python manage.py collectstatic --noinput
python manage.py runserverOpen up the schools database in your local server and make sure that all of the information is there and the pages are working correctly. You can compare it to the live version of the school's database.
All good? Let's go! There are also other commands in scuole's Makefile at your disposal so check them out.
If you're server's running in Terminal, you'll need open a new tab in terminal and get into another Docker shell.
make docker/shellYou'll need to update the Makefile in three places to ensure the scripts pull your latest data.
- Go to the
data/bootstrap-entitiesin theMakefileand change the year to the year you are updating for (ex: 2021-2022) for bothbootstrapdistricts_v2andbootstrapcampuses_v2.
data/bootstrap-entities:
python manage.py bootstrapdistricts_v2 2021-2022
python manage.py dedupedistrictslugs
python manage.py bootstrapcampuses_v2 2021-2022
python manage.py dedupecampusslugs- For
data/latest-school, change the year to the latest year (e.g. 2022-2023). - For
data/all-schoolsupdate the add another line to load in the latest year. As an example, if you're updating for 2022-2023, addpython manage.py loadtaprdata 2022-2023 --bulk. This is so that if you reset your database or if someone who is new to updating the schools database is setting up, they can upload the data that you are about to add.
If you've already updated the GEOJSONs of the districts and coordinates of the campuses as instructed in the scuole-data repo, you're already done with this step. We will be connecting this new district and campus geographic data by running the script in the following step.
In this explorer, we can see data for the entire state, regions, districts, and campuses. Regions typically don't change from year to year, but districts and campuses can be added or removed. As a result, we have to update the district and campus models every year by deleting all existing districts and campus models and using a list provided by TEA to read them to the database. This section relies on district and campus entities.csv files created in scuole-data to create the models.
Start the Python terminal.
python manage.py shellFrom the Python terminal, run the following to delete the existing district and campus models (runtime ~1 minute):
from scuole.districts.models import District
district = District.objects.all()
district.delete()
from scuole.campuses.models import Campus
campus = Campus.objects.all()
campus.delete()
exit()And finally, run the following to re-create the district and campus models with the latest list of districts and campus. This will also connect the district boundaries and campus coordinates from the previous step to their proper entities (runtime ~2 minutes).
make data/bootstrap-entitiesIn this explorer, we have a section at the top of the page of every district and campus (under the map of the district or campus location) where we have school addresses and contact information, along with superintendent and principal contact information. We get this data from AskTED which contains a file called Download School and District File with Site Address.
To update the data, run (runtime ~2 minutes):
make data/update-directoriestroubleshooting notes
If you run into any duplicate key errors during the AskTED update, refer to the troubleshooting readme for instructions on how to clear the models. You'll need to clear the model that is throwing this error, and reload the data.
There may be data formatting errors with some of the data as its being pulled in. For instance, some of the phone numbers may be invalid. Right now, we have a phoneNumberFormat function in the updatedistrictsuperintendents, updatecampusdirectory and updatecampusprincipals. You'll need to edit this function or create new ones if you're running into problems loading the data from AskTED.
However, if you're running into an Operation Timed Out error, it's possible that the AskTED has changed the urls where the script can download the data. You will have to go into constants.py and change them.
As of 2023, the spreadsheet was available through this link so it's simple for the script to directly download all of the data we need.
Before 2023, it involved hitting a download button in order to get the correct spreadsheet. We got around that by using a POST request using variables we set up in constants.py called ASKTED_DIRECTORY_VIEWSTATE or ASKTED_DIRECTORY_VIEWSTATE. If they ever change it back to needing to hit a download button, we would need to reset those variables again. To check for the correct values, I look on AskTED's website for the correct download URL, hit the download file button, open up the Network tab in the console, look at the request the download file button triggered and check the Payload tab.
This is the big one! This dataset contains all school and district performance scores, student and teacher staff info, graduation rates, attendance, SAT/ACT scores and more. These are the numbers that populate in each district and campus page.
Note that in August 2025, we pulled in more-recent A-F accountability data by rewiring the code to look in the 2024-25 folder for just accountability.csv, while pulling all other data from 2023-24. This leverages a variable ACCOUNTABILITY_YEAR_OVERRIDE, which you'll need to update in both loadtaprdata_v3.py and scuole/stats/models/reference.py.
To update the data, run:
make data/latest-schoolFYI, the scripts will update data first for the state, then the regions, then the districts and then finally, for the campuses.
Sometimes TEA likes to change up accountability ratings by adding new ones. For example, for the 2021-2022 year, scores that were D or Fs were labeled Not Rated: SB 1365. When that happens, you might need to go into reference.py and add them as RATING CHOICES. If you do that, you're changing the models, so don't forget to run python manage.py makemigrations and then run python manage.py migrate.
Either run docker-entrypoint.sh or python manage.py runserver to fire up the local server. Make sure that statewide, district and campus pages in the school database on your local server are working. If you see any data missing, it might be because TEA changed the column names for some metrics. You can check if there's a disconnect by checking the header name in the spreadsheet you have in the scuole-data repository with what's in schema_v2.py. FYI, short_code in the schema file is the first letter of the header that pertains to the dataset it belongs to (if it's district data, it's D, if it's campus data, it's C). You can find a full list by going to mapping.py.
If they're a mismatch, you can do the following things:
- Either change the column header name in
scuole-datarepository - or, if you think the change is permananent change the letters in the column header, change the column header in
schema_v2.py.
This is where we update the Higher Education outcomes cohorts data. Because this isn't directly connected to the school explorer updates, these can be done either before and after those updates.
First, make sure you have already followed the scuole-data instructions on how to download and format the latest cohorts data.
After you've put the latest cohorts data in scuole-data, you'll need to add a line to data/all-cohorts in the Makefile in scuole with the latest year. An example, if you're updating for 2012, add python manage.py loadallcohorts 2012. Again, this is so that if you reset your database or if someone who is new to updating the schools database is setting up, they can upload the data that you are about to add.
If you're starting in this section, make sure you've fired up the Docker containers with make compose/local.
Then, get inside the Docker shell:
make docker/shellLoad the data by running (latest year should be 2012 in this sunset/last version):
python manage.py loadallcohorts <latest year>If you get the error "There should be only XX cohorts", you'll need to delete the StateCohorts, RegionCohorts and CountyCohorts data in the database — the error is likely because old cohorts data does not get cleared out when new data is loaded. Follow the instructions in the duplicate key error section to delete the data. Make sure you run make data/all-cohorts afterwards from within the Docker shell so you load in data dating back to 1997 — otherwise, the stacked area charts will not show up.
Also, you will need to change the latest_cohort_year variable in all of the functions in the scuole/cohorts/views.py file to reference the latest cohorts school year.
Lastly, make sure the scuole/cohorts/schemas/cohorts/schema.py has the correct years (i.e. you'll need to change the year in 8th Grade (FY 2009) for the reference 'enrolled_8th': '8th Grade (FY 2009)', along with the rest of the references.)
If you make changes to the styles, you'll need to run npm run build again to rebuild the main.css file in the assets/ folder that the templates reference.
Then, run make docker/shell, followed by python manage.py collectstatic --noinput to recollect static files. You'll also need to do a hard refresh in whatever browser you're running the explorer in to fetch the new styles.
- Update the "Last updated" date on the landing page at
scuole/templates/landing.html. If you're updating cohorts data, also update the "Last updated" date on the cohorts landing page atscuole/templates/cohorts_landing.html. - We have several spots in our templates that include metadata about when this explorer was last updated, such as:
- Template:
scuole/templates/base.html, variable:dateModified - Template:
scuole/templates/cohorts_base.html, variable:dateModified(only modify if you are updating the cohorts data) - Template:
scuole/templates/includes/meta.html, variable:article:modified_time
You need to change those! They are (probably) important for search.
When we add new urls, we also need to update the sitemap (sitemap.xml) to include those paths. Fortunately, Django has functions that allow us to generate all of the urls associated with an object's views.
To see an example, view any of the sitemaps.py files. You'll need to add the sitemap to the config/urls.py file, and view the updated sitemap locally at localhost:8000/sitemap.xml.
After verifying that the sitemap looks OK locally, copy the content starting from the <urlset> tag in sitemap.xml and paste it into scuole/static_src/sitemap.xml before deploying. You can also run python manage.py collectstatic --noinput on the test and production servers to get the updated sitemap.
When everything looks good locally, it's time to update the data in the test servers. But first, we need to make sure we can get in the servers.
First, make sure you have pushed all your changes to Github repos of scuole and scuole-data and merged them into the master branch. These are the versions that are going to be pulled into the test server.
Before you update the data, YOU MUST DEPLOY ALL OF THE CODE CHANGES FIRST. This is important, especially if you made any code changes that pertain to the updating process.
First, we will ssh into the schools-test server.
ssh schools-testIf you haven't been able to get into any ssh server, see the previous section to configure your computer or ask Engineering for help.
The server will have a repository of the scuole project. We will need to pull all of the code changes you made into the scuole in the test server that you pushed and merged to master.
cd scuole
git checkout master
git pullWhen you see all of your code changes get pulled in from the Github repo, it's time to rebuild images of the application and restart the Docker containers with the new images on the test host machines by running:
make compose/test-deployMake sure Docker containers are running by running docker ps (There should be a container for web, db and proxy services - see docker-compose.override.yml for more details). Note that when you're done working, you can optionally stop the containers with make compose/local/stop.
Once you run these, make sure your code changes made it through by going to schools-test. Remember that these are only code changes, you haven't updated your data yet on the test server so don't expect to see the latest data up on the test server.
Now, we need to get all of our data changes from scuole-data to the test server.
First, get out of the scuole repo and get into the scuole-data repo:
cd ../scuole-dataNext, get the latest data from your master branch:
git checkout master
git pullYou should see the latest data pulled into the test server from Github.
Next, get inside the Docker container:
docker exec -i -t scuole_web_1 /bin/ashThis is where we will run all of the scripts needed to update the data.
Let's run all migrations so the database is all set up.
python manage.py migrate
Now, let's update the data in the schools explorer part of the site. Remember that there's an order to this:
- District boundaries and campus coordinates
- District and campus entities
- AskTED
- TAPR
Updating district boundaries and campus coordinates and district and campus entities
Just like in the local server, get into the Python terminal and remove the district and campus models:
python manage.py shell
from scuole.districts.models import District
district = District.objects.all()
district.delete()
from scuole.campuses.models import Campus
campus = Campus.objects.all()
campus.delete()
exit()Lastly, run make data/bootstrap-entities to create the district and campus models.
Updating AskTED
Run this command to load the latest AskTED data:
make data/update-directories(8/18/25) Note that if you have any permissions trouble running these make targets, a workaround is to manually paste in the targets and replace python with python3. The cause of this problem is presently unclear.
Updating TAPR
Run this command to load the latest TAPR data:
make data/latest-schoolIf you were able to run this in your local server, then you shouldn't run into any errors! Check if your data updated correctly by going into the schools-test url.
Updating Cohorts
Run this command to load the latest cohorts data (latest year should be 2012 in this sunset/last version):
python manage.py loadallcohorts <latest year>If you were able to run this in your local server, then you shouldn't run into any errors! Check if your data updated correctly by going into the schools-test url. Congrats! You have just finished updating the test server!
One final task before deploying it into production is to fact check a few schools to see if the data loaded in correctly. Every year, we recruit other data team members to help out with this process.
We set up a Schools Explorer Factchecking Google Doc that we update every year to help out the fact-checking process. You want data team members to check at least a few schools and districts with the TAPR data that TEA makes available.
We set up a similar Google Doc to fact check the Cohorts data.
You should also make sure you fact check the statewide numbers in your database with the numbers TEA has.
In addition, you want to hoedown and ask for copy editors to look over any changes to the text or to the disclaimers.
Hooray! We're ready to update the production servers and deploy it live. Scary! As a safeguard, we've had two production servers: schools-prod-1 and schools-prod-2 so that means while one is updating, the other is still up. But schools-prod-2 is decommissioned due to incompatibility with legacy software.
The other good news is that deployment is similar to how you deployed in the test server so much of it should look familiar.
After checking the test site, you'll need to deploy the code changes on the two production servers: schools-prod-1 and schools-prod-2. You must do both servers — if you don't, the published app will switch between new and old code.
Remember, YOU MUST DEPLOY ALL OF THE CODE CHANGES FIRST.
First, we will ssh into the schools-prod-1 server.
ssh schools-prod-1When you first connect, you should see a message *** System restart required ***. DO NOT DO THIS!!! Restarting will break the delicate balance of legacy software that we want to hold together 'til the end of 2025.
The server will have a repository of the scuole project. We will need to pull all of the code changes you made into the scuole into the production server.
cd scuole
git checkout master
git pullWhen you see all of your code changes get pulled in from the Github repo, it's time to rebuild images of the application and restart the Docker containers with the new images on the production host machines by running:
make compose/production-deployThis should only take about a minute. Make sure Docker containers are running by running docker ps (There should be a container for web and proxy services - no db container is necessary to be running in the production servers). If one or more containers are missing you can check their status with docker ps -a.
Congrats, your code changes are now live!
We're almost there! Now we just need to update the data in the production server. Deploying the data on the production servers will be similar to loading it in locally and on the test server. Fortunately, you only need to push data changes to one server - schools-prod-1. Let's go back to schools-prod-1 server:
ssh schools-prod-1Let's get all of our new data from the scuole-data Github:
cd scuole-data
git checkout master
git pullYou should see the latest data pulled into the production server from Github.
Next, get inside the Docker container:
docker exec -i -t scuole_web_1 /bin/ashThis is where we will run all of the scripts needed to update the data.
Let's run all migrations so the database is all set up.
python manage.py migrate
First, let's update the data in the schools explorer part of the site. Remember that there's an order to this:
- District boundaries and campus coordinates
- District and campus entities
- AskTED
- TAPR
Updating district boundaries and campus coordinates and district and campus entities
Just like in the local and test server, get into the Python terminal and remove the district and campus models:
python manage.py shell
from scuole.districts.models import District
district = District.objects.all()
district.delete()
from scuole.campuses.models import Campus
campus = Campus.objects.all()
campus.delete()
exit()Lastly, run make data/bootstrap-entities to create the district and campus models.
Updating AskTED
Run this command to load the latest AskTED data:
make data/update-directoriesUpdating TAPR
Run this command to load the latest TAPR data:
make data/latest-schoolUpdating Cohorts
Run this command to load the latest cohorts data (latest year should be 2012 in this sunset/last version):
python manage.py loadallcohorts <latest year>Once that's done, check the live site. Your changes should be there!
One final thing is that you should make sure you bust the cache for the schools explorer metadata on Twitter's card validator. You can do this by adding a query param to the card URL, like this: https://schools.texastribune.org/?hello and previewing the card.
Here are the bare bones instructions for updating data and code on the test and production servers, after you're satisfied with your changes locally. To read more, check out the Updating data and Deploying on the test servers and Deploying on the production servers sections.
These changes need to be made on the schools-test, schools-prod-1 and schools-prod-2 servers.
- Get onto the host machine:
ssh schools-test(schools-testis the host machine. Useschools-prod-1andschools-prod-2to get onto the production machines.) - Get into the code repo:
cd scuole - Get any code changes:
git pull - Rebuild and restart Docker services and containers:
make compose/test-deploy(make compose/production-deployon the production servers) - Make sure Docker containers are running:
docker ps(There should a container forweb,dbandproxyservices forschools-testbut onlywebandproxyforschools-prod-1andschools-prod-2)
These changes only need to be made on the schools-test and schools-prod-1 servers.
- Get into the data repo:
cd scuole-data - Get the latest data:
git pull - Get into the
webcontainer to make data updates:docker exec -i -t scuole_web_1 /bin/ash - Run migrations so the database is all set up:
python manage.py migrate - If you're doing a new update and need to update the campus and district models, run through the commands here
- After you've created new models, run through the rest of the data update:
- AskTED:
make data/update-directories - TAPR:
make data/latest-school - Cohorts:
python manage.py loadallcohorts <latest-year>(latest year should be 2012 in this sunset/last version)
Before publishing, make sure you make these small changes.
The workspace directory is used for incorporating the schools database with other datasets we run across in our reporting. These include:
- A-F scores
For this, we merge the slugs for campuses in our schools app with their A-F scores from TEA. This is done so we can link to their pages in the schools app when showing them in our grade lookup tool. The spreadsheet with A-F scores from TEA gets put into the raw_data manually. The other spreadsheet you'll need is one with slugs for each campus in our schools app. It can be generated by running:
python manage.py exportslugsAfter those files are in the raw_data directory, run everything inside of analysis.ipynb to spit out a merged spreadsheet in the output directory, which will then be loaded into a Google spreadsheet and used with the lookup tool.
This likely won't have an admin interface, but you are welcome to use it to check out how things are getting loaded. First, you'll need to create a super user. (If you ever blow away your database, you'll have to do it again!)
python manage.py createsuperuserThen, after a python manage.py runserver, you can visit http://localhost:8000/admin and use the credentials you setup to get access. Every thing will be set to read-only, so there's no risk of borking anything.