``md
This README documents only the repository scripts whose filenames start with update:
update_migration_status.pyupdate_nonnull_counts.pyupdate_transfer_metrics_summary.pyupdate_data_for_amp_review.py
These scripts are “ops / reporting” utilities that sync Google Sheets (and, for some tasks, an Ocotillo Postgres database + local log/CSV files).
Python 3 recommended.
At minimum, these scripts rely on:
google-api-python-clientgoogle-authpg8000(only for scripts that query Postgres)
pip install google-api-python-client google-auth pg8000Several scripts expect a service account JSON file in the repo root:
service_account.json(migration status + non-null counts + transfer summary)transfermetrics_service_account.json(AMP review scripts)
Make sure:
- The Google Sheets API is enabled for the GCP project.
- The target spreadsheets are shared with the service account email (Editor access).
Purpose: Recompute migration tracking columns in the MIGRATION_STATUS Google Sheet from scratch.
-
Migration Path -
NMAquifer_TableField(oldTable.Field, e.g.Equipment.DateInstalled) -
Final Schema Target(finaltable.field, e.g.transducer_observation.value)- Accepts aliases like
New Schema Target,New Target Schema,Final Target Schema
- Accepts aliases like
Temp Schema TargetFinal Mapping StatusFinal Target StatusTemp Mapping StatusTemp Target StatusTransfer Status
-
Derives temp schema targets using the staging naming convention (e.g.
nma_<table>.<field>), with fallback/normalization logic. -
Checks whether targets exist in the Ocotillo DB schema.
-
Computes status values using exact strings like:
defined,undefined,exists,missingstaging transfer complete,final transfer complete,incompletenot being migrated
Near the top of the script:
SERVICE_ACCOUNT_FILESPREADSHEET_IDSHEET_NAME- DB connection settings (
DB_HOST,DB_PORT, etc.) UPDATED_CELL(writes a “last updated” timestamp into a single cell)
python update_migration_status.pyPurpose: For rows where Migration Path == "stage then refactor", compute non-null counts for old vs temp fields and write the results back to the MIGRATION_STATUS sheet.
-
Google Sheet columns:
Migration PathNMAquifer_TableFieldTemp Schema Target✅ (this is why you typically runupdate_migration_status.pyfirst)
-
Local CSV file:
nma_aquifer_nonnull_counts.csv(default name in the script)
NMA NonNull CountTemp NonNull CountNonNull Diff(old − temp)
-
It does not rewrite other status logic.
-
It will only update
Transfer Statustostaging transfer completewhen:- migration path is
stage then refactor, and NonNull Diff == 0
- migration path is
-
Otherwise, it preserves the existing
Transfer Statusvalue.
export DB_PASS="..."
python update_nonnull_counts.pyPurpose: Parse “transfer metrics” log output (stored as a .csv text file with pipe-delimited blocks) and write summary rows to a Google Sheet tab.
-
A log file path configured in:
TRANSFER_METRICS_PATH(example points intotransfer_metrics_logs/...)
The parser expects blocks like:
-
First block may contain a header line:
model|input_count|cleaned_count|transferred|issue_percentage
-
Blocks then include rows under:
PointID|Table|Field|Error
Writes a summary table with headers:
model | Table | input_count | cleaned_count | transferred | issue_percentage
…and writes it to the configured sheet/tab (defaults: spreadsheet 1Ntka... and sheet name transfer_metrics).
python update_transfer_metrics_summary.pyPurpose: Append new [NMAquifer_Table.Field, PointID, Error] rows into the AMP_review sheet without overwriting existing content.
-
Reads the transfer metrics log file (configured by
TRANSFER_METRICS_PATH) -
Extracts rows from blocks under:
PointID|Table|Field|Error
- Appends to
AMP_reviewcolumns A:C using the Sheets append API. - Skips duplicates already present in the sheet.
Cleans error text to reduce noise, including:
- removing
row.id=... - normalizing certain
sensor_typeandorganizationerrors into shorter forms - removing
value errorprefixes - stripping quotes and trailing separators
python update_data_for_amp_review.pyA common order that matches dependencies:
-
Update migration tracking + temp targets
export DB_PASS="..." python update_migration_status.py
-
(Stage/refactor only) Update non-null comparisons
python update_nonnull_counts.py
-
Parse transfer metrics into summary tables
python update_transfer_metrics_summary.py
-
Feed new transfer errors into AMP review
python update_data_for_amp_review.py
- Sheets permission errors: Share the spreadsheet with the service account email from the JSON credentials file.
- DB connection/auth errors: confirm
DB_PASSis set, and thatocotillo-stagingis reachable at127.0.0.1:5432. - Missing columns in sheets: these scripts expect exact column headers. If your sheet differs, either rename columns or update the script’s header lookups.
::contentReference[oaicite:0]{index=0}