Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@huangh
Copy link
Collaborator

@huangh huangh commented Aug 1, 2025

Prototype import spare to dev
Import all spare to dev/prod

Related Devops/Terraform:
https://github.com/mbta/terraform_modules/pull/287
https://github.com/mbta/terraform_modules/pull/289
https://github.com/mbta/terraform_modules/pull/295

https://github.com/mbta/devops/pull/3042

Why do we need changes?

Spare/Paratransit tech is reusing the existing Tableau architecture developed by LAMP to upload data to tableau for analysts. all spare data is in the /spare partition of the same dataplatform buckets, and require specific permissions to access (as implemented in the devops tickets above.

What changes does this PR propose?

The core work here is to generically convert and flatten the files into a tableau hyperapi compliant shape. A simple way to do this is to simply convert everything to strings, but here we added a utility to flatten/explode/convert all fields into a fully flat structure. Based on Comments, implemented just string flattening as a reasonable default. This is reusable and generic for other Tableau uploading tasks.

We also autogenerated the input/output and processing methods for each of these inputs, so adding a new source is simply adding it to a whilelist, and it will get processed and uploaded.

How were these changes validated?

Tested locally, tested main conversion method, tested in staging (since there is no environment to test in via dev...)

What questions should reviewers consider?

Limitations:

We removed the ability to do "custom" conversions for now as they were not necessary, but an improvement in the future may be to add includes/excludes, or as requested leave structs as just a "string" if it causes analysis issues by being too exploded/deep.

Another thing to watch is the runtime of this, as we've added ~46 new resources to upload, and this all runs in the Tableau cron job - which runs every hour.

@huangh huangh force-pushed the hh-250729-feat-spare branch from 126f959 to 82e3697 Compare August 14, 2025 17:41
@github-actions
Copy link

LCOV of commit a95c7be during Continuous Integration (Python) #1244

Summary coverage rate:
  lines......: 74.9% (2577 of 3440 lines)
  functions..: 32.9% (191 of 580 functions)
  branches...: no data found

Files changed coverage rate:
                                                                                     |Lines       |Functions  |Branches    
  Filename                                                                           |Rate     Num|Rate    Num|Rate     Num
  =========================================================================================================================
  src/lamp_py/ingestion/config_busloc_trip.py                                        |82.4%     17|12.5%     8|    -      0
  src/lamp_py/ingestion/config_rt_trip.py                                            |82.4%     17|12.5%     8|    -      0
  src/lamp_py/ingestion/gtfs_rt_detail.py                                            |88.2%     17| 0.0%     8|    -      0
  src/lamp_py/ingestion/utils.py                                                     |47.0%    115|18.2%    22|    -      0
  src/lamp_py/runtime_utils/process_logger.py                                        | 100%     64|50.0%    12|    -      0

huangh added 28 commits August 26, 2025 14:15
…rom s3, extracts the schema, and then renames fields that are invalid for tableau, and returns that schema
…_schema" that operates on schemas and not tables. this might not be used...or be possible. wip
…rform operations like flatten. this is generic, can do anything
…hat dont upload to tableau, but need to address
@huangh huangh force-pushed the hh-250729-feat-spare branch from a95c7be to 25ef982 Compare August 26, 2025 18:15
@github-actions
Copy link

LCOV of commit 4731eba during Continuous Integration (Python) #1246

Summary coverage rate:
  lines......: 74.7% (2569 of 3440 lines)
  functions..: 32.9% (191 of 580 functions)
  branches...: no data found

Files changed coverage rate:
                                                                                     |Lines       |Functions  |Branches    
  Filename                                                                           |Rate     Num|Rate    Num|Rate     Num
  =========================================================================================================================
  src/lamp_py/ingestion/config_busloc_trip.py                                        |82.4%     17|12.5%     8|    -      0
  src/lamp_py/ingestion/config_rt_trip.py                                            |82.4%     17|12.5%     8|    -      0
  src/lamp_py/ingestion/gtfs_rt_detail.py                                            |88.2%     17| 0.0%     8|    -      0
  src/lamp_py/ingestion/utils.py                                                     |47.0%    115|18.2%    22|    -      0
  src/lamp_py/runtime_utils/process_logger.py                                        | 100%     64|50.0%    12|    -      0

@github-actions
Copy link

LCOV of commit 0b4c262 during Continuous Integration (Python) #1247

Summary coverage rate:
  lines......: 74.7% (2569 of 3440 lines)
  functions..: 32.9% (191 of 580 functions)
  branches...: no data found

Files changed coverage rate:
                                                                                     |Lines       |Functions  |Branches    
  Filename                                                                           |Rate     Num|Rate    Num|Rate     Num
  =========================================================================================================================
  src/lamp_py/ingestion/config_busloc_trip.py                                        |82.4%     17|12.5%     8|    -      0
  src/lamp_py/ingestion/config_rt_trip.py                                            |82.4%     17|12.5%     8|    -      0
  src/lamp_py/ingestion/gtfs_rt_detail.py                                            |88.2%     17| 0.0%     8|    -      0
  src/lamp_py/ingestion/utils.py                                                     |47.0%    115|18.2%    22|    -      0
  src/lamp_py/runtime_utils/process_logger.py                                        | 100%     64|50.0%    12|    -      0

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break out lamp jobs from spare jobs to stay organized

@github-actions
Copy link

LCOV of commit 2d4d31f during Continuous Integration (Python) #1248

Summary coverage rate:
  lines......: 74.7% (2569 of 3440 lines)
  functions..: 32.9% (191 of 580 functions)
  branches...: no data found

Files changed coverage rate:
                                                                                     |Lines       |Functions  |Branches    
  Filename                                                                           |Rate     Num|Rate    Num|Rate     Num
  =========================================================================================================================
  src/lamp_py/ingestion/config_busloc_trip.py                                        |82.4%     17|12.5%     8|    -      0
  src/lamp_py/ingestion/config_rt_trip.py                                            |82.4%     17|12.5%     8|    -      0
  src/lamp_py/ingestion/gtfs_rt_detail.py                                            |88.2%     17| 0.0%     8|    -      0
  src/lamp_py/ingestion/utils.py                                                     |47.0%    115|18.2%    22|    -      0
  src/lamp_py/runtime_utils/process_logger.py                                        | 100%     64|50.0%    12|    -      0

@github-actions
Copy link

LCOV of commit 47d290e during Continuous Integration (Python) #1250

Summary coverage rate:
  lines......: 74.6% (2613 of 3504 lines)
  functions..: 32.6% (193 of 592 functions)
  branches...: no data found

Files changed coverage rate:
                                                                                     |Lines       |Functions  |Branches    
  Filename                                                                           |Rate     Num|Rate    Num|Rate     Num
  =========================================================================================================================
  src/lamp_py/ingestion/config_busloc_trip.py                                        |82.4%     17|12.5%     8|    -      0
  src/lamp_py/ingestion/config_rt_trip.py                                            |82.4%     17|12.5%     8|    -      0
  src/lamp_py/ingestion/gtfs_rt_detail.py                                            |88.2%     17| 0.0%     8|    -      0
  src/lamp_py/ingestion/utils.py                                                     |47.0%    115|18.2%    22|    -      0
  src/lamp_py/runtime_utils/process_logger.py                                        | 100%     64|50.0%    12|    -      0
  src/lamp_py/tableau/__init__.py                                                    |35.0%     20| 0.0%     6|    -      0
  src/lamp_py/tableau/spare/default_converter.py                                     |65.9%     44|33.3%     6|    -      0

@huangh huangh changed the title Hh 250729 feat spare feat(spare/tableau): Ingest, flatten, and upload Spare resources to tableau Aug 27, 2025
@huangh huangh marked this pull request as ready for review August 27, 2025 16:02
@huangh huangh requested review from ealexa05 and skyqrose August 27, 2025 16:03
@github-actions
Copy link

LCOV of commit 8daaaee during Continuous Integration (Python) #1251

Summary coverage rate:
  lines......: 74.4% (2615 of 3516 lines)
  functions..: 32.4% (192 of 592 functions)
  branches...: no data found

Files changed coverage rate:
                                                                                     |Lines       |Functions  |Branches    
  Filename                                                                           |Rate     Num|Rate    Num|Rate     Num
  =========================================================================================================================
  src/lamp_py/ingestion/config_busloc_trip.py                                        |82.4%     17|12.5%     8|    -      0
  src/lamp_py/ingestion/config_rt_trip.py                                            |82.4%     17|12.5%     8|    -      0
  src/lamp_py/ingestion/gtfs_rt_detail.py                                            |88.2%     17| 0.0%     8|    -      0
  src/lamp_py/ingestion/utils.py                                                     |47.0%    115|18.2%    22|    -      0
  src/lamp_py/runtime_utils/process_logger.py                                        | 100%     64|50.0%    12|    -      0
  src/lamp_py/tableau/__init__.py                                                    |35.0%     20| 0.0%     6|    -      0
  src/lamp_py/tableau/spare/default_converter.py                                     |55.4%     56|16.7%     6|    -      0

@huangh huangh requested a review from skyqrose August 27, 2025 19:40
Copy link
Member

@skyqrose skyqrose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed the code in depth, but based on talking to you out of band and a spot check of the data in Tableau, 👍.

Thanks for taking care of this, I would not have been able to do it myself.

@huangh huangh merged commit 06a8aed into main Aug 28, 2025
6 checks passed
@huangh huangh deleted the hh-250729-feat-spare branch August 28, 2025 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants