- AWS Lambda
- AWS S3
- AWS SQS
- sub_set.csv: data/sub_set.csv and s3://taxi-data-processing-data/sub_set.csv
- sub_set_300.csv: s3://taxi-data-processing-data/sub_set_300.csv
- fares.csv: s3://taxi-data-processing-data/fares.csv
- id - a unique identifier for each trip
- vendor_id - a code indicating the provider associated with the trip record
- pickup_datetime - date and time when the meter was engaged
- dropoff_datetime - date and time when the meter was disengaged
- passenger_count - the number of passengers in the vehicle (driver entered value)
- pickup_longitude - the longitude where the meter was engaged
- pickup_latitude - the latitude where the meter was engaged
- dropoff_longitude - the longitude where the meter was disengaged
- dropoff_latitude - the latitude where the meter was disengaged
- store_and_fwd_flag - This flag indicates whether the trip record was held in vehicle memory before sending to the vendor because the vehicle did not have a connection to the server - Y=store and forward; N=not a store and forward trip
- trip_duration - duration of the trip in seconds
- Split the map into quadrants. How many routes started from each quadrant
- Count routes R where: lR > 1000m and tR > 10mins and pR > 2
- Find the quadrant of the biggest route
- Longest route
- Num of passengers for each vendor