Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
19 views3 pages

Taxicab Oncall Data Data Dictionary

The Green Taxi Trip Data dataset contains records of taxi trips in New York City, provided by the Taxi and Limousine Commission (TLC). Each record includes details such as pick-up and drop-off times, locations, distances, fares, and payment types. The dataset is published as historical data and is updated only as needed for corrections, with a focus on transparency and accessibility for public use.

Uploaded by

nagrajsmanthale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views3 pages

Taxicab Oncall Data Data Dictionary

The Green Taxi Trip Data dataset contains records of taxi trips in New York City, provided by the Taxi and Limousine Commission (TLC). Each record includes details such as pick-up and drop-off times, locations, distances, fares, and payment types. The dataset is published as historical data and is updated only as needed for corrections, with a focus on transparency and accessibility for public use.

Uploaded by

nagrajsmanthale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Data Dictionary - Dataset Information

Dataset Name Green Taxi Trip Data

Dataset URL https://data.cityofnewyork.us/browse?Data-Collection_Data-Collection=TLC%20Trip%20Data&sortBy=alpha

Data Provided by Taxi and Limousine Commission (TLC)


The name of the NYC agency providing this data to the public.

Each row is a... Taxi trip record


The unit of analysis/level of aggregation of the dataset

Publishing Frequency
How often changed data is published to this dataset. For an automatically Historical data
updated dataset, this is the frequency of that automation

Data Change Frequency As needed


How often the data underlying this dataset is changed

Frequency Details As this is historical data, the dataset is only published once. The data will only be changed rarely
Additional details about the publishing or data change frequency, if if there are corrections needed.
needed

Dataset Description These records are generated from the trip record submissions made by green taxi Technology
Overview of the information this dataset contains, including overall context Service Providers (TSPs). Each row represents a single trip in a green taxi. The trip records
and definitions of key terms. This field may include links to supporting include fields capturing pick-up and drop-off dates/times, pick-up and drop-off taxi zone locations,
datasets, agency websites, or external resources for additional context. trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts.

In partnership with the New York City Office of Technology and Innovation (OTI), TLC has
Why is this data collected? published millions of trip records from both yellow medallion taxis and green Street Hail Livery
Purpose behind the collection of this data, including any legal or policy (SHLs). Publicizing trip record data through an open platform permits instant access to records
requirements for this data by NYC Executive Order, Local Law, or other which previously were available only through a formal process (FOIL request.). Internally, TLC
policy directive. uses similar data to guide and evaluate policy decisions.

How is this data collected?


The methods used to create and update this dataset, including what
cleaning or processing was involved prior to dataset publication. The data used in the attached datasets were collected and provided to the NYC Taxi and
If data collection includes interpreting physical information this field Limousine Commission (TLC) by technology providers authorized under the Taxicab & Livery
includes technical details. Passenger Enhancement Programs (TPEP/LPEP). The trip data was not created by the TLC, and
If data collection includes fielding applications, requests, or complaints, TLC makes no representations as to the accuracy of these data.
this field includes details about the forms, applications, and processes
used.

How can this data be used?


Examples of and/or links to projects or agency operations that have used TLC internally uses similar data to provide internal and external public dashboards, please see:
this dataset. https://www.nyc.gov/site/tlc/about/data-and-research.page. This data is heavily used as an
Where relevant, includes links to online projects, agency websites, example dataset for data engineering and data science tasks. For an excellent third-party set of
visualizations, maps, or dashboards. visualizations, see https://toddwschneider.com/dashboards/nyc-taxi-ridehailing-uber-lyft-data/.
What are some questions one might answer using this dataset?

What are the unique characteristics or limitations of this As the trip data was provided by technology providers to TLC, there may be some noise. This
dataset? may occur in the form of unexpected categories or numbers out of expected ranges in some
Unique characteristics of this dataset to be aware of, specifically, columns.
constraints or limitations to the use of the data.

Additional geospatial information Please see our website under "Taxi Zone Maps and Lookup Tables":
For any datasets with geospatial data, specify the coordinate reference https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page. There is a shapefile to associate
system or projection used and other relevant details. LocationID zones to geographic coordinates.
Data Dictionary - Column Information
Column Name Column Description Expected/Allowed Values Field Limitations Additional Notes
Describes any unique characteristics or potential
analytical limitations presented by this field, including:
- the reasoning for any null, zero, or empty values in the
Specifies if there is an expected range and/or
data Provides any additional relevant information
format of possible values. For example, if the
- if the data in the column was integrated from another about the data in the column, including:
data type is Date & Time, this field will note
dataset or organization - definitions of acronyms, special term or
whether the timestamp is MM/DD/YYYY or
- if the data covered includes a different time period codes, or jargon that appears in the field
Name of the column MM/YYYY. If the Column Name is ice cream, this
A brief, plain-language explanation of what the data in the column - the source of the column and how the data in the values;
exactly as it appears in field might note that values can be Chocolate,
means. column was generated. - the meaning of confusing or non-intuitive
the dataset. Vanilla or Strawberry.
values in the data;
For example, information on how the data in this column - how the information in this column relates
If relevant, this field specifies the unit of
was generated can include whether the data was self- to information in other columns;
measurement of the data field, e.g. thousands,
reported directly by a person, system generated by a - other unique details about this column.
millions, $ value, miles, feet, year, etc.
database or agency system, derived through analytical
manipulation of other fields or records; or obtained from
a different agency.
1=Creative Mobile Technologies, LLC
VendorID A code indicating the TPEP provider that provided the record.
2=VeriFone Inc.
lpep_pickup_datetime The date and time when the meter was engaged.
lpep_dropoff_datetime The date and time when the meter was disengaged.

This flag indicates whether the trip record was held in vehicle memory before
Y= store and forward trip
store_and_fwd_flag sending to the vendor, aka “store and forward,” because the vehicle did not have
N= not a store and forward trip
a connection to the server.

1= Standard rate
2=JFK
3=Newark
RatecodeID The final rate code in effect at the end of the trip. 4=Nassau or Westchester
5=Negotiated fare
6=Group ride
99 = Null/unknown

PULocationID TLC Taxi Zone in which the taximeter was engaged


DOLocationID TLC Taxi Zone in which the taximeter was disengaged
passenger_count The number of passengers in the vehicle.
trip_distance The elapsed trip distance in miles reported by the taximeter.

The time-and-distance fare calculated by the meter. For additional information on


fare_amount
the following columns, see https://www.nyc.gov/site/tlc/passengers/taxi-fare.page

extra Miscellaneous extras and surcharges.


mta_tax Tax that is automatically triggered based on the metered rate in use.
Tip amount – This field is automatically populated for credit card tips. Cash tips
tip_amount
are not included.
tolls_amount Total amount of all tolls paid in trip.
ehail_fee Currently unused
Improvement surcharge assessed trips at the flag drop. The improvement
improvement_surcharge
surcharge began being levied in 2015.

total_amount The total amount charged to passengers. Does not include cash tips. 0= Flex Fare trip
1= Credit card
2= Cash
payment_type A numeric code signifying how the passenger paid for the trip. 3= No charge
4= Dispute
5= Unknown
A code indicating whether the trip was a street-hail or a dispatch that is 6= Voided trip
1= Street-hail
trip_type automatically assigned based on the metered rate in use but can be altered by
2= Dispatch
the driver.

congestion_surcharge Total amount collected in trip for NYS congestion surcharge.


Data Dictionary - Revision History
calculation method, or method of collection of the data that have taken place since the initial version. Adding
or updating new data values does not necessitate a new version entry.

Date Change Highlights Comments

12/14/2023 Creation of dataset

You might also like