Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
17 views20 pages

NUS Pyspark Homework

Uploaded by

michael ws
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views20 pages

NUS Pyspark Homework

Uploaded by

michael ws
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

1.3.

2 Data Preparation (12 marks)


Step 1 (5 marks): Download and preprocess CitiBike data
Download and unzip CitiBike data files using Python codes.
Read all the CSV files into a Spark DataFrame.
Aggregate the data into daily records, where:
trip_count column records the daily trip counts (must have).
datetime column records the date (must have).
The rest columns extract useful features to predict trip_count . These features should not
contain trip_count information directly or indirectly.

Step 2 (2 marks): Download and preprocess the weather data


Download the weather data into a CSV file.
Upload it to Google Drive and mount Google Drive in Colab to read it into a Spark DataFrame.
Preprocess the weather data:
Handle missing values.
Select relevant weather features.

Step 3: Optional Step (no marks)


Feel free to explore and download other datasets that might be helpful.
If the data file size is large, provide the code for downloading the data.
While no marks are allocated, using extra data that improves your model performance can lead
to higher marks in Section 1.3.3.

Step 4 (4 marks): Combine all data into a final table


Combine the processed CitiBike data, weather data, and any additional data into a final
DataFrame.
Ensure the final table has:
trip_count column as the target.
Other columns as features for training your model.
Save the aggregated table as a file in Google Drive or the Colab environment for faster loading in
subsequent steps.

Step 5 (1 mark): Split data into training and testing sets


Filter the final DataFrame as follows:
Training Data: Data from January 2022 to December 2022 (~300+ rows).
Testing Data: Data from January 2023 to July 2023 (~200+ rows).

Step 1 (5 marks): Download and preprocess CitiBike data

In [ ]: from pyspark.sql import SparkSession

# Create a SparkSession
spark = SparkSession.builder \
.appName("Read and Export Multiple CSV Files with PySpark") \
.getOrCreate()

# List of CSV file names


file_names = [
'201401_citibike_tripdata_1.csv',
'201402_citibike_tripdata_1.csv',
'201403_citibike_tripdata_1.csv',
'201404_citibike_tripdata_1.csv',
'201405_citibike_tripdata_1.csv',
'201406_citibike_tripdata_1.csv',
'201407_citibike_tripdata_1.csv',
'201408_citibike_tripdata_1.csv',
'201409_citibike_tripdata_1.csv',
'201410_citibike_tripdata_1.csv',
'201411_citibike_tripdata_1.csv',
'201412_citibike_tripdata_1.csv'
]

# Path to the CSV files in FileStore


base_path = '/FileStore/'

# Loop over each file name and process them


for idx, file_name in enumerate(file_names, start=1):
# Full path to the CSV file
csv_path = f'{base_path}{file_name}'

# Read the CSV file into a PySpark DataFrame


df = spark.read.csv(csv_path, header=True, inferSchema=True)

# Dynamically assign DataFrame to df1, df2, ..., df12


globals()[f'df{idx}'] = df

# Show the first few rows of the DataFrame for verification


print(f"Displaying first few rows of {file_name}:")
df.show()

# Optionally, you can now access the DataFrames as df1, df2, ..., df12
# Example: df1.show(), df2.show(), etc.
Displaying first few rows of 201401_citibike_tripdata_1.csv:
+------------+-------------------+-------------------+----------------+--------------------+
----------------------+-----------------------+--------------+--------------------+---------
-----------+---------------------+------+----------+----------+------+
|tripduration| starttime| stoptime|start station id| start station name|
start station latitude|start station longitude|end station id| end station name|end stati
on latitude|end station longitude|bikeid| usertype|birth year|gender|
+------------+-------------------+-------------------+----------------+--------------------+
----------------------+-----------------------+--------------+--------------------+---------
-----------+---------------------+------+----------+----------+------+
| 471|2014-01-01 00:00:06|2014-01-01 00:07:57| 2009|Catherine St & Mo...|
40.71117444| -73.99682619| 263|Elizabeth St & He...| 40.71729
| -73.996375| 16379|Subscriber| 1986| 1|
| 1494|2014-01-01 00:00:38|2014-01-01 00:25:32| 536| 1 Ave & E 30 St|
40.74144387| -73.97536082| 259|South St & Whiteh...| 40.70122128
| -74.01234218| 15611|Subscriber| 1963| 1|
| 464|2014-01-01 00:03:59|2014-01-01 00:11:43| 228| E 48 St & 3 Ave|
40.7546011026| -73.971878855| 2022| E 59 St & Sutton Pl| 40.758491
16| -73.95920622| 16613|Subscriber| 1991| 1|
| 373|2014-01-01 00:05:15|2014-01-01 00:11:28| 519| Pershing Square N|
40.75188406| -73.97770164| 526| E 33 St & 5 Ave| 40.74765947
| -73.98490707| 15938|Subscriber| 1989| 1|
| 660|2014-01-01 00:05:18|2014-01-01 00:16:18| 83|Atlantic Ave & Fo...|
40.68382604| -73.97632328| 436|Hancock St & Bedf...| 40.68216564
| -73.95399026| 19830|Subscriber| 1990| 1|
| 330|2014-01-01 00:05:55|2014-01-01 00:11:25| 422| W 59 St & 10 Ave|
40.770513| -73.988038| 526| E 33 St & 5 Ave| 40.74765947|
-73.98490707| 17343|Subscriber| 1987| 1|
| 261|2014-01-01 00:06:04|2014-01-01 00:10:25| 516| E 47 St & 1 Ave|
40.75206862| -73.96784384| 167| E 39 St & 3 Ave| 40.7489006
| -73.97604882| 17880|Subscriber| 1983| 1|
| 337|2014-01-01 00:06:41|2014-01-01 00:12:18| 380| W 4 St & 7 Ave S|
40.73401143| -74.00293877| 435| W 21 St & 6 Ave| 40.74173969
| -73.99415556| 16275|Subscriber| 1963| 1|
| 429|2014-01-01 00:07:33|2014-01-01 00:14:42| 296|Division St & Bowery|
40.71413089| -73.9970468| 306|Cliff St & Fulton St| 40.70823502
| -74.00530063| 17318|Subscriber| 1972| 2|
| 1025|2014-01-01 00:08:27|2014-01-01 00:25:32| 540|Lexington Ave & E...|
40.74147286| -73.98320928| 447| 8 Ave & W 52 St| 40.76370739
| -73.9851615| 15525|Subscriber| 1981| 1|
| 718|2014-01-01 00:09:32|2014-01-01 00:21:30| 263|Elizabeth St & He...|
40.71729| -73.996375| 251| Mott St & Prince St| 40.72317958|
-73.99480012| 15693| Customer| \N| 0|
| 786|2014-01-01 00:10:59|2014-01-01 00:24:05| 153| E 40 St & 5 Ave|
40.752062307| -73.9816324043| 290| 2 Ave & E 58 St| 40.7602025
8| -73.96478473| 15281|Subscriber| 1968| 1|
| 267|2014-01-01 00:11:17|2014-01-01 00:15:44| 151|Cleveland Pl & Sp...|
40.7218158| -73.99720307| 410|Suffolk St & Stan...| 40.72066442|
-73.98517977| 15159|Subscriber| 1983| 1|
| 744|2014-01-01 00:12:23|2014-01-01 00:24:47| 450| W 49 St & 8 Ave|
40.76227205| -73.98788205| 505| 6 Ave & W 33 St| 40.74901271
| -73.98848395| 15157|Subscriber| 1976| 1|
| 704|2014-01-01 00:12:25|2014-01-01 00:24:09| 331| Pike St & Monroe St|
40.71173107| -73.99193043| 195|Liberty St & Broa...| 40.70905623
| -74.01043382| 17080|Subscriber| 1980| 2|
| 1367|2014-01-01 00:12:47|2014-01-01 00:35:34| 519| Pershing Square N|
40.75188406| -73.97770164| 386|Centre St & Worth St| 40.71494807
| -74.00234482| 20731| Customer| \N| 0|
| 327|2014-01-01 00:13:11|2014-01-01 00:18:38| 502| Henry St & Grand St|
40.714215| -73.981346| 411| E 6 St & Avenue D| 40.72228087|
-73.97668709| 15655|Subscriber| 1973| 2|
| 223|2014-01-01 00:15:30|2014-01-01 00:19:13| 528| 2 Ave & E 31 St|
40.74290902| -73.97706058| 518| E 39 St & 2 Ave| 40.74780373
| -73.9734419| 15737|Subscriber| 1982| 1|
| 577|2014-01-01 00:16:04|2014-01-01 00:25:41| 467| Dean St & 4 Ave|
40.68312489| -73.97895137| 270|Adelphi St & Myrt...| 40.69308257
| -73.97178913| 16115|Subscriber| 1983| 1|
| 566|2014-01-01 00:16:13|2014-01-01 00:25:39| 467| Dean St & 4 Ave|
40.68312489| -73.97895137| 270|Adelphi St & Myrt...| 40.69308257
| -73.97178913| 15753|Subscriber| 1983| 2|
+------------+-------------------+-------------------+----------------+--------------------+
----------------------+-----------------------+--------------+--------------------+---------
-----------+---------------------+------+----------+----------+------+
only showing top 20 rows

Displaying first few rows of 201402_citibike_tripdata_1.csv:


+------------+-------------------+-------------------+----------------+--------------------+
----------------------+-----------------------+--------------+--------------------+---------
-----------+---------------------+------+----------+----------+------+
|tripduration| starttime| stoptime|start station id| start station name|
start station latitude|start station longitude|end station id| end station name|end stati
on latitude|end station longitude|bikeid| usertype|birth year|gender|
+------------+-------------------+-------------------+----------------+--------------------+
----------------------+-----------------------+--------------+--------------------+---------
-----------+---------------------+------+----------+----------+------+
| 382|2014-02-01 00:00:00|2014-02-01 00:06:22| 294| Washington Square E|
40.73049393| -73.9957214| 265|Stanton St & Chry...| 40.72229346
| -73.99147535| 21101|Subscriber| 1991| 1|
| 372|2014-02-01 00:00:03|2014-02-01 00:06:15| 285| Broadway & E 14 St|
40.73454567| -73.99074142| 439| E 4 St & 2 Ave| 40.7262807
| -73.98978041| 15456|Subscriber| 1979| 2|
| 591|2014-02-01 00:00:09|2014-02-01 00:10:00| 247|Perry St & Bleeck...|
40.73535398| -74.00483091| 251| Mott St & Prince St| 40.72317958
| -73.99480012| 16281|Subscriber| 1948| 2|
| 583|2014-02-01 00:00:32|2014-02-01 00:10:15| 357| E 11 St & Broadway|
40.73261787| -73.99158043| 284|Greenwich Ave & 8...| 40.7390169121
| -74.0026376103| 17400|Subscriber| 1981| 1|
| 223|2014-02-01 00:00:41|2014-02-01 00:04:24| 401|Allen St & Riving...|
40.72019576| -73.98997825| 439| E 4 St & 2 Ave| 40.7262807
| -73.98978041| 19341|Subscriber| 1990| 1|
| 541|2014-02-01 00:00:46|2014-02-01 00:09:47| 152|Warren St & Churc...|
40.71473993| -74.00910627| 331| Pike St & Monroe St| 40.71173107
| -73.99193043| 18674|Subscriber| 1990| 1|
| 354|2014-02-01 00:01:01|2014-02-01 00:06:55| 325| E 19 St & 3 Ave|
40.73624527| -73.98473765| 439| E 4 St & 2 Ave| 40.7262807
| -73.98978041| 16975|Subscriber| 1991| 1|
| 916|2014-02-01 00:01:11|2014-02-01 00:16:27| 354|Emerson Pl & Myrt...|
40.69363137| -73.96223558| 395|Bond St & Scherme...| 40.68807003
| -73.98410637| 16020|Subscriber| 1978| 1|
| 277|2014-02-01 00:01:33|2014-02-01 00:06:10| 375|Mercer St & Bleec...|
40.72679454| -73.99695094| 369|Washington Pl & 6...| 40.73224119
| -74.00026394| 18891|Subscriber| 1944| 1|
| 439|2014-02-01 00:02:14|2014-02-01 00:09:33| 285| Broadway & E 14 St|
40.73454567| -73.99074142| 247|Perry St & Bleeck...| 40.73535398
| -74.00483091| 20875|Subscriber| 1983| 2|
| 959|2014-02-01 00:02:17|2014-02-01 00:18:16| 518| E 39 St & 2 Ave|
40.74780373| -73.9734419| 439| E 4 St & 2 Ave| 40.7262807
| -73.98978041| 15263|Subscriber| 1969| 1|
| 359|2014-02-01 00:02:35|2014-02-01 00:08:34| 501| FDR Drive & E 35 St|
40.744219| -73.97121214| 487| E 20 St & FDR Drive| 40.73314259|
-73.97573881| 19377|Subscriber| 1986| 1|
| 1040|2014-02-01 00:02:37|2014-02-01 00:19:57| 388| W 26 St & 10 Ave|
40.749717753| -74.002950346| 336|Sullivan St & Was...| 40.7304774
7| -73.99906065| 17271|Subscriber| 1981| 1|
| 477|2014-02-01 00:02:42|2014-02-01 00:10:39| 518| E 39 St & 2 Ave|
40.74780373| -73.9734419| 528| 2 Ave & E 31 St| 40.74290902
| -73.97706058| 19366|Subscriber| 1990| 1|
| 707|2014-02-01 00:02:50|2014-02-01 00:14:37| 257|Lispenard St & Br...|
40.71939226| -74.00247214| 345| W 13 St & 6 Ave| 40.73649403
| -73.99704374| 17757|Subscriber| 1962| 1|
| 343|2014-02-01 00:03:16|2014-02-01 00:08:59| 477| W 41 St & 8 Ave|
40.75640548| -73.9900262| 493| W 45 St & 6 Ave| 40.7568001
| -73.98291153| 19734|Subscriber| 1965| 1|
| 813|2014-02-01 00:03:18|2014-02-01 00:16:51| 317| E 6 St & Avenue B|
40.72453734| -73.98185424| 223| W 13 St & 7 Ave| 40.73781509
| -73.99994661| 18003|Subscriber| 1942| 1|
| 1491|2014-02-01 00:03:18|2014-02-01 00:28:09| 527| E 33 St & 1 Ave|
40.74315566| -73.97434726| 412|Forsyth St & Cana...| 40.7158155
| -73.99422366| 17630|Subscriber| 1986| 1|
| 292|2014-02-01 00:04:02|2014-02-01 00:08:54| 504| 1 Ave & E 15 St|
40.73221853| -73.98165557| 487| E 20 St & FDR Drive| 40.73314259
| -73.97573881| 16115|Subscriber| 1989| 2|
| 259|2014-02-01 00:05:21|2014-02-01 00:09:40| 316|Fulton St & Willi...|
40.70955958| -74.00653609| 415|Pearl St & Hanove...| 40.7047177
| -74.00926027| 20152|Subscriber| 1980| 2|
+------------+-------------------+-------------------+----------------+--------------------+
----------------------+-----------------------+--------------+--------------------+---------
-----------+---------------------+------+----------+----------+------+
only showing top 20 rows

Displaying first few rows of 201403_citibike_tripdata_1.csv:


+------------+-------------------+-------------------+----------------+--------------------+
----------------------+-----------------------+--------------+--------------------+---------
-----------+---------------------+------+----------+----------+------+
|tripduration| starttime| stoptime|start station id| start station name|
start station latitude|start station longitude|end station id| end station name|end stati
on latitude|end station longitude|bikeid| usertype|birth year|gender|
+------------+-------------------+-------------------+----------------+--------------------+
----------------------+-----------------------+--------------+--------------------+---------
-----------+---------------------+------+----------+----------+------+
| 949|2014-03-01 00:00:16|2014-03-01 00:16:05| 317| E 6 St & Avenue B|
40.72453734| -73.98185424| 284|Greenwich Ave & 8...| 40.7390169121
| -74.0026376103| 17440|Subscriber| 1942| 1|
| 533|2014-03-01 00:00:57|2014-03-01 00:09:50| 457| Broadway & W 58 St|
40.76695317| -73.98169333| 441| E 52 St & 2 Ave| 40.756014
| -73.967416| 20855|Subscriber| 1960| 1|
| 122|2014-03-01 00:01:06|2014-03-01 00:03:08| 146|Hudson St & Reade St|
40.71625008| -74.0091059| 276|Duane St & Greenw...| 40.71748752
| -74.0104554| 15822|Subscriber| 1984| 1|
| 134|2014-03-01 00:01:14|2014-03-01 00:03:28| 146|Hudson St & Reade St|
40.71625008| -74.0091059| 276|Duane St & Greenw...| 40.71748752
| -74.0104554| 17793|Subscriber| 1985| 1|
| 997|2014-03-01 00:01:18|2014-03-01 00:17:55| 150| E 2 St & Avenue C|
40.7208736| -73.98085795| 461| E 20 St & 2 Ave| 40.73587678|
-73.98205027| 20756|Subscriber| 1977| 1|
| 720|2014-03-01 00:01:27|2014-03-01 00:13:27| 382|University Pl & E...|
40.73492695| -73.99200509| 79|Franklin St & W B...| 40.71911552
| -74.00666661| 19377|Subscriber| 1983| 1|
| 231|2014-03-01 00:02:08|2014-03-01 00:05:59| 384|Fulton St & Waver...|
40.68317813| -73.9659641| 399|Lafayette Ave & S...| 40.68851534
| -73.9647628| 20117|Subscriber| 1982| 1|
| 387|2014-03-01 00:02:24|2014-03-01 00:08:51| 521| 8 Ave & W 31 St|
40.75044999| -73.99481051| 529| W 42 St & 8 Ave| 40.7575699
| -73.99098507| 18856|Subscriber| 1975| 2|
| 115|2014-03-01 00:02:28|2014-03-01 00:04:23| 438| St Marks Pl & 1 Ave|
40.72779126| -73.98564945| 438| St Marks Pl & 1 Ave| 40.72779126
| -73.98564945| 20922|Subscriber| 1994| 2|
| 656|2014-03-01 00:02:49|2014-03-01 00:13:45| 284|Greenwich Ave & 8...|
40.7390169121| -74.0026376103| 504| 1 Ave & E 15 St| 40.732218
53| -73.98165557| 14889|Subscriber| 1987| 1|
| 156|2014-03-01 00:03:05|2014-03-01 00:05:41| 337| Old Slip & Front St|
40.7037992| -74.00838676| 351|Front St & Maiden Ln| 40.70530954|
-74.00612572| 18393|Subscriber| 1948| 1|
| 336|2014-03-01 00:03:23|2014-03-01 00:08:59| 462| W 22 St & 10 Ave|
40.74691959| -74.00451887| 284|Greenwich Ave & 8...| 40.7390169121
| -74.0026376103| 20516|Subscriber| 1986| 1|
| 294|2014-03-01 00:03:33|2014-03-01 00:08:27| 396|Lefferts Pl & Fra...|
40.680342423| -73.9557689392| 364|Lafayette Ave & C...| 40.6890044
3| -73.96023854| 18194|Subscriber| 1980| 1|
| 133|2014-03-01 00:03:41|2014-03-01 00:05:54| 337| Old Slip & Front St|
40.7037992| -74.00838676| 351|Front St & Maiden Ln| 40.70530954|
-74.00612572| 17257|Subscriber| 1977| 1|
| 454|2014-03-01 00:03:45|2014-03-01 00:11:19| 386|Centre St & Worth St|
40.71494807| -74.00234482| 340|Madison St & Clin...| 40.71269042
| -73.98776323| 19943|Subscriber| 1977| 1|
| 336|2014-03-01 00:04:25|2014-03-01 00:10:01| 531|Forsyth St & Broo...|
40.71893904| -73.99266288| 502| Henry St & Grand St| 40.714215
| -73.981346| 21136|Subscriber| 1960| 2|
| 286|2014-03-01 00:04:48|2014-03-01 00:09:34| 399|Lafayette Ave & S...|
40.68851534| -73.9647628| 365|Fulton St & Grand...| 40.68223166
| -73.9614583| 16573|Subscriber| 1964| 2|
| 314|2014-03-01 00:05:17|2014-03-01 00:10:31| 383|Greenwich Ave & C...|
40.735238| -74.000271| 463| 9 Ave & W 16 St| 40.74206539|
-74.00443172| 21431|Subscriber| 1989| 1|
| 223|2014-03-01 00:06:06|2014-03-01 00:09:49| 438| St Marks Pl & 1 Ave|
40.72779126| -73.98564945| 511| E 14 St & Avenue B| 40.72938685
| -73.97772429| 16137|Subscriber| 1991| 1|
| 262|2014-03-01 00:06:29|2014-03-01 00:10:51| 300|Shevchenko Pl & E...|
40.728145| -73.990214| 285| Broadway & E 14 St| 40.73454567|
-73.99074142| 19486|Subscriber| 1992| 1|
+------------+-------------------+-------------------+----------------+--------------------+
----------------------+-----------------------+--------------+--------------------+---------
-----------+---------------------+------+----------+----------+------+
only showing top 20 rows

Displaying first few rows of 201404_citibike_tripdata_1.csv:


+------------+-------------------+-------------------+----------------+--------------------+
----------------------+-----------------------+--------------+--------------------+---------
-----------+---------------------+------+----------+----------+------+
|tripduration| starttime| stoptime|start station id| start station name|
start station latitude|start station longitude|end station id| end station name|end stati
on latitude|end station longitude|bikeid| usertype|birth year|gender|
+------------+-------------------+-------------------+----------------+--------------------+
----------------------+-----------------------+--------------+--------------------+---------
-----------+---------------------+------+----------+----------+------+
| 558|2014-04-01 00:00:07|2014-04-01 00:09:25| 82|St James Pl & Pea...|
40.71117416| -74.00016545| 2008|Little West St & ...| 40.70569254
| -74.01677685| 21062|Subscriber| 1982| 1|
| 882|2014-04-01 00:00:20|2014-04-01 00:15:02| 349|Rivington St & Ri...|
40.71850211| -73.98329859| 312|Allen St & E Hous...| 40.722055
| -73.989111| 20229|Subscriber| 1988| 1|
| 587|2014-04-01 00:00:25|2014-04-01 00:10:12| 293|Lafayette St & E ...|
40.73028666| -73.9907647| 334| W 20 St & 7 Ave| 40.74238787
| -73.99726235| 20922|Subscriber| 1959| 1|
| 355|2014-04-01 00:00:44|2014-04-01 00:06:39| 539|Metropolitan Ave ...|
40.71534825| -73.96024116| 282| Kent Ave & S 11 St| 40.70827295
| -73.96834101| 20914|Subscriber| 1981| 1|
| 524|2014-04-01 00:01:29|2014-04-01 00:10:13| 459| W 20 St & 11 Ave|
40.746745| -74.007756| 503| E 20 St & Park Ave| 40.73827428|
-73.98751968| 21051|Subscriber| 1964| 1|
| 301|2014-04-01 00:01:53|2014-04-01 00:06:54| 281|Grand Army Plaza ...|
40.7643971| -73.97371465| 500| Broadway & W 51 St| 40.76228826|
-73.98336183| 17286|Subscriber| 1970| 1|
| 136|2014-04-01 00:02:34|2014-04-01 00:04:50| 386|Centre St & Worth St|
40.71494807| -74.00234482| 387|Centre St & Chamb...| 40.71273266
| -74.0046073| 21429|Subscriber| 1983| 2|
| 151|2014-04-01 00:02:40|2014-04-01 00:05:11| 223| W 13 St & 7 Ave|
40.73781509| -73.99994661| 405|Washington St & G...| 40.739323
| -74.008119| 15572|Subscriber| 1992| 1|
| 434|2014-04-01 00:02:58|2014-04-01 00:10:12| 324|DeKalb Ave & Huds...|
40.689888| -73.981013| 366|Clinton Ave & Myr...| 40.693261|
-73.968896| 17582|Subscriber| 1992| 2|
| 164|2014-04-01 00:02:59|2014-04-01 00:05:43| 539|Metropolitan Ave ...|
40.71534825| -73.96024116| 460| S 4 St & Wythe Ave| 40.71285887
| -73.96590294| 16010|Subscriber| 1983| 1|
| 326|2014-04-01 00:03:03|2014-04-01 00:08:29| 405|Washington St & G...|
40.739323| -74.008119| 254| W 11 St & 6 Ave| 40.73532427|
-73.99800419| 19105|Subscriber| 1981| 2|
| 263|2014-04-01 00:03:55|2014-04-01 00:08:18| 509| 9 Ave & W 22 St|
40.7454973| -74.00197139| 346| Bank St & Hudson St| 40.73652889|
-74.00618026| 21516|Subscriber| 1980| 1|
| 1153|2014-04-01 00:04:49|2014-04-01 00:24:02| 489| 10 Ave & W 28 St|
40.75066386| -74.00176802| 281|Grand Army Plaza ...| 40.7643971
| -73.97371465| 17816|Subscriber| 1964| 1|
| 709|2014-04-01 00:05:07|2014-04-01 00:16:56| 312|Allen St & E Hous...|
40.722055| -73.989111| 476| E 31 St & 3 Ave| 40.74394314|
-73.97966069| 16366|Subscriber| 1988| 1|
| 324|2014-04-01 00:05:44|2014-04-01 00:11:08| 435| W 21 St & 6 Ave|
40.74173969| -73.99415556| 494| W 26 St & 8 Ave| 40.74734825
| -73.99723551| 20096|Subscriber| 1967| 1|
| 562|2014-04-01 00:05:44|2014-04-01 00:15:06| 536| 1 Ave & E 30 St|
40.74144387| -73.97536082| 512| W 29 St & 9 Ave| 40.7500727
| -73.99839279| 20006|Subscriber| 1979| 1|
| 221|2014-04-01 00:05:45|2014-04-01 00:09:26| 335|Washington Pl & B...|
40.72903917| -73.99404649| 382|University Pl & E...| 40.73492695
| -73.99200509| 16582|Subscriber| 1985| 1|
| 198|2014-04-01 00:05:46|2014-04-01 00:09:04| 492| W 33 St & 7 Ave|
40.75019995| -73.99093085| 526| E 33 St & 5 Ave| 40.74765947
| -73.98490707| 18060|Subscriber| 1971| 1|
| 157|2014-04-01 00:06:00|2014-04-01 00:08:37| 387|Centre St & Chamb...|
40.71273266| -74.0046073| 306|Cliff St & Fulton St| 40.70823502
| -74.00530063| 20359|Subscriber| 1986| 1|
| 744|2014-04-01 00:06:35|2014-04-01 00:18:59| 300|Shevchenko Pl & E...|
40.728145| -73.990214| 438| St Marks Pl & 1 Ave| 40.72779126|
-73.98564945| 19465|Subscriber| 1988| 1|
+------------+-------------------+-------------------+----------------+--------------------+
----------------------+-----------------------+--------------+--------------------+---------
-----------+---------------------+------+----------+----------+------+
only showing top 20 rows

Displaying first few rows of 201405_citibike_tripdata_1.csv:


+------------+-------------------+-------------------+----------------+--------------------+
-------------------

*** WARNING: max output size exceeded, skipping output. ***

-74.015756| 337| Old Slip & Front St| 40.7037992| -74.0083867


6| 14703|Subscriber| 1981| 1|
+------------+-------------------+-------------------+----------------+--------------------+
----------------------+-----------------------+--------------+--------------------+---------
-----------+---------------------+------+----------+----------+------+
only showing top 20 rows

Displaying first few rows of 201409_citibike_tripdata_1.csv:


+------------+-----------------+-----------------+----------------+--------------------+----
------------------+-----------------------+--------------+--------------------+-------------
-------+---------------------+------+----------+----------+------+
|tripduration| starttime| stoptime|start station id| start station name|star
t station latitude|start station longitude|end station id| end station name|end station l
atitude|end station longitude|bikeid| usertype|birth year|gender|
+------------+-----------------+-----------------+----------------+--------------------+----
------------------+-----------------------+--------------+--------------------+-------------
-------+---------------------+------+----------+----------+------+
| 2828|9/1/2014 00:00:25|9/1/2014 00:47:33| 386|Centre St & Worth St|
40.71494807| -74.00234482| 450| W 49 St & 8 Ave| 40.76227205
| -73.98788205| 15941|Subscriber| 1980.0| 1|
| 368|9/1/2014 00:00:28|9/1/2014 00:06:36| 387|Centre St & Chamb...|
40.71273266| -74.0046073| 2008|Little West St & ...| 40.70569254
| -74.01677685| 18962|Subscriber| 1982.0| 1|
| 2201|9/1/2014 00:00:40|9/1/2014 00:37:21| 386|Centre St & Worth St|
40.71494807| -74.00234482| 441| E 52 St & 2 Ave| 40.756014
| -73.967416| 15982|Subscriber| 1968.0| 1|
| 322|9/1/2014 00:00:41|9/1/2014 00:06:03| 167| E 39 St & 3 Ave|
40.7489006| -73.97604882| 528| 2 Ave & E 31 St| 40.74290902|
-73.97706058| 19081|Subscriber| 1961.0| 1|
| 1693|9/1/2014 00:00:59|9/1/2014 00:29:12| 223| W 13 St & 7 Ave|
40.73781509| -73.99994661| 83|Atlantic Ave & Fo...| 40.68382604
| -73.97632328| 20836|Subscriber| 1978.0| 1|
| 438|9/1/2014 00:01:18|9/1/2014 00:08:36| 474| 5 Ave & E 29 St|
40.7451677| -73.98683077| 501| FDR Drive & E 35 St| 40.744219|
-73.97121214| 18089|Subscriber| 1985.0| 1|
| 860|9/1/2014 00:01:36|9/1/2014 00:15:56| 386|Centre St & Worth St|
40.71494807| -74.00234482| 2000|Front St & Washin...| 40.70255088
| -73.98940236| 17160|Subscriber| 1990.0| 1|
| 675|9/1/2014 00:01:54|9/1/2014 00:13:09| 151|Cleveland Pl & Sp...|
40.7218158| -73.99720307| 152|Warren St & Churc...| 40.71473993|
-74.00910627| 14539| Customer| NULL| 0|
| 560|9/1/2014 00:01:55|9/1/2014 00:11:15| 386|Centre St & Worth St|
40.71494807| -74.00234482| 2009|Catherine St & Mo...| 40.71117444
| -73.99682619| 20113|Subscriber| 1991.0| 1|
| 2286|9/1/2014 00:02:04|9/1/2014 00:40:10| 479| 9 Ave & W 45 St|
40.76019252| -73.9912551| 143|Clinton St & Jora...| 40.69239502
| -73.99337909| 19796|Subscriber| 1975.0| 1|
| 1153|9/1/2014 00:02:09|9/1/2014 00:21:22| 512| W 29 St & 9 Ave|
40.7500727| -73.99839279| 428| E 3 St & 1 Ave| 40.72467721|
-73.98783413| 17508|Subscriber| 1964.0| 1|
| 709|9/1/2014 00:02:33|9/1/2014 00:14:22| 536| 1 Ave & E 30 St|
40.74144387| -73.97536082| 466| W 25 St & 6 Ave| 40.74395411
| -73.99144871| 16028|Subscriber| 1979.0| 1|
| 248|9/1/2014 00:02:38|9/1/2014 00:06:46| 293|Lafayette St & E ...|
40.73028666| -73.9907647| 439| E 4 St & 2 Ave| 40.7262807
| -73.98978041| 21269|Subscriber| 1982.0| 2|
| 625|9/1/2014 00:02:39|9/1/2014 00:13:04| 151|Cleveland Pl & Sp...|
40.7218158| -73.99720307| 152|Warren St & Churc...| 40.71473993|
-74.00910627| 21191|Subscriber| 1983.0| 2|
| 306|9/1/2014 00:02:45|9/1/2014 00:07:51| 297| E 15 St & 3 Ave|
40.734232| -73.986923| 300|Shevchenko Pl & E...| 40.728145|
-73.990214| 18386| Customer| NULL| 0|
| 228|9/1/2014 00:03:09|9/1/2014 00:06:57| 485| W 37 St & 5 Ave|
40.75038009| -73.98338988| 160|E 37 St & Lexingt...| 40.748238
| -73.978311| 17170|Subscriber| 1975.0| 1|
| 448|9/1/2014 00:03:22|9/1/2014 00:10:50| 330| Reade St & Broadway|
40.71450451| -74.00562789| 307|Canal St & Rutger...| 40.71427487
| -73.98990025| 15411|Subscriber| 1956.0| 1|
| 459|9/1/2014 00:03:22|9/1/2014 00:11:01| 265|Stanton St & Chry...|
40.72229346| -73.99147535| 331| Pike St & Monroe St| 40.71173107
| -73.99193043| 16421|Subscriber| 1988.0| 1|
| 290|9/1/2014 00:03:28|9/1/2014 00:08:18| 380| W 4 St & 7 Ave S|
40.73401143| -74.00293877| 284|Greenwich Ave & 8...| 40.7390169121
| -74.0026376103| 20727|Subscriber| 1984.0| 1|
| 252|9/1/2014 00:03:38|9/1/2014 00:07:50| 151|Cleveland Pl & Sp...|
40.7218158| -73.99720307| 375|Mercer St & Bleec...| 40.72679454|
-73.99695094| 14930|Subscriber| 1969.0| 1|
+------------+-----------------+-----------------+----------------+--------------------+----
------------------+-----------------------+--------------+--------------------+-------------
-------+---------------------+------+----------+----------+------+
only showing top 20 rows

Displaying first few rows of 201410_citibike_tripdata_1.csv:


+------------+------------------+------------------+----------------+--------------------+--
--------------------+-----------------------+--------------+--------------------+-----------
---------+---------------------+------+----------+----------+------+
|tripduration| starttime| stoptime|start station id| start station name|st
art station latitude|start station longitude|end station id| end station name|end station
latitude|end station longitude|bikeid| usertype|birth year|gender|
+------------+------------------+------------------+----------------+--------------------+--
--------------------+-----------------------+--------------+--------------------+-----------
---------+---------------------+------+----------+----------+------+
| 1027|10/1/2014 00:00:27|10/1/2014 00:17:34| 479| 9 Ave & W 45 St|
40.76019252| -73.9912551| 540|Lexington Ave & E...| 40.74147286
| -73.98320928| 21376|Subscriber| 1977.0| 1|
| 534|10/1/2014 00:00:36|10/1/2014 00:09:30| 417|Barclay St & Chur...|
40.71291224| -74.01020234| 417|Barclay St & Chur...| 40.71291224
| -74.01020234| 16086|Subscriber| 1974.0| 2|
| 416|10/1/2014 00:00:42|10/1/2014 00:07:38| 327|Vesey Pl & River ...|
40.7153379| -74.01658354| 415|Pearl St & Hanove...| 40.7047177|
-74.00926027| 16073|Subscriber| 1990.0| 1|
| 428|10/1/2014 00:00:50|10/1/2014 00:07:58| 515| W 43 St & 10 Ave|
40.76009437| -73.99461843| 447| 8 Ave & W 52 St| 40.76370739
| -73.9851615| 18635|Subscriber| 1966.0| 1|
| 281|10/1/2014 00:01:08|10/1/2014 00:05:49| 497| E 17 St & Broadway|
40.73704984| -73.99009296| 537|Lexington Ave & E...| 40.74025878
| -73.98409214| 20203|Subscriber| 1979.0| 1|
| 656|10/1/2014 00:01:29|10/1/2014 00:12:25| 509| 9 Ave & W 22 St|
40.7454973| -74.00197139| 382|University Pl & E...| 40.73492695|
-73.99200509| 15334|Subscriber| 1985.0| 1|
| 429|10/1/2014 00:03:12|10/1/2014 00:10:21| 504| 1 Ave & E 15 St|
40.73221853| -73.98165557| 536| 1 Ave & E 30 St| 40.74144387
| -73.97536082| 20836|Subscriber| 1969.0| 1|
| 840|10/1/2014 00:03:42|10/1/2014 00:17:42| 347|W Houston St & Hu...|
40.72873888| -74.00748842| 432| E 7 St & Avenue A| 40.72621788
| -73.98379855| 17344|Subscriber| 1984.0| 2|
| 883|10/1/2014 00:03:45|10/1/2014 00:18:28| 268|Howard St & Centr...|
40.71910537| -73.99973337| 119|Park Ave & St Edw...| 40.69608941
| -73.97803415| 19209|Subscriber| 1980.0| 1|
| 2470|10/1/2014 00:04:42|10/1/2014 00:45:52| 503| E 20 St & Park Ave|
40.73827428| -73.98751968| 507| E 25 St & 2 Ave| 40.73912601
| -73.97973776| 19950|Subscriber| 1968.0| 1|
| 318|10/1/2014 00:04:47|10/1/2014 00:10:05| 545| E 23 St & 1 Ave|
40.736502| -73.97809472| 325| E 19 St & 3 Ave| 40.73624527|
-73.98473765| 19967|Subscriber| 1954.0| 2|
| 474|10/1/2014 00:05:01|10/1/2014 00:12:55| 537|Lexington Ave & E...|
40.74025878| -73.98409214| 236| St Marks Pl & 2 Ave| 40.7284186
| -73.98713956| 17860|Subscriber| 1957.0| 1|
| 765|10/1/2014 00:05:19|10/1/2014 00:18:04| 448| W 37 St & 10 Ave|
40.75660359| -73.9979009| 345| W 13 St & 6 Ave| 40.73649403
| -73.99704374| 16936|Subscriber| 1986.0| 1|
| 957|10/1/2014 00:05:43|10/1/2014 00:21:40| 537|Lexington Ave & E...|
40.74025878| -73.98409214| 412|Forsyth St & Cana...| 40.7158155
| -73.99422366| 18273|Subscriber| 1982.0| 1|
| 480|10/1/2014 00:06:44|10/1/2014 00:14:44| 434| 9 Ave & W 18 St|
40.74317449| -74.00366443| 483| E 12 St & 3 Ave| 40.73223272
| -73.98889957| 18458|Subscriber| 1987.0| 1|
| 801|10/1/2014 00:06:50|10/1/2014 00:20:11| 441| E 52 St & 2 Ave|
40.756014| -73.967416| 477| W 41 St & 8 Ave| 40.75640548|
-73.9900262| 19432|Subscriber| 1987.0| 1|
| 811|10/1/2014 00:07:15|10/1/2014 00:20:46| 493| W 45 St & 6 Ave|
40.7568001| -73.98291153| 546|E 30 St & Park Ave S| 40.74444921|
-73.98303529| 17497|Subscriber| 1944.0| 1|
| 452|10/1/2014 00:07:19|10/1/2014 00:14:51| 358|Christopher St & ...|
40.73291553| -74.00711384| 383|Greenwich Ave & C...| 40.735238
| -74.000271| 18137|Subscriber| 1979.0| 1|
| 240|10/1/2014 00:07:19|10/1/2014 00:11:19| 358|Christopher St & ...|
40.73291553| -74.00711384| 284|Greenwich Ave & 8...| 40.7390169121
| -74.0026376103| 19541|Subscriber| 1980.0| 1|
| 1278|10/1/2014 00:07:22|10/1/2014 00:28:40| 419|Carlton Ave & Par...|
40.69580705| -73.97355569| 324|DeKalb Ave & Huds...| 40.689888
| -73.981013| 17432|Subscriber| 1988.0| 2|
+------------+------------------+------------------+----------------+--------------------+--
--------------------+-----------------------+--------------+--------------------+-----------
---------+---------------------+------+----------+----------+------+
only showing top 20 rows

Displaying first few rows of 201411_citibike_tripdata_1.csv:


+------------+------------------+------------------+----------------+--------------------+--
--------------------+-----------------------+--------------+--------------------+-----------
---------+---------------------+------+----------+----------+------+
|tripduration| starttime| stoptime|start station id| start station name|st
art station latitude|start station longitude|end station id| end station name|end station
latitude|end station longitude|bikeid| usertype|birth year|gender|
+------------+------------------+------------------+----------------+--------------------+--
--------------------+-----------------------+--------------+--------------------+-----------
---------+---------------------+------+----------+----------+------+
| 97|11/1/2014 00:00:11|11/1/2014 00:01:48| 344|Monroe St & Bedfo...|
40.6851443| -73.95380904| 344|Monroe St & Bedfo...| 40.6851443|
-73.95380904| 15742|Subscriber| 1983.0| 1|
| 604|11/1/2014 00:00:11|11/1/2014 00:10:15| 477| W 41 St & 8 Ave|
40.75640548| -73.9900262| 480| W 53 St & 10 Ave| 40.76669671
| -73.99061728| 17274|Subscriber| 1980.0| 1|
| 629|11/1/2014 00:00:38|11/1/2014 00:11:07| 546|E 30 St & Park Ave S|
40.74444921| -73.98303529| 284|Greenwich Ave & 8...| 40.7390169121
| -74.0026376103| 19628|Subscriber| 1981.0| 1|
| 939|11/1/2014 00:00:49|11/1/2014 00:16:28| 284|Greenwich Ave & 8...|
40.7390169121| -74.0026376103| 511| E 14 St & Avenue B| 40.729386
85| -73.97772429| 15664|Subscriber| 1982.0| 1|
| 825|11/1/2014 00:00:56|11/1/2014 00:14:41| 280| E 10 St & 5 Ave|
40.73331967| -73.99510132| 483| E 12 St & 3 Ave| 40.73223272
| -73.98889957| 20704|Subscriber| 1979.0| 1|
| 727|11/1/2014 00:01:16|11/1/2014 00:13:23| 483| E 12 St & 3 Ave|
40.73223272| -73.98889957| 410|Suffolk St & Stan...| 40.72066442
| -73.98517977| 14806|Subscriber| 1961.0| 2|
| 527|11/1/2014 00:01:25|11/1/2014 00:10:12| 483| E 12 St & 3 Ave|
40.73223272| -73.98889957| 223| W 13 St & 7 Ave| 40.73781509
| -73.99994661| 16116|Subscriber| 1984.0| 1|
| 243|11/1/2014 00:03:27|11/1/2014 00:07:30| 259|South St & Whiteh...|
40.70122128| -74.01234218| 264|Maiden Ln & Pearl St| 40.70706456
| -74.00731853| 15128|Subscriber| 1972.0| 2|
| 377|11/1/2014 00:03:50|11/1/2014 00:10:07| 403| E 2 St & 2 Ave|
40.72502876| -73.99069656| 236| St Marks Pl & 2 Ave| 40.7284186
| -73.98713956| 21638|Subscriber| 1990.0| 1|
| 1338|11/1/2014 00:04:43|11/1/2014 00:27:01| 507| E 25 St & 2 Ave|
40.73912601| -73.97973776| 360|William St & Pine St| 40.70717936
| -74.00887308| 18394|Subscriber| 1988.0| 1|
| 698|11/1/2014 00:04:47|11/1/2014 00:16:25| 486| Broadway & W 29 St|
40.7462009| -73.98855723| 284|Greenwich Ave & 8...| 40.7390169121|
-74.0026376103| 17723|Subscriber| 1983.0| 1|
| 703|11/1/2014 00:04:49|11/1/2014 00:16:32| 486| Broadway & W 29 St|
40.7462009| -73.98855723| 284|Greenwich Ave & 8...| 40.7390169121|
-74.0026376103| 15591|Subscriber| 1987.0| 2|
| 559|11/1/2014 00:05:05|11/1/2014 00:14:24| 116| W 17 St & 8 Ave|
40.74177603| -74.00149746| 297| E 15 St & 3 Ave| 40.734232
| -73.986923| 18601|Subscriber| 1989.0| 1|
| 745|11/1/2014 00:05:07|11/1/2014 00:17:32| 336|Sullivan St & Was...|
40.73047747| -73.99906065| 435| W 21 St & 6 Ave| 40.74173969
| -73.99415556| 17750|Subscriber| 1976.0| 1|
| 64|11/1/2014 00:05:15|11/1/2014 00:06:19| 531|Forsyth St & Broo...|
40.71893904| -73.99266288| 531|Forsyth St & Broo...| 40.71893904
| -73.99266288| 18126|Subscriber| 1989.0| 2|
| 664|11/1/2014 00:05:32|11/1/2014 00:16:36| 250|Lafayette St & Je...|
40.72456089| -73.99565293| 433| E 13 St & Avenue A| 40.72955361
| -73.98057249| 14822|Subscriber| 1981.0| 1|
| 939|11/1/2014 00:05:45|11/1/2014 00:21:24| 531|Forsyth St & Broo...|
40.71893904| -73.99266288| 195|Liberty St & Broa...| 40.70905623
| -74.01043382| 15264|Subscriber| 1995.0| 1|
| 516|11/1/2014 00:07:42|11/1/2014 00:16:18| 394| E 9 St & Avenue C|
40.72521311| -73.97768752| 331| Pike St & Monroe St| 40.71173107
| -73.99193043| 19279|Subscriber| 1990.0| 1|
| 728|11/1/2014 00:08:01|11/1/2014 00:20:09| 476| E 31 St & 3 Ave|
40.74394314| -73.97966069| 454| E 51 St & 1 Ave| 40.75455731
| -73.96592976| 18270|Subscriber| 1989.0| 2|
| 343|11/1/2014 00:08:17|11/1/2014 00:14:00| 236| St Marks Pl & 2 Ave|
40.7284186| -73.98713956| 250|Lafayette St & Je...| 40.72456089|
-73.99565293| 17027|Subscriber| 1977.0| 1|
+------------+------------------+------------------+----------------+--------------------+--
--------------------+-----------------------+--------------+--------------------+-----------
---------+---------------------+------+----------+----------+------+
only showing top 20 rows

Displaying first few rows of 201412_citibike_tripdata_1.csv:


+------------+------------------+------------------+----------------+--------------------+--
--------------------+-----------------------+--------------+--------------------+-----------
---------+---------------------+------+----------+----------+------+
|tripduration| starttime| stoptime|start station id| start station name|st
art station latitude|start station longitude|end station id| end station name|end station
latitude|end station longitude|bikeid| usertype|birth year|gender|
+------------+------------------+------------------+----------------+--------------------+--
--------------------+-----------------------+--------------+--------------------+-----------
---------+---------------------+------+----------+----------+------+
| 1257|12/1/2014 00:00:28|12/1/2014 00:21:25| 475| E 16 St & Irving Pl|
40.73524276| -73.98758561| 521| 8 Ave & W 31 St| 40.75044999
| -73.99481051| 16047| Customer| NULL| 0|
| 275|12/1/2014 00:00:43|12/1/2014 00:05:18| 498| Broadway & W 32 St|
40.74854862| -73.98808416| 546|E 30 St & Park Ave S| 40.74444921
| -73.98303529| 18472|Subscriber| 1988.0| 2|
| 450|12/1/2014 00:01:22|12/1/2014 00:08:52| 444| Broadway & W 24 St|
40.7423543| -73.98915076| 434| 9 Ave & W 18 St| 40.74317449|
-74.00366443| 19589|Subscriber| 1983.0| 1|
| 1126|12/1/2014 00:02:17|12/1/2014 00:21:03| 475| E 16 St & Irving Pl|
40.73524276| -73.98758561| 521| 8 Ave & W 31 St| 40.75044999
| -73.99481051| 21142| Customer| NULL| 0|
| 331|12/1/2014 00:02:21|12/1/2014 00:07:52| 519|Pershing Square N...|
40.751873| -73.977706| 527| E 33 St & 2 Ave| 40.744023|
-73.976056| 18679|Subscriber| 1986.0| 2|
| 162|12/1/2014 00:02:37|12/1/2014 00:05:19| 229| Great Jones St|
40.72743423| -73.99379025| 336|Sullivan St & Was...| 40.73047747
| -73.99906065| 21668|Subscriber| 1973.0| 1|
| 155|12/1/2014 00:02:43|12/1/2014 00:05:18| 229| Great Jones St|
40.72743423| -73.99379025| 336|Sullivan St & Was...| 40.73047747
| -73.99906065| 16205|Subscriber| 1986.0| 1|
| 493|12/1/2014 00:03:12|12/1/2014 00:11:25| 305| E 58 St & 3 Ave|
40.76095756| -73.96724467| 160|E 37 St & Lexingt...| 40.748238
| -73.978311| 18234|Subscriber| 1990.0| 1|
| 521|12/1/2014 00:03:16|12/1/2014 00:11:57| 82|St James Pl & Pea...|
40.71117416| -74.00016545| 2008|Little West St & ...| 40.70569254
| -74.01677685| 17045|Subscriber| 1982.0| 1|
| 349|12/1/2014 00:03:37|12/1/2014 00:09:26| 470| W 20 St & 8 Ave|
40.74345335| -74.00004031| 491|E 24 St & Park Ave S| 40.74096374
| -73.98602213| 18805|Subscriber| 1983.0| 1|
| 8761|12/1/2014 00:04:48|12/1/2014 02:30:49| 295|Pike St & E Broadway|
40.71406667| -73.99293911| 412|Forsyth St & Cana...| 40.7158155
| -73.99422366| 21397| Customer| NULL| 0|
| 428|12/1/2014 00:05:27|12/1/2014 00:12:35| 465| Broadway & W 41 St|
40.75513557| -73.98658032| 447| 8 Ave & W 52 St| 40.76370739
| -73.9851615| 21249|Subscriber| 1983.0| 1|
| 1102|12/1/2014 00:05:33|12/1/2014 00:23:55| 369|Washington Pl & 6...|
40.73224119| -74.00026394| 352| W 56 St & 6 Ave| 40.76340613
| -73.97722479| 17785|Subscriber| 1981.0| 1|
| 336|12/1/2014 00:05:50|12/1/2014 00:11:26| 386|Centre St & Worth St|
40.71494807| -74.00234482| 147|Greenwich St & Wa...| 40.71542197
| -74.01121978| 21392|Subscriber| 1970.0| 1|
| 1051|12/1/2014 00:06:18|12/1/2014 00:23:49| 369|Washington Pl & 6...|
40.73224119| -74.00026394| 352| W 56 St & 6 Ave| 40.76340613
| -73.97722479| 16102|Subscriber| 1983.0| 1|
| 354|12/1/2014 00:06:23|12/1/2014 00:12:17| 487| E 20 St & FDR Drive|
40.73314259| -73.97573881| 317| E 6 St & Avenue B| 40.72453734
| -73.98185424| 21447|Subscriber| 1983.0| 1|
| 670|12/1/2014 00:07:04|12/1/2014 00:18:14| 536| 1 Ave & E 30 St|
40.74144387| -73.97536082| 512| W 29 St & 9 Ave| 40.7500727
| -73.99839279| 15871|Subscriber| 1979.0| 1|
| 420|12/1/2014 00:07:04|12/1/2014 00:14:04| 300|Shevchenko Pl & E...|
40.728145| -73.990214| 250|Lafayette St & Je...| 40.72456089|
-73.99565293| 16470|Subscriber| 1991.0| 2|
| 2284|12/1/2014 00:07:14|12/1/2014 00:45:18| 448| W 37 St & 10 Ave|
40.75660359| -73.9979009| 514| 12 Ave & W 40 St| 40.76087502
| -74.00277668| 17698|Subscriber| 1989.0| 2|
| 509|12/1/2014 00:07:26|12/1/2014 00:15:55| 237| E 11 St & 2 Ave|
40.73047309| -73.98672378| 302| Avenue D & E 3 St| 40.72082834
| -73.97793172| 15404|Subscriber| 1991.0| 1|
+------------+------------------+------------------+----------------+--------------------+--
--------------------+-----------------------+--------------+--------------------+-----------
---------+---------------------+------+----------+----------+------+
only showing top 20 rows

In [ ]: from pyspark.sql import functions as F

# Define a function to process each DataFrame


def aggregate_daily_data(df, file_name):
# Convert 'starttime' column to DateType (if not already in DateType)
df = df.withColumn("date", F.to_date(df["starttime"])) # Use 'starttime' to extract th

# Select relevant columns only


relevant_columns = [
"tripduration",
"starttime",
"stoptime",
"start station name",
"end station name",
"usertype",
"birth year",
"gender",
"date"
]
df = df.select(relevant_columns)

# Filter out rows where 'birth year' is less than or equal to 1940
df = df.filter(F.col("birth year") > 1940)

# Aggregate data by date to calculate trip_count, avg_tripduration, and counts for gend
daily_df = df.groupBy("date").agg(
F.count("*").alias("trip_count"), # Count trips for the day
F.avg("tripduration").alias("avg_tripduration"), # Average trip duration
F.count(F.when(F.col("gender") == 1, 1)).alias("male_count"), # Count male users
F.count(F.when(F.col("gender") == 2, 1)).alias("female_count"), # Count female use
F.count(F.when(F.col("usertype") == "Subscriber", 1)).alias("subscriber_count"), #
F.count(F.when(F.col("usertype") == "Customer", 1)).alias("customer_count") # Coun
)

return daily_df

# Initialize an empty DataFrame to store the union of all daily data


final_aggregated_df = None

# Process each DataFrame (df1, df2, ..., df12)


for idx in range(1, 13):
df = globals()[f'df{idx}']
aggregated_df = aggregate_daily_data(df, f"20140{idx}_citibike_tripdata_1.csv")

# If it's the first iteration, initialize the final DataFrame


if final_aggregated_df is None:
final_aggregated_df = aggregated_df
else:
# Union the current aggregated data with the final DataFrame
final_aggregated_df = final_aggregated_df.union(aggregated_df)

# Show the combined result (this will include data from all 12 files)
final_aggregated_df.show()
+----------+----------+------------------+----------+------------+----------------+---------
-----+
| date|trip_count| avg_tripduration|male_count|female_count|subscriber_count|customer_
count|
+----------+----------+------------------+----------+------------+----------------+---------
-----+
|2014-01-08| 9170| 610.8408942202835| 7571| 1599| 9170|
0|
|2014-01-05| 2639| 816.3520272830617| 2194| 445| 2639|
0|
|2014-01-06| 9384| 733.009484228474| 7750| 1633| 9384|
0|
|2014-01-03| 1123| 799.9937666963491| 986| 137| 1123|
0|
|2014-01-02| 8401| 767.5840971312939| 6889| 1512| 8401|
0|
|2014-01-01| 5393| 698.7481921008715| 4204| 1189| 5393|
0|
|2014-01-04| 2257|1217.7673903411608| 1814| 443| 2257|
0|
|2014-01-07| 6230| 725.10658105939| 5279| 951| 6230|
0|
|2014-01-12| 11706| 673.9484025286177| 8736| 2969| 11706|
0|
|2014-01-11| 7479| 716.7067789811472| 5841| 1638| 7479|
0|
|2014-01-10| 9699| 666.9246314052995| 8019| 1679| 9699|
0|
|2014-01-09| 13169| 641.7592072290986| 10743| 2426| 13169|
0|
|2014-01-15| 21010| 685.1461684911947| 16690| 4319| 21010|
0|
|2014-01-14| 9828| 648.2232397232398| 8056| 1772| 9828|
0|
|2014-01-13| 20010| 681.0005997001499| 15930| 4080| 20010|
0|
|2014-01-16| 19459| 667.2081813042807| 15592| 3863| 19459|
0|
|2014-01-17| 19490| 690.6688558234993| 15470| 4019| 19490|
0|
|2014-01-18| 8938| 793.3137167151488| 6934| 2004| 8938|
0|
|2014-01-19| 8554| 735.6098901098901| 6372| 2182| 8554|
0|
|2014-01-20| 12873| 749.4723063776897| 9951| 2922| 12873|
0|
+----------+----------+------------------+----------+------------+----------------+---------
-----+
only showing top 20 rows

In [ ]: final_aggregated_df.orderBy("date", ascending=False).limit(20).show()
+----------+----------+-----------------+----------+------------+----------------+----------
----+
| date|trip_count| avg_tripduration|male_count|female_count|subscriber_count|customer_c
ount|
+----------+----------+-----------------+----------+------------+----------------+----------
----+
|2014-08-31| 12890|801.0823894491854| 9389| 3497| 12890|
0|
|2014-08-30| 16272| 818.163470993117| 11910| 4356| 16272|
0|
|2014-08-29| 27452|768.1563820486667| 21309| 6130| 27452|
0|
|2014-08-28| 31034|765.4483469742862| 24290| 6728| 31034|
0|
|2014-08-27| 31071|768.8513082939076| 24071| 6998| 31071|
0|
|2014-08-26| 31437|750.5371377675987| 24335| 7099| 31432|
5|
|2014-08-25| 29565| 754.389514628784| 22787| 6767| 29561|
4|
|2014-08-24| 20483|856.1377239662158| 14765| 5714| 20483|
0|
|2014-08-23| 20138|812.0242824510875| 14632| 5506| 20138|
0|
|2014-08-22| 27529|749.7640306585782| 21355| 6154| 27529|
0|
|2014-08-21| 30599|768.9101931435667| 23778| 6819| 30599|
0|
|2014-08-20| 32010|773.5048734770385| 24759| 7244| 32010|
0|
|2014-08-19| 32330| 776.447633776678| 24894| 7430| 32330|
0|
|2014-08-18| 30436|760.1569522933369| 23387| 7046| 30436|
0|
|2014-08-17| 20795|820.6033181053137| 15133| 5659| 20795|
0|
|2014-08-16| 21981|895.3084027114326| 15701| 6279| 21981|
0|
|2014-08-15| 28926|759.9964737606306| 22323| 6586| 28926|
0|
|2014-08-14| 31801|797.2494890097796| 24599| 7180| 31801|
0|
|2014-08-13| 25836|739.6614026939154| 20078| 5744| 25836|
0|
|2014-08-12| 24158|666.0317493169964| 19100| 5055| 24158|
0|
+----------+----------+-----------------+----------+------------+----------------+----------
----+

Step 2 (2 marks): Download and preprocess the weather data

In [ ]: # Specify the path to the CSV file


file_path = '/FileStore/new_york_weather.csv'

# Load the CSV file into a DataFrame


weather_df = spark.read.csv(file_path, header=True, inferSchema=True)

# Show the first few rows of the DataFrame


weather_df.show(5)
+--------+----------+-------+-------+-----+------------+------------+---------+-----+-------
-+------+----------+-----------+----------+----+---------+--------+---------+-------+-------
---------+----------+----------+--------------+-----------+-------+----------+--------------
-----+-------------------+---------+--------------------+--------------------+--------------
---+--------------------+
| name| datetime|tempmax|tempmin| temp|feelslikemax|feelslikemin|feelslike| dew|humidit
y|precip|precipprob|precipcover|preciptype|snow|snowdepth|windgust|windspeed|winddir|sealeve
lpressure|cloudcover|visibility|solarradiation|solarenergy|uvindex|severerisk| su
nrise| sunset|moonphase| conditions| description| i
con| stations|
+--------+----------+-------+-------+-----+------------+------------+---------+-----+-------
-+------+----------+-----------+----------+----+---------+--------+---------+-------+-------
---------+----------+----------+--------------+-----------+-------+----------+--------------
-----+-------------------+---------+--------------------+--------------------+--------------
---+--------------------+
|new york|2014-01-01| 0.8| -4.4| -1.7| 0.8| -10.0| -5.2|-10.9| 49.
8| 0.0| 0| 0.0| NULL| 0.0| 0.0| 44.3| 18.8| 295.0|
1027.7| 45.1| 16.0| 107.9| 9.1| 5| NULL|2014-01-01 07:20:
11|2014-01-01 16:39:19| 0.0| Partially cloudy|Becoming cloudy i...|partly-cloudy-day
|72505394728,KEWR,...|
|new york|2014-01-02| 0.6| -7.1| -2.9| -2.9| -15.4| -9.3| -6.8| 75.
1| 2.716| 100| 20.83| snow| 2.9| 0.9| 50.0| 27.7| 48.3|
1015.5| 100.0| 12.2| 20.8| 1.7| 1| NULL|2014-01-02 07:20:
16|2014-01-02 16:40:10| 0.04| Snow, Overcast|Cloudy skies thro...| snow
|72505394728,KEWR,...|
|new york|2014-01-03| -7.8| -12.2|-10.3| -14.8| -21.3| -18.3|-16.3| 62.
9| 3.198| 100| 33.33| snow| 1.7| 4.2| 50.0| 28.5| 338.6|
1017.9| 53.5| 9.4| 74.6| 6.5| 4| NULL|2014-01-03 07:20:
19|2014-01-03 16:41:03| 0.08|Snow, Partially c...|Clearing in the a...| snow
|72505394728,KEWR,...|
|new york|2014-01-04| -2.5| -13.1| -7.7| -5.0| -20.0| -10.9|-16.9| 48.
7| 0.0| 0| 0.0| snow| 0.9| 4.8| 30.4| 15.0| 256.1|
1030.6| 16.6| 16.0| 116.5| 10.0| 5| NULL|2014-01-04 07:20:
20|2014-01-04 16:41:57| 0.11| Clear|Clear conditions ...| clear-day
|72505394728,KEWR,...|
|new york|2014-01-05| 3.4| -3.7| -0.5| 1.9| -6.4| -2.4| -3.5| 80.
8| 1.425| 100| 25.0| rain,snow| 0.0| 3.1| 27.7| 11.4| 30.4|
1022.5| 75.4| 7.4| 36.2| 3.0| 2| NULL|2014-01-05 07:20:
19|2014-01-05 16:42:53| 0.14|Snow, Rain, Parti...|Partly cloudy thr...| rain
|72505394728,KEWR,...|
+--------+----------+-------+-------+-----+------------+------------+---------+-----+-------
-+------+----------+-----------+----------+----+---------+--------+---------+-------+-------
---------+----------+----------+--------------+-----------+-------+----------+--------------
-----+-------------------+---------+--------------------+--------------------+--------------
---+--------------------+
only showing top 5 rows

In [ ]: # Select the relevant columns and rename 'icon' to 'weather'


weather_selected_df = weather_df.select("datetime", "icon", "temp", "humidity") \
.withColumnRenamed("icon", "weather")

# Show the first few rows of the selected data


weather_selected_df.show(5)

+----------+-----------------+-----+--------+
| datetime| weather| temp|humidity|
+----------+-----------------+-----+--------+
|2014-01-01|partly-cloudy-day| -1.7| 49.8|
|2014-01-02| snow| -2.9| 75.1|
|2014-01-03| snow|-10.3| 62.9|
|2014-01-04| clear-day| -7.7| 48.7|
|2014-01-05| rain| -0.5| 80.8|
+----------+-----------------+-----+--------+
only showing top 5 rows
Step 3: Optional Step (Add Air Quality Data)

In [ ]: from pyspark.sql.functions import col

# Load the air quality dataset


file_path = "/FileStore/ad_viz_plotval_data.csv"
air_quality_df = spark.read.csv(file_path, header=True, inferSchema=True)

# Arrange data by date in ascending order


sorted_air_quality_df = air_quality_df.orderBy(col("Date").asc())

# Show the data sorted by date


sorted_air_quality_df.show(5)

+----------+------+---------+---+------------------------------+--------+---------------+---
-----------------+---------------+----------------+------------------+----------------------
---+-----------+--------------------+---------+--------------------+---------------+--------
+----------------+------+----------------+-----------------+
| Date|Source| Site ID|POC|Daily Mean PM2.5 Concentration| Units|Daily AQI Value|
Local Site Name|Daily Obs Count|Percent Complete|AQS Parameter Code|AQS Parameter Descriptio
n|Method Code| Method Description|CBSA Code| CBSA Name|State FIPS Code| State|C
ounty FIPS Code|County| Site Latitude| Site Longitude|
+----------+------+---------+---+------------------------------+--------+---------------+---
-----------------+---------------+----------------+------------------+----------------------
---+-----------+--------------------+---------+--------------------+---------------+--------
+----------------+------+----------------+-----------------+
|2014-01-01| AQS|360290005| 3| 8.8|ug/m3 LC| 49|
BUFFALO| 1| 100.0| 88502| Acceptable PM2.5 ...|
702|PM2.5 SCC w/Corre...| 15380|Buffalo-Cheektowa...| 36|New York|
29| Erie|42.8769066671345|-78.8095260117327|
|2014-01-01| AQS|360470118| 3| 12.1|ug/m3 LC| 57|
PS 274| 1| 100.0| 88502| Acceptable PM2.5 ...|
702|PM2.5 SCC w/Corre...| 35620|New York-Newark-J...| 36|New York|
47| Kings| 40.69454| -73.92769|
|2014-01-01| AQS|360291013| 3| 8.9|ug/m3 LC| 49|GRA
ND ISLE BOULDVARD| 1| 100.0| 88502| Acceptable PM2.5
...| 702|PM2.5 SCC w/Corre...| 15380|Buffalo-Cheektowa...| 36|New York
| 29| Erie| 42.98844| -78.91859|
|2014-01-01| AQS|360010005| 3| 10.8|ug/m3 LC| 54|ALB
ANY COUNTY HEA...| 1| 100.0| 88502| Acceptable PM2.5
...| 702|PM2.5 SCC w/Corre...| 10580|Albany-Schenectad...| 36|New York
| 1|Albany| 42.64225| -73.75464|
|2014-01-01| AQS|360291014| 3| 7.6|ug/m3 LC| 42|
BROOKSIDE TERRACE| 1| 100.0| 88502| Acceptable PM2.5
...| 702|PM2.5 SCC w/Corre...| 15380|Buffalo-Cheektowa...| 36|New York
| 29| Erie| 42.99813| -78.89926|
+----------+------+---------+---+------------------------------+--------+---------------+---
-----------------+---------------+----------------+------------------+----------------------
---+-----------+--------------------+---------+--------------------+---------------+--------
+----------------+------+----------------+-----------------+
only showing top 5 rows
In [ ]: from pyspark.sql.functions import col, avg, round

# Use backticks to handle special characters in column names


average_air_quality_df = air_quality_df.groupBy("Date").agg(
round(avg("`Daily Mean PM2.5 Concentration`"), 2).alias("Average PM2.5"),
round(avg("`Daily AQI Value`"), 2).alias("Average AQI")
)

# Arrange the result by date


average_air_quality_df = average_air_quality_df.orderBy(col("Date").asc())

# Show the result


average_air_quality_df.show(5)

+----------+-------------+-----------+
| Date|Average PM2.5|Average AQI|
+----------+-------------+-----------+
|2014-01-01| 12.13| 54.96|
|2014-01-02| 9.38| 45.87|
|2014-01-03| 10.22| 51.45|
|2014-01-04| 11.52| 54.26|
|2014-01-05| 16.22| 62.94|
+----------+-------------+-----------+
only showing top 5 rows

In [ ]: # Get the total number of rows


row_count = average_air_quality_df.count()

# Print the row count


print(f"Total number of rows: {row_count}")

Total number of rows: 365

Step 4 (4 marks): Combine all data into a final table

In [ ]: from pyspark.sql.functions import to_date

# Ensure all date columns are of the same format and type
final_aggregated_df = final_aggregated_df.withColumn("date", to_date(col("date")))
weather_selected_df = weather_selected_df.withColumn("datetime", to_date(col("datetime")))
average_air_quality_df = average_air_quality_df.withColumn("Date", to_date(col("Date")))

# Join final_aggregated_df with weather_selected_df on date columns


combined_df_1 = final_aggregated_df.join(
weather_selected_df,
final_aggregated_df["date"] == weather_selected_df["datetime"],
"inner"
)

# Join the result with average_air_quality_df on date columns


final_combined_df = combined_df_1.join(
average_air_quality_df,
combined_df_1["date"] == average_air_quality_df["Date"],
"inner"
)

# Drop duplicate date columns from joins


final_combined_df = final_combined_df.drop("datetime", "Date")

# Show the combined DataFrame


final_combined_df.show(5)
+----------+-----------------+----------+------------+----------------+--------------+------
-----------+-----+--------+-------------+-----------+
|trip_count| avg_tripduration|male_count|female_count|subscriber_count|customer_count|
weather| temp|humidity|Average PM2.5|Average AQI|
+----------+-----------------+----------+------------+----------------+--------------+------
-----------+-----+--------+-------------+-----------+
| 9170|610.8408942202835| 7571| 1599| 9170| 0|partly
-cloudy-day| -9.5| 43.2| 12.2| 56.26|
| 2639|816.3520272830617| 2194| 445| 2639| 0|
rain| -0.5| 80.8| 16.22| 62.94|
| 9384| 733.009484228474| 7750| 1633| 9384| 0|
rain| 6.8| 73.9| 7.55| 41.0|
| 1123|799.9937666963491| 986| 137| 1123| 0|
snow|-10.3| 62.9| 10.22| 51.45|
| 8401|767.5840971312939| 6889| 1512| 8401| 0|
snow| -2.9| 75.1| 9.38| 45.87|
+----------+-----------------+----------+------------+----------------+--------------+------
-----------+-----+--------+-------------+-----------+
only showing top 5 rows

In [ ]: from pyspark.ml.feature import StringIndexer

# Index the 'weather' column to convert it to a numeric type (long)


indexer = StringIndexer(inputCol="weather", outputCol="weather_indexed")

# Apply the transformation to the data


final_combined_df = indexer.fit(final_combined_df).transform(final_combined_df)

# Now overwrite the original 'weather' column with the indexed values
final_combined_df = final_combined_df.drop("weather").withColumnRenamed("weather_indexed",

# Show the result


final_combined_df.show(5)

+----------+-----------------+----------+------------+----------------+--------------+-----+
--------+------------+-----------+-------+
|trip_count| avg_tripduration|male_count|female_count|subscriber_count|customer_count| temp|
humidity|average_pm25|average_aqi|weather|
+----------+-----------------+----------+------------+----------------+--------------+-----+
--------+------------+-----------+-------+
| 9170|610.8408942202835| 7571| 1599| 9170| 0| -9.5|
43.2| 12.2| 56.26| 0.0|
| 2639|816.3520272830617| 2194| 445| 2639| 0| -0.5|
80.8| 16.22| 62.94| 1.0|
| 9384| 733.009484228474| 7750| 1633| 9384| 0| 6.8|
73.9| 7.55| 41.0| 1.0|
| 1123|799.9937666963491| 986| 137| 1123| 0|-10.3|
62.9| 10.22| 51.45| 3.0|
| 8401|767.5840971312939| 6889| 1512| 8401| 0| -2.9|
75.1| 9.38| 45.87| 3.0|
+----------+-----------------+----------+------------+----------------+--------------+-----+
--------+------------+-----------+-------+
only showing top 5 rows

Step 5 (1 mark): Split data into training and testing sets

In [ ]: # Rename columns for better compatibility


final_combined_df = final_combined_df.withColumnRenamed("Average PM2.5", "average_pm25") \
.withColumnRenamed("Average AQI", "average_aqi")

# Check the schema after renaming


final_combined_df.printSchema()
root
|-- trip_count: long (nullable = false)
|-- avg_tripduration: double (nullable = true)
|-- male_count: long (nullable = false)
|-- female_count: long (nullable = false)
|-- subscriber_count: long (nullable = false)
|-- customer_count: long (nullable = false)
|-- weather: string (nullable = true)
|-- temp: double (nullable = true)
|-- humidity: double (nullable = true)
|-- average_pm25: double (nullable = true)
|-- average_aqi: double (nullable = true)

In [ ]: from pyspark.ml.feature import VectorAssembler


from pyspark.ml.regression import RandomForestRegressor
from pyspark.ml import Pipeline

feature_columns = [
"avg_tripduration", "male_count", "female_count", "subscriber_count",
"customer_count", "weather", "temp", "humidity", "average_pm25", "average_aqi"
]

# Prepare the data by selecting features and target variable


final_data = final_combined_df.select("trip_count", *feature_columns)

# Split the data into train and test sets (80% train, 20% test)
train_data, test_data = final_data.randomSplit([0.8, 0.2], seed=42)

# Show the count of training and testing data


print(f"Training Data Count: {train_data.count()}")
print(f"Testing Data Count: {test_data.count()}")

Training Data Count: 192


Testing Data Count: 51

1.3.3 Evaluate ML Pipeline using Testing Data (8 marks)


To evaluate the performance of the trained ML pipeline, we will use the testing data and calculate key
performance metrics: Mean Absolute Error (MAE) and R squared (R²) value. These metrics will help
assess how well the model has generalized to unseen data.

Steps:
1. Model Prediction on Testing Data: Use the trained model to make predictions on the testing
dataset.

2. Performance Metrics: Calculate MAE and R squared (R²) values:

MAE: This metric represents the average magnitude of errors in the predictions, without
considering their direction.
R²: This metric represents the proportion of variance in the target variable that is explained
by the model.

In [ ]: from pyspark.ml.evaluation import RegressionEvaluator


from pyspark.ml.regression import RandomForestRegressor
from pyspark.ml import Pipeline
from pyspark.ml.feature import VectorAssembler

# Initialize the model (RandomForestRegressor in this case)


rf = RandomForestRegressor(labelCol="trip_count", featuresCol="features")

# Create the feature assembler


assembler = VectorAssembler(inputCols=feature_columns, outputCol="features")

# Create the pipeline


pipeline = Pipeline(stages=[assembler, rf])

# Train the model using the training data


model = pipeline.fit(train_data)

# Make predictions on the test data


predictions = model.transform(test_data)

# Show the predictions


predictions.select("trip_count", "prediction").show(5)

# Initialize the evaluator for MAE and R squared


evaluator_mae = RegressionEvaluator(metricName="mae", labelCol="trip_count", predictionCol=
evaluator_r2 = RegressionEvaluator(metricName="r2", labelCol="trip_count", predictionCol="p

# Calculate MAE and R squared


mae = evaluator_mae.evaluate(predictions)
r2 = evaluator_r2.evaluate(predictions)

# Print the evaluation metrics


print(f"Mean Absolute Error (MAE): {mae}")
print(f"R Squared (R²): {r2}")

+----------+------------------+
|trip_count| prediction|
+----------+------------------+
| 2421|3184.6350694444445|
| 4879| 4187.76128968254|
| 5393| 6395.522843137255|
| 8554| 9188.944260210681|
| 9949|10063.855525724275|
+----------+------------------+
only showing top 5 rows

Mean Absolute Error (MAE): 371.7361597086873


R Squared (R²): 0.9974625580065704

You might also like