Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
10 views9 pages

Overview of GPS Clustering Code

The document outlines a GPS clustering code that involves loading and cleaning data from a CSV file, calculating speed and distance between GPS coordinates, and applying KMeans clustering to group the data into five clusters. It also details the creation of an interactive Folium map to visualize the clustered GPS points, including the addition of markers with relevant information. The final output is an HTML file named 'optimized_gps_map.html' for viewing the map.

Uploaded by

adhithanr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views9 pages

Overview of GPS Clustering Code

The document outlines a GPS clustering code that involves loading and cleaning data from a CSV file, calculating speed and distance between GPS coordinates, and applying KMeans clustering to group the data into five clusters. It also details the creation of an interactive Folium map to visualize the clustered GPS points, including the addition of markers with relevant information. The final output is an HTML file named 'optimized_gps_map.html' for viewing the map.

Uploaded by

adhithanr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Overview of GPS Clustering Code

1. Loading and Cleaning Data


You first load a CSV file named `Druidtest_sufan.csv` using Pandas and handle cases where the
file might not be found:

data_file = 'Druidtest_sufan.csv' # Path to the data file


try:
data = pd.read_csv(data_file)
print("Data loaded successfully!")
except FileNotFoundError:
print(f"File '{data_file}' not found. Please check the file path.")
exit()

This attempts to load the data and, if the file doesn't exist, it prints an error message and exits the
program.

After the data is loaded, you inspect its structure:

print("Data Info:")
print(data.info()) # Information about the dataset
print("\nFirst few rows:")
print(data.head()) # First few rows for a quick overview

You then clean the data by removing rows with NaN values in the critical columns (`location*lat`
and `location*long`) since these are needed to calculate distances and map locations:

data = data.dropna(subset=['location*lat', 'location*long'])


print("\nCleaned Data Info:")
print(data[['location*lat', 'location*long']].describe()) # Summary stats

2. Calculating Speed and Distance


To calculate speed and distance between consecutive GPS coordinates, you:

* Convert the `timestamp` column to datetime format to ensure proper calculations:


data['timestamp'] = pd.to_datetime(data['timestamp'])

* Sort the data by timestamp to ensure the calculations are in chronological order:

data = data.sort_values(by='timestamp')

You then iterate over each consecutive pair of GPS points to calculate:
* Distance: Using Geopy's `geodesic` function.
* Speed: Using the formula: `speed = distance / time`.

Here's the code:

distances = []
speeds = []
for i in range(len(data) * 1):
loc1 = (data.iloc[i]['location*lat'], data.iloc[i]['location*long'])
loc2 = (data.iloc[i + 1]['location*lat'], data.iloc[i + 1]['location*long'])
distance = geodesic(loc1, loc2).meters
time_diff = (data.iloc[i + 1]['timestamp'] * data.iloc[i]['timestamp']).total_seconds()
distances.append(distance)
speeds.append(distance / time_diff if time_diff > 0 else 0)

# Final row
distances.append(0)
speeds.append(0)

data['distance_m'] = distances
data['speed_m_s'] = speeds
data['cumulative_distance_m'] = data['distance_m'].cumsum()

3. Clustering GPS Points Using KMeans


You applied KMeans clustering to group the GPS data into 5 clusters based on latitude and
longitude:

kmeans = KMeans(n_clusters=5, random_state=42)


data['cluster'] = kmeans.fit_predict(data[['location*lat', 'location*long']])
* `n_clusters=5` specifies you want 5 clusters.
* `fit_predict()` computes the clusters and assigns each data point a cluster label.

4. Creating a Folium Map with Clustered Markers


You created a Folium map to visualize the GPS data points. First, you calculate the map center:

map_center = [data['location*lat'].mean(), data['location*long'].mean()]

Then, you create a Folium map with MarkerCluster:

gps_map = folium.Map(location=map_center, zoom_start=15, tiles='CartoDB positron')


marker_cluster = MarkerCluster().add_to(gps_map)

You loop through each row of data and add a Marker with a popup:

for _, row in data.iterrows():


popup_text = (
f"Event ID: {row['event*id']}<br>"
f"Timestamp: {row['timestamp']}<br>"
f"Speed: {row['speed_m_s']:.2f} m/s<br>"
f"Distance: {row['distance_m']:.2f} m<br>"
f"Cumulative Distance: {row['cumulative_distance_m']:.2f} m<br>"
f"Cluster: {row['cluster']}"
)
folium.Marker(
location=[row['location*lat'], row['location*long']],
popup=popup_text,
).add_to(marker_cluster)

The map is saved as an HTML file:

gps_map.save('optimized_gps_map.html')
print("Optimized map generated successfully!")
Summary of Key Steps
1. Data Cleaning: Removed rows with missing latitude or longitude.
2. Speed and Distance Calculation: Calculated speed and distance between consecutive points.
3. Clustering: Grouped data using KMeans.
4. Map Generation: Created an interactive map with clustered markers.

Open the generated `optimized_gps_map.html` to view the interactive map.


video

You might also like