This project demonstrates the use of Pandas, a powerful Python library for data manipulation and analysis. Pandas provides data structures like DataFrame and Series to handle structured data efficiently.
- Reading and writing data from various formats like CSV, Excel, SQL, JSON, etc.
- Handling missing data with ease.
- Data filtering, selection, and slicing.
- Grouping and aggregation.
- Merging and joining datasets.
- Time series functionality.
- Data visualization integration.
pip install pandasimport pandas as pd
# Creating a DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']
}
df = pd.DataFrame(data)
# Display DataFrame
print(df)
# Read data from CSV
df_csv = pd.read_csv('data.csv')
# Basic operations
print(df['Age'].mean()) # Calculate average age
print(df[df['Age'] > 28]) # Filter rows where age > 28- Efficient handling of large datasets.
- Intuitive syntax for data analysis.
- Integrates well with other Python libraries like NumPy, Matplotlib, and Scikit-learn.
- Supports both numerical and categorical data.
Contributions, issues, and feature requests are welcome. Feel free to check the issues page.
This project is licensed under the MIT License.