This project aims to analyze and visualize data related to superheroes. The dataset used contains information about various superheroes, including their powers, appearances, and other attributes. The analysis focuses on exploring trends, distributions, and relationships within the data to gain insights into the superhero universe.
The dataset used in this project was obtained from API SuperHero, which provides comprehensive information about superheroes from various comic books and publishers.
- Python: Programming language used for data analysis and manipulation.
- Pandas: Library used for data manipulation and analysis with DataFrame structures.
- Matplotlib: Library used for creating plots and visualizations.
- Seaborn: Library used for creating attractive and informative statistical graphics.
- NumPy: Library used for numerical computations and data manipulation with arrays.
- Requests: Library used for sending HTTP requests and fetching data from websites.
- Regular Expressions (re): Used for pattern matching and text processing.
- MySQL
notebooks: Jupyter Notebook containing the main analysis code.README.md: This file providing an overview of the project.data: Directory containing the dataset used for analysis.slides: Project presentation.erd: Diagram.sql_files: All SQL files used.
- Data Loading and Preprocessing: The dataset is loaded into a DataFrame and cleaned as needed.
- Exploratory Data Analysis (EDA): Basic statistics and visualizations are used to understand the dataset's structure and distributions.
- Feature Engineering: Additional features are created or extracted from existing ones to facilitate analysis.
- Visualization: Various plots and graphs are generated to visualize trends, distributions, and relationships within the data.
- Insights and Interpretation: The findings from the analysis are summarized, and conclusions are drawn based on the observed patterns.
- Distribution of Superhero Attributes: Histograms and box plots reveal the distribution of different superhero attributes such as strength, speed, and intelligence.
- Top 10 Superheroes: Tables and visualizations highlight the top 10 superheroes based on specific attributes or criteria.
- Distribution of Appearances Across Generations: Bar charts and pie charts illustrate the distribution of superhero appearances across different generations.
This project provides a comprehensive analysis of superhero data, offering insights into the characteristics and powers of various superheroes. The findings can be valuable for fans, researchers, and enthusiasts interested in the superhero universe.
- Incorporate additional datasets to enrich the analysis.
- Explore machine learning models for predictive analysis or classification tasks.
- Extend the analysis to include more in-depth character profiles and storylines.
Élio Vieira