I hold a higher technical education.
I specialize in data analysis with a focus on empowering informed decision-making.
By extracting insights from complex data sets, I help organizations make data-driven decisions that drive business growth and improvement.
- Programming Languages: Python, SQL (PostgreSQL, MySQL, ClickHouse), NoSQL (MongoDB).
- Data Analysis & Visualization:
- Libraries: Pandas, NumPy, SciPy, Statsmodels, Pingouin, Plotly, Matplotlib, Seaborn.
- Tools & Frameworks: Dash, Power BI, Tableau, Redash, DataLens, Superset.
- Big Data & Distributed Computing: Apache Spark, Apache Airflow.
- Machine learning and AI: Scikit-learn, MLlib.
- Natural Language Processing: NLTK, SpaCy, TextBlob.
- Web scraping: BeautifulSoup, Selenium, Scrapy.
- DevOps: Linux, Git, Docker.
- IDEs: VS Code, Google Colab, Jupyter Notebook, Zeppelin, PyCharm.
- Deep data analysis: preprocessing, cleaning, and identifying patterns using visualization to support decision-making.
- Writing complex SQL queries: working with nested queries, window functions, CASE and WITH statements for data extraction and analysis.
- Understanding product strategy: knowledge of product development and improvement principles, including analyzing user needs and formulating recommendations for its growth.
- Product metrics analysis: LTV, RR, CR, ARPU, ARPPU, MAU, DAU, and other key performance indicators.
- Conducting A/B testing: analyzing results using statistical methods to evaluate the effectiveness of changes.
- Cohort analysis and RFM segmentation: identifying user behavior patterns to optimize marketing strategies.
- End-to-End Data Pipelines: Building automated ETL processes from databases to dashboards with Airflow orchestration.
- Data visualization and dashboard development: creating interactive reports in Tableau, Redash, Power BI, and other tools for presenting analytics.
- Web scraping: experience in extracting data from websites using tools and libraries such as BeautifulSoup, Scrapy, and Selenium for information gathering and data analysis.
- Working with big data: experience with tools and technologies for processing large volumes of data (e.g., Hadoop, Spark).
- Machine Learning Applications: Capable of building and applying simple machine learning models for data analysis tasks, including forecasting, classification, and clustering, to uncover deeper insights and enhance decision-making processes.
- Working with APIs: integrating and extracting data from various sources via APIs.
- Process Automation: Automating data workflows and routine tasks using Linux scripting, Apache Airflow and other DevOps tools.
Stack: Python | Pandas | NumPy | SciPy | Plotly | Statsmodels | Scikit-learn | Pingouin | TextBlob | Sphinx
Key Methods: Data Exploration | Statistical Testing | Cohort Analysis | Automated Visualization | Feature Analysis | Machine Learning
Powerful pandas extension that enhances DataFrames with production-ready analytics while maintaining native functionality.
- Seamlessly integrates exploratory analysis, statistical testing and visualization into pandas workflows
- Provides instant insights through automated data profiling and quality checks
- Enables cohort analysis with flexible periodization and metric customization
- Offers built-in statistical methods (bootstrap, effect sizes, group comparisons)
- Generates interactive visualizations with single-command access
- Supports both DataFrame-level and column-specific analysis
- Modular architecture allows extending with domain-specific methods
- Preserves all native pandas functionality for backward compatibility
Stack: Python
| Pandas
| Plotly
| Tableau
| StatsModels
| SciPy
| NLTK
| TextBlob
| Sklearn
| Pingouin
Key Methods: Time-Series
| Anomaly Detection
| Custom Metrics
| RFM/Cohorts
| NLP
| Clustering
Comprehensive analysis of Brazilian e-commerce data, uncovering key insights and actionable business recommendations.
- Time-series analysis of sales dynamics, seasonality, and trend decomposition
- Anomaly detection in orders, payments, and delivery times
- Customer profiling (RFM segmentation, clustering, geo-analysis)
- Cohort analysis to track retention and lifetime value (LTV)
- NLP processing of customer reviews (sentiment analysis)
- Hypothesis validation involved conducting tests to verify data-driven assumptions.
- Delivered strategic, data-backed recommendations to optimize logistics, enhance customer retention strategies, and drive sales growth.
Core Skills: Knowledge Management
| Critical Thinking
| Research
| Content Curation
| Information Architecture
A curated knowledge hub demonstrating systematic approach to data analysis, reflecting expertise in structuring complex information and evaluating technical content.
- Systematized 400+ resources into logical learning paths
- Implemented quality control - Selected materials based on accuracy and relevance
- Optimized for usability - Structured content for quick navigation
- Enhanced user experience - Developed a web version to facilitate easy access and navigation
- Synthesized fragmented knowledge into unified framework
- Covered full analytics pipeline from fundamentals to deployment