Thanks to visit codestin.com
Credit goes to github.com

Skip to content

PavelGrigoryevDS/PavelGrigoryevDS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 

Repository files navigation

πŸ‘‹ Welcome! I'm Pavel

Gmail

πŸ§‘β€πŸ’» About me

I have a technical background and specialize in data analysis, with a focus on extracting actionable insights from complex datasets to support strategic decisions and deliver measurable business growth. I truly enjoy using data analysis and A/B testing to improve products.


πŸ› οΈ Languages and Tools

  • Programming Languages: Python, SQL (PostgreSQL, MySQL, ClickHouse), NoSQL (MongoDB).
  • Data Analysis & Visualization:
    • Libraries: Pandas, NumPy, SciPy, Statsmodels, Pingouin, Plotly, Matplotlib, Seaborn.
    • Tools & Frameworks: Dash, Power BI, Tableau, Redash, DataLens, Superset.
  • Big Data & Distributed Computing: Apache Spark, Apache Airflow.
  • Machine learning and AI: Scikit-learn, MLlib.
  • Time Series Forecasting: Facebook Prophet, Uber Orbit.
  • Natural Language Processing: NLTK, SpaCy, TextBlob.
  • Web scraping: BeautifulSoup, Selenium, Scrapy.
  • DevOps: Linux, Git, Docker.
  • IDEs: VS Code, Google Colab, Jupyter Notebook, Zeppelin, PyCharm.
PythonΒ  PandasΒ  NumPyΒ  PlotlyΒ  PostgreSQLΒ  MySQLΒ  MongoDBΒ  TableauΒ  Power BIΒ  RedashΒ  SklearnΒ  VS CodeΒ  JupyterΒ  LinuxΒ  GitΒ  DockerΒ  AirflowΒ 

🎯 Skills

  • Deep data analysis:
    • Preprocessing, cleaning, and identifying patterns using visualization to support decision-making.
  • Writing complex SQL queries:
    • Working with nested queries, window functions, CASE and WITH statements for data extraction and analysis.
  • Understanding product strategy:
    • Knowledge of product development and improvement principles, including analyzing user needs and formulating recommendations for its growth.
  • Product metrics analysis:
    • LTV, RR, CR, ARPU, ARPPU, MAU, DAU, and other key performance indicators.
  • Conducting A/B testing:
    • Analyzing results using statistical methods to evaluate the effectiveness of changes.
  • Cohort analysis and RFM segmentation:
    • Identifying user behavior patterns to optimize marketing strategies.
  • End-to-End Data Pipelines:
    • Building automated ETL processes from databases to dashboards with Airflow orchestration.
  • Data visualization and dashboard development:
    • Creating interactive reports in Tableau, Redash, Power BI, and other tools for presenting analytics.
  • Web scraping:
    • Experience in extracting data from websites using tools and libraries such as BeautifulSoup, Scrapy, and Selenium for information gathering and data analysis.
  • Working with big data:
    • Experience with tools and technologies for processing large volumes of data (e.g., Hadoop, Spark).
  • Machine Learning Applications:
    • Capable of building and applying machine learning models for data analysis tasks, including forecasting, classification, and clustering, to uncover deeper insights and enhance decision-making processes.
  • Business and Metric Forecasting:
    • Building and interpreting time series forecasts for key business metrics using libraries like Uber Orbit and Facebook Prophet for intuitive, robust forecasting to support strategic planning and goal-setting.
  • Working with APIs:
    • Integrating and extracting data from various sources via APIs.
  • Process Automation:
    • Automating data workflows and routine tasks using Linux scripting, Apache Airflow and other DevOps tools.

🌟 Featured Projects

Key Methods: Knowledge Management | Critical Thinking | Research | Content Curation | Information Architecture

A curated knowledge hub demonstrating systematic approach to data analysis, reflecting expertise in structuring complex information and evaluating technical content.

  • Systematized 500+ resources into logical learning paths
  • Implemented quality control - Selected materials based on accuracy and relevance
  • Optimized for usability - Structured content for quick navigation
  • Enhanced user experience - Developed a web version to facilitate easy access and navigation
  • Synthesized fragmented knowledge into unified framework
  • Covered full analytics pipeline from fundamentals to deployment

Stack: Python | ClickHouse | Apache Airflow | Superset | Yandex DataLens | StatsModels | Uber Orbit | Telegram API

Key Methods: A/B Testing | Time Series Forecasting | Anomaly Detection | Cohort Analysis | ETL Pipelines | Dashboard Design

Building analytics process for startup: infrastructure, dashboards, A/B testing, forecasting, automated reports, and anomaly detection.

  • Built complete data infrastructure from raw events to automated business intelligence
  • Designed interactive dashboards for real-time monitoring of user engagement and retention
  • Implemented rigorous A/B testing pipeline with statistical validation of feature experiments
  • Developed forecasting models for server load prediction and capacity planning
  • Created automated reporting system with daily Telegram delivery to stakeholders
  • Established real-time anomaly detection for proactive issue resolution
  • Enabled data-driven product decisions through comprehensive analytics ecosystem

Stack: Python | Pandas | Plotly | Tableau | StatsModels | SciPy | NLTK | TextBlob | Sklearn | Pingouin

Key Methods: Time-Series | Anomaly Detection | Custom Metrics | RFM/Cohorts | NLP | Clustering

Comprehensive analysis of Brazilian e-commerce data, uncovering key insights and actionable business recommendations.

  • Time-series analysis of sales dynamics, seasonality, and trend decomposition
  • Anomaly detection in orders, payments, and delivery times
  • Customer profiling (RFM segmentation, clustering, geo-analysis)
  • Cohort analysis to track retention and lifetime value (LTV)
  • NLP processing of customer reviews (sentiment analysis)
  • Hypothesis validation involved conducting tests to verify data-driven assumptions.
  • Delivered strategic, data-backed recommendations to optimize logistics, enhance customer retention strategies, and drive sales growth.

Stack: Python | PostgreSQL | Airflow | Yandex DataLens | SQLAlchemy | DBLink

Key Methods: ETL Pipelines | Star Schema | Data Warehousing | Dashboard Design | SQL Optimization

End-to-end data pipeline and business intelligence solution for global distributor Wide World Importers.

  • Built automated ETL pipeline transforming OLTP data into optimized star schema data mart
  • Designed and implemented interactive dashboard for sales, logistics, and customer analytics
  • Developed daily automated data updates with Airflow DAG orchestration
  • Enabled data-driven decision making across sales, procurement, and logistics departments
  • Reduced manual reporting time through automated data consolidation and visualization

Stack: Python | Pandas | NumPy | SciPy | Plotly | Statsmodels | Scikit-learn | Pingouin | TextBlob | Sphinx

Key Methods: Data Exploration | Statistical Testing | Cohort Analysis | Automated Visualization | Feature Analysis | Machine Learning

Powerful pandas extension that enhances DataFrames with production-ready analytics while maintaining native functionality.

  • Seamlessly integrates exploratory analysis, statistical testing and visualization into pandas workflows
  • Provides instant insights through automated data profiling and quality checks
  • Enables cohort analysis with flexible periodization and metric customization
  • Offers built-in statistical methods (bootstrap, effect sizes, group comparisons)
  • Generates interactive visualizations with single-command access
  • Supports both DataFrame-level and column-specific analysis
  • Modular architecture allows extending with domain-specific methods
  • Preserves all native pandas functionality for backward compatibility

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published