Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
4 views2 pages

Data Analyst Workflow

Uploaded by

Raymond Banag
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views2 pages

Data Analyst Workflow

Uploaded by

Raymond Banag
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Data Analyst Workflow (Day-to-Day / Project-Based)

1. Understand the Problem / Objective


• Purpose: Know what the business wants to measure or improve (sales trends,
customer churn, etc.).
• Tools: None specifically; mainly meetings, notes, or documentation.

2. Data Collection / Extraction


• Purpose: Gather relevant data from databases, spreadsheets, APIs, or other
sources.
• Tools & When to Use:
o SQL: When the data is in a relational database (e.g., MySQL, PostgreSQL).
Use it to pull exactly what you need.
o Python / R: For web scraping, APIs, or large datasets.
o Excel: For small datasets or ad-hoc data from CSVs, manual reports.

3. Data Cleaning / Preprocessing


• Purpose: Make the data usable by fixing errors, missing values, duplicates, and
standardizing formats.
• Tools & When to Use:
o Excel: Small datasets or simple tasks like removing duplicates, correcting
typos, or quick filters.
o SQL: When cleaning data in a database before exporting (e.g., filtering rows,
joining tables, aggregating).
o Python (pandas) / R (dplyr): For large datasets, complex transformations,
automated cleaning, and reproducibility.
• Tip: If it’s a one-off small dataset, Excel is fine; for repeated, large, or multi-table
cleaning, use SQL or Python.

4. Exploratory Data Analysis (EDA)


• Purpose: Understand patterns, distributions, trends, and anomalies in the data.
• Tools:
o Python / R: For plotting histograms, scatterplots, boxplots, correlation
analysis.
o Excel / Power BI / Tableau: Quick visual summaries, pivot tables, basic
charts.
• Tip: Python/R is better for deeper statistical understanding, dashboards are better
for business storytelling.
5. Analysis / Modeling
• Purpose: Derive insights, test hypotheses, and predict trends.
• Tools:
o Python / R: Regression, clustering, forecasting, hypothesis testing.
o Excel: Simple calculations, trendlines, correlation, or basic pivot table
analysis.
o Power BI / Tableau: Visualize insights or KPI metrics, create dashboards.
• Tip: Complex analysis → Python/R; simple analysis → Excel; storytelling →
dashboards.

6. Reporting / Visualization
• Purpose: Present actionable insights to stakeholders in a clear, visual, and
understandable way.
• Tools:
o Power BI / Tableau: Interactive dashboards.
o Excel: Static reports, charts, pivot tables.
o Python / R (matplotlib, seaborn, ggplot): For custom, reproducible charts
for technical reports.

7. Documentation & Archiving


• Purpose: Ensure your work can be replicated or audited.
• Tools:
o Git / GitHub: Version control for code.
o Excel / CSV / Database: Store cleaned datasets.
o Markdown / Confluence / Notion: Document methodology, assumptions,
transformations.

You might also like