This project focuses on conducting exploratory data analysis (EDA) on the layoffs_staging2 dataset. The dataset contains information on company layoffs, including details such as the number of employees laid off, the percentage of workforce laid off, industry, country, and more. The SQL queries provided offer insights into various aspects of the layoffs data.
The project is structured around the following SQL queries, each designed to extract specific insights from the dataset:
-
Basic Data Exploration
SELECT * FROM layoffs_staging2;
-
Maximum Values Analysis
SELECT MAX(total_laid_off), MAX(percentage_laid_off) FROM layoffs_staging2;
-
High-Impact Layoffs
SELECT * FROM layoffs_staging2 WHERE percentage_laid_off = 1 ORDER BY total_laid_off DESC;
-
Company-wise Layoffs
SELECT company, SUM(total_laid_off) FROM layoffs_staging2 GROUP BY company ORDER BY 2 DESC;
-
Date Range of Layoffs
SELECT MIN(`date`), MAX(`date`) FROM layoffs_staging2;
-
Industry-wise Layoffs
SELECT industry, SUM(total_laid_off) FROM layoffs_staging2 GROUP BY industry ORDER BY 2 DESC;
-
Country-wise Layoffs
SELECT country, SUM(total_laid_off) FROM layoffs_staging2 GROUP BY country ORDER BY 2 DESC;
-
Year-wise Layoffs
SELECT YEAR(`date`), SUM(total_laid_off) FROM layoffs_staging2 GROUP BY YEAR(`date`) ORDER BY 1 DESC;
-
Country and Year-wise Layoffs
SELECT country, YEAR(`date`), SUM(total_laid_off) FROM layoffs_staging2 GROUP BY YEAR(`date`), country ORDER BY SUM(total_laid_off) DESC;
-
Stage-wise Layoffs
SELECT stage, SUM(total_laid_off) FROM layoffs_staging2 GROUP BY stage ORDER BY 2 DESC;- Monthly Layoffs Trend
SELECT SUBSTRING(`date`, 1, 7) AS `MONTH`, SUM(total_laid_off) FROM layoffs_staging2 WHERE SUBSTRING(`date`, 1, 7) IS NOT NULL GROUP BY `MONTH` ORDER BY 1 ASC;- Rolling Total of Layoffs
WITH rolling_total AS (
SELECT SUBSTRING(`date`, 1, 7) AS `MONTH`, SUM(total_laid_off) AS total_laid_off
FROM layoffs_staging2
WHERE SUBSTRING(`date`, 1, 7) IS NOT NULL
GROUP BY `MONTH`
ORDER BY 1 ASC
)
SELECT `MONTH`, total_laid_off, SUM(total_laid_off) OVER(ORDER BY `MONTH`) AS rolling_total
FROM rolling_total;- Company and Year-wise Layoffs
SELECT company, YEAR(`date`) AS `date`, SUM(total_laid_off) AS total_laid_off
FROM layoffs_staging2
GROUP BY company, `date`
ORDER BY 3 DESC;- Company Ranking by Year
WITH company_year (company_name, years, total_laid_off) AS (
SELECT company, YEAR(`date`) AS `date`, SUM(total_laid_off) AS total_laid_off
FROM layoffs_staging2
GROUP BY company, `date`
)
SELECT *, DENSE_RANK() OVER(PARTITION BY years ORDER BY total_laid_off DESC) AS ranks
FROM company_year
WHERE years IS NOT NULL;- Top 5 Companies by Layoffs Per Year
WITH company_year (company_name, years, total_laid_off) AS (
SELECT company, YEAR(`date`) AS `date`, SUM(total_laid_off) AS total_laid_off
FROM layoffs_staging2
GROUP BY company, `date`
), company_ranking AS (
SELECT *, DENSE_RANK() OVER(PARTITION BY years ORDER BY total_laid_off DESC) AS ranks
FROM company_year
WHERE years IS NOT NULL
)
SELECT * FROM company_ranking WHERE ranks <= 5;To run these queries, load your dataset into a SQL environment and execute the SQL statements provided. This will allow you to explore and analyze the layoffs data from different angles, helping you to gain a better understanding of the trends and patterns within the data.