ANANT DEV SRIVASTAVA
AMRIT PRIYADARSHAN ROUT
BHAVYA KUMARI
HARSHITA KHANDELWAL
ANUSHKA SAINI
KANISH KALRA
Introduction to the Data
This dataset is designed to analyze key aspects of a mid-sized company’s Human Resources,
Sales, and Performance Management. It includes information about employees, their salaries,
departmental structures, sales transactions, and performance reviews. By querying this data, we
can uncover insights related to employee productivity, revenue generation, and managerial
effectiveness.
The dataset consists of four relational tables:
Employees → Contains details like employee ID, name, job role, department, salary, hire
date, and performance rating.
Departments → Lists department names and their corresponding IDs.
Sales → Records sales transactions made by employees, including sales amounts and
product details.
Performance Reviews → Tracks employee performance ratings over time.
Key Features of the Dataset
Employee Demographics & Roles → Helps in understanding workforce distribution
across different job roles.
Sales & Revenue Data → Enables revenue tracking and performance comparison across
departments.
Performance Reviews → Provides insights into how employees are rated based on their
contributions.
Managerial Hierarchies → Helps in assessing leadership effectiveness and workforce
structure.
Background & Context of the Chosen Dataset
In today’s data-driven business environment, companies need to track employee performance,
sales trends, and departmental efficiency to make informed managerial decisions. This dataset
has been designed to reflect a mid-sized company's HR, Sales, and Performance Data, which
helps in analyzing:
Employee Productivity → Understanding the impact of experience, salary, and department on
performance.
Sales Insights → Evaluating which employees and departments contribute the most to revenue.
Managerial Effectiveness → Measuring leadership efficiency through employee
performance metrics.
Departmental Performance → Identifying which departments generate the highest sales
and maintain the best employee retention.
The dataset consists of four key tables:
Employees → Contains details like employee name, job role, department, salary, hire
date, and performance rating.
Departments → Lists department names and their corresponding IDs.
Sales → Stores sales transaction data, including sales amount, employee ID, and product
details.
Performance Reviews → Captures employee ratings and review dates.
This dataset is structured to simulate real-world business scenarios and allows SQL-based data
analysis for decision-making.
Project Objectives
The goal of this SQL assignment is to use structured data querying to extract meaningful
business insights. The objectives include:
Understanding Employee Performance → Analyzing employee performance based on
salary, experience, and department.
Sales & Revenue Analysis → Identifying top-performing employees and departments
that generate the highest revenue.
Departmental & Managerial Efficiency → Evaluating the effectiveness of different
departments and their leadership structures.
HR & Retention Insights → Examining employee retention trends and identifying factors
affecting job satisfaction.
Query Optimization & SQL Functionalities → Utilizing advanced SQL functions like
aggregations, joins, subqueries, and window functions to derive insights efficiently.
Business Decision Support → Providing data-driven recommendations that can improve
employee performance, increase sales, and optimize management decisions.
SQL Features and Functionalities Used DDL (Data Definition Language) -
Creating tables, constraints. DML (Data Manipulation Language) - Insert, Update, Delete. Joins -
INNER JOIN, LEFT JOIN, RIGHT JOIN, SELF JOIN. Aggregation Functions - SUM(),
COUNT(), AVG(), MIN(), MAX(). Subqueries - Nested queries for filtering. Common Table
Expressions (CTEs) - Recursive and non-recursive. Window Functions - RANK(),
DENSE_RANK(), ROW_NUMBER(), NTILE(). Case Statements - Conditional query
execution. Stored Procedures & Functions - Automating repetitive queries. Triggers -
Implementing automatic updates upon actions.
SQL Codes & their result
Query 1
Total number of employees SELECT COUNT(*) AS total_employees FROM
employees;
Result- Total number of employees → 50.
Query 2
Total sales made by the company SELECT SUM(sales_amount) AS total_sales FROM
sales;
Result - Total sales made by the company → ₹2,521,420.15
Query 3
Top 5 employees with highest sales
SELECT employee_id, SUM(sales_amount) AS total_sales FROM sales
GROUP BY employee_id
ORDER BY total_sales DESC LIMIT 5;
Result- Top 5 employees with highest sales:
Employee 5 → ₹151,935.59
Employee 27 → ₹147,304.72
Employee 11 → ₹129,855.97
Employee 24 → ₹103,403.09
Employee 34 → ₹94,221.16
Query 4
Average salary per department
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department;
Result - Average salary per department:
Finance → ₹74,039.91
HR → ₹80,368.21 IT → ₹87,024.71
Marketing → ₹75,929.76
Sales → ₹84,879.36
Query 5
Employees in the company for more than 5 years
SELECT first_name, last_name, hire_date
FROM employees
WHERE hire_date <= DATE_SUB(CURDATE(), INTERVAL 5 YEAR);
Result - Employees in the company for more than 5 years:
19 employees.
Query 6
Department-wise count of employees
SELECT department,
COUNT(*) AS employee_count
FROM employees
GROUP BY department;
Result- Department-wise count of employees:
Finance → 15
HR → 9
IT → 8
Marketing → 9
Sales → 9
Query 7
Total revenue generated by each department
SELECT e.department,
SUM(s.sales_amount) AS total_revenue
FROM employees e JOIN sales s ON e.employee_id = s.employee_id
GROUP BY e.department;
Result- Total revenue generated by each department:
Finance → ₹692,467.13
HR → ₹551,434.50
IT → ₹257,548.26
Marketing → ₹560,319.86
Sales → ₹459,650.40
Query 8
Highest and lowest salary in each job role
SELECT job_role, MAX(salary) AS highest_salary,
MIN(salary) AS lowest_salary
FROM employees
GROUP BY job_role;
Result- Highest and lowest salary in each job role:
JOB1 → ₹111,680.11 (high), ₹31,626.63 (low)
JOB2 → ₹117,169.86 (high), ₹60,401.55 (low)
JOB3 → ₹90,721.76 (high), ₹44,723.45 (low)
JOB4 → ₹110,635.52 (high), ₹52,188.15 (low)
JOB5 → ₹109,853.37
(high), ₹41,877.29 (low)
Query 9
Employee with the highest salary
SELECT first_name, last_name, salary
FROM employees
ORDER BY salary DESC LIMIT 1;
Result- Employee with the highest salary:
David Coleman (₹117,169.86)
Query 10
Total sales transactions in last 6 months
SELECT COUNT(*) AS sales_transactions
FROM sales
WHERE sales_date >= DATE_SUB(CURDATE(), INTERVAL 6 MONTH);
Result- Total sales transactions in last 6 months: 20
Query 11
Average sales amount per employee
SELECT employee_id,
AVG(sales_amount) AS avg_sales
FROM sales
GROUP BY employee_id;
Result- Average sales amount per employee:
Varies; highest is ₹46,245.34
Query 12
Employees with a performance rating of 5
SELECT first_name, last_name, performance_rating
FROM employees
WHERE performance_rating = 5;
Result- Employees with a performance rating of 5:
9 employees
Query 13
Number of employees under each manager
SELECT manager_id,
COUNT(*) AS employees_under_manager
FROM employees
WHERE manager_id IS NOT NULL
GROUP BY manager_id;
Result - Number of employees under each manager:
Ranges from 3 to 7 employees per manager
Query 14
Total number of job roles in the company
SELECT COUNT(DISTINCT job_role) AS total_job_roles
FROM employees;
Result - Total number of job roles in the company: 5
Query 15
Employees who have not made any sales
SELECT e.first_name, e.last_name
FROM employees
LEFT JOIN sales s ON e.employee_id = s.employee_id
WHERE s.employee_id IS NULL;
Result- Employees who have not made any sales:
6 employees
Query 16
Department with the highest number of employees
SELECT department
COUNT(*) AS total_employees
FROM employees
GROUP BY department
ORDER BY total_employees DESC LIMIT 1;
Result - Department with the highest number of employees:
Finance (15 employees)
Query 17
Employee with the highest single sale transaction
SELECT e.first_name, e.last_name,
MAX(s.sales_amount) AS highest_single_sale
FROM sales
JOIN employees e ON s.employee_id = e.employee_id
GROUP BY e.employee_id
ORDER BY highest_single_sale DESC LIMIT 1;
Result - Employee with the highest single sale transaction:
Jeffery Dixon (₹49,641.26)
Query 18
Count of employees earning more than the average salary
SELECT COUNT(*) AS above_avg_salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
Result - Count of employees earning more than the average salary:
28 employees
Query 19
Employees who joined in the last 2 years
SELECT first_name, last_name, hire_date
FROM employees
WHERE hire_date >= DATE_SUB(CURDATE(), INTERVAL 2 YEAR);
Result - Employees who joined in the last 2 years:
8 employees
Query 20
Total number of distinct products sold
SELECT COUNT(DISTINCT product_id)
AS distinct_products_sold
FROM sales;
Result - Total number of distinct products sold: 5
Insights from the SQL Project
Employee Performance & Sales Trends
Employees with higher performance ratings tend to have higher sales contributions.
Some employees have high salaries but contribute less in sales, indicating potential
inefficiencies.
Departmental Analysis
Finance and Marketing generate the highest revenue, but IT contributes the least, likely
due to indirect sales involvement. Sales department employees have variable
performance, suggesting a need for more training or incentives.
Managerial Effectiveness
Some managers oversee more employees than others, which may impact management
efficiency. A few managers have low-performing teams, signaling potential leadership or
motivation issues.
Employee Retention & Experience
Around 19 employees have been with the company for over 5 years, indicating
reasonable retention. However, new hires in the last two years have a mix of high and
low performance, suggesting varying training effectiveness.
Sales & Revenue Generation
The top 5 employees contribute significantly to total sales, but a long tail of employees
contributes much less. Some employees have never made a sale, indicating they might be
in support or non-sales roles.
Limitations of the Project
Limited Sales & Product Data
The dataset only contains sales figures, but lacks detailed customer behavior insights like
demographics, frequency of purchases, etc.
Simplified Employee Performance Metrics
Employee performance is measured solely based on a rating system (1-5), which might
not capture qualitative aspects of performance.
Static Snapshot
The dataset captures only recent sales data and doesn't provide historical trends over
multiple years. Lack of seasonal or quarterly comparisons.
No Cost or Profitability Metrics
The dataset focuses on revenue but does not account for costs, expenses, or profitability
of different departments.
What More Could Be Done with This Dataset?
Advanced Analytics & Machine Learning
1) Predictive modeling for employee performance based on historical sales and tenure.
2) Sales forecasting using time-series analysis.
Employee Retention & HR Insights
1) Identifying risk factors for employee attrition based on salary, tenure, and performance
data.
2) Analyzing salary distribution fairness across job roles.
Sales & Marketing Optimization
1) Finding correlations between product sales and employee performance.
2) Analyzing which sales strategies yield the highest conversions.
Customer & Product Insights (If Data is Available)
1) Customer segmentation based on purchase history.
2) Identifying the most profitable products and their seasonal trend