Thanks to visit codestin.com
Credit goes to www.geeksforgeeks.org

Open In App

SQL Query to Delete Duplicate Rows

Last Updated : 06 Nov, 2025
Comments
Improve
Suggest changes
16 Likes
Like
Report

Duplicate rows in a database can cause inconsistent results and affect performance. Removing them helps maintain data accuracy and efficiency.

  • Caused by import errors or missing constraints.
  • Lead to wasted storage and slower queries.
  • Can be removed using SQL functions like ROW_NUMBER() or COUNT().

Example: First, we will create a demo SQL database and table, on which we will use the Delete Duplicate Rows command.

Charlie

Query:

DELETE FROM Employee
WHERE EmployeeID NOT IN (
SELECT MIN(EmployeeID)
FROM Employee
GROUP BY Name, Department
);

Output:

SQL-1

Identify Duplicate Rows

We Use the GROUP BY clause with the COUNT(*) function to find rows with duplicate values.

Query:

SELECT Name, Department, COUNT(*)
FROM Employee
GROUP BY Name, Department
HAVING COUNT(*) > 1;

Output:

Martin

Explanation:

  • GROUP BY Name, Department: groups rows by employee Name and Department.
  • COUNT(*): counts how many rows are in each group.
  • HAVING COUNT(*) > 1: filters to show only groups with more than one employee (i.e., duplicates).
  • Result: lists Name and Department combinations that appear multiple times.

Methods to Delete Duplicate Rows in SQL

There are several ways to delete duplicate rows in SQL. Here, we will explain five methods to handle this task effectively.

1. Using GROUP BY and COUNT()

Use the GROUP BY clause along with MIN(SN) to retain one unique row for each duplicate group. This method identifies the first occurrence of each duplicate combination based on the SN (serial number) and deletes the other duplicate rows.

Query:

DELETE FROM Employees
WHERE EmployeeID NOT IN (
SELECT MIN(EmployeeID)
FROM Employees
GROUP BY Name, Department
);

SELECT * FROM Employees;

Output

SQL-1

2. Using ROW_NUMBER()

The ROW_NUMBER() function provides a more elegant and flexible solution. This window function assigns a unique number to each row within a partition (group of duplicates). We can delete rows where the row number is greater than 1.

Query:

WITH CTE AS (
SELECT EmployeeID, Name, Department,
ROW_NUMBER() OVER (PARTITION BY Name, Department ORDER BY EmployeeID) AS RowNum
FROM Employees
)
DELETE FROM Employees
WHERE EmployeeID IN (SELECT EmployeeID FROM CTE WHERE RowNum > 1);

Output:

Method-2

3. Using Common Table Expressions (CTEs)

Using a Common Table Expression (CTE), we can delete duplicates in a more structured way. CTEs provide a cleaner approach by allowing us to define a temporary result set that can be referenced within the DELETE statement. This method can be more readable and maintainable, especially when dealing with complex queries.

Query:

WITH CTE AS (
SELECT EmployeeID,
ROW_NUMBER() OVER (PARTITION BY Name, Department ORDER BY EmployeeID) AS RowNum
FROM Employees
)
DELETE FROM Employees
WHERE EmployeeID IN (
SELECT EmployeeID
FROM CTE
WHERE RowNum > 1
);

Output:

SQL-1

4. Using Temporary Tables

You can create a temporary table to hold unique records and then replace the original table with the new, clean data.

Steps:

  1. Insert unique rows into a temporary table.
  2. Truncate the original table.
  3. Insert the unique rows back.

Query:

DROP TEMPORARY TABLE IF EXISTS DistinctEmployees;

CREATE TEMPORARY TABLE DistinctEmployees AS
SELECT DISTINCT Name, Department
FROM Employees;

DELETE FROM Employees;

INSERT INTO Employees (Name, Department)
SELECT Name, Department
FROM DistinctEmployees;

DROP TEMPORARY TABLE DistinctEmployees;

Output:

SQL-1

5. Using DISTINCT with INSERT INTO

You can use DISTINCT to select only unique rows and then insert them back into the original table, effectively deleting duplicates.

Query:

WITH DistinctEmployees AS (
SELECT DISTINCT Name, Department
FROM Employees
)
DELETE FROM Employees;
INSERT INTO Employees (Name, Department)
SELECT Name, Department
FROM DistinctEmployees;

Output:

SQL-1

Why You Should Remove Duplicate Rows

  1. Data Integrity: Duplicates can distort reports and analyses, leading to incorrect insights.
  2. Optimal Performance: Redundant data can slow down queries, especially when dealing with large datasets.
  3. Efficient Storage: Removing duplicates helps optimize storage usage, keeping your database lean.

SQL Query to Delete Duplicate Rows

Explore