Delete Duplicate Emails - Problem
Email Database Cleanup Challenge

You're working as a database administrator for a company that has been collecting user emails for years. Over time, the same email addresses have been registered multiple times with different IDs, creating duplicates in your Person table.

The Problem: Clean up the database by removing duplicate emails, but keep the record with the smallest ID (the oldest registration).

Table Structure:
Person
+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| id          | int     |
| email       | varchar |
+-------------+---------+

Your Task: Write a DELETE statement (not SELECT) that removes all duplicate emails, preserving only the entry with the smallest id for each unique email.

Note: This is a database manipulation problem - you're actually modifying the table structure, not just querying it!

Input & Output

example_1.sql โ€” Basic Duplicate Removal
$ Input: Person table: +----+------------------+ | id | email | +----+------------------+ | 1 | [email protected] | | 2 | [email protected] | | 3 | [email protected] | +----+------------------+
โ€บ Output: Person table after deletion: +----+------------------+ | id | email | +----+------------------+ | 1 | [email protected] | | 2 | [email protected] | +----+------------------+
๐Ÿ’ก Note: The row with id=3 is deleted because it's a duplicate of [email protected] with a higher ID than the existing row (id=1).
example_2.sql โ€” Multiple Duplicates
$ Input: Person table: +----+------------------+ | id | email | +----+------------------+ | 1 | [email protected]| | 2 | [email protected] | | 3 | [email protected]| | 4 | [email protected] | | 5 | [email protected]| +----+------------------+
โ€บ Output: Person table after deletion: +----+------------------+ | id | email | +----+------------------+ | 1 | [email protected]| | 2 | [email protected] | | 4 | [email protected] | +----+------------------+
๐Ÿ’ก Note: Rows with id=3 and id=5 are deleted because they are duplicates of [email protected]. Only the record with the smallest ID (id=1) is kept.
example_3.sql โ€” No Duplicates
$ Input: Person table: +----+------------------+ | id | email | +----+------------------+ | 1 | [email protected] | | 2 | [email protected] | | 3 | [email protected] | +----+------------------+
โ€บ Output: Person table after deletion: +----+------------------+ | id | email | +----+------------------+ | 1 | [email protected] | | 2 | [email protected] | | 3 | [email protected] | +----+------------------+
๐Ÿ’ก Note: No rows are deleted because all email addresses are unique. The table remains unchanged.

Constraints

  • 1 โ‰ค Person table rows โ‰ค 104
  • All emails are in lowercase format
  • id is unique and serves as the primary key
  • Email length: 1 โ‰ค email.length โ‰ค 100

Visualization

Tap to expand
Email Deduplication VisualizationBefore: Duplicate EmailsID: 1[email protected]ID: 2[email protected]ID: 3[email protected]DELETEAfter: Unique Emails OnlyID: 1[email protected]ID: 2[email protected]SQL QueryDELETE p1 FROM Person p1, Person p2WHERE p1.email = p2.emailAND p1.id > p2.id;๐ŸŽฏ Key Insight: Self-join allows efficient comparison and deletion in one operation
Understanding the Visualization
1
Identify Duplicates
Find all email addresses that appear multiple times in the database
2
Compare IDs
For each duplicate email, compare the ID numbers to find the smallest (oldest)
3
Mark for Deletion
Mark all duplicate entries with larger IDs for removal
4
Execute Deletion
Remove the marked records, leaving only unique emails with smallest IDs
Key Takeaway
๐ŸŽฏ Key Insight: Using a self-join with DELETE allows us to compare each row with all other rows having the same email and remove duplicates with higher IDs in a single, efficient database operation.
Asked in
Google 25 Amazon 18 Microsoft 15 Meta 12
28.5K Views
Medium Frequency
~15 min Avg. Time
945 Likes
Ln 1, Col 1
Smart Actions
๐Ÿ’ก Explanation
AI Ready
๐Ÿ’ก Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen