Active Users - Problem
Active Users Analytics
You're building a user engagement analytics system for a social media platform. Your task is to identify active users - those who demonstrate consistent engagement by logging in for 5 or more consecutive days.
Given two database tables:
Accounts Table: Contains user account information (id and name)
Logins Table: Contains login records (account id and login date)
Your goal is to write an SQL query that finds all users who have logged in for at least 5 consecutive days. Return their
Note: A user may log in multiple times per day, but we only care about the distinct login dates for consecutive day calculations.
You're building a user engagement analytics system for a social media platform. Your task is to identify active users - those who demonstrate consistent engagement by logging in for 5 or more consecutive days.
Given two database tables:
Accounts Table: Contains user account information (id and name)
Logins Table: Contains login records (account id and login date)
Your goal is to write an SQL query that finds all users who have logged in for at least 5 consecutive days. Return their
id and name, ordered by id.Note: A user may log in multiple times per day, but we only care about the distinct login dates for consecutive day calculations.
Input & Output
example_1.sql โ Basic consecutive logins
$
Input:
Accounts: [(1,'Winston'), (7,'Jonathan')]
Logins: [(7,'2020-05-30'), (1,'2020-05-30'), (7,'2020-05-31'), (7,'2020-06-01'), (7,'2020-06-02'), (7,'2020-06-03'), (7,'2020-06-04'), (1,'2020-06-01')]
โบ
Output:
[(7,'Jonathan')]
๐ก Note:
User 7 (Jonathan) logged in for 5 consecutive days from 2020-05-30 to 2020-06-03, making him an active user. User 1 (Winston) only has 2 separate login days, not meeting the 5 consecutive days requirement.
example_2.sql โ Multiple consecutive streaks
$
Input:
Accounts: [(1,'Alice'), (2,'Bob'), (3,'Charlie')]
Logins: [(1,'2023-01-01'), (1,'2023-01-02'), (1,'2023-01-03'), (1,'2023-01-04'), (1,'2023-01-05'), (2,'2023-01-01'), (2,'2023-01-03'), (2,'2023-01-05'), (3,'2023-01-10'), (3,'2023-01-11'), (3,'2023-01-12'), (3,'2023-01-13'), (3,'2023-01-14'), (3,'2023-01-15')]
โบ
Output:
[(1,'Alice'), (3,'Charlie')]
๐ก Note:
Alice has 5 consecutive days (Jan 1-5), Charlie has 6 consecutive days (Jan 10-15). Bob has non-consecutive login days, so he doesn't qualify as an active user.
example_3.sql โ Edge case with duplicates
$
Input:
Accounts: [(1,'John')]
Logins: [(1,'2023-01-01'), (1,'2023-01-01'), (1,'2023-01-02'), (1,'2023-01-03'), (1,'2023-01-04'), (1,'2023-01-05'), (1,'2023-01-05')]
โบ
Output:
[(1,'John')]
๐ก Note:
Even with duplicate login entries for the same day, John still has 5 consecutive unique login days (Jan 1-5), making him an active user. Duplicates are handled by using DISTINCT in the query.
Constraints
- 1 โค Accounts table size โค 500
- 1 โค Logins table size โค 1000
- 1 โค id โค 500
- All login_date values are valid dates
- A user may have multiple login records for the same date
Visualization
Tap to expand
Understanding the Visualization
1
Remove Duplicates
Get unique login dates per user (same person can't visit gym twice in one day)
2
Rank Dates
Apply DENSE_RANK to order login dates chronologically for each user
3
Group Consecutive
Calculate date groups by subtracting rank from date - consecutive dates get same group
4
Count & Filter
Count group sizes and keep users with 5+ consecutive login days
Key Takeaway
๐ฏ Key Insight: DENSE_RANK creates consistent group identifiers for consecutive date sequences, allowing efficient detection of active users in O(n log n) time
๐ก
Explanation
AI Ready
๐ก Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code