Active Users - Problem
Active Users Analytics

You're building a user engagement analytics system for a social media platform. Your task is to identify active users - those who demonstrate consistent engagement by logging in for 5 or more consecutive days.

Given two database tables:

Accounts Table: Contains user account information (id and name)
Logins Table: Contains login records (account id and login date)

Your goal is to write an SQL query that finds all users who have logged in for at least 5 consecutive days. Return their id and name, ordered by id.

Note: A user may log in multiple times per day, but we only care about the distinct login dates for consecutive day calculations.

Input & Output

example_1.sql โ€” Basic consecutive logins
$ Input: Accounts: [(1,'Winston'), (7,'Jonathan')] Logins: [(7,'2020-05-30'), (1,'2020-05-30'), (7,'2020-05-31'), (7,'2020-06-01'), (7,'2020-06-02'), (7,'2020-06-03'), (7,'2020-06-04'), (1,'2020-06-01')]
โ€บ Output: [(7,'Jonathan')]
๐Ÿ’ก Note: User 7 (Jonathan) logged in for 5 consecutive days from 2020-05-30 to 2020-06-03, making him an active user. User 1 (Winston) only has 2 separate login days, not meeting the 5 consecutive days requirement.
example_2.sql โ€” Multiple consecutive streaks
$ Input: Accounts: [(1,'Alice'), (2,'Bob'), (3,'Charlie')] Logins: [(1,'2023-01-01'), (1,'2023-01-02'), (1,'2023-01-03'), (1,'2023-01-04'), (1,'2023-01-05'), (2,'2023-01-01'), (2,'2023-01-03'), (2,'2023-01-05'), (3,'2023-01-10'), (3,'2023-01-11'), (3,'2023-01-12'), (3,'2023-01-13'), (3,'2023-01-14'), (3,'2023-01-15')]
โ€บ Output: [(1,'Alice'), (3,'Charlie')]
๐Ÿ’ก Note: Alice has 5 consecutive days (Jan 1-5), Charlie has 6 consecutive days (Jan 10-15). Bob has non-consecutive login days, so he doesn't qualify as an active user.
example_3.sql โ€” Edge case with duplicates
$ Input: Accounts: [(1,'John')] Logins: [(1,'2023-01-01'), (1,'2023-01-01'), (1,'2023-01-02'), (1,'2023-01-03'), (1,'2023-01-04'), (1,'2023-01-05'), (1,'2023-01-05')]
โ€บ Output: [(1,'John')]
๐Ÿ’ก Note: Even with duplicate login entries for the same day, John still has 5 consecutive unique login days (Jan 1-5), making him an active user. Duplicates are handled by using DISTINCT in the query.

Constraints

  • 1 โ‰ค Accounts table size โ‰ค 500
  • 1 โ‰ค Logins table size โ‰ค 1000
  • 1 โ‰ค id โ‰ค 500
  • All login_date values are valid dates
  • A user may have multiple login records for the same date

Visualization

Tap to expand
Raw Login DataUser 1: 2023-01-01User 1: 2023-01-01 (dup)User 1: 2023-01-02User 1: 2023-01-03User 1: 2023-01-04User 1: 2023-01-05User 1: 2023-01-07User 1: 2023-01-08DeduplicatedUser 1: 2023-01-01User 1: 2023-01-02User 1: 2023-01-03User 1: 2023-01-04User 1: 2023-01-05User 1: 2023-01-07User 1: 2023-01-08With DENSE_RANKRank 1: 2023-01-01Rank 2: 2023-01-02Rank 3: 2023-01-03Rank 4: 2023-01-04Rank 5: 2023-01-05Rank 6: 2023-01-07Rank 7: 2023-01-08Date GroupsGroup A: 2023-01-01Group A: 2023-01-01Group A: 2023-01-01Group A: 2023-01-01Group A: 2023-01-01Group B: 2023-01-02Group B: 2023-01-02๐ŸŽฏ Result AnalysisGroup A: 5 consecutive daysGroup B: 2 consecutive daysโœ… User 1 is ACTIVE (Group A โ‰ฅ 5 days)
Understanding the Visualization
1
Remove Duplicates
Get unique login dates per user (same person can't visit gym twice in one day)
2
Rank Dates
Apply DENSE_RANK to order login dates chronologically for each user
3
Group Consecutive
Calculate date groups by subtracting rank from date - consecutive dates get same group
4
Count & Filter
Count group sizes and keep users with 5+ consecutive login days
Key Takeaway
๐ŸŽฏ Key Insight: DENSE_RANK creates consistent group identifiers for consecutive date sequences, allowing efficient detection of active users in O(n log n) time
Asked in
Meta 45 Amazon 38 Google 32 Microsoft 25
73.8K Views
High Frequency
~15 min Avg. Time
1.8K Likes
Ln 1, Col 1
Smart Actions
๐Ÿ’ก Explanation
AI Ready
๐Ÿ’ก Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen