CREATING A TABLE NAMED USERS, EVENTS AND EMAIL_EVENTS.
# FOR USER TABLE
CREATE DATABASE JOB;
USE JOB;
CREATE TABLE users ( //This line creates a table with six columns with named user_id,
user_id INT , created_at, company_id, language, activated_at, state.
created_at varchar(100),
company_id INT,
language VARCHAR(50),
activated_at varchar(100),
state VARCHAR(50)
);
LOAD DATA INFILE
"C:/ProgramData/MySQL/MySQL //This line loads data from a CSV file located
Server 8.0/Uploads/users.csv" C:/ProgramData/MySQL/MySQL Server 8.0/Uploads/users.csv
INTO TABLE users into the users table.
FIELDS TERMINATED BY ','
ENCLOSED BY '\"'
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS;
select*FROM users;
alter table users add column temp_created_at datetime; This code adds a new
UPDATE users column temp_created_at
SET temp_created_at = STR_TO_DATE(created_at, '%d-%m-%Y %H: with a datetime data type,
%i'); updates
alter table users drop column created_at; the temp_created_at
alter table users change column temp_created_at created_at column by converting
datetime; the temp_created_at
column from a string to a
datetime format using
the STR_TO_DATE function
RESULT:- USER TABLE
#FOR EVENTS TABLE
CREATE TABLE events (
user_id int not null, //This line creates a table named events with
occurred_at varchar(100) not null, columns named user_id, occurred_at,
event_type varchar(100), event_type, event_name, location, device,
event_name varchar(100), user_type.
location varchar(100),
device varchar(100),
user_type varchar(100)
);
LOAD DATA INFILE "C:/ProgramData/MySQL/MySQL Server 8.0/Uploads/events.csv"
INTO TABLE events
FIELDS TERMINATED BY ',' //This line loads data from a CSV file located
ENCLOSED BY '\"' C:/ProgramData/MySQL/MySQL Server
LINES TERMINATED BY '\r\n' 8.0/Uploads/events.csv into the users table.
IGNORE 1 ROWS;
SELECT*from events;
ALTER TABLE events ADD COLUMN
temp_occurred_at datetime;
UPDATE events
SET temp_occurred_at = STR_TO_DATE(occurred_at, '%d-%m-%Y %H.%i');
ALTER TABLE events DROP COLUMN occurred_at;
ALTER TABLE events CHANGE COLUMN temp_occurred_at occurred_at datetime;
RESULT:- EVENTS TABLE
## email_events TABLE
CREATE TABLE email_events( //This line creates a table named email_events
user_id int not null,
with columns named user_id, occurred_at, action
occurred_at varchar(100),
and user_type.
action varchar(100),
user_type int
);
LOAD DATA INFILE
"C:/ProgramData/MySQL/MySQL Server //This line loads data from a CSV file
8.0/Uploads/email_events.csv" located C:/ProgramData/MySQL/MySQL
INTO TABLE email_events Server 8.0/Uploads/email_events.csv.
FIELDS TERMINATED BY ','
ENCLOSED BY '\"'
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS;
SELECT*from email_events;
ALTER TABLE email_events ADD COLUMN temp_occurred_at datetime;
UPDATE email_events SET temp_occurred_at = STR_TO_DATE(occurred_at, '%d-%m-%Y %H:%i');
ALTER TABLE email_events
DROP COLUMN occurred_at;
ALTER TABLE email_events CHANGE COLUMN temp_occurred_at occurred_at datetime;
RESULT:- EMAIL_EVENTS TABLE
TASK A:- Calculate the weekly user engagement.
//The DATE_FORMAT is used to extract the
SELECT week number.
DATE_FORMAT(e.occurred_at, '%Y-%u') AS week, //%Y-%u' format returns the year and week
COUNT(DISTINCT e.user_id) AS active_users, number.
COUNT(e.event_type) AS total_events //The counts the number of events and unique
FROM
users who were active during each week.
events e
GROUP BY
week The results are grouped by week and
ORDER BY in order week.
week;
RESULT:- Weekly user engagement.
INSIGHTS:-
After analyzing these results i can identify trends and patterns in user engagement
Busy weeks for users.
If user activity changes with seasons.
How many users are active compared to total actions.
Big jumps or drops in user activity.
I can check which weeks had the highest user engagement.
TASK B:- Calculate the user growth for the product.
// The DATE_FORMAT function is used to
SELECT
extract the month and year using the '%Y-%m'
DATE_FORMAT(created_at, '%Y-%m') AS month,
format.
COUNT(user_id) AS new_users
//counts the number of new users for each
FROM
month.
Users
//The results are grouped by month and order
GROUP BY
in month.
month
ORDER BY
month;
RESULT:- The user growth for the product.
INSIGHTS:-
We can analyse the increasing or decreasing trends in new user .
we can check the growth rate is steady or fluctuating.
We can identify months with higher or lower user interaction.
By counting the number of new users for each month, we can see how many new
users are joining your product over time.
By analyzing the new user count for each month we can identify any new patterns in
user growth.
TASK C:- Calculate the weekly retention of users based on
their sign-up cohort.
//This selects the user_id and the week of
WITH created_at from the users table.
user_cohorts AS ( //the week function is used to extract the week
SELECT number from the creted_at date.
user_id,
WEEK(created_at) AS signup_week
FROM
users
),
//user_active_weeks joins the user_cohorts with the
user_active_weeks AS (
events table with user id column.
SELECT
ue.user_id, // then extracts the user_id and the week of activity
WEEK(e.occurred_at) AS active_week from the events table and grouping the columns.
FROM
user_cohorts ue
JOIN
events e ON ue.user_id = e.user_id
GROUP BY
ue.user_id,
WEEK(e.occurred_at)
)
SELECT
uc.signup_week,
COUNT(DISTINCT ua.user_id) AS active_users,
COUNT(DISTINCT uc.user_id) AS total_users, //This joins the two tables on the
ROUND(COUNT(DISTINCT ua.user_id) / COUNT user_id column, groups the results
(DISTINCT uc.user_id) * 100, 2) AS retention_rate by signup_week and counts the
FROM distinct number of active users for
user_cohorts uc each signup week.
LEFT JOIN
user_active_weeks ua ON uc.user_id = ua.user_id AND // The results are ordered by
uc.signup_week = ua.active_week - 1 signup_week.
GROUP BY
uc.signup_week
ORDER BY
uc.signup_week;
RESULT:- Weekly retention of users based on their sign-up cohort.
FIG.1 FIG.2
FIG.3
INSIGHTS:-
We can analyze Which weeks are more successful for marketing campaigns, new
feature releases, or seasonal trends that boosted user engagement.
The week when a user created their account.
By analyzing the number of active users for each signup week we can
identify patterns in user retention.
Identify the weeks with the highest number of active users.
Compare the number of active users in each week to the total number of
users signed up in that week.
Analyze the weeks with the lowest number of active users and identify
areas where you can improve the user experience
TASK D:- Calculate the weekly engagement per device.
SELECT //This uses the week function to extract the week from
device, the occurred_at datetime column. It then groups the
WEEK(occurred_at) AS week, results by device and week and counts the number of
COUNT(*) AS engagement events for each group. Then orders the results by device
FROM and week.
events
GROUP BY
device,
WEEK(occurred_at)
ORDER BY
device,
week;
RESULT:- Weekly engagement per device.
FIG.1 FIG.2
FIG.3
FIG.4
FIG.5 FIG.6
FIG.7 FIG.8
FIG.9 FIG.10
FIG.11 FIG.12
INSIGHTS:-
We can analyze which device is most popular among the users.
We can see any specific week when the interaction is high or low.
Any device that show a consistent engagement across weeks.
Any device that works poor in weeks.
We can gain insights into how users interact with our platform and identify opportunities
to improve the user experience.
TASK D:- Calculate the email engagement metrics.
SELECT // The unique_users shows the number of
COUNT(DISTINCT user_id) users who were targeted by the email.
AS unique_users, The total_emails shows the total number of
COUNT(*) AS total_emails, emails sent.
COUNT(CASE WHEN action = 'open' THEN 1 END) //opens, clicks, bounces, and unsubs provide how
AS opens, users interacted with the email.
COUNT(CASE WHEN action = 'click' THEN 1 END) // The open_rate, click_rate, bounce_rate, and
AS clicks, unsub_rate provide a view of the engagement.
COUNT(CASE WHEN action = 'bounce' THEN 1
END)
AS bounces,
COUNT(CASE WHEN action = 'unsubscribe' THEN 1 END)
AS unsubs,
ROUND((COUNT(CASE WHEN action = 'open' THEN 1 END) / COUNT(*)) * 100, 2) AS
open_rate,
ROUND((COUNT(CASE WHEN action = 'click' THEN 1 END) / COUNT(*)) * 100, 2) AS
click_rate,
ROUND((COUNT(CASE WHEN action = 'bounce' THEN 1 END) / COUNT(*)) * 100, 2) AS
bounce_rate,
ROUND((COUNT(CASE WHEN action = 'unsubscribe' THEN 1 END) / COUNT(*)) * 100, 2) AS
unsub_rate
FROM
email_events;
RESULT:- Email engagement metrics.
INSIGHTS:-
We can check the emails reach to the users or not.
Are emails bouncing or being marked as spam.
Are users unsubscribing from the email.
Are there any trends or patterns in user engagement over time.
Project Description
The goal of this project is to analyze user engagement, growth, and retention for a
product, along with specific email engagement metrics. By using SQL queries on the
given tables. Focusing on weekly user engagement, user growth, weekly retention,
weekly engagement per device, and email engagement.
APPROACH
To execute the analysis I used DB Gate a database management tool to write and
execute SQL queries on the provided tables users, events, and email_events I
followed a structured approach to complete the every task.
1. Weekly user engagement: I wrote a query to calculate the number of active
users per week.
2. User growth analysis: I created a query to calculate the number of new users
signing up each week.
3. Weekly retention: I developed a query to analyze the retention of users based
on their sign-up cohort.
4. I wrote a query to calculate the number of active users per device per week.
5. I created a query to analyze email engagement metrics, such as the number
of emails sent, opened, and clicked.
Tech-Stack Used
DB Gate: Used for querying the database and managing data.
Tech-Stack Used I used MySQL Community Server version 8.4.0 with DB gate
software. I prefer DB gate because for me this is user friendly interface like a
creating a tables, write and execute the SQL queries.
Insights
Weekly User Engagement: The engagement data shows active users on a
weekly basis. Patterns can be observed to identify peak activity periods.
User Growth: The growth analysis highlights periods of high user acquisition,
which can correlate with marketing campaigns or product updates.
Retention Rates: Understanding retention helps identify how well the product
maintains its user base over time.
Engagement per device: This reveals which devices users prefer.
Email Engagement: Email metrics help in understanding the effectiveness of
email campaigns.
Result
Through this project I gained a deeper understanding of user behavior and
engagement on the product. The project has contributed to my understanding of the
importance of analyzing user behaviour and has equipped me with the skills to
execute such analyses using SQL and database management tools.