-
Couldn't load subscription status.
- Fork 0
Open
Labels
documentationImprovements or additions to documentationImprovements or additions to documentation
Description
SELECT
x.variation_name as x_variation_name,
y.variation_name as y_variation_name,
COUNT(DISTINCT x.test_user_id) as users
FROM
(SELECT
test_user_id,
variation_name,
experiment_name
FROM Experiment_Visitors
WHERE experiment_name='x') x
JOIN
(SELECT
test_user_id,
variation_name,
experiment_name
FROM Experiment_Visitors
WHERE experiment_name='y') y
ON x.test_user_id = y.test_user_id
GROUP BY x.experiment_name, x.variation_name, y.experiment_name, y.variation_name;
Downloaded as CSV and imported into R this data looks like this.
> result <- read_csv("Downloads/result.csv", col_types = cols(USERS = col_integer()))
> result
# A tibble: 8 × 3
X_VARIATION_NAME Y_VARIATION_NAME USERS
<chr> <chr> <int>
1 variation_1 variation_1 21944
2 variation_2 variation_4 14825
3 variation_2 variation_3 14883
4 variation_2 variation_1 14815
5 variation_1 variation_2 22108
6 variation_2 variation_2 14888
7 variation_1 variation_3 22247
8 variation_1 variation_4 22210
This is narrow format, but chisq.test expects wide, so I need to acast it.
> t <- acast(result, Y_VARIATION_NAME ~ X_VARIATION_NAME)
Using USERS as value column: use value.var to override.
> t
variation_1 variation_2
variation_1 21944 14815
variation_2 22108 14888
variation_3 22247 14883
variation_4 22210 14825
Then just use chisq.test as before.
> chisq.test(t)
Pearson's Chi-squared test
data: t
X-squared = 0.76795, df = 3, p-value = 0.8571
Metadata
Metadata
Assignees
Labels
documentationImprovements or additions to documentationImprovements or additions to documentation