Example of an artificial experiment with CUPED transformation¶
[2]:
import numpy as np
import pandas as pd
from tqdm.notebook import tqdm
from ambrosia.designer import Designer
from ambrosia.splitter import Splitter
from ambrosia.tester import Tester
from ambrosia.preprocessing import Cuped
Your CPU supports instructions that this binary was not compiled to use: AVX2
For maximum performance, you can install NMSLIB from sources
pip install --no-binary :all: nmslib
Load data¶
[3]:
dataframe = pd.read_csv('../tests/test_data/kion_data.csv', sep=';')
dataframe.head()
[3]:
| profile_id | sum_dur | vod_cnt | ln_vod_cnt | bin_col | |
|---|---|---|---|---|---|
| 0 | 99402893794 | 20104282 | 83 | 5.533356 | 1 |
| 1 | 878511937265 | 3986136 | 53 | 4.807294 | 1 |
| 2 | 998929369788 | 2063965 | 22 | 3.187069 | 1 |
| 3 | 265028786131 | 523539 | 14 | 2.679252 | 1 |
| 4 | 995182338752 | 1588224 | 19 | 4.177776 | 1 |
Make CUPED transformation for the metric of interest¶
[4]:
transformer = Cuped()
transformed = transformer.fit_transform(dataframe,
target_column='ln_vod_cnt',
covariate_column='sum_dur')
ambrosia LOGGER: After transformation СUPED for ln_vod_cnt, the variance is 86.0748 % of the original
ambrosia LOGGER: Variance transformation 2.1849 ===> 1.8806
Design an abstract experiment for original and transformed metrics¶
[5]:
designer = Designer(transformed, effects=1.05)
designer_info = designer.run(to_design='size',
method='theory',
metrics=['ln_vod_cnt', 'ln_vod_cnt_transformed'])
[6]:
designer_info['ln_vod_cnt']
[6]:
| Errors ($\alpha$, $\beta$) | (0.05; 0.2) |
|---|---|
| Effect | |
| 5.0% | 3106 |
[7]:
designer_info['ln_vod_cnt_transformed']
[7]:
| Errors ($\alpha$, $\beta$) | (0.05; 0.2) |
|---|---|
| Effect | |
| 5.0% | 2674 |
It can be seen that it will take ~450 ids less for the same experiment after the metric transformation
Saving the transformation parameters to use them after the experiment is completed¶
[8]:
transformer.store_params('_examples_configs/kion_cuped_params.json')
Let’s conduct an artificial testing and look at first and second type empirical errors¶
[9]:
tests_amounts: int = 100
group_size = 2700
amount_first_type_errors: int = 0
amount_second_type_errors: int = 0
alpha = 0.05
for exp_num in tqdm(range(tests_amounts)):
# Checking for I type error
splitter = Splitter(dataframe, id_column='profile_id')
exp_data = splitter.run(method='hash',
salt=f'exp {exp_num}',
groups_size=group_size)
transformer = Cuped(verbose=False)
transformer.load_params('_examples_configs/kion_cuped_params.json')
transformed = transformer.transform(exp_data)
tester = Tester(transformed,
metrics='ln_vod_cnt_transformed',
column_groups='group')
pvalue = tester.run(method='empiric')['pvalue']
amount_first_type_errors += (pvalue < alpha
) # Reject equality of means when it is true
# Checking for II type error
splitter = Splitter(dataframe, id_column='profile_id')
exp_data = splitter.run(method='hash',
salt=f'exp {exp_num}',
groups_size=group_size)
mean_b = exp_data[exp_data.group == 'B']['ln_vod_cnt'].mean()
# Let's add an effect
exp_data.loc[exp_data.group == 'B', 'ln_vod_cnt'] += 0.05 * mean_b
transformer = Cuped(verbose=False)
transformer.load_params('_examples_configs/kion_cuped_params.json')
transformed = transformer.transform(exp_data)
tester = Tester(transformed,
metrics='ln_vod_cnt_transformed',
column_groups='group')
pvalue = tester.run(method='empiric')['pvalue']
amount_second_type_errors += (
pvalue > alpha) # Do not reject the equality of averages when
# it is necessary to reject
[11]:
print('Empirical I type error: {}'.format(amount_first_type_errors.loc[0] /
tests_amounts))
print('Empirical II type error: {}'.format(amount_second_type_errors.loc[0] /
tests_amounts))
Empirical I type error: 0.05
Empirical II type error: 0.18