Example of an artificial experiment with CUPED transformation

[2]:
import numpy as np
import pandas as pd
from tqdm.notebook import tqdm

from ambrosia.designer import Designer
from ambrosia.splitter import Splitter
from ambrosia.tester import Tester

from ambrosia.preprocessing import Cuped
Your CPU supports instructions that this binary was not compiled to use: AVX2
For maximum performance, you can install NMSLIB from sources
pip install --no-binary :all: nmslib

Load data

[3]:
dataframe = pd.read_csv('../tests/test_data/kion_data.csv', sep=';')
dataframe.head()
[3]:
profile_id sum_dur vod_cnt ln_vod_cnt bin_col
0 99402893794 20104282 83 5.533356 1
1 878511937265 3986136 53 4.807294 1
2 998929369788 2063965 22 3.187069 1
3 265028786131 523539 14 2.679252 1
4 995182338752 1588224 19 4.177776 1

Make CUPED transformation for the metric of interest

[4]:
transformer = Cuped()
transformed = transformer.fit_transform(dataframe,
                                        target_column='ln_vod_cnt',
                                        covariate_column='sum_dur')
ambrosia LOGGER: After transformation СUPED for ln_vod_cnt, the variance is 86.0748 % of the original
ambrosia LOGGER: Variance transformation 2.1849 ===> 1.8806

Design an abstract experiment for original and transformed metrics

[5]:
designer = Designer(transformed, effects=1.05)
designer_info = designer.run(to_design='size',
                             method='theory',
                             metrics=['ln_vod_cnt', 'ln_vod_cnt_transformed'])
[6]:
designer_info['ln_vod_cnt']
[6]:
Errors ($\alpha$, $\beta$) (0.05; 0.2)
Effect
5.0% 3106
[7]:
designer_info['ln_vod_cnt_transformed']
[7]:
Errors ($\alpha$, $\beta$) (0.05; 0.2)
Effect
5.0% 2674

It can be seen that it will take ~450 ids less for the same experiment after the metric transformation

Saving the transformation parameters to use them after the experiment is completed

[8]:
transformer.store_params('_examples_configs/kion_cuped_params.json')

Let’s conduct an artificial testing and look at first and second type empirical errors

[9]:
tests_amounts: int = 100
group_size = 2700
amount_first_type_errors: int = 0
amount_second_type_errors: int = 0
alpha = 0.05

for exp_num in tqdm(range(tests_amounts)):
    # Checking for I type error
    splitter = Splitter(dataframe, id_column='profile_id')
    exp_data = splitter.run(method='hash',
                            salt=f'exp {exp_num}',
                            groups_size=group_size)
    transformer = Cuped(verbose=False)
    transformer.load_params('_examples_configs/kion_cuped_params.json')
    transformed = transformer.transform(exp_data)
    tester = Tester(transformed,
                    metrics='ln_vod_cnt_transformed',
                    column_groups='group')
    pvalue = tester.run(method='empiric')['pvalue']
    amount_first_type_errors += (pvalue < alpha
                                 )  # Reject equality of means when it is true

    # Checking for II type error
    splitter = Splitter(dataframe, id_column='profile_id')
    exp_data = splitter.run(method='hash',
                            salt=f'exp {exp_num}',
                            groups_size=group_size)
    mean_b = exp_data[exp_data.group == 'B']['ln_vod_cnt'].mean()
    # Let's add an effect
    exp_data.loc[exp_data.group == 'B', 'ln_vod_cnt'] += 0.05 * mean_b
    transformer = Cuped(verbose=False)
    transformer.load_params('_examples_configs/kion_cuped_params.json')
    transformed = transformer.transform(exp_data)
    tester = Tester(transformed,
                    metrics='ln_vod_cnt_transformed',
                    column_groups='group')
    pvalue = tester.run(method='empiric')['pvalue']
    amount_second_type_errors += (
        pvalue > alpha)  # Do not reject the equality of averages when
    # it is necessary to reject
[11]:
print('Empirical I type error: {}'.format(amount_first_type_errors.loc[0] /
                                          tests_amounts))
print('Empirical II type error: {}'.format(amount_second_type_errors.loc[0] /
                                           tests_amounts))
Empirical I type error: 0.05
Empirical II type error: 0.18

Just as the design of the experiment guaranteed. It must be noted that for the CUPED transformation it is important that the mean is preserved for the covariates, i.e. \(\mathbb{E}X_{control} = \mathbb{E}X_{test}\)