Example of an artificial experiment with CUPED transformation¶

[2]:

import numpy as np
import pandas as pd
from tqdm.notebook import tqdm

from ambrosia.designer import Designer
from ambrosia.splitter import Splitter
from ambrosia.tester import Tester

from ambrosia.preprocessing import Cuped

Your CPU supports instructions that this binary was not compiled to use: AVX2
For maximum performance, you can install NMSLIB from sources
pip install --no-binary :all: nmslib

Load data¶

[3]:

dataframe = pd.read_csv('../tests/test_data/kion_data.csv', sep=';')
dataframe.head()

[3]:

	profile_id	sum_dur	vod_cnt	ln_vod_cnt	bin_col
0	99402893794	20104282	83	5.533356	1
1	878511937265	3986136	53	4.807294	1
2	998929369788	2063965	22	3.187069	1
3	265028786131	523539	14	2.679252	1
4	995182338752	1588224	19	4.177776	1

Make CUPED transformation for the metric of interest¶

[4]:

transformer = Cuped()
transformed = transformer.fit_transform(dataframe,
                                        target_column='ln_vod_cnt',
                                        covariate_column='sum_dur')

ambrosia LOGGER: After transformation СUPED for ln_vod_cnt, the variance is 86.0748 % of the original
ambrosia LOGGER: Variance transformation 2.1849 ===> 1.8806

Design an abstract experiment for original and transformed metrics¶

[5]:

designer = Designer(transformed, effects=1.05)
designer_info = designer.run(to_design='size',
                             method='theory',
                             metrics=['ln_vod_cnt', 'ln_vod_cnt_transformed'])

[6]:

designer_info['ln_vod_cnt']

[6]:

Errors ($\alpha$, $\beta$)	(0.05; 0.2)
Effect
5.0%	3106

[7]:

designer_info['ln_vod_cnt_transformed']

[7]:

Errors ($\alpha$, $\beta$)	(0.05; 0.2)
Effect
5.0%	2674

It can be seen that it will take ~450 ids less for the same experiment after the metric transformation

Saving the transformation parameters to use them after the experiment is completed¶

[8]:

transformer.store_params('_examples_configs/kion_cuped_params.json')

Let’s conduct an artificial testing and look at first and second type empirical errors¶

[9]:

tests_amounts: int = 100
group_size = 2700
amount_first_type_errors: int = 0
amount_second_type_errors: int = 0
alpha = 0.05

for exp_num in tqdm(range(tests_amounts)):
    # Checking for I type error
    splitter = Splitter(dataframe, id_column='profile_id')
    exp_data = splitter.run(method='hash',
                            salt=f'exp {exp_num}',
                            groups_size=group_size)
    transformer = Cuped(verbose=False)
    transformer.load_params('_examples_configs/kion_cuped_params.json')
    transformed = transformer.transform(exp_data)
    tester = Tester(transformed,
                    metrics='ln_vod_cnt_transformed',
                    column_groups='group')
    pvalue = tester.run(method='empiric')['pvalue']
    amount_first_type_errors += (pvalue < alpha
                                 )  # Reject equality of means when it is true

    # Checking for II type error
    splitter = Splitter(dataframe, id_column='profile_id')
    exp_data = splitter.run(method='hash',
                            salt=f'exp {exp_num}',
                            groups_size=group_size)
    mean_b = exp_data[exp_data.group == 'B']['ln_vod_cnt'].mean()
    # Let's add an effect
    exp_data.loc[exp_data.group == 'B', 'ln_vod_cnt'] += 0.05 * mean_b
    transformer = Cuped(verbose=False)
    transformer.load_params('_examples_configs/kion_cuped_params.json')
    transformed = transformer.transform(exp_data)
    tester = Tester(transformed,
                    metrics='ln_vod_cnt_transformed',
                    column_groups='group')
    pvalue = tester.run(method='empiric')['pvalue']
    amount_second_type_errors += (
        pvalue > alpha)  # Do not reject the equality of averages when
    # it is necessary to reject

[11]:

print('Empirical I type error: {}'.format(amount_first_type_errors.loc[0] /
                                          tests_amounts))
print('Empirical II type error: {}'.format(amount_second_type_errors.loc[0] /
                                           tests_amounts))

Empirical I type error: 0.05
Empirical II type error: 0.18

Just as the design of the experiment guaranteed. It must be noted that for the CUPED transformation it is important that the mean is preserved for the covariates, i.e. \(\mathbb{E}X_{control} = \mathbb{E}X_{test}\)¶