Overview of Ambrosia Designer class Spark data support¶
This example shows the functionality of the Designer class on Spark DataFrames. Synthetic data on LTV and user retention rate is used.
The functionality of the
Designer class on Spark data currently is limited compared to the pandas format.In order to learn about the full functionality of the
Designer and get information about why the design of A / B test parameters is needed and how it can be done, see the main tutorial on the Designer class.[2]:
import os
import pandas as pd
import pyspark
from ambrosia.designer import Designer
Build local spark session
[3]:
os.environ['SPARK_LOCAL_IP'] = '127.0.0.1'
spark = pyspark.sql.SparkSession.builder.master("local[1]").getOrCreate()
spark.sparkContext.setLogLevel('ERROR')
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
23/04/21 17:39:34 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Create Spark DataFrame
[4]:
ltv_and_retention_dataset = pd.read_csv(
"./../tests/test_data/ltv_retention.csv")
sdf = spark.createDataFrame(ltv_and_retention_dataset)
[5]:
sdf.printSchema()
root
|-- LTV: double (nullable = true)
|-- retention: double (nullable = true)
Spark A/B test parameters theoretical design¶
First, we will use a theoretical approach to find the missing parameters of a hypothetical experiment.
We will obtain theoretical estimates for the size of groups, MDE in the power of the test with the appropriate known parameters.
Create class instance and set grid of parameters, I and II type errors remain default
[6]:
designer = Designer(dataframe=sdf,
effects=[1.05, 1.2],
sizes=[100, 1000],
metrics='LTV')
Design groups size
[7]:
designer.run('size', 'theory')
[7]:
| Errors ($\alpha$, $\beta$) | (0.05; 0.2) |
|---|---|
| Effect | |
| 5.0% | 6206 |
| 20.0% | 389 |
Design minimal detectable effect
[8]:
designer.run('effect', 'theory')
[8]:
| Errors ($\alpha$, $\beta$) | (0.05; 0.2) |
|---|---|
| Group sizes | |
| 100 | 39.6% |
| 1000 | 12.5% |
Design test power
[9]:
designer.run('power', 'theory')
[9]:
| Group sizes | 100 | 1000 | |
|---|---|---|---|
| $\alpha$ | Effect | ||
| 0.05 | 5.0% | 6.4% | 20.3% |
| 20.0% | 29.4% | 99.4% |
Spark A/B test parameters empirical design¶
Now let’s calculate the parameters using multiple sampling of groups from the transmitted data and modeling a hypothetical effect.
This approach, with high value of
bootstrap_size parameter (number of sampled groups per step), gives more accurate estimation of the parameters, but requires much more computational resources than the theoretical one.[10]:
designer = Designer(dataframe=sdf,
second_type_errors=0.1,
effects=[1.1, 1.2],
sizes=[500, 2000],
metrics='LTV')
Currently, we don’t have criterion parameter which we implement for different statistical criteria in pandas data empirical design, here t-test criterion is always used.
[11]:
designer.run('size', 'empiric', bootstrap_size=20)
[11]:
| errors | (0.1, 0.05) |
|---|---|
| effect | |
| 20.0% | 247 |
| 10.0% | 1198 |
[12]:
designer.run('effect', 'empiric', bootstrap_size=20)
[12]:
| errors | (0.1, 0.05) |
|---|---|
| group_sizes | |
| 500 | 9.6% |
| 2000 | 5.9% |
Spark design for binary metrics¶
For binary metrics,
"theory" or "binary" approaches can be used.The first approach uses different approximations for binary data, while the latter calculates experimental parameters based on the constructed confidence intervals of various types.
[14]:
designer = Designer(dataframe=sdf,
second_type_errors=0.5,
sizes=150,
effects=1.2,
metrics='retention')
Group size:
[15]:
designer.run('size', 'theory')
[15]:
| Errors ($\alpha$, $\beta$) | (0.05; 0.5) |
|---|---|
| Effect | |
| 20.0% | 289 |
[16]:
designer.run('size', 'binary', interval_type='newcombe', amount=100000)
[16]:
| Errors ($\alpha$, $\beta$) | (0.05; 0.5) |
|---|---|
| Effect | |
| 20.0% | 280 |
Power:
[17]:
designer.run('power', 'theory')
[17]:
| Group sizes | 150 | |
|---|---|---|
| $\alpha$ | Effect | |
| 0.05 | 20.0% | 29.2% |
[18]:
designer.run('power', 'binary', interval_type='newcombe', amount=100000)
[18]:
| Group sizes | 150 | |
|---|---|---|
| $\alpha$ | Effect | |
| 0.05 | 20.0% | 30.3% |
[19]:
spark.stop()