Release Notes¶
Version 0.5.3 (11.06.2026)¶
New Features:
Sample Ratio Mismatch (SRM) check. A broken assignment, filtering or logging pipeline often shows up as group sizes that deviate from the planned split - and then any test result on such groups cannot be trusted.
Tester.runcan now verify the observed group sizes with a chi-square test (strict0.0005level) before evaluating the results and warn when a mismatch is detected. The check is opt-in: passcheck_srm=Truefor equal expected sizes, or providesrm_expected_ratios(e.g.{"A": 0.9, "B": 0.1}) for intentionally unequal splits - the ratios enable the check automatically. The same diagnostic is available standalone asambrosia.tools.srm.check_srmfor pandas and Spark dataframes.
Bug Fixes:
The standalone
testfunction no longer fails whenfirst_type_errorsis not provided (the same applies toTesterwith an explicitfirst_type_errors=None): the documented default0.05is now applied.Fixed binary experiment design with Bayesian intervals: the conjugate prior parameters (
n_success_conjugate,n_failure_conjugate) forinterval_type="bayes_beta"now reach the interval computation in size, effect and power design instead of raisingTypeError. Thanks to Artem Vasin for the original fix (#45).Fixed one-sided alternatives in the binary design power table: confidence bounds are now built with the correct shape for simulation matrices, so
alternative="greater"/"less"no longer fails with a shape error.Keyword arguments that collide with internally derived confidence interval arguments (e.g.
confidence_level) now raise a clearValueErrorinstead of silently overriding the design parameters.The standalone
design_binary_size/design_binary_effect/design_binary_powerfunctions now honoralternativeformethod="binary"(previously the argument was silently dropped and the design was always two-sided) and warn about the unsupportedgroups_ratioinstead of silently designing equal groups.
Dependencies:
Allowed pandas 3.x: the requirement is now
pandas >=1.5.0, <4.0.0. Fixed the two pandas 3 incompatibilities in the library: design tables no longer use the removedDataFrame.applymap(an elementwise helper picksDataFrame.maporapplymapdepending on the pandas version), andBoxCoxTransformerno longer mutates the read-only arrays that pandas 3 returns. The test suite is verified against the full supported range, from pandas 1.5.3 with numpy 1.26 up to pandas 3.x.Dev dependencies allow pytest 9 and pytest-cov 7; removed the no-op marks on test fixtures that pytest 9 no longer accepts.
Refreshed locked dependencies to address security alerts: pillow 12.2 (a transitive runtime dependency), Pygments 2.20 and pytest 9.0.3 (Python 3.9 and 3.10 environments keep the latest compatible pillow 11.x and pytest 8.x).
Version 0.5.2 (01.06.2026)¶
New Features:
The
Testernow offers several ways to correct p-values when an experiment evaluates many metrics or many groups at once, not just Bonferroni. Running a lot of comparisons together inflates the chance of a false positive, and a correction keeps that chance under control. Thecorrection_methodargument now accepts:"bonferroni"(the default, unchanged),"holm","holm-sidak","sidak","hommel"and"simes-hochberg"- methods that limit the probability of making any false positive."holm"and the others are less conservative than Bonferroni, so you keep more statistical power."fdr_bh"(Benjamini-Hochberg) and"fdr_by"(Benjamini-Yekutieli) - methods that control the false discovery rate, i.e. the expected share of false positives among the metrics you call significant. A popular choice when a dashboard tracks many metrics.
Existing code keeps working unchanged: Bonferroni stays the default and
correction_method=Noneturns correction off. Friendly aliases such as"benjamini-hochberg"are also accepted. For"bonferroni"and"sidak"the confidence intervals are widened to match; the other methods adjust the p-values only.
Internal:
Added
ambrosia/tools/multitest.pywith native p-value adjustment procedures, cross-validated againststatsmodelsand R’sp.adjust.
Version 0.5.1 (26.03.2026)¶
New Features:
Custom metric functions in
Tester: newmetric_funcsparameter allows passing arbitrary callables instead of column names. Works withtheoryandempiricmethods. Functions passed torun()override those set in the constructor.LinearizationTransformerfor ratio metrics (e.g. revenue/orders). Linearizes metric vialinearized_i = numerator_i - ratio * denominator_i, whereratiois estimated on reference data duringfit().Preprocessor.linearize()integrates linearization into the existing chain architecture with full serialization and replay support.
Bug Fixes:
Pinned
setuptools>=65.0.0, <82.0.0to fixpkg_resourcesremoval in setuptools 82 that brokepip install ambrosiadue to hyperopt dependency.
Internal:
Updated publish workflow to use
PYPI_TOKEN_V2Added CLAUDE.md to
.gitignore
Version 0.5.0 (06.01.2025)¶
Breaking Changes:
Minimum Python version raised to 3.9 (dropped support for 3.7, 3.8)
Minimum PySpark version raised to 3.4 (dropped support for 3.2, 3.3)
New Features:
Added support for Python 3.11, 3.12, 3.13
Bug Fixes:
Added hnswlib as fallback for nmslib on macOS ARM (fixes segfault in metric split)
Dependencies:
Updated numpy to >=1.24.0, <3.0.0
Updated pandas to >=1.5.0, <3.0.0
Updated scipy to >=1.10.0
Updated scikit-learn to >=1.3.0
Updated nmslib to >=2.1.0
Added hnswlib >=0.7.0 as alternative KNN backend
Updated catboost to >=1.2.0
Updated other dependencies for Python 3.12/3.13 compatibility
Internal:
Replaced deprecated
pkg_resourceswithimportlib.metadataUpdated CI/CD to test Python 3.9-3.13
Updated GitHub Actions to v4/v5
Version 0.4.1 (21.04.2023)¶
Hotfix for pyspark import in spark criteria.
Version 0.4.0 (21.04.2023)¶
Documentation and usage examples have been substantially reworked and updated.
The
Designerclass and design methods functionality is updated.Empirical design now supports the choice of hypothesis alternative and group ratio parameter
Look of resulting tables with calculated parameters is unified for all design methods
Changed multiprocessing strategy for bootstrap criterion
The
Testerclass functionality is updated.Spark data support for the
Testerclass is added. Independent t-test is available nowBootstrap criterion can now return deterministic output using a
random_seedparameterPaired bootstrap criterion is now available
MHC now is optional and takes into account the number of passed metrics
first_errorsparameter renamed tofirst_type_errors
pysparkpackage now is optional and could be installed usingpipextras.Fixed a set of bugs.
Version 0.3.0 (15.02.2023)¶
The
Designerclass and design methods functionality is updated.Theoretical design now supports the choice of hypothesis alternative and group ratio parameter
These calculations now use Statsmodels solvers
Experimental parameters for binary data can now also be theoretically designed using both the asin variance-stabilizing transformation and the normal approximation
All preprocessor classes, except for the
Preprocessor, have changed their api and have updated functionalityPreprocessing classes now use
fitandtransformmethods to get transformation parameters and apply transformation on pandas tablesFitted classes now can now be saved and loaded from json files
Table column names used when fitting class instances are now strictly fixed in instance attributes
The
Preprocessorclass is updated.Added new transformation methods
The executed transformation pipeline can now be saved and loaded from a json file. This can be used to store and load the entire experimental data processing pipeline
The data handling methods of the class have changed some parameters to match the changes in the classes used
The
IQRPreprocessorclass now is available inambrosia.preprocessing.It can be used to remove outliers based on quartile and interquartile range estimates
The
RobustPreprocessorclass is updated.It now supports different types of tails for removal:
both,rightorleftFor each processed column, a separate alpha portion of the distribution can be passed.
The
BoxCoxTransformerclass now is available inambrosia.preprocessingIt can be used for data distribution normalization.
The
LogTransformerclass now is available inambrosia.preprocessingIt can be used to transform data for variance reduction.
The
MLVarianceReducerclass is updated.Now it can store and load the selected ML model from a single specified path
Version 0.2.0 (22.11.2022)¶
Library name changed back to ambrosia. Naming conflict in PyPI has been resolved.
0.1.x versions are still available in PyPI under ambrozia name.
Version 0.1.2 (16.11.2022)¶
Hotfix for Ttest stat criterion absolute effect calculation. Url to main image deleted from docs.
Version 0.1.1 (04.10.2022)¶
Hotfix for library naming.
Library temprorary renamed to ambrozia in PyPI repository due to hidden naming conflict.
Version 0.1.0 (03.10.2022)¶
First release of Ambrosia package:
Added
Designerclass for experiment parameters designAdded
Spliiterclass for A/B groups splitAdded
Testerclass for experiment effect measurementAdded various classes for experiment data preprocessing
Added A/B testing tools with wide functionality