Aggregation¶
Preprocessing class for data aggregation. |
- class ambrosia.preprocessing.AggregatePreprocessor(categorial_method='mode', real_method='sum')[source]¶
Preprocessing class for data aggregation.
Can group data by multiple columns and aggregate it using methods for real and categorial features.
- Parameters:
- categorial_methodtypes.MethodType, default:
"mode" Aggregation method for categorial variables that will become as a default behavior.
- real_methodtypes.MethodType, default:
"sum" Aggregation method for real variables that will become as a default behavior.
- categorial_methodtypes.MethodType, default:
- Attributes:
- categorial_methodtypes.MethodType
Default aggregation method for categorial variables.
- real_methodtypes.MethodType
Default aggregation method for real variables.
- groupby_columnstypes.ColumnNamesType
Columns which were used for groupping in the last aggregation. Gets value after fitting the class instance.
- agg_paramsDict
Dictionary with aggregation rules which was used in the last aggregation. Gets value after fitting the class instance.
- get_params_dict()[source]¶
Returns dictionary with parameters of the last run() or transform() call.
- fit(dataframe, groupby_columns, agg_params=None, real_cols=None, categorial_cols=None)[source]¶
Fit preprocessor with parameters of aggregation.
Aggregation will be performed using passed dictionary with defined aggregation conditions for each columns of interest, or lists of columns with default class aggregation behavior.
- Parameters:
- dataframepd.DataFrame
Table with selected columns.
- groupby_columnstypes.ColumnNamesType
Columns for GROUP BY.
- agg_paramsDict, optional
Dictionary with aggregation parameters.
- real_colstypes.ColumnNamesType, optional
Columns with real metrics. Overriden by
agg_paramsparameter and could be passed if expected default aggregation behavior.- categorial_colstypes.ColumnNamesType, optional
Columns with categorial metrics Overriden by
agg_paramsparameter and could be passed if expected default aggregation behavior.
- Returns:
- selfobject
Instance object.