kinoml.datasets.groups
¶
Splitting strategies for datasets
Module Contents¶
- class kinoml.datasets.groups.BaseGrouper¶
Base class to assign groups to measurements in a DatasetProvider
- assign(dataset, overwrite=False, **kwargs)¶
Given a DatasetProvider, assign a key to the elements of each group, as provided by
.indices()
- Parameters
dataset (DatasetProvider) –
overwrite (bool, optional=False) – If a measurement has been assigned a group already, do not overwrite unless this option is set to True.
- Returns
dataset – The same dataset passed in the input, with measurements modified in place.
- Return type
- abstract indices(dataset, **kwargs)¶
Given a dataset, create a dictionary that maps keys or labels to a set of numerical indices. The strategy to follow will depend on the subclass.
- Parameters
dataset (DatasetProvider) –
- Returns
Maps
int` or ``str
to a list ofint
- Return type
dict
- class kinoml.datasets.groups.RandomGrouper(ratios)¶
Bases:
BaseGrouper
Randomized groups following a split proportional to the provided ratios
- Parameters
ratios (tuple or dict) – 1-based ratios for the different groups. They must sum 1.0. If a dict is provided, the keys are used to label the resulting groups. Otherwise, the groups are 0-enumerated.
- indices(dataset, **kwargs)¶
Given a dataset, create a dictionary that maps keys or labels to a set of numerical indices. The strategy to follow will depend on the subclass.
- Parameters
dataset (DatasetProvider) –
- Returns
Maps
int` or ``str
to a list ofint
- Return type
dict
- class kinoml.datasets.groups.CallableGrouper(function)¶
Bases:
BaseGrouper
A grouper that applies a user-provided function to each Measurement in the Dataset. Returned value should be the name of the group.
- Parameters
function (callable) – This function must be able to take a
Measurement
object and return astr
orint
.
- indices(dataset, progress=True)¶
Given a dataset, create a dictionary that maps keys or labels to a set of numerical indices. The strategy to follow will depend on the subclass.
- Parameters
dataset (DatasetProvider) –
- Returns
Maps
int` or ``str
to a list ofint
- Return type
dict
- class kinoml.datasets.groups.BaseFilter¶
Bases:
BaseGrouper
Base class to assign groups to measurements in a DatasetProvider