tools

tools.tests

This module contains all tests.

class RegressionTest(model, data, test_definition_version='last', model_version='last', data_version='last', repo_info=<pailab.ml_repo.repo_objects.RepoInfo object>)

Bases: pailab.tools.tests.Test

Regression test.

Note

In general, tests are automatically constructed and run using pailab.ml_repo.repo.run_tests(). As a user, there is nearly no need to construct a test by hand.

A regression test compares a specified measure of a reference model described by a label to the respective measure of the model to be tested. It fails, if the measure of the tested model is greater then a given tolerance of the reference measure, i.e. the test fails if

  • measure-measure_ref < tol and an absolute tolerance is defined,
  • measure-measure_ref < tol*measure_ref if a relative tolerance is used.

All the attributes specific for the regression test (i.e. not contained in the base class) are retrieved during the run of the test from the underlying testdefinition.

Parameters:
  • model (str) – Name of model for which the test is applied.
  • data (str) – Name of dataset used in the test.
  • test_definition_version (str, optional) – Defaults to latest version. Version of the tests’s underlying pailab.tools.tests.TestDefinition that is used as basis for the test.
  • model_version (str, optional) – Defaults to latest version. Version of the model the test is applied to.
  • data_version (str, optional) – Defaults to latest version. Version of the data used in the test.
test_definition

Name of underlying pailab.tools.tests.TestDefinition.

Type:str
model

Name of model for which the test is applied.

Type:str
data

Name of dataset used in the test.

Type:str
test_definition_version

Version of the tests’s underlying pailab.tools.tests.TestDefinition that is used as basis for the test.

Type:str
model_version

Version of the model the test is applied to.

Type:str
data_version

Version of the data used in the test.

Type:str
result

Describes the state of the test.

Type:str, ‘not run’, ‘failed’, ‘succeeded’
details

Contains details when test fails, otherwise empty dict.

Type:dict
get_modifier_versions(ml_repo)

Get the modifier versions

Parameters:repo (MLrepository) – repository used to get and store the data
Returns:tuple of string, dict – return the object name and the modifiers
class RegressionTestDefinition(reference='prod', models=None, data=None, labels=None, measures=None, tol=0.001, repo_info=<pailab.ml_repo.repo_objects.RepoInfo object>, relative=False)

Bases: pailab.tools.tests.TestDefinition

Definition of a regression test.

A regression test compares a specified measure of a reference model described by a label to the respective measure of the model to be tested. It fails, if the measure of the tested model is greater then a given tolerance of the reference measure, i.e. the test fails if

  • measure-measure_ref < tol and an absolute tolerance is defined,
  • measure-measure_ref < tol*measure_ref if a relative tolerance is used.

Note

The tests needs the chosen measure(s) to be computed, therefore you have to take care that the measure has been added to the repo (using pailab.ml_repo.repo.MLRepo.add_measure())

Examples

Add a test for the model ‘my_model’ on a data set named ‘test_data’ which checks if the maximum error of the model is not greater than 10% in relation to the error of the reference model defined by the label ‘production_model’

>>> test_def = RegressionTestDefinition(models=['my_model'], reference ='production_model', data = ['test_data'], measures = ['max'], tol = 0.1, relative = True)
>>> ml_repo.add(test_def)

Add a test applied to all models in the repo (always the latest versions of the models are used within the tests)

>>> test_def = RegressionTestDefinition(models=None, reference ='production_model', data = ['test_data'], measures = ['max'], tol = 0.1, relative = True)
>>> ml_repo.add(test_def)
Parameters:
  • models (iterable with str items, optional) – Defaults to None. Iterable (e.g. list of str) returning names of the models to be tested.
  • data (iterable with str items, optional) – Defaults to None. Iterable (e.g. list of str) returning names of the data used for testing.
  • labels (iterable with str items, optional) – Defaults to []. Iterable returning labels defining models to be tested.
  • measures ([type], optional) – Defaults to None. List of measures used in the test
  • reference (str, optional) – Defaults to ‘prod’. Label defining the reference model to which the measures are compared.
  • tol (float, optional) – Defaults to 1e-3. Tolerance, if relative is False, the test fails if new_value-ref_value < tol, otherwise if new_value-ref_value < tol*ref_value.
  • relative (bool, optional) – Defaults to False.
  • repo_info (RepoInfo, optional) Defaults to RepoInfo() –
models

List of strings defining the models to be tested.

Type:list of str
labels

List of strings defining the labels to be tested.

Type:list of str
data

List of strings defining the names of the data to be tested

Type:list of str
class Test(model, data, test_definition_version='last', model_version='last', data_version='last', repo_info=<pailab.ml_repo.repo_objects.RepoInfo object>)

Bases: pailab.ml_repo.repo.Job

Base class for all tests.

Note

In general, tests are automatically constructed and run using pailab.ml_repo.repo.run_tests(). As a user, there is nearly no need to construct a test by hand.

Parameters:
  • model (str) – Name of model for which the test is applied.
  • data (str) – Name of dataset used in the test.
  • test_definition_version (str, optional) – Defaults to latest version. Version of the tests’s underlying pailab.tools.tests.TestDefinition that is used as basis for the test.
  • model_version (str, optional) – Defaults to latest version. Version of the model the test is applied to.
  • data_version (str, optional) – Defaults to latest version. Version of the data used in the test.
  • model – Name of model for which the test is applied.
  • data – Name of dataset used in the test.
  • test_definition_version – Defaults to latest version. Version of the tests’s underlying pailab.tools.tests.TestDefinition that is used as basis for the test.
  • model_version – Defaults to latest version. Version of the model the test is applied to.
  • data_version – Defaults to latest version. Version of the data used in the test.
test_definition

Name of underlying pailab.tools.tests.TestDefinition.

Type:str
model

Name of model for which the test is applied.

Type:str
data

Name of dataset used in the test.

Type:str
test_definition_version

Version of the tests’s underlying pailab.tools.tests.TestDefinition that is used as basis for the test.

Type:str
model_version

Version of the model the test is applied to.

Type:str
data_version

Version of the data used in the test.

Type:str
result

Describes the state of the test.

Type:str, ‘not run’, ‘failed’, ‘succeeded’
details

Contains details when test fails, otherwise empty dict.

Type:dict
class TestDefinition(models=None, data=None, labels=[], repo_info=<pailab.ml_repo.repo_objects.RepoInfo object>)

Bases: pailab.ml_repo.repo_objects.RepoObject, abc.ABC

Abstract base class for all test definitions.

A test definition defines the framework such as models and data the tests are applied to. It also provides a create method which creates the test cases for a special model and data version.

Parameters:
  • models (iterable with str items, optional) – Defaults to None. Iterable (e.g. list of str) returning names of the models to be tested.
  • data (iterable with str items, optional) – Defaults to None. Iterable (e.g. list of str) returning names of the data used for testing.
  • labels (iterable with str items, optional) – Defaults to []. Iterable returning labels defining models to be tested.
  • repo_info (RepoInfo, optional) Defaults to RepoInfo() –
models

List of strings defining the models to be tested.

Type:list of str
labels

List of strings defining the labels to be tested.

Type:list of str
data

List of strings defining the names of the data to be tested

Type:list of str
create(ml_repo: pailab.ml_repo.repo.MLRepo)

Create a set of tests for models of the repository.

Parameters:
  • ml_repo (MLRepo) – ml repo
  • models (dict, optional) – Defaults to {}. Dictionary of model names to version numbers to apply tests for. If empty, all latest models are used.
  • data (dict, optional) – Defaults to {}. Dictionary of data the tests are applied to. If empty, all latest test- and train data will be used.
  • labels (list, optional) – Defaults to []. List of labels to which the tests are applied.
Returns:

[description]

Return type:

[type]

tools.tree

This module contains all functions and classes for the MLTree. The MLTree buils a tree-like structure of the objects in a given repository. This allows the user to access objects in a comfortable way allowing for autocompletion (i.e. in Jupyter notebooks).

To use it one can simply call the pailab.tools.tree.MLTree.add_tree() method to add such a tree to the current repository:

>>from pailab.tools.tree import MLTree
>>MLTree.add_tree(ml_repo)

After the tree has been added, one can simply use the tree. Here, using autocompletion makes the basic work wih repo objects quite simply. Each tree node provides useful functions that can be applied:

  • load loads the object of the given tree node or the child tree nodes of the current node. a After calling load the respective nodes have a new attribute obj that contains the respective loaded object. To load all objects belonging to the models subtree like parameters, evaluations or measures one can call:

    >> ml_repo.tree.models.load()
    
  • history lists the history of all objects of the respective subtree, where history excepts certain parameters such as a range of versions or which repo object information to include. To list th history of all training data just use:

    >> ml_repo.tree.training_data.history()
    
  • modifications lists all objects of the respective subtree that have been modified and no yet been committed.

There are also node dependent function (depending on what object the node represents).

class MLTree(ml_repo)

Bases: object

static add_tree(ml_repo)

Adds an MLTree to a repository.

Parameters:ml_repo (MLRepo) – the repository the tre is added
modifications()

Return a dictionary of all objects that were modified but no yet commited to the repository.

Returns:dictionary mapping object ids to dictionary of the modified attributes
Return type:dict
reload(**kwargs)

Method to reload the tree after objects have been added or deleted from the repository.

tools.interpretation

This module contains functions for model agnostic interpretation methods.

class ICE_Results

Bases: object

compute_cluster_average(ice_results_2)

[summary]

Parameters:ice_results_2 ([type]) – [description]
Returns:Matrix containing the average of values from ice_results_2 over the different clusters from this result.
Return type:numpy matrix
compute_ice(ml_repo, x_values, data, model=None, model_label=None, model_version='last', data_version='last', y_coordinate=0, x_coordinate=0, start_index=0, end_index=-1, cache=False, clustering_param=None, scale='')

Compute individual conditional expectation (ice) for a given dataset and model

Parameters:
  • ml_repo (MLRepo) – MLRepo used to retrieve model and data and be used in caching.
  • x_values (list) – List of x values for the ICE.
  • data (str, DataSet, RawData) – Either name of data or directly the data object which is used as basis for ICE (an ICE is computed at each datapoint of the data).
  • model (str, optional) – Name of model in the MLRepo for which the ICE will be computed. If None, model_label must be specified, defining the model to be used. Defaults to None.
  • model_label (str, optional) – Label defining the model to be used. Defaults to None.
  • model_version (str, optional) – Version of model to be used for ICE. Only needed if model is specified. Defaults to RepoStore.LAST_VERSION.
  • data_version (str, optional) – Version of data used. Defaults to RepoStore.LAST_VERSION.
  • y_coordinate (int or str, optional) – Defines y-coordinate (either by name or coordinate index) for which the ICE is computed. Defaults to 0.
  • x_coordinate (int or str, optional) – Defines x-coordinate (either by name or coordinate index) for which the ICE is computed. Defaults to 0.
  • start_index (int, optional) – Defines the start index of the data to be used in ICE computation (data[start_index:end_index] will be used). Defaults to 0.
  • end_index (int, optional) – Defines the end index of the data to be used in ICE computation (data[start_index:end_index] will be used). Defaults to -1.
  • cache (bool, optional) – If True, results will be cached. Defaults to False.
  • clustering_param (dict or None, optional) – Dictionary of parameters for method functional_clustering that is called if the parameter is not None and applies functional clustering to the ICE curves.
  • scale (str or int, optional) – String defining the scaling for the functions before functional clustering is applied. Scaling is perfomred by dividing the vector of the y-values of the ICE by the respective vector norm defined by scaling. The scaling must be one of numpy’s valid strings for linalg.norm’s ord parameter. If string is empty, no scaling will be applied. Defaults to ‘’.
Returns:

result object containing all relevant data (including functional clustering)

Return type:

ICE_Results

functional_clustering(x, scale='', n_clusters=20, random_state=42)
Given a set of different 1D functions (represented in matrix form where each row contains the function values on a given grid), this
method clusters those functions and tries to find typical function structures.
Parameters:
  • x (numpy matrix) – Matrix containing in each row the function values at a datapoint.
  • n_clusters (int, optional) – Number of clusters for the functional clustering of the ICE curves. Defaults to 0.
  • random_state (int, optional) – [description]. Defaults to 42.
  • scale (str or int, optional) – String defining the scaling for the functions before functional clustering is applied. Scaling is perfomred by dividing the vector of the y-values of the ICE by the respective vector norm defined by scaling. The scaling must be one of numpy’s valid strings for linalg.norm’s ord parameter. If string is empty, no scaling will be applied. Defaults to ‘’.
Returns:

Vector where each value defines the corresponding cluster of the respective function (x[i] is the cluster the i-th function belongs to). numpy matrix: Contains in each row th distance to each cluster for the respective function. numpy matrix: Contains in each row a cluster centroid.

Return type:

numpy vector

generate_prototypes(ml_repo, data, n_prototypes, n_criticisms, data_version='last', use_x=True, data_start_index=0, data_end_index=-1, metric='rbf', witness_penalty=1.0, **kwds)

This methods computes for a given test/training dataset prototypes and criticisms and adds them as separate test data sets to the repository.

This methods computes for given test/training dataset prototypes and criticisms, i.e. datapoints from th given set that are typical representatives (prototypes) and datapoints that are not well representatives (criticisms). Here, a simple greedy algorithm using MDM2 is used to compute the prototypes and a witness function together with some simple penalty are used to compute the criticisms (see e.g. C. Molnar, Interpretable Machine Learning).

Parameters:
  • ml_repo (MLRepo) – The repository used to retrieve data and store prototypes/criticisms.
  • data (str) – Name of data used for computation.
  • n_prototypes (int) – Number of prototypes.
  • n_criticisms (int) – Number of criticisms.
  • data_version (str) – Version of data to be used. Defaults to RepoStore.LAST_VERSION.
  • use_x (bool) – Flags that determine if prototypes are computed w.r.t. x or y coordinates. Defaults to True.
  • data_start_index (int) – Startindex of data used.
  • data_end_index (int) – Endindex of data used.
  • metric (str or callable, optional) –

    The metric to use when calculating kernel between instances in a feature array. If metric is a string, it must be one of the metrics in sklearn.metrics.pairwise.PAIRWISE_KERNEL_FUNCTIONS. If metric is precomputed, X is assumed to be a kernel matrix. Alternatively, if metric is a callable function, it is called on each pair of instances (rows) and the resulting value recorded. The callable should take two arrays from X as input and return a value indicating the distance between them. Currently, sklearn provides the following strings: ‘additive_chi2’, ‘chi2’, ‘linear’, ‘poly’, ‘polynomial’, ‘rbf’,

    ‘laplacian’, ‘sigmoid’, ‘cosine’
  • witness_penalty (float) – Penalty parameter to include some penalty to avoid to close criticisms.
  • **kwds – optional keyword parameters Any further parameters are passed directly to the kernel function.
Raises:

Exception – If sklearn is not installed

Returns:

List of indices defining the datapoints which are the resulting prototypes. list of int: List of indices defining the datapoints which are the resulting criticisms.

Return type:

list of int