mindinsight.lineagemgr

Lineagemgr Module Introduction.

This module provides Python APIs to query the lineage of models. The APIs can be used to get the lineage information of the models. For example, what hyperparameter is used in the model training, which model has the highest accuracy among all the versions, etc.

mindinsight.lineagemgr.filter_summary_lineage(summary_base_dir, search_condition=None)[source]

Filter the lineage information under summary base directory according to search condition.

Users can filter and sort all lineage information according to the search condition. The supported filter fields include summary_dir, network, etc. The filter conditions include eq, lt, gt, le, ge and in. If the value type of filter condition is str, such as summary_dir and lineage_type, then its key can only be in and eq. At the same time, the combined use of these fields and conditions is supported. If you want to sort based on filter fields, the field of sorted_name and sorted_type should be specified.

Users can use lineage_type to decide what kind of lineage information to query. If the lineage_type is not defined, the query result is all lineage information.

Users can paginate query result based on offset and limit. The offset refers to page number. The limit refers to the number in one page.

Parameters
  • summary_base_dir (str) – The summary base directory. It contains summary directories generated by training.

  • search_condition (dict) –

    The search condition. When filtering and sorting, in addition to the following supported fields, fields prefixed with metric/ and user_defined/ are also supported. For example, the field should be metric/accuracy if the key of metrics parameter is accuracy. The fields prefixed with metric/ and user_defined/ are related to the metrics parameter in the training script and user defined information in TrainLineage/EvalLineage callback, respectively. Default: None.

    • summary_dir (dict): The filter condition of summary directory.

    • loss_function (dict): The filter condition of loss function.

    • train_dataset_path (dict): The filter condition of train dataset path.

    • train_dataset_count (dict): The filter condition of train dataset count.

    • test_dataset_path (dict): The filter condition of test dataset path.

    • test_dataset_count (dict): The filter condition of test dataset count.

    • network (dict): The filter condition of network.

    • optimizer (dict): The filter condition of optimizer.

    • learning_rate (dict): The filter condition of learning rate.

    • epoch (dict): The filter condition of epoch.

    • batch_size (dict): The filter condition of batch size.

    • device_num (dict): The filter condition of device num.

    • loss (dict): The filter condition of loss.

    • model_size (dict): The filter condition of model size.

    • dataset_mark (dict): The filter condition of dataset mark.

    • lineage_type (dict): The filter condition of lineage type. It decides what kind of lineage information to query. Its value can be dataset or model, e.g., {‘in’: [‘dataset’, ‘model’]}, {‘eq’: ‘model’}, etc. If its values contain dataset, the query result will contain the lineage information related to data augmentation. If its values contain model, the query result will contain model lineage information. If it is not defined or it is a dict like {‘in’: [‘dataset’, ‘model’]}, the query result is all lineage information.

    • offset (int): Page number, the value range is [0, 100000].

    • limit (int): The number in one page, the value range is [1, 100].

    • sorted_name (str): Specify which field to sort by.

    • sorted_type (str): Specify sort order. It can be ascending or descending.

Returns

dict, lineage information under summary base directory according to search condition.

Raises
  • LineageSearchConditionParamError – If search_condition param is invalid.

  • LineageParamSummaryPathError – If summary path is invalid.

  • LineageFileNotFoundError – If the summary log file is not found.

  • LineageQuerySummaryDataError – If querying summary log file data fails.

Examples

>>> summary_base_dir = "/path/to/summary_base"
>>> search_condition = {
>>>     'summary_dir': {
>>>         'in': [
>>>             os.path.join(summary_base_dir, 'summary_1'),
>>>             os.path.join(summary_base_dir, 'summary_2'),
>>>             os.path.join(summary_base_dir, 'summary_3')
>>>         ]
>>>     },
>>>     'loss': {
>>>         'gt': 2.0
>>>     },
>>>     'batch_size': {
>>>         'ge': 128,
>>>         'le': 256
>>>     },
>>>     'metric/accuracy': {
>>>         'lt': 0.1
>>>     },
>>>     'sorted_name': 'summary_dir',
>>>     'sorted_type': 'descending',
>>>     'limit': 3,
>>>     'offset': 0,
>>>     'lineage_type': {
>>>         'eq': 'model'
>>>     }
>>> }
>>> summary_lineage = filter_summary_lineage(summary_base_dir)
>>> summary_lineage_filter = filter_summary_lineage(summary_base_dir, search_condition)
mindinsight.lineagemgr.get_summary_lineage(summary_dir, keys=None)[source]

Get the lineage information according to summary directory and keys.

The function queries lineage information of single train process corresponding to the given summary directory. Users can query the information according to keys.

Parameters
  • summary_dir (str) – The summary directory. It contains summary logs for one training.

  • keys (list[str]) – The filter keys of lineage information. The acceptable keys are metric, user_defined, hyper_parameters, algorithm, train_dataset, model, valid_dataset and dataset_graph. If it is None, all information will be returned. Default: None.

Returns

dict, the lineage information for one training.

Raises
  • LineageParamSummaryPathError – If summary path is invalid.

  • LineageQuerySummaryDataError – If querying summary data fails.

  • LineageFileNotFoundError – If the summary log file is not found.

Examples

>>> summary_dir = "/path/to/summary"
>>> summary_lineage_info = get_summary_lineage(summary_dir)
>>> hyper_parameters = get_summary_lineage(summary_dir, keys=["hyper_parameters"])