mindinsight.lineagemgr

Lineagemgr Module Introduction.

This module provides Python APIs to collect and query the lineage of models. Users can add the TrainLineage/EvalLineage callback to the MindSpore train/eval callback list to collect the key parameters and results, such as, the name of the network and optimizer, the evaluation metric and results. The APIs can be used to get the lineage information of the models. For example, what hyperparameter is used in the model training, which model has the highest accuracy among all the versions, etc.

class mindinsight.lineagemgr.TrainLineage(summary_record, raise_exception=False)[source]

Collect lineage of a training job.

Parameters
  • summary_record (SummaryRecord) – SummaryRecord is used to record the summary value, and summary_record is an instance of SummaryRecord, see mindspore.train.summary.SummaryRecord.

  • raise_exception (bool) – Whether to raise exception when error occurs in TrainLineage. If True, raise exception. If False, catch exception and continue. Default: False.

Raises
  • MindInsightException – If validating parameter fails.

  • LineageLogError – If recording lineage information fails.

Examples

>>> from mindinsight.lineagemgr import TrainLineage
>>> from mindspore.train.callback import ModelCheckpoint, SummaryStep
>>> from mindspore.train.summary import SummaryRecord
>>> model = Model(train_network)
>>> model_ckpt = ModelCheckpoint(directory='/dir/to/save/model/')
>>> summary_writer = SummaryRecord(log_dir='./')
>>> summary_callback = SummaryStep(summary_writer, flush_step=2)
>>> lineagemgr = TrainLineage(summary_record=summary_writer)
>>> model.train(epoch_num, dataset, callbacks=[model_ckpt, summary_callback, lineagemgr])
begin(run_context)[source]

Initialize the training progress when the training job begins.

Parameters

run_context (RunContext) – It contains all lineage information, see mindspore.train.callback.RunContext.

Raises

MindInsightException – If validating parameter fails.

end(run_context)[source]

Collect lineage information when the training job ends.

Parameters

run_context (RunContext) – It contains all lineage information, see mindspore.train.callback.RunContext.

Raises

LineageLogError – If recording lineage information fails.

class mindinsight.lineagemgr.EvalLineage(summary_record, raise_exception=False)[source]

Collect lineage of an evaluation job.

Parameters
  • summary_record (SummaryRecord) – SummaryRecord is used to record the summary value, and summary_record is an instance of SummaryRecord, see mindspore.train.summary.SummaryRecord.

  • raise_exception (bool) – Whether to raise exception when error occurs in EvalLineage. If True, raise exception. If False, catch exception and continue. Default: False.

Raises
  • MindInsightException – If validating parameter fails.

  • LineageLogError – If recording lineage information fails.

Examples

>>> from mindinsight.lineagemgr import EvalLineage
>>> from mindspore.train.callback import ModelCheckpoint, SummaryStep
>>> from mindspore.train.summary import SummaryRecord
>>> model = Model(train_network)
>>> model_ckpt = ModelCheckpoint(directory='/dir/to/save/model/')
>>> summary_writer = SummaryRecord(log_dir='./')
>>> summary_callback = SummaryStep(summary_writer, flush_step=2)
>>> lineagemgr = EvalLineage(summary_record=summary_writer)
>>> model.eval(epoch_num, dataset, callbacks=[model_ckpt, summary_callback, lineagemgr])
end(run_context)[source]

Collect lineage information when the training job ends.

Parameters

run_context (RunContext) – It contains all lineage information, see mindspore.train.callback.RunContext.

Raises
  • MindInsightException – If validating parameter fails.

  • LineageLogError – If recording lineage information fails.

mindinsight.lineagemgr.get_summary_lineage(summary_dir, keys=None)[source]

Get the lineage information according to summary directory and keys.

The function queries lineage information of single train process corresponding to the given summary directory. Users can query the information according to keys.

Parameters
  • summary_dir (str) – The summary directory. It contains summary logs for one training.

  • keys (list[str]) – The filter keys of lineage information. The acceptable keys are metric, hyper_parameters, algorithm, train_dataset, model, valid_dataset and dataset_graph. If it is None, all information will be returned. Default: None.

Returns

dict, the lineage information for one training.

Raises
  • LineageParamSummaryPathError – If summary path is invalid.

  • LineageQuerySummaryDataError – If querying summary data fails.

  • LineageFileNotFoundError – If the summary log file is not found.

Examples

>>> summary_dir = "/path/to/summary"
>>> summary_lineage_info = get_summary_lineage(summary_dir)
>>> hyper_parameters = get_summary_lineage(summary_dir, keys=["hyper_parameters"])
mindinsight.lineagemgr.filter_summary_lineage(summary_base_dir, search_condition=None)[source]

Filter the lineage information under summary base directory according to search condition.

Users can filter and sort all lineage information according to the search condition. The supported filter fields include summary_dir, network, etc. The filter conditions include eq, lt, gt, le, ge and in. At the same time, the combined use of these fields and conditions is supported. If you want to sort based on filter fields, the field of sorted_name and sorted_type should be specified.

Users can use lineage_type to decide what kind of lineage information to query. If the lineage_type is dataset, the query result is only the lineage information related to data augmentation. If the lineage_type is model or None, the query result is all lineage information.

Users can paginate query result based on offset and limit. The offset refers to page number. The limit refers to the number in one page.

Parameters
  • summary_base_dir (str) – The summary base directory. It contains summary directories generated by training.

  • search_condition (dict) –

    The search condition. When filtering and sorting, in addition to the following supported fields, fields prefixed with metric_ are also supported. The fields prefixed with metric_ are related to the metrics parameter in the training script. For example, if the key of metrics parameter is accuracy, the field should be metric_accuracy. Default: None.

    • summary_dir (dict): The filter condition of summary directory.

    • loss_function (dict): The filter condition of loss function.

    • train_dataset_path (dict): The filter condition of train dataset path.

    • train_dataset_count (dict): The filter condition of train dataset count.

    • test_dataset_path (dict): The filter condition of test dataset path.

    • test_dataset_count (dict): The filter condition of test dataset count.

    • network (dict): The filter condition of network.

    • optimizer (dict): The filter condition of optimizer.

    • learning_rate (dict): The filter condition of learning rate.

    • epoch (dict): The filter condition of epoch.

    • batch_size (dict): The filter condition of batch size.

    • loss (dict): The filter condition of loss.

    • model_size (dict): The filter condition of model size.

    • dataset_mark (dict): The filter condition of dataset mark.

    • offset (int): Page number, the value range is [0, 100000].

    • limit (int): The number in one page, the value range is [1, 100].

    • sorted_name (str): Specify which field to sort by.

    • sorted_type (str): Specify sort order. It can be ascending or descending.

    • lineage_type (str): It decides what kind of lineage information to query. It can be dataset or model. If it is dataset, the query result is only the lineage information related to data augmentation. If it is model or None, the query result is all lineage information.

Returns

dict, all lineage information under summary base directory according to search condition.

Raises
  • LineageSearchConditionParamError – If search_condition param is invalid.

  • LineageParamSummaryPathError – If summary path is invalid.

  • LineageFileNotFoundError – If the summary log file is not found.

  • LineageQuerySummaryDataError – If querying summary log file data fails.

Examples

>>> summary_base_dir = "/path/to/summary_base"
>>> search_condition = {
>>>     'summary_dir': {
>>>         'in': [
>>>             os.path.join(summary_base_dir, 'summary_1'),
>>>             os.path.join(summary_base_dir, 'summary_2'),
>>>             os.path.join(summary_base_dir, 'summary_3')
>>>         ]
>>>     },
>>>     'loss': {
>>>         'gt': 2.0
>>>     },
>>>     'batch_size': {
>>>         'ge': 128,
>>>         'le': 256
>>>     },
>>>     'metric_accuracy': {
>>>         'lt': 0.1
>>>     },
>>>     'sorted_name': 'summary_dir',
>>>     'sorted_type': 'descending',
>>>     'limit': 3,
>>>     'offset': 0,
>>>     'lineage_type': 'model'
>>> }
>>> summary_lineage = filter_summary_lineage(summary_base_dir)
>>> summary_lineage_filter = filter_summary_lineage(summary_base_dir, search_condition)