mindarmour.detectors
This module includes detector methods on distinguishing adversarial examples from benign examples.
- class mindarmour.detectors.DivergenceBasedDetector(auto_encoder, model, option='jsd', t=1, bounds=(0.0, 1.0))[source]
- This class implement a divergence-based detector. - Parameters
- auto_encoder (Model) – Encoder model. 
- model (Model) – Targeted model. 
- option (str) – Method used to calculate Divergence. Default: “jsd”. 
- t (int) – Temperature used to overcome numerical problem. Default: 1. 
- bounds (tuple) – Upper and lower bounds of data. In form of (clip_min, clip_max). Default: (0.0, 1.0). 
 
 - Examples - >>> np.random.seed(5) >>> ori = np.random.rand(4, 4, 4).astype(np.float32) >>> np.random.seed(6) >>> adv = np.random.rand(4, 4, 4).astype(np.float32) >>> encoder = Model(Net()) >>> model = Model(PredNet()) >>> detector = DivergenceBasedDetector(encoder, model) >>> threshold = detector.fit(ori) >>> detector.set_threshold(threshold) >>> detected_res = detector.detect(adv) >>> adv_trans = detector.transform(adv) - detect_diff(inputs)[source]
- Detect the distance between original samples and reconstructed samples. - The distance is calculated by JSD. - Parameters
- inputs (numpy.ndarray) – Input samples. 
- Returns
- float, the distance. 
- Raises
- NotImplementedError – If the param option is not supported. 
 
 
- class mindarmour.detectors.EnsembleDetector(detectors, policy='vote')[source]
- Ensemble detector. - Parameters
 - detect(inputs)[source]
- Detect adversarial examples from input samples. - Parameters
- inputs (numpy.ndarray) – Input samples. 
- Returns
- list[int], whether a sample is adversarial. if res[i]=1, then the input sample with index i is adversarial. 
- Raises
- ValueError – If policy is not supported. 
 
 - detect_diff(inputs)[source]
- This method is not available in this class. - Parameters
- inputs (Union[numpy.ndarray, list, tuple]) – Data been used as references to create adversarial examples. 
- Raises
- NotImplementedError – This function is not available in ensemble. 
 
 - fit(inputs, labels=None)[source]
- Fit detector like a machine learning model. This method is not available in this class. - Parameters
- inputs (numpy.ndarray) – Data to calculate the threshold. 
- labels (numpy.ndarray) – Labels of data. 
 
- Raises
- NotImplementedError – This function is not available in ensemble. 
 
 - transform(inputs)[source]
- Filter adversarial noises in input samples. This method is not available in this class. - Raises
- NotImplementedError – This function is not available in ensemble. 
 
 
- class mindarmour.detectors.ErrorBasedDetector(auto_encoder, false_positive_rate=0.01, bounds=(0.0, 1.0))[source]
- The detector reconstructs input samples, measures reconstruction errors and rejects samples with large reconstruction errors. - Parameters
 - Examples - >>> np.random.seed(5) >>> ori = np.random.rand(4, 4, 4).astype(np.float32) >>> np.random.seed(6) >>> adv = np.random.rand(4, 4, 4).astype(np.float32) >>> model = Model(Net()) >>> detector = ErrorBasedDetector(model) >>> detector.fit(ori) >>> detected_res = detector.detect(adv) >>> adv_trans = detector.transform(adv) - detect(inputs)[source]
- Detect if input samples are adversarial or not. - Parameters
- inputs (numpy.ndarray) – Suspicious samples to be judged. 
- Returns
- list[int], whether a sample is adversarial. if res[i]=1, then the input sample with index i is adversarial. 
 
 - detect_diff(inputs)[source]
- Detect the distance between the original samples and reconstructed samples. - Parameters
- inputs (numpy.ndarray) – Input samples. 
- Returns
- float, the distance between reconstructed and original samples. 
 
 - fit(inputs, labels=None)[source]
- Find a threshold for a given dataset to distinguish adversarial examples. - Parameters
- inputs (numpy.ndarray) – Input samples. 
- labels (numpy.ndarray) – Labels of input samples. Default: None. 
 
- Returns
- float, threshold to distinguish adversarial samples from benign ones. 
 
 - set_threshold(threshold)[source]
- Set the parameters threshold. - Parameters
- threshold (float) – Detection threshold. Default: None. 
 
 - transform(inputs)[source]
- Reconstruct input samples. - Parameters
- inputs (numpy.ndarray) – Input samples. 
- Returns
- numpy.ndarray, reconstructed images. 
 
 
- class mindarmour.detectors.RegionBasedDetector(model, number_points=10, initial_radius=0.0, max_radius=1.0, search_step=0.01, degrade_limit=0.0, sparse=False)[source]
- This class implement a region-based detector. - Reference: Mitigating evasion attacks to deep neural networks via region-based classification - Parameters
- model (Model) – Target model. 
- number_points (int) – The number of samples generate from the hyper cube of original sample. Default: 10. 
- initial_radius (float) – Initial radius of hyper cube. Default: 0.0. 
- max_radius (float) – Maximum radius of hyper cube. Default: 1.0. 
- search_step (float) – Incremental during search of radius. Default: 0.01. 
- degrade_limit (float) – Acceptable decrease of classification accuracy. Default: 0.0. 
- sparse (bool) – If True, input labels are sparse-encoded. If False, input labels are one-hot-encoded. Default: False. 
 
 - Examples - >>> detector = RegionBasedDetector(model) >>> detector.fit(Tensor(ori), Tensor(labels)) >>> adv_ids = detector.detect(Tensor(adv)) - detect(inputs)[source]
- Tell whether input samples are adversarial or not. - Parameters
- inputs (numpy.ndarray) – Suspicious samples to be judged. 
- Returns
- list[int], whether a sample is adversarial. if res[i]=1, then the input sample with index i is adversarial. 
 
 - detect_diff(inputs)[source]
- Return raw prediction results and region-based prediction results. - Parameters
- inputs (numpy.ndarray) – Input samples. 
- Returns
- numpy.ndarray, raw prediction results and region-based prediction results of input samples. 
 
 - fit(inputs, labels=None)[source]
- Train detector to decide the best radius. - Parameters
- inputs (numpy.ndarray) – Benign samples. 
- labels (numpy.ndarray) – Ground truth labels of the input samples. Default:None. 
 
- Returns
- float, the best radius. 
 
 - transform(inputs)[source]
- Generate hyper cube for input samples. - Parameters
- inputs (numpy.ndarray) – Input samples. 
- Returns
- numpy.ndarray, hyper cube corresponds to every sample. 
 
 
- class mindarmour.detectors.SimilarityDetector(trans_model, max_k_neighbor=1000, chunk_size=1000, max_buffer_size=10000, tuning=False, fpr=0.001)[source]
- The detector measures similarity among adjacent queries and rejects queries which are remarkably similar to previous queries. - Parameters
- trans_model (Model) – A MindSpore model to encode input data into lower dimension vector. 
- max_k_neighbor (int) – The maximum number of the nearest neighbors. Default: 1000. 
- chunk_size (int) – Buffer size. Default: 1000. 
- max_buffer_size (int) – Maximum buffer size. Default: 10000. 
- tuning (bool) – Calculate the average distance for the nearest k neighbours, if tuning is true, k=K. If False k=1,…,K. Default: False. 
- fpr (float) – False positive ratio on legitimate query sequences. Default: 0.001 
 
 - Examples - >>> detector = SimilarityDetector(model) >>> detector.fit(Tensor(ori), Tensor(labels)) >>> adv_ids = detector.detect(Tensor(adv)) - detect(inputs)[source]
- Process queries to detect black-box attack. - Parameters
- inputs (numpy.ndarray) – Query sequence. 
- Raises
- ValueError – The parameters of threshold or num_of_neighbors is not available. 
 
 - detect_diff(inputs)[source]
- Detect adversarial samples from input samples, like the predict_proba function in common machine learning model. - Parameters
- inputs (Union[numpy.ndarray, list, tuple]) – Data been used as references to create adversarial examples. 
- Raises
- NotImplementedError – This function is not available in class SimilarityDetector. 
 
 - fit(inputs, labels=None)[source]
- Process input training data to calculate the threshold. A proper threshold should make sure the false positive rate is under a given value. - Parameters
- inputs (numpy.ndarray) – Training data to calculate the threshold. 
- labels (numpy.ndarray) – Labels of training data. 
 
- Returns
- list[int], number of the nearest neighbors. 
- list[float], calculated thresholds for different K. 
 
- Raises
- ValueError – The number of training data is less than max_k_neighbor! 
 
 - get_detected_queries()[source]
- Get the indexes of detected queries. - Returns
- list[int], sequence number of detected malicious queries. 
 
 - get_detection_interval()[source]
- Get the interval between adjacent detections. - Returns
- list[int], number of queries between adjacent detections. 
 
 - set_threshold(num_of_neighbors, threshold)[source]
- Set the parameters num_of_neighbors and threshold. 
 - transform(inputs)[source]
- Filter adversarial noises in input samples. - Raises
- NotImplementedError – This function is not available in class SimilarityDetector. 
 
 
- class mindarmour.detectors.SpatialSmoothing(model, ksize=3, is_local_smooth=True, metric='l1', false_positive_ratio=0.05)[source]
- Detect method based on spatial smoothing. - Parameters
- model (Model) – Target model. 
- ksize (int) – Smooth window size. Default: 3. 
- is_local_smooth (bool) – If True, trigger local smooth. If False, none local smooth. Default: True. 
- metric (str) – Distance method. Default: ‘l1’. 
- false_positive_ratio (float) – False positive rate over benign samples. Default: 0.05. 
 
 - Examples - >>> detector = SpatialSmoothing(model) >>> detector.fit(Tensor(ori), Tensor(labels)) >>> adv_ids = detector.detect(Tensor(adv)) - detect(inputs)[source]
- Detect if an input sample is an adversarial example. - Parameters
- inputs (numpy.ndarray) – Suspicious samples to be judged. 
- Returns
- list[int], whether a sample is adversarial. if res[i]=1, then the input sample with index i is adversarial. 
 
 - detect_diff(inputs)[source]
- Return the raw distance value (before apply the threshold) between the input sample and its smoothed counterpart. - Parameters
- inputs (numpy.ndarray) – Suspicious samples to be judged. 
- Returns
- float, distance. 
 
 - fit(inputs, labels=None)[source]
- Train detector to decide the threshold. The proper threshold make sure the actual false positive rate over benign sample is less than the given value. - Parameters
- inputs (numpy.ndarray) – Benign samples. 
- labels (numpy.ndarray) – Default None. 
 
- Returns
- float, threshold, distance larger than which is reported as positive, i.e. adversarial.