mindarmour.adv_robustness.detectors
This module includes detector methods on distinguishing adversarial examples from benign examples.
- class mindarmour.adv_robustness.detectors.DivergenceBasedDetector(auto_encoder, model, option='jsd', t=1, bounds=(0.0, 1.0))[source]
- The divergence-based detector learns to distinguish normal and adversarial examples by their js-divergence. - Parameters
- auto_encoder (Model) – Encoder model. 
- model (Model) – Targeted model. 
- option (str) – Method used to calculate Divergence. Default: “jsd”. 
- t (int) – Temperature used to overcome numerical problem. Default: 1. 
- bounds (tuple) – Upper and lower bounds of data. In form of (clip_min, clip_max). Default: (0.0, 1.0). 
 
 - Examples - >>> import mindspore.ops.operations as P >>> from mindspore.nn import Cell >>> from mindspore import Model >>> from mindarmour.adv_robustness.detectors import DivergenceBasedDetector >>> class PredNet(Cell): ... def __init__(self): ... super(PredNet, self).__init__() ... self.shape = P.Shape() ... self.reshape = P.Reshape() ... self._softmax = P.Softmax() ... def construct(self, inputs): ... data = self.reshape(inputs, (self.shape(inputs)[0], -1)) ... return self._softmax(data) >>> class Net(Cell): ... def __init__(self): ... super(Net, self).__init__() ... self.add = P.Add() ... def construct(self, inputs): ... return self.add(inputs, inputs) >>> np.random.seed(5) >>> ori = np.random.rand(4, 4, 4).astype(np.float32) >>> np.random.seed(6) >>> adv = np.random.rand(4, 4, 4).astype(np.float32) >>> encoder = Model(Net()) >>> model = Model(PredNet()) >>> detector = DivergenceBasedDetector(encoder, model) >>> threshold = detector.fit(ori) >>> detector.set_threshold(threshold) >>> adv_ids = detector.detect(adv) >>> adv_trans = detector.transform(adv) - detect_diff(inputs)[source]
- Detect the distance between original samples and reconstructed samples. - The distance is calculated by JSD. - Parameters
- inputs (numpy.ndarray) – Input samples. 
- Returns
- float, the distance. 
- Raises
- NotImplementedError – If the param option is not supported. 
 
 
- class mindarmour.adv_robustness.detectors.EnsembleDetector(detectors, policy='vote')[source]
- The ensemble detector uses a list of detectors to detect the adversarial examples from the input samples. - Parameters
 - Examples - >>> from mindspore.ops.operations import Add >>> from mindspore.nn import Cell >>> from mindspore import Model >>> from mindarmour.adv_robustness.detectors import ErrorBasedDetector >>> from mindarmour.adv_robustness.detectors import RegionBasedDetector >>> from mindarmour.adv_robustness.detectors import EnsembleDetector >>> class Net(Cell): ... def __init__(self): ... super(Net, self).__init__() ... self.add = Add() ... def construct(self, inputs): ... return self.add(inputs, inputs) >>> class AutoNet(Cell): ... def __init__(self): ... super(AutoNet, self).__init__() ... self.add = Add() ... def construct(self, inputs): ... return self.add(inputs, inputs) >>> np.random.seed(6) >>> adv = np.random.rand(4, 4).astype(np.float32) >>> model = Model(Net()) >>> auto_encoder = Model(AutoNet()) >>> random_label = np.random.randint(10, size=4) >>> labels = np.eye(10)[random_label] >>> magnet_detector = ErrorBasedDetector(auto_encoder) >>> region_detector = RegionBasedDetector(model) >>> region_detector.fit(adv, labels) >>> detectors = [magnet_detector, region_detector] >>> detector = EnsembleDetector(detectors) >>> adv_ids = detector.detect(adv) - detect(inputs)[source]
- Detect adversarial examples from input samples. - Parameters
- inputs (numpy.ndarray) – Input samples. 
- Returns
- list[int], whether a sample is adversarial. if res[i]=1, then the input sample with index i is adversarial. 
- Raises
- ValueError – If policy is not supported. 
 
 - detect_diff(inputs)[source]
- This method is not available in this class. - Parameters
- inputs (Union[numpy.ndarray, list, tuple]) – Data been used as references to create adversarial examples. 
- Raises
- NotImplementedError – This function is not available in ensemble. 
 
 - fit(inputs, labels=None)[source]
- Fit detector like a machine learning model. This method is not available in this class. - Parameters
- inputs (numpy.ndarray) – Data to calculate the threshold. 
- labels (numpy.ndarray) – Labels of data. Default: None. 
 
- Raises
- NotImplementedError – This function is not available in ensemble. 
 
 - transform(inputs)[source]
- Filter adversarial noises in input samples. This method is not available in this class. - Parameters
- inputs (Union[numpy.ndarray, list, tuple]) – Data been used as references to create adversarial examples. 
- Raises
- NotImplementedError – This function is not available in ensemble. 
 
 
- class mindarmour.adv_robustness.detectors.ErrorBasedDetector(auto_encoder, false_positive_rate=0.01, bounds=(0.0, 1.0))[source]
- The detector reconstructs input samples, measures reconstruction errors and rejects samples with large reconstruction errors. - Parameters
 - Examples - >>> from mindspore.ops.operations import Add >>> from mindspore.nn import Cell >>> from mindspore import Model >>> from mindarmour.adv_robustness.detectors import ErrorBasedDetector >>> class Net(Cell): ... def __init__(self): ... super(Net, self).__init__() ... self.add = Add() ... def construct(self, inputs): ... return self.add(inputs, inputs) >>> np.random.seed(5) >>> ori = np.random.rand(4, 4, 4).astype(np.float32) >>> np.random.seed(6) >>> adv = np.random.rand(4, 4, 4).astype(np.float32) >>> model = Model(Net()) >>> detector = ErrorBasedDetector(model) >>> detector.fit(ori) >>> adv_ids = detector.detect(adv) >>> adv_trans = detector.transform(adv) - detect(inputs)[source]
- Detect if input samples are adversarial or not. - Parameters
- inputs (numpy.ndarray) – Suspicious samples to be judged. 
- Returns
- list[int], whether a sample is adversarial. if res[i]=1, then the input sample with index i is adversarial. 
 
 - detect_diff(inputs)[source]
- Detect the distance between the original samples and reconstructed samples. - Parameters
- inputs (numpy.ndarray) – Input samples. 
- Returns
- float, the distance between reconstructed and original samples. 
 
 - fit(inputs, labels=None)[source]
- Find a threshold for a given dataset to distinguish adversarial examples. - Parameters
- inputs (numpy.ndarray) – Input samples. 
- labels (numpy.ndarray) – Labels of input samples. Default: None. 
 
- Returns
- float, threshold to distinguish adversarial samples from benign ones. 
 
 - set_threshold(threshold)[source]
- Set the parameters threshold. - Parameters
- threshold (float) – Detection threshold. 
 
 - transform(inputs)[source]
- Reconstruct input samples. - Parameters
- inputs (numpy.ndarray) – Input samples. 
- Returns
- numpy.ndarray, reconstructed images. 
 
 
- class mindarmour.adv_robustness.detectors.RegionBasedDetector(model, number_points=10, initial_radius=0.0, max_radius=1.0, search_step=0.01, degrade_limit=0.0, sparse=False)[source]
- The region-based detector uses the fact that adversarial examples are close to the classification boundary, and ensembles information around the given example to predict whether it is an adversarial example or not. - Reference: Mitigating evasion attacks to deep neural networks via region-based classification - Parameters
- model (Model) – Target model. 
- number_points (int) – The number of samples generate from the hyper cube of original sample. Default: 10. 
- initial_radius (float) – Initial radius of hyper cube. Default: 0.0. 
- max_radius (float) – Maximum radius of hyper cube. Default: 1.0. 
- search_step (float) – Incremental during search of radius. Default: 0.01. 
- degrade_limit (float) – Acceptable decrease of classification accuracy. Default: 0.0. 
- sparse (bool) – If True, input labels are sparse-encoded. If False, input labels are one-hot-encoded. Default: False. 
 
 - Examples - >>> from mindspore.ops.operations import Add >>> from mindspore.nn import Cell >>> from mindspore import Model >>> from mindarmour.adv_robustness.detectors import RegionBasedDetector >>> class Net(Cell): ... def __init__(self): ... super(Net, self).__init__() ... self.add = Add() ... def construct(self, inputs): ... return self.add(inputs, inputs) >>> np.random.seed(5) >>> ori = np.random.rand(4, 4).astype(np.float32) >>> labels = np.array([[1, 0, 0, 0], [0, 0, 1, 0], [0, 0, 1, 0], ... [0, 1, 0, 0]]).astype(np.int32) >>> np.random.seed(6) >>> adv = np.random.rand(4, 4).astype(np.float32) >>> model = Model(Net()) >>> detector = RegionBasedDetector(model) >>> radius = detector.fit(ori, labels) >>> detector.set_radius(radius) >>> adv_ids = detector.detect(adv) - detect(inputs)[source]
- Tell whether input samples are adversarial or not. - Parameters
- inputs (numpy.ndarray) – Suspicious samples to be judged. 
- Returns
- list[int], whether a sample is adversarial. if res[i]=1, then the input sample with index i is adversarial. 
 
 - detect_diff(inputs)[source]
- Return raw prediction results and region-based prediction results. - Parameters
- inputs (numpy.ndarray) – Input samples. 
- Returns
- numpy.ndarray, raw prediction results and region-based prediction results of input samples. 
 
 - fit(inputs, labels=None)[source]
- Train detector to decide the best radius. - Parameters
- inputs (numpy.ndarray) – Benign samples. 
- labels (numpy.ndarray) – Ground truth labels of the input samples. Default:None. 
 
- Returns
- float, the best radius. 
 
 - transform(inputs)[source]
- Generate hyper cube for input samples. - Parameters
- inputs (numpy.ndarray) – Input samples. 
- Returns
- numpy.ndarray, hyper cube corresponds to every sample. 
 
 
- class mindarmour.adv_robustness.detectors.SimilarityDetector(trans_model, max_k_neighbor=1000, chunk_size=1000, max_buffer_size=10000, tuning=False, fpr=0.001)[source]
- The detector measures similarity among adjacent queries and rejects queries which are remarkably similar to previous queries. - Parameters
- trans_model (Model) – A MindSpore model to encode input data into lower dimension vector. 
- max_k_neighbor (int) – The maximum number of the nearest neighbors. Default: 1000. 
- chunk_size (int) – Buffer size. Default: 1000. 
- max_buffer_size (int) – Maximum buffer size. Default: 10000. 
- tuning (bool) – Calculate the average distance for the nearest k neighbours, if tuning is true, k=K. If False k=1,…,K. Default: False. 
- fpr (float) – False positive ratio on legitimate query sequences. Default: 0.001 
 
 - Examples - >>> from mindspore.ops.operations import Add >>> from mindspore.nn import Cell >>> from mindspore import Model >>> from mindarmour.adv_robustness.detectors import SimilarityDetector >>> class EncoderNet(Cell): ... def __init__(self, encode_dim): ... super(EncoderNet, self).__init__() ... self._encode_dim = encode_dim ... self.add = Add() ... def construct(self, inputs): ... return self.add(inputs, inputs) ... def get_encode_dim(self): ... return self._encode_dim >>> np.random.seed(5) >>> x_train = np.random.rand(10, 32, 32, 3).astype(np.float32) >>> perm = np.random.permutation(x_train.shape[0]) >>> benign_queries = x_train[perm[:10], :, :, :] >>> suspicious_queries = x_train[perm[-1], :, :, :] + np.random.normal(0, 0.05, (10,) + x_train.shape[1:]) >>> suspicious_queries = suspicious_queries.astype(np.float32) >>> encoder = Model(EncoderNet(encode_dim=256)) >>> detector = SimilarityDetector(max_k_neighbor=3, trans_model=encoder) >>> num_nearest_neighbors, thresholds = detector.fit(inputs=x_train) >>> detector.set_threshold(num_nearest_neighbors[-1], thresholds[-1]) >>> detector.detect(benign_queries) >>> detections = detector.get_detection_interval() >>> detected_queries = detector.get_detected_queries() - detect(inputs)[source]
- Process queries to detect black-box attack. - Parameters
- inputs (numpy.ndarray) – Query sequence. 
- Raises
- ValueError – The parameters of threshold or num_of_neighbors is not available. 
 
 - detect_diff(inputs)[source]
- Detect adversarial samples from input samples, like the predict_proba function in common machine learning model. - Parameters
- inputs (Union[numpy.ndarray, list, tuple]) – Data been used as references to create adversarial examples. 
- Raises
- NotImplementedError – This function is not available in class SimilarityDetector. 
 
 - fit(inputs, labels=None)[source]
- Process input training data to calculate the threshold. A proper threshold should make sure the false positive rate is under a given value. - Parameters
- inputs (numpy.ndarray) – Training data to calculate the threshold. 
- labels (numpy.ndarray) – Labels of training data. 
 
- Returns
- list[int], number of the nearest neighbors. 
- list[float], calculated thresholds for different K. 
 
- Raises
- ValueError – The number of training data is less than max_k_neighbor! 
 
 - get_detected_queries()[source]
- Get the indexes of detected queries. - Returns
- list[int], sequence number of detected malicious queries. 
 
 - get_detection_interval()[source]
- Get the interval between adjacent detections. - Returns
- list[int], number of queries between adjacent detections. 
 
 - set_threshold(num_of_neighbors, threshold)[source]
- Set the parameters num_of_neighbors and threshold. 
 - transform(inputs)[source]
- Filter adversarial noises in input samples. - Parameters
- inputs (Union[numpy.ndarray, list, tuple]) – Data been used as references to create adversarial examples. 
- Raises
- NotImplementedError – This function is not available in class SimilarityDetector. 
 
 
- class mindarmour.adv_robustness.detectors.SpatialSmoothing(model, ksize=3, is_local_smooth=True, metric='l1', false_positive_ratio=0.05)[source]
- Detect method based on spatial smoothing. Using Gaussian filtering, median filtering, and mean filtering, to blur the original image. When the model has a large threshold difference between the predicted values before and after the sample is blurred, it is judged as an adversarial example. - Parameters
- model (Model) – Target model. 
- ksize (int) – Smooth window size. Default: 3. 
- is_local_smooth (bool) – If True, trigger local smooth. If False, none local smooth. Default: True. 
- metric (str) – Distance method. Default: ‘l1’. 
- false_positive_ratio (float) – False positive rate over benign samples. Default: 0.05. 
 
 - Examples - >>> import mindspore.ops.operations as P >>> from mindspore.nn import Cell >>> from mindspore import Model >>> from mindarmour.adv_robustness.detectors import SpatialSmoothing >>> class Net(Cell): ... def __init__(self): ... super(Net, self).__init__() ... self._softmax = P.Softmax() ... def construct(self, inputs): ... return self._softmax(inputs) >>> input_shape = (50, 3) >>> np.random.seed(1) >>> input_np = np.random.randn(*input_shape).astype(np.float32) >>> np.random.seed(2) >>> adv_np = np.random.randn(*input_shape).astype(np.float32) >>> model = Model(Net()) >>> detector = SpatialSmoothing(model) >>> threshold = detector.fit(input_np) >>> detector.set_threshold(threshold.item()) >>> detected_res = np.array(detector.detect(adv_np)) - detect(inputs)[source]
- Detect if an input sample is an adversarial example. - Parameters
- inputs (numpy.ndarray) – Suspicious samples to be judged. 
- Returns
- list[int], whether a sample is adversarial. if res[i]=1, then the input sample with index i is adversarial. 
 
 - detect_diff(inputs)[source]
- Return the raw distance value (before apply the threshold) between the input sample and its smoothed counterpart. - Parameters
- inputs (numpy.ndarray) – Suspicious samples to be judged. 
- Returns
- float, distance. 
 
 - fit(inputs, labels=None)[source]
- Train detector to decide the threshold. The proper threshold make sure the actual false positive rate over benign sample is less than the given value. - Parameters
- inputs (numpy.ndarray) – Benign samples. 
- labels (numpy.ndarray) – Default None. 
 
- Returns
- float, threshold, distance larger than which is reported as positive, i.e. adversarial.