Defenses Functions

Available functions:


Adversarial Training

Adversarial training is a method where the model is trained on both the original and adversarial examples, aiming to make the model more robust to adversarial attacks.

Parameters:
    model (tensorflow.keras.Model): The model to defend.
    x (numpy.ndarray): The input training examples.
    y (numpy.ndarray): The true labels of the training examples.
    epsilon (float): The magnitude of the perturbation (default: 0.01).

Returns:
    defended_model (tensorflow.keras.Model): The adversarially trained model.

Feature Squeezing

Feature squeezing reduces the number of bits used to represent the input features, which can remove certain adversarial perturbations.

Parameters:
    model (tensorflow.keras.Model): The model to defend.
    bit_depth (int): The number of bits per feature (default: 4).

Returns:
    defended_model (tensorflow.keras.Model): The model with feature squeezing defense.

Gradient Masking

Gradient masking modifies the gradients during training to make them less informative for adversarial attackers.

Parameters:
        model (tensorflow.keras.Model): The model to defend.
        mask_threshold (float): The threshold for masking gradients (default: 0.1).

Returns:
        defended_model (tensorflow.keras.Model): The model with gradient masking defense.

Input Transformation

Input transformation applies a transformation to the input data before feeding it to the model, aiming to remove adversarial perturbations.

Parameters:
        model (tensorflow.keras.Model): The model to defend.
        transformation_function (function): The transformation function to apply (default: None).

Returns:
        defended_model (tensorflow.keras.Model): The model with input transformation defense.

Defensive Distillation

Defensive distillation trains a student model to mimic the predictions of a teacher model, which is often a more robust model.

Parameters:
        model (tensorflow.keras.Model): The student model to defend.
        teacher_model (tensorflow.keras.Model): The teacher model.
        temperature (float): The temperature parameter for distillation (default: 2).

Returns:
        defended_model (tensorflow.keras.Model): The distilled student model

Randomized Smoothing

Randomized smoothing adds random noise to the input data to make the model more robust to adversarial attacks.

Parameters:
        model (tensorflow.keras.Model): The model to defend.
        noise_level (float): The standard deviation of the Gaussian noise (default: 0.1).

Returns:
        defended_model (tensorflow.keras.Model): The model with randomized smoothing defense.

Feature Denoising

Feature denoising applies denoising operations to the input data to remove adversarial perturbations.

Parameters:
        model (tensorflow.keras.Model): The model to defend.

Returns:
        defended_model (tensorflow.keras.Model): The model with feature denoising defense.

Thermometer Encoding

Thermometer encoding discretizes the input features into bins, making it harder for adversarial perturbations to affect the model.

Parameters:
        model (tensorflow.keras.Model): The model to defend.
        num_bins (int): The number of bins for encoding (default: 10).

Returns:
        defended_model (tensorflow.keras.Model): The model with thermometer encoding defense.

Adversarial Logit Pairing (ALP)

Adversarial logit pairing encourages the logits of adversarial examples to be similar to those of clean examples.

Parameters:
        model (tensorflow.keras.Model): The model to defend.
        paired_model (tensorflow.keras.Model): The paired model for logit pairing.

Returns:
        defended_model (tensorflow.keras.Model): The model with adversarial logit pairing defense.

Spatial Smoothing

Spatial smoothing applies a smoothing filter to the input data to remove adversarial perturbations.

Parameters:
        model (tensorflow.keras.Model): The model to defend.
        kernel_size (int): The size of the smoothing kernel (default: 3).

Returns:
        defended_model (tensorflow.keras.Model): The model with spatial smoothing defense.

Last updated

Was this helpful?