Adversarial training is a method where the model is trained on both the original and adversarial examples, aiming to make the model more robust to adversarial attacks.
Parameters:
model (tensorflow.keras.Model): The model to defend.
x (numpy.ndarray): The input training examples.
y (numpy.ndarray): The true labels of the training examples.
epsilon (float): The magnitude of the perturbation (default: 0.01).
Returns:
defended_model (tensorflow.keras.Model): The adversarially trained model.
Feature Squeezing
Feature squeezing reduces the number of bits used to represent the input features, which can remove certain adversarial perturbations.
Parameters:
model (tensorflow.keras.Model): The model to defend.
bit_depth (int): The number of bits per feature (default: 4).
Returns:
defended_model (tensorflow.keras.Model): The model with feature squeezing defense.
Gradient Masking
Gradient masking modifies the gradients during training to make them less informative for adversarial attackers.
Parameters:
model (tensorflow.keras.Model): The model to defend.
mask_threshold (float): The threshold for masking gradients (default: 0.1).
Returns:
defended_model (tensorflow.keras.Model): The model with gradient masking defense.
Input Transformation
Input transformation applies a transformation to the input data before feeding it to the model, aiming to remove adversarial perturbations.
Parameters:
model (tensorflow.keras.Model): The model to defend.
transformation_function (function): The transformation function to apply (default: None).
Returns:
defended_model (tensorflow.keras.Model): The model with input transformation defense.
Defensive Distillation
Defensive distillation trains a student model to mimic the predictions of a teacher model, which is often a more robust model.
Parameters:
model (tensorflow.keras.Model): The student model to defend.
teacher_model (tensorflow.keras.Model): The teacher model.
temperature (float): The temperature parameter for distillation (default: 2).
Returns:
defended_model (tensorflow.keras.Model): The distilled student model
Randomized Smoothing
Randomized smoothing adds random noise to the input data to make the model more robust to adversarial attacks.
Parameters:
model (tensorflow.keras.Model): The model to defend.
noise_level (float): The standard deviation of the Gaussian noise (default: 0.1).
Returns:
defended_model (tensorflow.keras.Model): The model with randomized smoothing defense.
Feature Denoising
Feature denoising applies denoising operations to the input data to remove adversarial perturbations.
Parameters:
model (tensorflow.keras.Model): The model to defend.
Returns:
defended_model (tensorflow.keras.Model): The model with feature denoising defense.
Thermometer Encoding
Thermometer encoding discretizes the input features into bins, making it harder for adversarial perturbations to affect the model.
Parameters:
model (tensorflow.keras.Model): The model to defend.
num_bins (int): The number of bins for encoding (default: 10).
Returns:
defended_model (tensorflow.keras.Model): The model with thermometer encoding defense.
Adversarial Logit Pairing (ALP)
Adversarial logit pairing encourages the logits of adversarial examples to be similar to those of clean examples.
Parameters:
model (tensorflow.keras.Model): The model to defend.
paired_model (tensorflow.keras.Model): The paired model for logit pairing.
Returns:
defended_model (tensorflow.keras.Model): The model with adversarial logit pairing defense.
Spatial Smoothing
Spatial smoothing applies a smoothing filter to the input data to remove adversarial perturbations.
Parameters:
model (tensorflow.keras.Model): The model to defend.
kernel_size (int): The size of the smoothing kernel (default: 3).
Returns:
defended_model (tensorflow.keras.Model): The model with spatial smoothing defense.