Defenses Functions

Available functions:

  • adversarial_training(model, x, y, epsilon=0.01): Adversarial Training defense.

  • feature_squeezing(model, bit_depth=4): Feature Squeezing defense.

  • gradient_masking(model, mask_threshold=0.1): Gradient Masking defense.

  • input_transformation(model, transformation_function=None): Input Transformation defense.

  • defensive_distillation(model, teacher_model, temperature=2): Defensive Distillation defense.


Adversarial Training

Adversarial training is a method where the model is trained on both the original and adversarial examples, aiming to make the model more robust to adversarial attacks.

Parameters:
    model (tensorflow.keras.Model): The model to defend.
    x (numpy.ndarray): The input training examples.
    y (numpy.ndarray): The true labels of the training examples.
    epsilon (float): The magnitude of the perturbation (default: 0.01).

Returns:
    defended_model (tensorflow.keras.Model): The adversarially trained model.

Feature Squeezing

Feature squeezing reduces the number of bits used to represent the input features, which can remove certain adversarial perturbations.

Parameters:
    model (tensorflow.keras.Model): The model to defend.
    bit_depth (int): The number of bits per feature (default: 4).

Returns:
    defended_model (tensorflow.keras.Model): The model with feature squeezing defense.

Gradient Masking

Gradient masking modifies the gradients during training to make them less informative for adversarial attackers.

Parameters:
        model (tensorflow.keras.Model): The model to defend.
        mask_threshold (float): The threshold for masking gradients (default: 0.1).

Returns:
        defended_model (tensorflow.keras.Model): The model with gradient masking defense.

Input Transformation

Input transformation applies a transformation to the input data before feeding it to the model, aiming to remove adversarial perturbations.

Parameters:
        model (tensorflow.keras.Model): The model to defend.
        transformation_function (function): The transformation function to apply (default: None).

Returns:
        defended_model (tensorflow.keras.Model): The model with input transformation defense.

Defensive Distillation

Defensive distillation trains a student model to mimic the predictions of a teacher model, which is often a more robust model.

Parameters:
        model (tensorflow.keras.Model): The student model to defend.
        teacher_model (tensorflow.keras.Model): The teacher model.
        temperature (float): The temperature parameter for distillation (default: 2).

Returns:
        defended_model (tensorflow.keras.Model): The distilled student model

Last updated