So, how do you prevent your face from being changed?
Editor’s note: This article comes from WeChat public account “Heart of the Machine” (ID: almosthuman2014) , participating in: Si, Du Wei, egg sauce; released with permission. p>
Changing face videos is a big consequence of DL abuse. As long as you have your photos online, you may be changed faces to other backgrounds or videos. However, with such an open source attack model, uploaded photos are no longer a problem, and deepfake cannot directly use it for face replacement. p>
Recently, researchers from Boston University introduced a new study of deepfake in a paper. Looking at the title and effect of the paper, it seems that as long as we input our pictures, the deepfake face-changing model can no longer use our pictures as materials to make small video. p>
p>
It looks very good, just add some noise that is invisible to the human eye, and changing the face model can no longer generate the correct face. Isn’t this thinking exactly the way of adversarial attacks? The previous attack model would deceive the identification model by “forging real images”. Now, the noise generated by the attack model will arm the face image, thereby deceiving deepfake, so that deepfake cannot generate a face-changing model that deceives humans. p>
This Boston University study was released a short time ago, and it has been hotly debated by many researchers. It has also been discussed a lot on Reddit. After seeing this paper, and the researchers have released a GitHub project, it is likely that we will think “Is it possible to publish our photos online and then use them after deepfake?” P>
p>
But things are certainly not as simple as we think. Reddit user Other-Top said, “According to this paper, I need to use this method to process the photos first, and then upload the photos. Others use this to change their face. It will go wrong. “ p>
That is to say, our photos and photos of stars must be passed through the attack model before they can be uploaded to the Internet. Are such photos safe? p>
It sounds more troublesome, but we can still look at the research content of this paper first, maybe we can think of a better way. In this paper, the researchers use the adversarial attack that the human eye cannot perceive in the source image, and use the anti-noise interference to generate the image. p>
As a result of this destruction, the generated image will be sufficiently degraded, either making the image unusable or making changes to the image noticeable. In other words, invisible noise causes deepfake to produce videos that are obviously fake. p>
-
Paper address: https://arxiv.org/abs/2003.01279 p> li>
-
Code address: https://github.com/natanielruiz/disrupting-deepfakes p> li>
ul>Defense against adversaries, Deepfake h2>
Adversarial attacks are commonly used to deceive various image recognition models. Although they can also be used in image generation models, they do not seem to be of much significance. However, if it can be used in a face change model such as deepfake, it will be very promising. p>
In this paper, the researchers are “faking” deepfake’s face-changing operation along the path of adversarial attacks. Specifically, the researchers first proposed and successfully applied: p>
It can be generalized to different categories of migratory adversarial attacks, which means that the attacker does not need to know the category of the image; p>
Adversarial training for generating adversarial networks (GAN), which is the first step to achieve a robust image conversion network; p>
In a gray-box scenario, blurring the input image can successfully defend against attacks. Researchers have shown an attack method that can circumvent this defense. p>
p>
Figure 1: Flow chart that interferes with deepfake generation. Using the I-FGSM method, an undetectable set of noise is applied to the image, and then it can successfully interfere with the output of the Star Manipulation System (StarGAN). p>
Most face manipulation architectures are trained with input images and target condition categories. For example, certain attributes are used to define the target expression that generates a face (such as adding a smile to a face). If we want to prevent others from adding a smile to the face in the image, we need to clearly select the smile attribute instead of other unrelated attributes such as closed eyes. p>
So if you want to deceive deepfake by adversarial attack, you first need to sort out the problem of conditional image conversion, so that you can migrate the previous attack method to the face change. The researchers also proposed two categories of migratory interference variants to improve the generalization of different category attributes. p>
In a white-box test scenario, blurring a photo is a decisive defense, where the interferer clearly pre-processes the type and size of blur. In addition, in a real scenario, the interferer may know the architecture used, but ignore the type and size of the blur. The general attack method in this scenario will significantly reduce the effect. Therefore, researchers have proposed a new type of spread-spectrum disruption method, which can avoid different fuzzy defenses in gray-box test scenarios. p>
In general, although deepfake image generation has many unique features, it has withstood the “traditional image recognition” adversarial attack and can be modified to effectively deceive the deepfake model. p>
How to attack Deepfake h2>
If readers have learned about adversarial attacks before, the methods described later in this paper will be easier to understand. In general, for how to attack models such as deepfake, researchers said that they can be divided into general image translation disruption, their newly proposed conditional image modification, adversarial training techniques for GAN, and spread spectrum disruption. p>
Let ’s take a look at the effect of the attack first. The original unmodified image (without added anti-noise) can complete the face change. But if you add anti-noise to them, although the human eye can’t see any change in the input image, the model can’t finish changing faces based on such photos. p>
p>
Same as the adversarial attack. If we add some noise that the human eye cannot recognize, but the machine is very sensitive, then relying on such an image, deepfakes will be attacked. p>
The current popular attack methods are mainly based on gradients and iterations. Many other excellent and advanced attack methods are based on their main ideas. The main idea of this type of method is to find a small perturbation that maximizes the change in the loss function. By adding this small perturbation to the original input, the model will be misclassified into other categories. p>
Usually, the simple method is to calculate the derivative of the loss function to the input along the backpropagation and maximize the loss function according to the derivative, so that the attacker can find the optimal direction of perturbation and construct an anti-sample to deceive the deep network. p>
For example, the Fast Gradient Sign Method (FGSM) proposed in the early years. If we let x represent the input image, G is the generation model to complete the face change, and L is the loss function for training the neural network, then we can use the neighborhood of the current weight value. The domain linearly approximates the loss function and obtains the noise η that makes the generated image G (x) farthest from the original face-changing effect “r”. p>
p>
FGSM can quickly calculate the gradient through backpropagation and find the small perturbation η that increases the model loss the most. Others, such as the basic iterative method (BIM), iterate multiple FGSMs with smaller steps to obtain better adversarial samples. As shown in the figure below, adding the optimal perturbation η to the original input x “face”, and then using this “face” to generate deepfakes will have problems. p>
p>
Three attack methods p>
Only introduced aboveThe core idea of adversarial attack is that it can indeed deceive deepfakes to a certain extent, but in order to have a good effect, researchers have proposed three more perfect attack methods in the paper. The idea of conditional image modification is only briefly introduced here. For more details, please refer to the original paper. p>
Adding noise was unconditional before, but many face-changing models will not only input faces, but also enter a certain category. This category is the condition. As follows, we add the condition c to the image generation G (x, c), and hope to obtain the maximum loss L, but only need to modify the minimum pixel η. p>
p>
In order to solve this problem, the researchers show a new attack method, which is aimed at image conversion methods under conditional constraints. This method can enhance the ability of the attack model to migrate to various categories, for example, the category is “smiley face”, and then inputting it into the attack model can better generate human faces that invalidate deepfakes. p>
Specifically, the researchers modified I-FGSM as follows: p>
p>
Experimental effect h2>
Experiments show that the image-level FGSM, I-FGSM, and PGD-based image noise methods proposed by the researchers can successfully interfere with different image generation architectures such as GANimation, StarGAN, pix2pixHD, and CycleGAN. p>
In order to understand the effect of image modification based on L ^ 2 and L ^ 1 metrics, in Figure 3 below, researchers show qualitative examples of interference output and their respective distortion metrics. p>
p>
Figure 3: Equivalence scale between L_2 and L_1 distances and qualitative distortion on StarGAN interference images. p>
For the iterative category transferable interference and joint category transferable interference proposed in the paper, the researchers give a qualitative example in Figure 4 below. The purpose of these disturbances is to migrate to all action unit inputs of GANimation. p>
p>
Figure 4: The researchers proposed the effect of this attack-for-face model. p>
As shown in the figure above, a is the original input image, and the result of GANimation without adding noise is b. If you use the category as a constraint, the attack effect after using the correct category is c, and the attack effect without using the correct category is d. The following e and f are the iterative category migratable attack effect and joint category migratable attack effect proposed by the researchers. They can attack the deepfakes generation model across various categories. p>
In the settings of the gray box test, the interferer does not know the type and size of the blur used for preprocessing, so blur is an effective way to resist adversarial damage. The low-level blur can make the damage invalid, but at the same time, the quality of the image conversion output can be guaranteed. Figure 5 below shows an example in the StarGAN structure. p>
p>
Figure 5: Successful example of Gaussian blur defense. p>
If the image controller uses blur to block adversarial interference, the other party may not know the type and size of the blur used. Figure 6 below shows the proportion of interference successfully achieved by the spread spectrum method in the test image. p>
p>
Figure 6: Proportion of image interference caused by different blur avoidance under different blur defenses (L ^ 2 ≥ 0.05). p>
p>
Figure 7: The effect of the spread-spectrum disruption method for defensive measures using Gaussian blur (σ = 1.5). The first line shows the original method of attacking without obfuscation; the second line is the spread-spectrum disruption method, and the last line is the attack effect under the white-box test condition. p>