How to create a DeepFake video in 5 minutes

Carding

Professional
Messages
2,829
Reputation
17
Reaction score
2,087
Points
113
imagine that you have a full-body photo. Just a still image. Then all you need is a solo video of your favorite dancer performing a few moves. Not so hard now that TikTok is taking over the world.

Image animation uses a video sequence to control the movement of an object in the image. In this story, we'll see how image animation technology is now incredibly easy to use and how you can animate just about anything you can think of. To do this, I converted the source code of the corresponding publication into a simple script, creating a thin shell that everyone can use to create DeepFakes. With the original image and the correct driving video, anything is possible.

How does it work?
Under the hood, a neural network is used that is trained to reconstruct the video, taking into account the original frame (still image) and the hidden representation of movement in the video that is learned during training. During testing, the model takes as input a new source image and a moving video (for example, a sequence of frames) and predicts how the object in the source image moves according to the movement depicted in these frames.

The model tracks everything that is interesting in animation: head movements, conversations, eye tracking, and even body actions. For example, let's look at the GIF below: President trump makes the cast of Game of thrones talk and move like him.

ea54dbbecb688b67781b1.gif


Methodology and approach​

Before creating our own sequences, let's take a closer look at this approach. First, the training data set is a large collection of videos. During training, authors extract pairs of frames from the same video and pass them to the model. The model tries to reconstruct the video by somehow learning which key points are in pairs and how to represent the movement between them.

2eb9dbb4d221f31fb6422.png


For this purpose, the framework consists of two models: a motion estimator and a video generator. Initially, the motion estimator tries to find out the hidden representation of motion in the video. This is encoded as motion-specific key point offsets (where the key points can be the position of the eyes or mouth) and local affine transformations. This combination can model a larger family of transformations instead of using only the key point offset. The model has two outputs: a dense field of motion and an occlusion mask. This mask determines which parts of the driving video can be restored by deforming the original image, and which parts should be taken out of context because they are not present in the original image (for example, on the back of the head).For example, consider the trendy GIF below. The back of each model is missing from the original image, so the model should assume this.

df2662f030fd2921dc8bb.gif


The video generator then takes as input the output signal of the motion detector and the original image and animates it according to the driving video; it distorts the original image in ways that resemble the driving video, and in the hospital, it distorts the closed parts. Figure 1 shows the architecture of the framework.

Sample code
The source code for this article is available on GitHub. What I did was create a simple shell script, a thin shell that uses the source code and can be easily used by everyone for quick experimentation.

To use it, you must first install the module. Run pip install deep-animator to install the library in your environment. Then we will need four items:
  • Model weight; of Course, we don't want to train the model from scratch. So we need weights to load the pre-trained model.
  • YAML configuration file for our model.
  • The original image; this can be, for example, a portrait.
  • Video of driving; it is Better to start by downloading a video with a clearly visible face.
To quickly get results and test the algorithm's performance, you can use this source image and this driving video. The model weight can be found here . The following is a simple YAML configuration file. Open a text editor, copy and paste the following lines, and save them as conf. yml.

Code:
model_params:
 common_params:
 num_kp: 10
 num_channels: 3
 estimate_jacobian: True
 kp_detector_params:
temperature: 0.1
 block_expansion: 32
 max_features: 1024
 scale_factor: 0,25
 Num_blocks: 5
 generator_params:
 block_expansion: 64
 max_features: 512
 num_down_blocks: 2
 num_bottleneck_blocks: 6
 estimate_occlusion_map: True
 dense_motion_params:
 block_expansion : 64
 max_features: 1024
 Num_blocks: 5
 scale_factor: 0,25
 discriminator_params:
weights: [1]
 block_expansion: 32
 max_features: 512
 num_blocks: 4

Now we are ready to create a statue that imitates Leonardo DiCaprio! To get the results, simply run the following command.

Code:
deep_animate <path_ to image_source> <path_to_star_video> <path_k_aml_conf> <path_k_aml_conf><path_k_model>

For example, if you downloaded everything from the same CD folder to this folder and run:

deep_animate 00.png 00.mp4 conf.yml deep_animator_model.pth.tar

On my processor, it takes about five minutes to get the generated video. It will be saved in the same folder, unless otherwise specified in the --dest option. Alternatively, you can use GPU acceleration with the --device CUDA option. Finally, we are ready to see the result. Pretty cool!

bfbc17b6c4c2333cea754.gif


Conclusion
Finally, we used deep-animator, a thin wrapper, to animate the statue.

While there are some concerns about such technologies, they may have different applications, and also show how easy it is currently to create fake stories, raising awareness about them
 

Danny1silver

Carder
Messages
42
Reputation
0
Reaction score
26
Points
18
imagine that you have a full-body photo. Just a still image. Then all you need is a solo video of your favorite dancer performing a few moves. Not so hard now that TikTok is taking over the world.

Image animation uses a video sequence to control the movement of an object in the image. In this story, we'll see how image animation technology is now incredibly easy to use and how you can animate just about anything you can think of. To do this, I converted the source code of the corresponding publication into a simple script, creating a thin shell that everyone can use to create DeepFakes. With the original image and the correct driving video, anything is possible.

How does it work?
Under the hood, a neural network is used that is trained to reconstruct the video, taking into account the original frame (still image) and the hidden representation of movement in the video that is learned during training. During testing, the model takes as input a new source image and a moving video (for example, a sequence of frames) and predicts how the object in the source image moves according to the movement depicted in these frames.

The model tracks everything that is interesting in animation: head movements, conversations, eye tracking, and even body actions. For example, let's look at the GIF below: President trump makes the cast of Game of thrones talk and move like him.

ea54dbbecb688b67781b1.gif


Methodology and approach​

Before creating our own sequences, let's take a closer look at this approach. First, the training data set is a large collection of videos. During training, authors extract pairs of frames from the same video and pass them to the model. The model tries to reconstruct the video by somehow learning which key points are in pairs and how to represent the movement between them.

2eb9dbb4d221f31fb6422.png


For this purpose, the framework consists of two models: a motion estimator and a video generator. Initially, the motion estimator tries to find out the hidden representation of motion in the video. This is encoded as motion-specific key point offsets (where the key points can be the position of the eyes or mouth) and local affine transformations. This combination can model a larger family of transformations instead of using only the key point offset. The model has two outputs: a dense field of motion and an occlusion mask. This mask determines which parts of the driving video can be restored by deforming the original image, and which parts should be taken out of context because they are not present in the original image (for example, on the back of the head).For example, consider the trendy GIF below. The back of each model is missing from the original image, so the model should assume this.

df2662f030fd2921dc8bb.gif


The video generator then takes as input the output signal of the motion detector and the original image and animates it according to the driving video; it distorts the original image in ways that resemble the driving video, and in the hospital, it distorts the closed parts. Figure 1 shows the architecture of the framework.

Sample code
The source code for this article is available on GitHub. What I did was create a simple shell script, a thin shell that uses the source code and can be easily used by everyone for quick experimentation.

To use it, you must first install the module. Run pip install deep-animator to install the library in your environment. Then we will need four items:
  • Model weight; of Course, we don't want to train the model from scratch. So we need weights to load the pre-trained model.
  • YAML configuration file for our model.
  • The original image; this can be, for example, a portrait.
  • Video of driving; it is Better to start by downloading a video with a clearly visible face.
To quickly get results and test the algorithm's performance, you can use this source image and this driving video. The model weight can be found here . The following is a simple YAML configuration file. Open a text editor, copy and paste the following lines, and save them as conf. yml.

Code:
model_params:
common_params:
num_kp: 10
num_channels: 3
estimate_jacobian: True
kp_detector_params:
temperature: 0.1
block_expansion: 32
max_features: 1024
scale_factor: 0,25
Num_blocks: 5
generator_params:
block_expansion: 64
max_features: 512
num_down_blocks: 2
num_bottleneck_blocks: 6
estimate_occlusion_map: True
dense_motion_params:
block_expansion : 64
max_features: 1024
Num_blocks: 5
scale_factor: 0,25
discriminator_params:
weights: [1]
block_expansion: 32
max_features: 512
num_blocks: 4

Now we are ready to create a statue that imitates Leonardo DiCaprio! To get the results, simply run the following command.

Code:
deep_animate <path_ to image_source> <path_to_star_video> <path_k_aml_conf> <path_k_aml_conf><path_k_model>

For example, if you downloaded everything from the same CD folder to this folder and run:

deep_animate 00.png 00.mp4 conf.yml deep_animator_model.pth.tar

On my processor, it takes about five minutes to get the generated video. It will be saved in the same folder, unless otherwise specified in the --dest option. Alternatively, you can use GPU acceleration with the --device CUDA option. Finally, we are ready to see the result. Pretty cool!

bfbc17b6c4c2333cea754.gif


Conclusion
Finally, we used deep-animator, a thin wrapper, to animate the statue.

While there are some concerns about such technologies, they may have different applications, and also show how easy it is currently to create fake stories, raising awareness about them
This is so nice man... Good Job
Do you have any program to change your voice to a female voice during phone call?
 

Carding

Professional
Messages
2,829
Reputation
17
Reaction score
2,087
Points
113
This is so nice man... Good Job
Do you have any program to change your voice to a female voice during phone call?
Yes, such programs exist. Most often they are used by scammers when working in dating fraud. They call their clients on behalf of a beautiful girl and ask for money for any purchases on the Internet or ask to pay for an air ticket and a visa to fly to them and meet in real life.

Please check out these topics for a more detailed study and a step-by-step guide on how to use them correctly:

Also, these programs can be used for the following purposes:
1. For calls to the bank in order to change the billing address of the cardholder to pass the confirmation of the ABC address verification system.
2. For calls to payment systems and financial companies for verification.
3. For calls to online stores to confirm the details of ordering expensive goods.
4. With various types of telephone fraud.
 

Jollier

Professional
Messages
1,127
Reputation
6
Reaction score
1,105
Points
113

Creating your own Deepfake​


DeepFaceLAb
Obviously, to create a DeepFake video, we will need special software.

About the method
I use a hybrid method for creating deepfakes, which involves using cloud computing power. The process of creating such videos eats up a lot of video memory, and in itself is quite long. Using the cloud reduces production time, as well as provides better video output quality. This method is quite popular, works on all platforms, and most importantly-it is free.

7a4792e6e4b4704b238be.png


First of all, we need to download DeepFaceLab on Windows from an official source https://github.com/iperov/DeepFaceLab. This is a torrent that weighs + - 7 gigs. Download it and unpack it.
Link to DeepFaceLab: Windows (magnet link)

XSeg1.jpg


Preparation
Before working in the cloud, we need to pre-prepare the workspace for our video. Perhaps these processes can be performed in the cloud itself, but somehow I didn't really succeed. So first, we prepare everything we need directly to the PC. If you know the answer to the question above, please unsubscribe.

ba866dfc17dc0236c3b983e56ff4314e.png


Open The Deepfacelab Folder. We see a few more folders and a whole bunch of numbered terminal commands. This is the entire interface of the software) First of all, we need to prepare our videos, between which the merge will take place. We go to the Worskpace folder, there we see two videos and 3 more folders. Data_ src is the source code from which the neural network will extract the face. Data_dst - this is the video on which the mask will be applied. By default, we will always be greeted by uncle Elon and Iron man, which of course we are not interested in.

7e97147eb85fac59b0a1756a519ed8f6.JPEG


Go to the commands
Now we need the neural network to identify the faces in the video, extract them to Workspace, and then we can go to the cloud with peace of mind. Click on each of the specified commands one by one -
Each time, the terminal will open and ask you something. Just press Enter and the program will work according to the default settings.

Execute commands one at a time:
2) extract images from video data_src.bat
3) extract images from video data_dst FULL FPS.bat
4) data_src facesetextract
5) data_dst facesetextract
After that, hundreds of files with faces should appear in the workspace folder in the data_dst and data_src sections. Check it out.
We can start working with the cloud. Create an archive of the workspace folder and upload it to our Google drive.

4dad5f6300c1ebe34fdea.png


Google Colab

To further create a deepfake, we will use the Google Colab platform. This is a free service that will help us perform all calculations on the most powerful video cards. The only caveat is that Colab has restrictions on working hours (12 hours - one session), so if you are planning to make a deepfake, set aside enough time for this and do not get distracted. Everything works in the browser, so do not close it in any case.

Follow the link: Link

Log in to your Google Account and here we are in the cloud DeepFaceLab. The functionality here is about the same as on a PC. Perhaps the interface will be even more pleasant. Everything you need to know about Colab-click on Play and the command is executed.

24d415444614f4003236a.png


The first step is to start the entire engine. To do this, run the following commands. Check GPU и Clone Github repository and install requirements. You need to do this every time you use the service, otherwise it won't work.

32d4f27cc7177a90f40ff.png


Then import our worspace from Yandex. disk. Just run the Import command. The site will ask you for a specific access code. Copy and paste.

9726ea0811ffc4f57dd89.png


After everything is Done, we stupidly execute all the commands in turn, constantly skipping each question with an "enter".

41ac84f9dce74784ede8b.png


In theory, if you make all the settings, the video quality will be better. The most important thing is the Train Model section, which will eat up the lion's share of time. The longer this command is executed, the better the video quality. I first got a passable result only after 6 hours of training. Run this command and go to chill. The main thing is to sometimes look at the browser, so that nothing flies off. As soon as you think that the model is sufficiently trained, we execute the remaining two commands. Then select Export to Drive and the result video mode.

Conclusion
I hope you enjoyed this guide and will be able to put it to good use.
P.S. If you are an expert on creating deepfakes, please unsubscribe in the comments. I may have a very interesting suggestion for you.
 
Top