I decided to use the approach of swap faces, but how I can do that?. There are two ways: Image Warping and Artificial Neural Networks.

“It is easy to computationally replace a face in one image with a different face if you want to do it for giggles, but extremely difficult to do if you want to do it completely automatically at a quality that will fool people consistently.” - Satya Mallick

So, the first way is simple, we get the facial landmark detections, find the convex hull, align with delaunay triangulation, affine warp triangles, and finally combine it with the seamless cloning technique.

The second one is difficult. We need to train an autoencoder network, which is a type of neural network where its input is a picture and its output is the same picture. It has two parts, an encoder, and a decoder. The encoder learns a shorter representation of the input while the decoder transforms it back to the original data. So, we train an autoencoder network for each face and encode the picture from the first person with their decoder, but decode it with the second person decoder.

Without going deeper, I would dare to say that way one limiting us to achieve swap-face in only frontal faces. And the second way has a lot of limitations too. For example, we need a lot of images of the two people to train the network, the images need to be representative of the goal (to generate profile shots of the face), and the training will be so expensive since we will need to train a model for each face (A model for the user, a model for each hairstyle, and a model for each hairstyle that the user uploads).

So, I need more than one-day research to experiment and find more limitations, or to find a new way of making the face-swap. After that, I will be able to make the decision about which path I am going to take and start solving problems, but it will most likely be the first way, image warping…