Building a Facial Recognition API

Our project

In this project we are going to build a CNN that will help us compare if to pictures are of the same person. 📸

We will use the dataset of Labeled Faces in the Wild enriched with some pictures of ourselves.🤩

This model is a binary prediction model.
For a better understanding we will set two terminology words:

Anchor: The picture to compare to.
Input: The picture that is being compared.

Checkout our Medium Article

Baseline Model

Our baseline model is a Gaussian Naive Bayes Classifier with an accuracy of 51% (random guess).

The anchor and input are flattened and concatenated together to form a one 375,001 size vector (1 for the label) and sent into the GNB model

We do not expect to get good results from this model. Getting 51% accuracy assures us that we are on the right path.

Twin CNN:

The model will get the two images after preprocessing:
- Anchor (224, 224, 3)
- Input (224, 224, 3)
The two images will go through the VGG16 embedding stage. While the model trains it trains on the same embedding for both images.
Out of the embedding we get two vectors that enter an L1 distance layer. Here the distance between the two images is measured.
One last dense layer with a sigmoid activation to get us a binary classification.

We need to make sure we have pictures of ourselves and from random people in all of our folders in a balanced way (positive, negative and anchor).

The dataset consist of 112868 rows. In each row there are 3 columns: the path of the anchor image, the path of the image to be compared (positive or negative) and finally the classification (1 if they are equal and 0 otherwise), making all possible combinations between the anchor and the negative sets and the anchor and the positive sets. It is important, memorywise, to use the path and not the image itself (the pixels). To access the images and train the NN, a generator is used.

Project stages

Let’s start! 🚀

1. Downloading the LFW dataset and enriching it with pictures of ourselves

Downloading LFW and adding pictures of ourselves.
Extracting pictures of people that has more than one picture in their folder to a folder named data (including the new pictures)

2. Dataset building

Getting image from the data folder and using them as anchor.

For positive:
Taking another image from the directory of the person that in the anchor

For the negative: Taking a random image of a different person from the data folder

(At the training we will use a Keras custom image generator to load the data)

3. Preprocessing

Resizing and rescaling the data.

4. Training the model

The model was trained using ADAM optimizer in three stages, each stage took 50 epochs, with the exception of early stopping if necessary:

1st stage: lr = 1e-4
2nd stage: lr = 1e-5
3rd stage: lr = 1e-6
(lr: learning rate)

3. Baseline Model

Our baseline model is a Gaussian Naive Bayes Classifier with an accuracy of 51%. The accuracy is really bad - it's the same as saying that the images are always the same or that they are always different.

In this step, the dataframe consists on 4542 rows with pictures of ourselves and random people. For each picture we have 11025 columns that correspond to the pixels of the anchor image and another 11025 that are the pixels of the picture that we are comparing to the anchor (in same cases positive and in other cases negative). The label is 1 if the images are the same or 0 if they are not. That was to ensure that the model will have a low loss as possible.

4. Testing the model

Setting up voters:

Taking a set of voter images to place as the anchors to the input image.

Comparing input to voters:

An image will be taken (the input) and then be compared to a set of voter images. The average of the outcomes will be the confidence of the model if the person is the same as the voters or not

Prerequirements:

conda \ pip Install requirements.txt
GPU required for training. This project trains a CNN, so either work on local GPU or use a cloud GPU.

Libraries:

Contents of the repository:

baseline_model.ipynb: where the baseline model is created
baseline_model.py: where the baseline model functions are created
config.py: with the constants
get_samples.ipnyb: where we create the dataset with pictures of ourselves
preprocessing.py: where the preprocessing functions take place
requirements.txt
creating_DF.ipynb: where all the images from the folders were reorganized so that we have a balanced dataset.
creating_DF.py: where the functions needed for creating the DF are stored.
img_generator.py: Custom image generator
train_NN.py: Training loop for CNN
voters.py: Function that takes pictures to set as voters
model_test_real_time.py: A test function for trained model

References

Youtube Video
paper

Authors

Authors: Noam Cohen, Sahar Garber and Julieta Staryfurman

Important For GitHub Cloning

This repo has some large files. To get them we use git lfs (large file storage).
To use git lfs go to here and download git lfs to your machine. After that enter in command line in the git repo:
$ git lfs install
Then, as you do with normal git, pull the repo from GitHub and the large files will be tracked by git lfs and downloaded to your machine.

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
.idea		.idea
us		us
voting_juli		voting_juli
voting_noam		voting_noam
voting_sahar		voting_sahar
.DS_Store		.DS_Store
.gitattributes		.gitattributes
EDA.ipynb		EDA.ipynb
FR_v1.h5		FR_v1.h5
README.md		README.md
baseline_model.ipynb		baseline_model.ipynb
baseline_model.py		baseline_model.py
conf.py		conf.py
config.py		config.py
creating_DF.ipynb		creating_DF.ipynb
creating_DF.py		creating_DF.py
dataframe.csv		dataframe.csv
get_samples.ipynb		get_samples.ipynb
img_generator.py		img_generator.py
model_building.py		model_building.py
model_test_real_time.py		model_test_real_time.py
preprocessing.py		preprocessing.py
requirements.txt		requirements.txt
train_NN.py		train_NN.py
user_interface.py		user_interface.py
voters.py		voters.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Building a Facial Recognition API

Our project

Baseline Model

Twin CNN:

Project stages

1. Downloading the LFW dataset and enriching it with pictures of ourselves

2. Dataset building

3. Preprocessing

4. Training the model

3. Baseline Model

4. Testing the model

Setting up voters:

Comparing input to voters:

Prerequirements:

Libraries:

Contents of the repository:

References

Authors

Important For GitHub Cloning

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Saharufus/Face_Rec

Folders and files

Latest commit

History

Repository files navigation

Building a Facial Recognition API

Our project

Baseline Model

Twin CNN:

Project stages

1. Downloading the LFW dataset and enriching it with pictures of ourselves

2. Dataset building

3. Preprocessing

4. Training the model

3. Baseline Model

4. Testing the model

Setting up voters:

Comparing input to voters:

Prerequirements:

Libraries:

Contents of the repository:

References

Authors

Important For GitHub Cloning

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages