1) The Fashion Mnist dataset training using CNN model.
As per the requirements we have first downloaded the dataset and have
relabelled them into three classes clothes, shoes and other.
after that we have defined the model . The model contains 4 convolutional
layers, 1 dropout layer(40 percent), 1 fully connected layer and the output
layer. In the output we have used SoftMax function as it is a classification
problem. We have used cross entropy as a losss function as it is a classification
problem and Adam optimizer with a learning rate of 0.01.Then we have trained the
data set on the training set and evaluated on the test set. We have also used batch
normalization while training to increase the speed as tarining.
The second thing next we have done is , we have scrammebled all the training and
the test set with a fixed permutation amd used the same mosel to train with the
scramblled images. The test accuracy is decreased little bit in this case , but
still it is pretty good(approximately 98%). It proves that cnn does not capture any
visual information.
2)Emotion detction with Resnet18
Firstly as usual we have downloaded the data. After that we have transformed the
data into colour image as it was in gray scale and Resnet
architechture expects the images to be RGB and also we have resized the image into
size (224,224). Then we have trained it from scratch using the taring data set and
have tested it on the test data set. The classification accuracy was around 61%.
After that we have used a pre tarined Resnet-18 model. In this case we have freezed
all the conolutional layer and have only trained the last fully connected layer of
this model using the training data. You can seee that in this case the accuracy
drppened drastically. It is normal, becaue Resnet-18 was trained on real RGB
images, but here we are forcing our images to be RGB which is not a good
representaion of thr color form of our images, Secondly Resnet-18 is trained on the
image-net data set and have huge number of calsses and the data disstribution of
the image net are very much different from us, so all the convolution layer of the
pre tarined Resnet-18 are bised in a differnt manner and does nor suit well with
our data set.