Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
24 views24 pages

Deep Learning

Deep learning notes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
24 views24 pages

Deep Learning

Deep learning notes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 24
Dec 2023 Paper Pattern ~ ‘C’ SCHEME Ql. Attempt any four from the following. 1(a) Describe underfitting, overfitting and bias variance trade-off. - Moule No.2 | (5 Marks) Ans: 1. Underfitting: Underfitting occurs when a machine learning model is too simple to capture the underlying structure of the data. In other words, the model fails to learn the patterns in the training data and performs poorly not only on the training data but also on unseen data (test data). This often happens when the model is too basic or has not been trained for a sufficient number of iterations. 5 2. Overfitting: Overfitting happens when a machine learning model captures noise or random fluctuations in the training data rather than the underlying relationships. The model becomes too complex, essentially memorizing the training data instead of learning from it. As a result, it performs very well on the training data but generalizes poorly to new, unseen data. 3. Bias-Variance Trade-off: The bias-variance trade-off is a fundamental concept in machine learning that deals with the balance between a mode''s bias (systematic error) and variance (random error). A high-bias model is typically simple and makes strong assumptions about the data, which may lead to underfitting. On the other hand, a high-variance model is more complex and sensitive to small fluctuations in the training data, which may lead to overfitting. The goal Is to find the right balance between bias and variance to achieve optimal performance an unseen data, 1{b) Explain toss functions. How to choose output and loss function? Wustrate with different cases - Moule No. 2] (5 Marks) Ans: Loss functions, also known as cost functions or objective functions, are used in machine learning to quantity the difference between the predicted output of a model and the actual output. Tne goal of training a machine learning model is to minimize the . value of the loss function by adjusting the model's parameters, such as weights and biases. There are different types of loss functions used in machine learning, depending on the type of problem being solved. Some commonly used loss functions include: 1. Mean Squared Error (MSE): This loss function is commonly used in regression problems, where the goal is to predict a continuous value, It measures the average of the squared ditference between the predicted and actual values. Binary Cross-Entropy: This loss function is commonly used in binary classification problems, where the goal is to predict a binary output (e.g, yes/no, true/false). It measures the distance between the predicted and actual probabilities. Categorical Cross-Entropy: This loss function is commonly used in multi-class classification problems, where the goal is to predict a categorical output (e.g, a label for a set of input features). It measures the difference between the predicted wo and actual probability distributions. Hinge loss: This loss function is commonly used in support vector machines (svMs) and other classification models. It measures the margin between the predicted output and the actual output. Choosing the appropriate loss function is important for training a . machine teaming model that performs well on the given task. . The logs function should be chosen based on the type of problem being solved and the nature of the data. In some cases, it may be necessary Lo Use a Custom loss function that is tailored to the specific requirements of the problem. \(e) Explain dropout and early stopping regularization methods in deep learning model: = Moule No.2 | (6 Marks) Ans: Dropout Is a regularization technique that is widely used in deep learning to prevent overfitting, Overfitting occurs when a model becomes too complex and starts to fit the training data too closely, leading to poor generalization on unseen data, Dropout is a simple yet effective way to prevent overfitting by randomly dropping out (setting to zoro) some of the neurons ino Neural network during training. The Idea behind dropout is to train multiple modols in parallel by randomly dropping out (setting to zero) some of the neurons Ina neural network during each training iteration. By doing so, each model will focus on a different subset of features, leading to a more robust and generalized model. The dropout process can be applied to any layer in a neural network, but itis typically applied to the fully connected layers. During each training Iteration, a random subset of neurons is selected to be dropped out, with a probability p. The value of pis usually set to between 0.2 and 0.5. The process of dropout is typically applied only during training and not during inference. During inference, all the neurons are used, but thelr outputs are scaled down by the dropout probability to ensure that the total output from the layer remains the same as during training. Dropout has several benofits: 1, Itreduces overfitting by preventing neurons from co-adapting too much, which can lead to over-roliance on cortain features, 2. Itimproves the generalization of the model by creating an ensemble of models with different subsets of neurons. 3. Itreduces the training time by avoiding the need for extensive hyperparameter tuning, such as reducing the learning rate, increasing the batch size, or adding more layers. Despite its effectiveness, dropout has some limitations: 1. It may increase the training time per epoch because of the need to randomly drop out neurons during training. 2. It may increase the bias of the’model, leading to underfitting, if the dropout probability is too high. 3. Itmay increase the variance if the model, leading to instability, if the dropout prok lity is too low. - Early stopping isa requiciation technique only used in deep on the validation set starts to deteriorate. The idea behind early stopping is that as ie model continues to jata and its performance on train, it will start to overfit the training! the validation set will start to deter jorate, By stopping the training process before this happens, we can prevent the model from overfitting and improve its generalization. To implement early stopping, we monitor the performance of the model on the validation set after each epoch of training. If the performance on the validation set does not improve for a certain number of epochs, we stop the training process and use the weights of the model that achieved the best performance on the validation set as the final model. Early stopping has several benefits: 1. It helps prevent overfitting and improve the generalization of the model. 2. It can save time and computational resources by stopping the training process early, before it reaches a point of diminishing returns. 3. Itis a simple and effective technique that can be easily implemented in most deep learning frameworks. Early stopping is commonly used in many deep learning models, and it is a standard part of most deep learning frameworks. It is a powerful technique that can help improve the performance of models on a wide range of tasks, but it requires careful monitoring of the performance of the model on the validation set to achieve optimal results. 1(d) State any five applications of deep learning. - Moule No.1 | (5 Marks) Ans: The main applications of deep learning can be divided into computer vision, natural language processing (NLP), and reinforcement learning. Computer vision In computer vision, Deep learning models can enable machines to identify and understand visual data. Some of the main applications of deep learning in computer vision include: Object detection and recognition: Deep learning model can be used to identify and locate objects within images and videos, making it possible for machines to perform tasks such as self-driving cars, surveillance, and robotics. Image classification: Deep learning models can be used to classify images into categories such as animals, plants, and buildings. This is used in applications such as medical imaging, quality control, and image retrieval. Image segmentation: Deep learning models can be used for image segmentation into different regions, making it possible to identify specific features within images. Natural language processing (NLP): In NLP, the Deep learning mode! can enable machines to understand and generate human language. Some of the main applications of deep learning in NUP include: * Automatic Text Generation - Deep learning mode! can learn the corpus of text and new text like summaries, essays can be automatically generated using those trained models. * Language translation: Deep learning models can translate text from one language to another, making it possible to communicate with people from different linguistic backgrounds. * Sentiment analysis: Deep learning models can analyze the sentiment of a piece of text, making it possible to determine whether the text is positive, negative, or neutral. This is used in applications such as customer service, social media monitering, and political analysis. * Speech recognition: Deep learning models can recognize and transcribe spoken words, making it possible to perform tasks such as speech-to-text conversion, voice search, and voice-controlled devices. Reinforcement learning: In reinforcement learning, deep learning works as training agents to take action in an environment to maximize a reward, Some of the main applications of deep learning in reinforcoment learning include: © Game playing: Deep reinforcement learning models have been able to beat human experts at games such as Go, Chess, and Atari, * Robotics: Deep reinforcement learning models can be used to train robots to perform complex tasks such as grasping objects, navigation, and manipulation. * Control systems: Deep reinforcement learning models can be used to control complex systems such as power grids, tratfic management, and supply chain optimization. 1(0) Describe any five activation functions - Moule No.2 | (5 Marks) Ans: Activation functions aro used in artificial neural networks to introduce non-linarity Into the network and enablo it to learn complox relationships botwoen input and output data, « Here are brief oxplanations of several commonly used activation functions: Tanh (Hyperbolic tangent) function: The tanh function is o popular activation function used in neural networks. It has a sigmoidal shape, ranging from -1 to 1. This function is often used in hidden layers of neural networks because it allows the network to learn non-linear representations. Logistic (Sigmoid) function: The logistic function is another commonly used activation function that is similar to the tanh function. It has a sigmoidal shape and outputs values betwoon 0 and 1. The logistic function is often used in binary classification problems because it maps the input to a probability distribution. Linear function: The linear activation function is simply a linear equation that passes through the origin. This function doos not introduce non-linearity into the network, and is often used in regression problems where the output is a continuous variable. Softmax function: The softmax function is commonly used as the activation function in the output layer of neural networks for classification problems. It converts the output of the network into a probability distribution over the different classes, where the sum of the probabilities is equal to |. ReLU (Rectified Linear Unit) function: The ReLU function is a popular activation function used in deep learning. Itis a simple piecewise linear function that outputs the input value if it is positive, and 0 otherwise. The ReLU function has been shown to be effective in training deep neural networks because it allows the network to learn sparse representations and speeds up the training process. 6. Leaky ReLU function: The Leaky ReLU function is a modification of the ReLU function that allows small negative values to pass through the function. It has been shown to be more effective than the ReLU function in some cases, particularly for deep networks with a large number of layers. Overall, the choice of activation function depends on the nature of the problem being solved and the architecture of the neural network. It is important to experiment with different activation functions to find the one that works best for the given task. Q2. Attempt All the Questions. 2(a) Explain biological neuron. Differentiate between biological and artificial neuron. - Moule No.1 | (10 Marks) Ans: Deep learning is a subfield of artificial intelligence that is inspired by the structure and function of biological neurons. In deep learning, a neural network Is a computational model that is designed to simulate the behavior of biological neurons and their Interactions in the brain. ¢ Abiological neuron consists of a cell body, dendrites, and an axon. Similarly, a neuron in a neural network consists of an input layer, hidden layers, and an output layer. The input layer receives the input data, the hidden layers process the data through a series of mathematical transformations, and the output layer produces the final output. * Inaneural network, the input data is fed into the input layer, where it is processed by a series of neurons in the hidden layers. Each neuron in the hidden layer receives inputs from the neurons In the previous layer and produces an output, which is then passed on to the next layer. This process continues until the output layer = produces the final output. nity The neurons in a neural network are connected by weights, which ni ‘of the connection between are values that determine the two neurons. These weights are Tea ned through a process called ith a large amount of training, where the network is presente labeled data and adjusts its weights to minimize the difference between the predicted output and the actual output. In deep learning, the activation function of a neuron is an important component that determines the output ¢ of the neuron based on its inputs. The most commonly used ‘activation fi mnction is the rectified linear unit (ReLU), which outpu at e is negative. Other activa i and the hyperbolic tengen networks. inl In ‘summary, a biological neuron o ce eilor structures ond jee ut value i if itis positive and zero if it Hat it ons, such as the sigmoid function e also. used i in neural Hi MY ron in a neural network Po in deep | Itis also composed of | several processing pieces | known as neurons that are linked together via synapses. It is the mathematical | model which is mainly | inspired by the biological | neuron system in the human | brain. i | It processes the information | Its processing was | sequential and centralized. i in a parallel and distributive } L wo Feedback its processing was Ris smoi in sie. tts contre unit keeps track of et computer-related operations, tt processes the information cto faster speed. tt cannot perform complex pattern recognition. nt doesnt provide cny feedback. There is no fault tolerance. Its operating environment is well-defined and well-constrained Its memory is separate from a processor, localized, and non-content addressable. ttis very vulnerable. It processes the information ina paraiie! and distributive manner. itis large in size. All processing is managed centrally. it processes the informetion ata stow speed. The large quantity and 4 complexity of the connections allow the brain to perform complicated tasks. It provides feedback. ithas fault tolerance. tts operating environment is poorly defined and unconstrained. its memary is integrated into the processor, distributed, and content-addressabie, itis robust. 2(b) Describe Back propagation training algorithm. - Moule No.5 1(10 Marks) ‘Ans: Backpropagation is a widely used algorithm for training feedforward neural networks. It computes the gradient of the loss function with respect to the network weights. It is very officient, rather than naively directly cornputing the gradient concerning each weight. This efficiency makes it possible to use gradient methods to train multi-layer networks and update woights to minimize loss; variants such as gradient descent or stochastic gradient descent are often used. The backpropagation algorithm works by computing the gradient of the loss function with respect to each weight via the chain rule, computing the gradient layer by layer, and iterating backward from the last layer to avoid redundant computation of intermediate terms in the chain rule. Training Algorithm : Step 1: Initialize weight to small random values. Step 2: While the stepsstopping condition is to be false do step 3 to 10. Step 3: For each training pair do step 4 to 9 (Feed-Forward). Step 4: Each input unit receives the signal unit and transmitsthe signal xi signal to all the units, Step 5 : Each hidden unit Zj (2=1 to a) sums its weighted input signal to calculate its net input zinj = v0j + 2xivij_ (i=l ton) Applying activation function zj = {(zinj) and sends this signals to all units in the layer about i.e output units For each output I=unit yk = (k=1 to m) sums its weighted input signals. yink = wOk + ¥ ziwjk (j= to a) and applies its activation function to calculate the output signals. yk = f(yink) Q3. Attempt All the Questions. 3(a) Explain any two types of autoencoders. - Moule No.3 | (10 Marks) Ans: Types of Autoencoders There are diverse types of dutoencoders and analyze the advantages and disadvantages associated with different variation: 1.Denoising Autoencoder Denoising autoencoder works on a partially corrupted input and trains to recover the original undistorted image. As mentioned above, this method is an effective way to constrain the network from simply copying the input and thus learn the underlying structure and important features of the data. Advantages 1. This type of autoencoder can extract important features and reduce the noise or the useless features. 2. Denoising autoencoders can be used as a form of data augmentation, the restored images can be used as augmented data thus generating additional training samples. Disadvantages Selecting the right type and level of noise to introduce can be challenging and may require domain knowledge. 2. Denoising process can result into loss of some information that is needed from the original input. This loss can impact accuracy of the output. 2.Sparse Autoencoder This type of autoencoder typically contains more hidden units than the input but only a few are allowed to be active at once. This property is called the sparsity of the network. The sparsity of the network can be controlled by either manually zeroing the required hidden units, tuning the activation functions or by adding a loss term to the cost function. Advantages 1. The sparsity constraint in sparse autoencoders helps in filtering out noise and irrelevant features during the encoding process. 2. These autoencoders often learn important and meaningful features due to their emphasis on sparse activations. Disadvantages 1. The choice of hyperparameters play a significant role in the performance of this autoencoder. Different inputs should result in the activation of different nodes of the network. ; 2. The application of sparsity constraint increases computational complexity. 3.Variational Autoencoder Variational autoencoder makes strong assumptions about the distribution of latent variables and uses the Stochastic Gradient Variational Bayes estimator in the training process. It assumes that the data is generated by a Directed Graphical Model and tries to learn an approximation to the conditional property where and are the parameters of the encoder and the decoder respectively. Advantages 1. Variational Autoencoders are used to generate new data points that resemble the original training data. These samples are learned from the latent space. 2. Variational Autoencoder is probabilistic framework that is used to learn a compressed representation of the data that captures . its underlying structure and variations, so it is useful in detecting anomalies and data exploration. Disadvantages 1. Variational Autoencoder use approximations to estimate the true distribution of the latent variables. This approximation introduces some level of error, which can affect the quality of generated samples. 2. The generated samples may only cover a limited subset of the true data distribution. This can result in a lack of diversity in generated samples. * 3(b) Explain Mcculloch Pitt model. Design simulation of NOR gate & NAND gate using Mcculloch Pitt Model. - Moule No 1 (10 Marks) Ans: NA _ Qa. Attempt All the Questions. 4(a) Explain input shape, output shape, filter, padding, stride, tensors. - Moule No.4 | (10 Marks) Ans: input Shape: The input shape refers to the dimensions or structure of the input data that is fed into the model. For example, in an image Classification task, the input shape could be (width, height, channels) where width and height are the dimensions of the image, and channels refer to the color channels (e.g, RGB has 3 channels). ¢ Output Shape: Similarly, the output shape is the dimensions or structure of the output produced by the model after processing the input data. In a classification task, the output shape could be (number of classes) representing the probabilities or scores for each class. Filter: In the context of convolutional neural networks (CNNs), a filter (also known as a kernel) is a small matrix used for convolution operations. Filters are applied to the input data to extract features by performing element-wise multiplication and summation operations. Padding: Padding is a technique used in CNNs to add extra pixels or values around the edges of the input data. Padding can be ‘valid (no padding) or “same” (adding padding to maintain the output size). It helps in preserving spatial information and handling edge effects during convolution operations. . Stride: Stride refers to the number of pixels or units the filter/kernel moves during the convolution operation. A stride of | means the filter moves one pixel at a time, while a larger stride skips pixels, resulting in a smaller output size. Tensors: Tensors are multi-dimensional arrays used to represent data in deep learning frameworks like TensorFlow and PyTorch. In the context of neural networks, tensors are used to store and manipulate input data, mode! parameters (weights and biases), and intermediate outputs during the computation process. Scalars (0D tensors), vectors (ID tensors), matrices (2D tensors), and higher-dimensional arrays are all examples of tensors. 4(b) Illustrate any two applications of GAN. - Moule No.6 | (10 Marks) Ans: GANs, or Generative Adversarial Networks, have many uses in many different fields. Here are some of the widely recognized uses of GANs: 1. Image Synthesis and Generation : GANs are often used for picture synthesis and generation tasks, They may create fresh, lifelike pictures that mimic training data by learning the distribution that explains the dataset. The development of lifelike avatars, high-resolution photographs, and fresh artwork have all been facilitated by these types of generative networks. 2. Image-to-Image Translation : GANs may be used for problems involving image-to-image translation, where the objective is to convert an input picture from one domain to another while maintaining its key features. GANs may be used, for instance, to change pictures from day to night, transform drawings into realistic images, or change the creative style of an image. 3. Text-to-Image Synthesis : GANs have been used to create visuals from descriptions in text. GANs may produce pictures that translate to a description given a text input, such as a * phrase or a caption. This application might have an impact on is produced using text-based how realistic visual materi instructions. 4. Data Augmentation : GANs can augment present data and increase the robustness and generalizability of machine-learning models by creating synthetic data samples. “5. Data Generation for Training : GANs can enhance the resolution and quality of low-resolution images. By training on pairs of - low-resolution and high-resolution images, GANs can generate high-resolution images from low-resolution inputs, enabling improved image quality in various applications such as medical imaging, satellite imaging, and video enhancement. Q5. Attempt All the Questions. 8(a) Explain the architecture and training of GAN - Moule No.6 | (10 Marks) Ans: A Generative Adversarial Network (GAN) is composed of two primary parts, which are the Generator and the Discriminator. Generator Model _—" Akey element responsible for creating fresh, accurate data in a Generative” Adversarial Network (GAN) is the generator model. The generator takes random noise as input and converts it into complex data samples, such text or images. It is commonly depicted as a deep neural network. The training dato’s underlying distribution is captured by layers of learnable parameters in its design through training. The generator adjusts its output to produce samples that closely mimic real data as it is being trained by using backpropagation to fine-tune its parameters. The generator’s ability to generate high-quality, varied samples that can fool the discriminator is what makes it successful. Discriminator Model An artificial neural network called a discriminator model is used in Generative Adversarial Networks (GANS) to differentiate between generated and actual input. By evaluating input samples and allocating * probability of authenticity, the discriminator functions as a binary classifier. Over time, the discriminator learns to differentiate between genuine data from the dataset and artificial samples created by the generator. This allows it to progressively hone its parameters and increase its level of proficiency. Convolutional layers or pertinent structures for other modalities are usually used in its architecture when dealing with picture data. Maximizing the discriminators capacity to accurately identify generated samples as fraudulent and real samples as authentic is the aim of the adversarial training procedure. The discriminator grows increasingly discriminating as a result of the generator and discriminator's interaction, which helps the GAN produce extremely realistic-looking synthetic data overall. Condition YY Discriminator sitcorreet? Generated fake samples Fine tune taining latent andor vaabie | fl inet architecture is a class of deep ini used in computer vision tasks such as e The basic idea behind CNNs is to use convolutional layers to extract local features from the input image, and pooling layers to reduce the dimensionality of the feature maps. e CNN architecture typically consists of several layers, including: 1. Convolutional layers: These layers apply a set of learnable filters to the input image, producing a set of feature maps that highlight local patterns and structures in the image. 2. Activation functions: Those functions Introduce nonlinearity Into the modol, allowing It to learn complex, Nonlinear rolationships botwoon tho Input and output 3. Pooling layors: Those layers reduce tho dimensionality of the feature maps by downsampling thom using a pooling operation, such as max pooling or average pooling. 4, Fully connoctod layors: Thoso layora take the output of the convolutional and pooling layers and use thom to make predictions about tho input Imago. ¢ CNNarchitecturo can bo customized based on tho specific tank at hand. Wie © For example, if tho tosk i objact dotoctlon, tho output of the network may be fed into a. sopar errant and regrossion branch to predict the object « clas one n | box x coordinates, respectively. nit | i) } Ifthe task Is semantic : pamantad notwork may uso akip connections to fuse foc i fe » gcaloa to produco a dense pixel-wiso predict 1 I CNNs have achieved ato ‘i ults on a wido rango of computer vision tasks, an caro wi tap In industry and academia. However, designing an oO} om os can bo challenging J and oftan ro i ‘98 a combination of domain oxpartiso, experimentation, and automated search mothods such as noural architecture search. 6(b) What is the difference betweon fead forward noural network and RNN. ~ Moule No.5 | (5 Marks) Ans: Comparison Feed-forward Noural Rocurront Noural Attribute Networks Notworks Signal flow Forward only Bidirectional direction Delay introduced No Yes Complexity Low High Neuron independence in Yes No the same layer Speed High slow ? Pattern recognition, Language translation, Commonly used speech recognition, speech-to-text for and character conversion, and robotic recognition control 6(c) Explain the process of convolution as backpropagation. - Moule No.5 | (5 Marks) Ans: The process of convolution in neural networks involves applying a filter {also known as a kernel) to an input image or feature map to produce an output feature map. Each element in the output feature map is computed by taking the dot product between the filter anda corresponding patch of the input image. This operation is repeated across the entire input image, sliding the filter over the input spatially. When it comes to backpropagation in convolutional neural networks (CNNs), it's essentially the process of computing gradients of the loss function with respect to the parameters (ie, filter weights and biases) of the convolutional layers. Here's how it works: Forward Pass: During the forward pass, the input image is convolved with the filters in each convolutional layer to produce output feature maps. These feature maps are then passed through activation functions (e.g, ReLU) and possibly other operations (e.g,, pooling). Backward Pass: In the backward pass, the gradients of the loss function with respect to the output feature maps are computed first. These gradients are then propagated backward through the network to compute the gradients with respect to the parameters of the convolutional layers (ie, filter weights and biases). 2 Gradient Calculation: To compute the gradients with respect to the filter weights, the gradients of the loss function with respect to each element in the output feature maps are convolved with the corresponding patches of the input feature maps. This is essentially the cross-correlation operation, but with flipped filters (which is equivalent to convolution). Updating Parameters: Once the gradients with respect to the filter weights are computed, the parameters (filter weights and biases) of the convolutional layers are updated using an optimization algorithm such as gradient descent. Propagation to Previous Layers: The gradients computed for the input feature maps are then backpropagated to the previous layers in the network, allowing for the computation of gradients with respect to their parameters as well. 6(d) Describe the different types of GAN’s. - Moule No. 61 (6 Marks) Ans: Types of GANS 1 Vanilla GAN: This is the simplest type of GAN. Here, the Generator and the Discriminator are simple a basic multi-layer perceptrons. In vanilla GAN, the algorithm is really simple, it tries to optimize the mathematical equation using stochastic gradient descent. Conditional GAN (CGAN): CGAN can be described as a deep learning method in which some conditional parameters are put into place. * INCGAN, an additional parameter ‘y’ is added to the Generator for generating the corresponding data. Labels are also put into the input to the Discriminator in order for the Discriminator to help distinguish the real data from the fake generated data. < Deep Convolutional GAN (DCGAN): DCGAN is one of the most popular and also the most successful implementations of GAN. itis composed of ConvNets in place of multi-layer perceptrons. «The ConvNets are implemented without max pooling, which is in fact replaced by convolutional stride. Also, the layers are not fully connected. Laplacian Pyramid GAN (LAPGAN): The Laplacian pyramid is a linear invertible image representation consisting of a set of band-pass images, spaced an octave apart, plus a low-frequency residual. This approach uses multiple numbers of Generator and Discriminator networks and different levels of the Laplacian Pyramid. © This approach is mainly used because it produces very high-quality images. The image is down-sampled at first at each layer of the pyramid and then it is again up-scaled at each layer ina backward pass where the image acquires some noise from the Conditional GAN at these layers until it reaches its original size. 5. Super Resolution GAN (SRGAN): SRGAN as the name suggests is a way of designing a GAN in which a deep neural network is used along with an adversarial network in order to produce higher-resolution images. This type of GAN is particularly useful in optimally up-scaling native low-resolution images to enhance their details minimizing errors while doing so. 6(e) Explain the architecture of RNN. - Moule No.5 | (5 Marks) Ans: Recurrent Neural Network(RNN) is a type of Neural Network where the output from the previous step is fed as input to the current step. In traditional neural networks, all the inputs and outputs are independent of each other. Still, in cases when it is required to predict the next word ofta sentence, the previous words are required and hence there is a need to remember the previous words. Thus RNN came into existence, which solved this issue with the help of a Hidden Layer. The main and most important feature of RNN is its Hidden state, which remembers some information about a sequence. The state is also referred to as Memory State since it remembers the previous input to the network. It uses the same parameters for each input as it performs the same task on all the inputs or hidden layers to produce the output. This reduces the complexity of parameters, unlike other neural networks. RNNs have the some input and output architecture as any other deep neural architecture. However, differences arise in the way information flows from input to output. Unlike Deep neural networks where we have different weight matrices for each Dense network in RNN, the weight across the network remains the same. It calculates state hidden state Hi for every input Xi . By using the following formulas: h= o(UX + Wh-1 + B) Y= 0(Vh+C) Hence Y=f(X,h,W,U,V,8,C) Here S is the State matrix which has element si as the stato of tho network at timestep i The parameters in the network are W, U, V, c, b which aro shared across timestep RECURRENT NEURAL NETWORKS So RNN Si RNN Sn RNN RNN __|5i Cell Cell ~~ Cell Cell —* | | x Xo x Xp Recurrent Neural Architecture

You might also like