Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Minor Changes in Text #955

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 18, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions neural_nets.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,13 @@
"\n",
"### Overview\n",
"\n",
"Although the Perceptron may seem like a good way to make classifications, it is a linear classifier (which, roughly, means it can only draw straight lines to divide spaces) and therefore it can be stumped by more complex problems. We can extend Perceptron to solve this issue, by employing multiple layers of its functionality. The construct we are left with is called a Neural Network, or a Multi-Layer Perceptron, and it is a non-linear classifier. It achieves that by combining the results of linear functions on each layer of the network.\n",
"Although the Perceptron may seem like a good way to make classifications, it is a linear classifier (which, roughly, means it can only draw straight lines to divide spaces) and therefore it can be stumped by more complex problems. To solve this issue we can extend Perceptron by employing multiple layers of its functionality. The construct we are left with is called a Neural Network, or a Multi-Layer Perceptron, and it is a non-linear classifier. It achieves that by combining the results of linear functions on each layer of the network.\n",
"\n",
"Similar to the Perceptron, this network also has an input and output layer. However it can also have a number of hidden layers. These hidden layers are responsible for the non-linearity of the network. The layers are comprised of nodes. Each node in a layer (excluding the input one), holds some values, called *weights*, and takes as input the output values of the previous layer. The node then calculates the dot product of its inputs and its weights and then activates it with an *activation function* (sometimes a sigmoid). Its output is fed to the nodes of the next layer. Note that sometimes the output layer does not use an activation function, or uses a different one from the rest of the network. The process of passing the outputs down the layer is called *feed-forward*.\n",
"Similar to the Perceptron, this network also has an input and output layer; however, it can also have a number of hidden layers. These hidden layers are responsible for the non-linearity of the network. The layers are comprised of nodes. Each node in a layer (excluding the input one), holds some values, called *weights*, and takes as input the output values of the previous layer. The node then calculates the dot product of its inputs and its weights and then activates it with an *activation function* (e.g. sigmoid activation function). Its output is then fed to the nodes of the next layer. Note that sometimes the output layer does not use an activation function, or uses a different one from the rest of the network. The process of passing the outputs down the layer is called *feed-forward*.\n",
"\n",
"After the input values are fed-forward into the network, the resulting output can be used for classification. The problem at hand now is how to train the network (ie. adjust the weights in the nodes). To accomplish that we utilize the *Backpropagation* algorithm. In short, it does the opposite of what we were doing up to now. Instead of feeding the input forward, it will feed the error backwards. So, after we make a classification, we check whether it is correct or not, and how far off we were. We then take this error and propagate it backwards in the network, adjusting the weights of the nodes accordingly. We will run the algorithm on the given input/dataset for a fixed amount of time, or until we are satisfied with the results. The number of times we will iterate over the dataset is called *epochs*. In a later section we take a detailed look at how this algorithm works.\n",
"After the input values are fed-forward into the network, the resulting output can be used for classification. The problem at hand now is how to train the network (i.e. adjust the weights in the nodes). To accomplish that we utilize the *Backpropagation* algorithm. In short, it does the opposite of what we were doing up to this point. Instead of feeding the input forward, it will track the error backwards. So, after we make a classification, we check whether it is correct or not, and how far off we were. We then take this error and propagate it backwards in the network, adjusting the weights of the nodes accordingly. We will run the algorithm on the given input/dataset for a fixed amount of time, or until we are satisfied with the results. The number of times we will iterate over the dataset is called *epochs*. In a later section we take a detailed look at how this algorithm works.\n",
"\n",
"NOTE: Sometimes we add to the input of each layer another node, called *bias*. This is a constant value that will be fed to the next layer, usually set to 1. The bias generally helps us \"shift\" the computed function to the left or right."
"NOTE: Sometimes we add another node to the input of each layer, called *bias*. This is a constant value that will be fed to the next layer, usually set to 1. The bias generally helps us \"shift\" the computed function to the left or right."
]
},
{
Expand All @@ -60,7 +60,7 @@
"\n",
"The NeuralNetLearner returns the `predict` function which, in short, can receive an example and feed-forward it into our network to generate a prediction.\n",
"\n",
"In more detail, the example values are first passed to the input layer and then they are passed through the rest of the layers. Each node calculates the dot product of its inputs and its weights, activates it and pushes it to the next layer. The final prediction is the node with the maximum value from the output layer."
"In more detail, the example values are first passed to the input layer and then they are passed through the rest of the layers. Each node calculates the dot product of its inputs and its weights, activates it and pushes it to the next layer. The final prediction is the node in the output layer with the maximum value."
]
},
{
Expand All @@ -80,7 +80,7 @@
"\n",
"### Overview\n",
"\n",
"In both the Perceptron and the Neural Network, we are using the Backpropagation algorithm to train our weights. Basically it achieves that by propagating the errors from our last layer into our first layer, this is why it is called Backpropagation. In order to use Backpropagation, we need a cost function. This function is responsible for indicating how good our neural network is for a given example. One common cost function is the *Mean Squared Error* (MSE). This cost function has the following format:\n",
"In both the Perceptron and the Neural Network, we are using the Backpropagation algorithm to train our model by updating the weights. This is achieved by propagating the errors from our last layer (output layer) back to our first layer (input layer), this is why it is called Backpropagation. In order to use Backpropagation, we need a cost function. This function is responsible for indicating how good our neural network is for a given example. One common cost function is the *Mean Squared Error* (MSE). This cost function has the following format:\n",
"\n",
"$$MSE=\\frac{1}{n} \\sum_{i=1}^{n}(y - \\hat{y})^{2}$$\n",
"\n",
Expand Down Expand Up @@ -169,7 +169,7 @@
"source": [
"### Implementation\n",
"\n",
"First, we feed-forward the examples in our neural network. After that, we calculate the gradient for each layer weights. Once that is complete, we update all the weights using gradient descent. After running these for a given number of epochs, the function returns the trained Neural Network."
"First, we feed-forward the examples in our neural network. After that, we calculate the gradient for each layers' weights by using the chain rule. Once that is complete, we update all the weights using gradient descent. After running these for a given number of epochs, the function returns the trained Neural Network."
]
},
{
Expand Down Expand Up @@ -206,9 +206,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The output should be 0, which means the item should get classified in the first class, \"setosa\". Note that since the algorithm is non-deterministic (because of the random initial weights) the classification might be wrong. Usually though it should be correct.\n",
"The output should be 0, which means the item should get classified in the first class, \"setosa\". Note that since the algorithm is non-deterministic (because of the random initial weights) the classification might be wrong. Usually though, it should be correct.\n",
"\n",
"To increase accuracy, you can (most of the time) add more layers and nodes. Unfortunately the more layers and nodes you have, the greater the computation cost."
"To increase accuracy, you can (most of the time) add more layers and nodes. Unfortunately, increasing the number of layers or nodes also increases the computation cost and might result in overfitting."
]
}
],
Expand Down