When we were young, we were taught about the basics. It could be either in arithmetic or just plain understanding of the objects around us. Apple is red, while banana is yellow. One plus one equals two, and two plus two equals four. These understandings, to most people, are deeply learned after a number of examples and trials. Although some (or very few) are quick in learning, we all need a set of examples before our brain could finally interpret a given idea. The reason behind how the nervous system works have been a topic of debate in neuroscience since the nineteenth century.
This type of learning have been modeled in 1943 by McCulloch and Pitts, particularly how a neuron (in this case, the brain) responds, collects and pass information in the central nervous system. This came first to be the artificial neural network. An illustration of a neuron is shown in Figure 1.
Figure 1. Artificial Model of a Neuron
The basic idea is that, a number of input (x1, x2, x3) is connected to a neuron via edges, each of which has corresponding weights (w1, w2, w3). The neuron then collects and receives the weighted inputs from the other neurons, sums them up, and passes it to the activation function, g. An initialization of the weights is done and a desired output is placed. This part is called the learning process of the network. Based from the deviation of the output of the network to the desired output, the weights of the input are adjusted, until the deviations are minimal, and the network is able to classify the inputs. This process of determining the weights of the input is an iterative process with a prior knowledge of the output. In this activity, we will also try to determine the effect of varying the number of iterations and the learning rate of a the network.
In the last activity, we worked on pattern recognition, which relies on a set of features that distinguish the different classes to be segregated. This process is an example of a linear discriminant analysis. In the activity, I classified five different types of Philippine Peso coin. These are 5 centavo, 10 centavo, 25 centavo, 1 peso, and 5 peso coins. Each coin have different area, and different colors. However, to some of these, the color gradient are relatively the same from one another, especially if other factors such as fading and scratches in the coin is present.
Unlike pattern recognition where the program performs the same method in all input subjects, the method of neural network learns the pattern and then uses it to classify the next input. A connection of many neurons make up a neural network. It typically consists of an input layer, a hidden layer, and an output layer as shown in Figure 2.
Figure 2. Neural Network
The input layer receives the features, say size and color, and then fed to the hidden layer, where they are acted upon and passed on to the output layer. In supervising the learning, a desired output is initialized and an error back-propagation is performed, where the difference of the output layer and the desired output is computed, and together with the activation functions in each layer, the error derivatives with respect to the weights are computed until epoch is attained. This is the time where all the data sets are already entered in the network, and the modification of the weights is performed.
The ANN (artificial neural network) toolbox present in Scilab was used in this activity. This toolbox is straightforward. I had thought about coding the error propagation before just like what we did in AP 156, but thanks to this toolbox, the activity is made relatively simpler and easier. I think the crucial thing here is to know and understand how neural network works.
The image below is the plot of the predictions performed by my neural network program for a set of 50 training data points, 10 for each classification.
Figure 3. Result of the algorithm for the test data, ten for each classifications. The red ones correspond to the desired and the blue points correspond to the output of the network.
Notice that the result of the program is most accurate in classifying the 5 peso coin.
The ANN toolbox gives the user the luxury to change the parameters such as the training cycle and the learning rate. I tried to determine the effect of these two parameters, and I got the following plot.
Figure 4. % Error as a function of increasing training cycle
The y-axis corresponds to the percent deviation of the result of the program for a single class. In this study, I only used the 5-centavo coin class. As you increase the number of training cycle, the result of the ANN becomes more accurate, which is intuitive since you feed your neuron with more and more example. Isn't it very similar with the human brain? :) The more we are exposed to a certain thing, the more we get acquainted to it, and the more we get familiarized with it.
Now, how about the learning rate? Is it intuitive the the higher the learning rate, the more accurate the result should be? Well, apparently, that is not always the case. This is as suggested by the following plot.
Figure 5. % error for increasing values of learning rate [0.5 10]
There is a certain range where the increasing the learning rate would decrease the accuracy of the result, nd that is at about 4. Beyond this, increasing the learning rate would make the result more accurate, and closer to the correct values. This result is quite interesting! :)
And that's it! We're down to the last activity. Now, we'll proceed to the project. My project is about the topography of a granular collapse, which is my current topic in research. I'm glad Dr. Soriano has allowed us to do project that has something to do with our own research. So yeah, lucky me!
For this activity though, I give myself a grade of 11/10 for investigating the effects of different learning rate and training cycle. :)