Machine Learning Made Easy: Introduction To Artificial Neural Networks

Introduction

In recent years scientific community is continuously putting its effort to build systems which can mimic human behaviors. The effort of making computers to process the data similar to human brain started in the year 1943 when McCulloch and Pitts designed the first ever artificial neural model which termed as Artificial Neural Network (ANN). ANN is a computational system inspired by the structure, processing method and learning ability of a biological brain. In this blogpost I will be discussing on paper “Introduction to the Artificial Neural Networks” by Andrej et.al., This paper is best suited for someone who is very new to the world of ANN. The authors have discussed the structure of artificial neuron, similarities between biological and artificial neuron, details of ANN, types of ANN, learning methodologies for ANN and the uses of ANN.

Human brain is distinct because of its logical thinking ability. Human body consists of billions of neurons, which play important role in carrying signals from external stimuli to the brain. These signals are correspondent to the particular action which will be performed by the human body. Thus Brain and Central Nervous System plays important role in the human physiology. The idea or inspiration behind ANN is the biological neural network system. Biological neurons are building blocks of the brain. Image 1 shows the structure of a biological neuron. A biological neuron receives impulses from dendrites and soma processes the impulse signals, when a threshold is reached the impulse signal charges are sent out via axon across a synapse. The neurons are interconnected in a complex way which forms structure called nervous system. The human body performs various biological pathways and neurons connect these pathways to the brain ex: some neurons are connected to cells in sensory organs like smell, hearing and vision. Similarly some conduct signals to the motor systems and other organs of the body ex: body movements and central nervous system which passes body signals.

Image 1: Biological Neuron (Image derived from one © John Wiley and Sons Inc. 2000)

The biological neurons operate in milliseconds which is six time slower than the computers, which operate in nanoseconds. So there is a huge advantage if we make computes to mimic biological neurons there by human brains.

The Artificial Neuron and its Function:

An artificial neuron is built in the same logic as the biological neuron. In artificial neurons information comes via inputs that weighed with specific weights (this step behaves as biological dendrites), the artificial neuron then sums these weights and bias with a transfer function (behaves as soma of biological neuron). At the end an artificial neuron passes the processed information via outputs (behaves as axon of biological neuron). Below is the schematic representation of an artificial neuron.

Image 2: Artificial Neuron (Introduction to the Artificial Neural Networks - Andrej Krenker, Janez Bešter and Andrej Kos (2011).

One should choose a transfer function based on the type of problem to be solved. To be precise the transfer function is a mathematical function which defines the properties of artificial neuron. Some of the general transfer functions include Step Function, Linear Function and Non-linear (Sigmoid) function.

Step function: It is a binary function which will have only two outputs zero and one. If input value meets a threshold it will result in a specific output. This is the way biological neuron threshold works as well; when there is a trigger from the outer environment it induces action potential or biological pathways. Using these types of binary functions or step functions in an artificial neuron is termed as perceptron. Perceptrons are usually used in the last layer of ANN.

Linear Function: In this type of transfer function neuron will be performing simple linear function over the sum of weighed inputs and bias. This is usually deployed in the input layer of ANN.

Non –Linear or Sigmoid Function: This is a commonly used function which performs simple derivative operations. It is helpful in calculating the weight updates in ANN.

Artificial Neural Networks

A result oriented systematic interconnection of two or more artificial neuron will form artificial neural network. A typical ANN will have three layers of neuron interconnections (each layer can have several neurons),

1. Input Layer: In this layer neurons will receive inputs or signals.

2. Hidden Layer: In this layer neurons will perform mathematical calculations like summation, multiplications etc.

3. Output Layer: In this layer neurons will deliver the outputs or results.

A simple schematic ANN is shown below.

Image 3: A Simple Artificial Neural Network (Introduction to the Artificial Neural Networks - Andrej Krenker, Janez Bešter and Andrej Kos (2011).

To achieve a desired results from an ANN, we need to connect the neurons in a systematic manner, random inter connections will not yield any results. The way in which individual neurons are interconnected is called “topology”. We have many pre-defined topographies which can help us in solving problems in an easier, faster and more efficient way. After determining the type of given problem we need to decide for topology of ANN we are going to use and then fine-tune it by adjusting the weights.

Although we can make numerous interconnections and build many topologies, all the topologies are classified into two basic classes called

1. Feed - Forward Topology

2. Recurrent Topology.

1. Feed - Forward Topology (Feed - Forward Neural Network): In this type of topology input information/signals will travel in only one direction i.e., from input layer to hidden layer and then to output layer. This type of topology does not have any restriction on the number of layers, type of transfer function used in individual artificial neuron or number of connections between individual artificial neurons. The below image 4 shows the simple Feed – Forward Topology.

Image 4: Feed-forward (FNN) topology of an artificial neural network. (Introduction to the Artificial Neural Networks - Andrej Krenker, Janez Bešter and Andrej Kos (2011).

2. Recurrent Topology (Recurrent Neural Network): In this type of topology flow of information is independent of direction i.e., the information can flow between any three layers between Input, Hidden and Output layer in any direction. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior. Recurrent artificial neural networks can use their internal memory to process any sequence of inputs. Image 5 shows the simple Recurrent Topology.

Image 5: Recurrent (RNN) topology of an artificial neural network. (Introduction to the Artificial Neural Networks - Andrej Krenker, Janez Bešter and Andrej Kos (2011).

There are some special of type of recurrent artificial neural networks such as Hopfield, Elman, Jordan, and bi-directional artificial neural networks.

(a) Hopfield Artificial Neural Networks

This is a recurrent neural network which consists of one or more neurons. The neurons in this model act as a stable vectors which are nothing but the memory centers. When we train the model with specific examples the vectors act as memory centers and when the test data is introduced these memory units interprets the results in binary units. The binary units take two different values for their states which will be determined by whether the input units exceed the threshold or not. The binary values can be either 1 or -1, or 1 or 0. The important thing about this network is that the connections must be symmetric otherwise it will exhibit chaotic behavior.

Image 6: Hopfield Artificial Neural Networks. (Introduction to the Artificial Neural Networks - Andrej Krenker, Janez Bešter and Andrej Kos (2011).

(b) Elman and Jordan Artificial Neural Networks

Elman neural network consists of three layer input, hidden and output layers. In this ANN input layer has a recurrent connection. Elman’s neural network has a loop from hidden layer to input layer through a unit called context unit. This type of ANN usually designed to learn sequential or varying patterns of data. Elman neural network has a sigmoid artificial neuron in hidden layer and linear artificial neuron in output layer, this combination increases the accuracy of the model. Jordan artificial neural network is similar to Elmans neural network but has a loop from output layer to input layer through a context unit.

Image 7 : Elman and Jordan Artificial Neural Networks. (Introduction to the Artificial Neural Networks - Andrej Krenker, Janez Bešter and Andrej Kos (2011).

(c) Long Short Term Memory (LSTM)

This is most widely used ANN because of its long term memory feature. LSTM can learn from its experience to process, classify and predict time series with very long time lags of unknown size between important events. LSTM has three gate concepts which include “Write Gate”, “Keep Gate” and “Read Gate”. When Write Gate is on information will get into the system. Information will stay in the system till the Keep Gate is on. The information can be read or retrieved when the Read Gate is on. The working principle of LSTM is shown in the image 8. As per the image the input layer consists of four neurons in the input layer. The top neuron in the input layer receives the input signal and passes it on to the subsequent neuron where the weights will be computed. The third neuron in the input layer decides as to how long it has to hold the values in the memory, and the forth neuron decides when it should release the values to the output layer. Neurons in the first hidden layer does simple multiplication of input values and the second hidden layer computes simple linear function on the input values. Output of the second hidden layer will be fed back to the input and first hidden layer which will help in making decisions. The output layer performs simple multiplication of input values.

Image 8: Long Short Term Memory. (Introduction to the Artificial Neural Networks - Andrej Krenker, Janez Bešter and Andrej Kos (2011).

(d) Bi-directional Artificial Neural Networks (Bi-ANN)

Bi-directional artificial neural networks are capable of predicting both future and past values. This makes them unique of all other available ANN. The schematic representation of Bi-ANN is shown in the image 9. The model consist of two individual inter connected artificial neural networks through two dynamic artificial neurons which are capable of remembering their internal state. The two inter connected neural networks perform direct and inverse transformation functions, this type of inter connection between future and past values increases the Bi-ANN’s prediction capabilities. This model has two phase learning methodology where in the first phase it should be taught future values and in the second phase about past values.

Image 9: Bi-directional Artificial Neural Networks (Bi-ANN). (Introduction to the Artificial Neural Networks - Andrej Krenker, Janez Bešter and Andrej Kos (2011).

(e) Self-Organizing Map (SOM)

Self Organizing Map (SOM) is a type of FNN however. SOM is different in its arrangement when compared to the other ANNs, these are usually arranged in an hexagonal shape. The topological property of this ANN is determined by the neighborhood function. This type of ANN produces low dimensional views of high dimensional data. Such ANNs can regularities and correlation in their input signal or values and adapt them for the future responses. This model uses unsupervised learning technique, it can be trained by adjusting the weights and arrive at a point of initialization. After learning phase the model has a process called mapping in which only one neuron whose weight vector lies closes to the input vector will be chosen, this neuron is termed as winning neuron.

Image 10: Self Organizing Map (Introduction to the Artificial Neural Networks - Andrej Krenker, Janez Bešter and Andrej Kos (2011).

(f) Stochastic Artificial Neural Network (Boltzmann machine)

The Stochastic Artificial Neural Networks works are built by either giving network's neurons random transfer functions, or by giving them random weights. Because of their random fluctuation these ANNs are useful in solving optimization problems.

(g) Physical Artificial Neural Network

The physical neural networks is a field which is growing slowly, the early first physical artificial neural networks were created using memory transistors called memistors. This technology did not last long because of its incapability of commercializing. However, in recent years many researches focused on similar approach using nanotechnology or phase change material.

Learning Methodologies for ANN

Fine tuning a topology is just a precondition for ANN. Before we can use ANN we have to teach it solving the type of given problem, this will be accomplished by learning process. As human’s behavior comes from continuous learning and social interactions, similarly we can make an ANN learn and behave as we require.

ANN learning can be classified into three types, Supervised Learning, Unsupervised Learning and Reinforcement Learning. Each learning methodology is chosen for specific type of problem that has to be solved by ANN.

1. Supervised learning: This is a type of Machine learning technique where

We are aware of the input values (X) as well as the results (Y). We will train (f) the ANN by adjusting weights which will produce the desired results.

Y= f(X)

The purpose of this is to approximate the mapping function so well that when you have new input data (x’) that you can predict the output variables (Y’) for that data. In this type of leaning data is divided into two parts Training Data and Test Data. The training data consist of pairs of input and desired output values that are represented as data vectors. Test data set consist of data that has not been introduced to ANN while learning. When supervised learning achieves an acceptable level of performance it can be deployed as a standardized way of learning in an ANN.

2. Unsupervised Learning: In this type of learning we will know only the input values which will fed into the ANN. The model has to come up with the learning process and produce the underlying structure of the learnt data in order to achieve a suitable output. In this type of learning ANN is given only unlabeled examples, one common form of unsupervised learning is clustering where we try to categorize data in different clusters by their similarity.

3. Reinforcement learning: In this type of learning data will not be given to the ANN but generated by interactions with the environment. In reinforcement learning ANN automatically determines the ideal behavior within a specific context, in order to maximize the performance. Reinforcement learning is widely used in robot control, telecommunications, and games such as chess and other sequential decision making tasks.

Applications of Artificial Neural Network’s

Artificial Neural Networks have wide variety of applications in various industries. The most ingesting applications are

Handwriting recognition: U.S. Postal department has deployed handwriting recognizing algorithms to sort its posts. Neural networks can learn and interpret the hand written data and are best suited for this type of activity. The below image shows how the algorithms interpret the hand written data correctly.

Image 11: Hand Writing Recognition (Source: Wikipedia)

Information and Communication Technologies (ICT) fraud detection: The bi-directional ANN network can be used in ICT fraud detection, the telecommunication technologies not only has benefits but also has some threats. Criminals misuse the technology to capture the data like bank details, personnel information, money laundering and for terrorist activities. This can be overcome by deploying neural network system which monitors the behaviors of user and compares with the pre-defined data. In case of suspicion it triggers an alarm by which ICT companies can handle the situation way before things goes out of hand.

Retina Scan, Finger Print and Facial Recognition: In the current world Retina Scan, Finger Print and Facial Recognition are major security measures and neural network can be adopted learn the specific patterns of these and output the details when required.

Gaming Technology and Robotics: ANN is widely applied in the field of gaming and robotics. The ability of the ANN to learn, reproduce, and predict the future and past has made it best suited for gaming and robotic technology.

Financial Risk Management: In the financial field ANN is adopted in credit scoring, market risk estimation and predictions. ANN is successfully deployed for credit scoring and rating based on various inputs that will be given to it, it learns the input feature and predicts or gives a score as output. ANN is also useful in other areas of financial risk management such as market risk management and operational risk management.

Medical Imaging: In recent years loads of research is being done where ANN is set to learn the patterns of medical images such as cardiovascular imaging and made to predict the disease.

Natural Language Processing (NLP): ANN are widely used in NLP they are made to learn the patterns and tuned to give the desired outputs.

Voice and Image Recognition: ANN’s are used in learning/recognizing the voice and interpret it. After interpretation it will produce the desired results. This is the same way as iPhone structured its voice recognition technology Siri.

The ANN are deployed in recognizing photos, the ANN will trained on set of images and will be made to recognize the image. This is how the Facebook photo tagging works.

Similarly ANN’s have wide variety of significant application in most of the industries.

References:

1. Andrej Krenker, Janez Bešter and Andrej Kos (2011). Introduction to the Artificial Neural Networks, Artificial Neural Networks - Methodological Advances and Biomedical Applications, Prof. Kenji Suzuki (Ed.), ISBN: 978-953-307-243-2, InTech, Available from: http://www.intechopen.com/books/artificial-neural-networksmethodological-advances-and-biomedical-applications/introduction-to-the-artificial-neural-networks

2. Simon Haykin. Neural Networks – a Comprehensive Foundation. Prentice Hall, New Jersey, 2nd edition, 1999

3. Alan Dorin, An Introduction to Artificial Neural Networks, AI, A-Life and Virtual Environments, Monash University

4. reinforcementlearning.ai-depot.com/

5. Artificial Neural Networks for Beginners, Carlos Gershenson

6. machinelearningmastery.com

7. Wikipedia

Machine Learning Made Easy

Tuesday, 25 October 2016

Introduction To Artificial Neural Networks

No comments:

Post a Comment