Introduction
In recent years scientific
community is continuously putting its effort to build systems which can mimic
human behaviors. The effort of making computers to process the data similar to
human brain started in the year 1943 when McCulloch and Pitts designed the
first ever artificial neural model which termed as Artificial Neural Network
(ANN). ANN is a computational system inspired by the structure, processing
method and learning ability of a biological brain. In this blogpost I will be
discussing on paper “Introduction to the Artificial Neural Networks” by Andrej et.al.,
This paper is best suited for someone who is very new to the world of ANN. The
authors have discussed the structure of artificial neuron, similarities between
biological and artificial neuron, details of ANN, types of ANN, learning
methodologies for ANN and the uses of ANN.
Human brain is distinct because
of its logical thinking ability. Human body consists of billions of neurons,
which play important role in carrying signals from external stimuli to the
brain. These signals are correspondent to the particular action which will be
performed by the human body. Thus Brain and Central Nervous System plays
important role in the human physiology. The idea or inspiration behind ANN is
the biological neural network system. Biological neurons are building
blocks of the brain. Image 1 shows the structure of a biological neuron. A biological
neuron receives impulses from dendrites and soma processes the impulse signals,
when a threshold is reached the impulse signal charges are sent out via axon
across a synapse. The neurons are interconnected in a complex way which forms
structure called nervous system. The human body performs various biological
pathways and neurons connect these pathways to the brain ex: some neurons are
connected to cells in sensory organs like smell, hearing and vision. Similarly
some conduct signals to the motor systems and other organs of the body ex: body
movements and central nervous system which passes body signals.
Image 1: Biological Neuron (Image
derived from one © John Wiley and Sons Inc. 2000)
The biological neurons operate in
milliseconds which is six time slower than the computers, which operate in
nanoseconds. So there is a huge advantage if we make computes to mimic
biological neurons there by human brains.
The Artificial Neuron and its
Function:
An artificial neuron is built in
the same logic as the biological neuron. In artificial neurons information
comes via inputs that weighed with specific weights (this step behaves as
biological dendrites), the artificial neuron then sums these weights and bias
with a transfer function (behaves as soma of biological neuron). At the end an
artificial neuron passes the processed information via outputs (behaves as axon
of biological neuron). Below is the schematic representation of an artificial
neuron.
Image 2: Artificial Neuron (Introduction
to the Artificial Neural Networks - Andrej Krenker, Janez Bešter and Andrej
Kos (2011).
One should choose a transfer
function based on the type of problem to be solved. To be precise the transfer
function is a mathematical function which defines the properties of artificial
neuron. Some of the general transfer functions include Step Function, Linear
Function and Non-linear (Sigmoid) function.
Step function: It is a binary function which will have
only two outputs zero and one. If input value meets a threshold it will result
in a specific output. This is the way biological neuron threshold works as well;
when there is a trigger from the outer environment it induces action potential
or biological pathways. Using these types of binary functions or step functions
in an artificial neuron is termed as perceptron. Perceptrons are usually used
in the last layer of ANN.
Linear Function: In this type of transfer function
neuron will be performing simple linear function over the sum of weighed inputs
and bias. This is usually deployed in the input layer of ANN.
Non –Linear or Sigmoid Function: This is a
commonly used function which performs simple derivative operations. It is
helpful in calculating the weight updates in ANN.
Artificial Neural Networks
A result oriented systematic
interconnection of two or more artificial neuron will form artificial neural
network. A typical ANN will have three layers of neuron interconnections (each
layer can have several neurons),
1.
Input
Layer: In this layer neurons will
receive inputs or signals.
2.
Hidden
Layer: In this layer neurons will perform mathematical calculations like
summation, multiplications etc.
3.
Output Layer: In this
layer neurons will deliver the outputs or results.
A simple schematic ANN is shown
below.
Image 3: A Simple Artificial
Neural Network (Introduction to the Artificial Neural Networks - Andrej
Krenker, Janez Bešter and Andrej Kos (2011).
To achieve a desired results from
an ANN, we need to connect the neurons in a systematic manner, random inter
connections will not yield any results. The way in which individual neurons are
interconnected is called “topology”. We have many pre-defined topographies
which can help us in solving problems in an easier, faster and more efficient
way. After determining the type of given problem we need to decide for topology
of ANN we are going to use and then fine-tune it by adjusting the weights.
Although we can make numerous
interconnections and build many topologies, all the topologies are classified
into two basic classes called
1.
Feed - Forward Topology
2.
Recurrent Topology.
1. Feed - Forward Topology (Feed
- Forward Neural Network): In this type of topology input
information/signals will travel in only one direction i.e., from input layer to
hidden layer and then to output layer. This type of topology does not have any
restriction on the number of layers, type of transfer function used in
individual artificial neuron or number of connections between individual
artificial neurons. The below image 4 shows the simple Feed – Forward Topology.
Image 4: Feed-forward (FNN)
topology of an artificial neural network. (Introduction to the Artificial
Neural Networks - Andrej Krenker, Janez Bešter and Andrej Kos (2011).
2. Recurrent Topology (Recurrent
Neural Network): In this type of topology flow of information is independent
of direction i.e., the information can flow between any three layers between
Input, Hidden and Output layer in any direction. This creates an internal state of the network
which allows it to exhibit dynamic temporal behavior. Recurrent artificial
neural networks can use their internal memory to process any sequence of
inputs. Image 5 shows the simple Recurrent Topology.
Image 5: Recurrent (RNN) topology
of an artificial neural network. (Introduction to the Artificial Neural
Networks - Andrej Krenker, Janez Bešter and Andrej Kos (2011).
There are some special of type of
recurrent artificial neural networks such as Hopfield, Elman, Jordan, and
bi-directional artificial neural networks.
(a) Hopfield Artificial Neural
Networks
This is a recurrent neural
network which consists of one or more neurons. The neurons in this model act as
a stable vectors which are nothing but the memory centers. When we train the
model with specific examples the vectors act as memory centers and when the
test data is introduced these memory units interprets the results in binary
units. The binary units take two different values for their states which will
be determined by whether the input units exceed the threshold or not. The
binary values can be either 1 or -1, or 1 or 0. The important thing about this
network is that the connections must be symmetric otherwise it will exhibit
chaotic behavior.
Image 6: Hopfield Artificial
Neural Networks. (Introduction to the Artificial Neural Networks - Andrej
Krenker, Janez Bešter and Andrej Kos (2011).
(b) Elman and Jordan
Artificial Neural Networks
Elman neural network consists of
three layer input, hidden and output layers. In this ANN input layer has a
recurrent connection. Elman’s neural network has a loop from hidden layer to
input layer through a unit called context unit. This type of ANN usually
designed to learn sequential or varying patterns of data. Elman neural network
has a sigmoid artificial neuron in hidden layer and linear artificial neuron in
output layer, this combination increases the accuracy of the model. Jordan
artificial neural network is similar to Elmans neural network but has a loop
from output layer to input layer through a context unit.
Image 7 : Elman and Jordan Artificial Neural Networks. (Introduction
to the Artificial Neural Networks - Andrej Krenker, Janez Bešter and Andrej
Kos (2011).
(c) Long Short Term Memory
(LSTM)
This is most widely used ANN
because of its long term memory feature. LSTM can learn from its experience to
process, classify and predict time series with very long time lags of unknown
size between important events. LSTM has three gate concepts which include “Write
Gate”, “Keep Gate” and “Read Gate”. When Write Gate is on information will get
into the system. Information will stay in the system till the Keep Gate is on.
The information can be read or retrieved when the Read Gate is on. The working
principle of LSTM is shown in the image 8. As per the image the input layer
consists of four neurons in the input layer. The top neuron in the input layer
receives the input signal and passes it on to the subsequent neuron where the
weights will be computed. The third neuron in the input layer decides as to how
long it has to hold the values in the memory, and the forth neuron decides when
it should release the values to the output layer. Neurons in the first hidden
layer does simple multiplication of input values and the second hidden layer
computes simple linear function on the input values. Output of the second
hidden layer will be fed back to the input and first hidden layer which will
help in making decisions. The output layer performs simple multiplication of
input values.
Image 8: Long Short Term Memory.
(Introduction to the Artificial Neural Networks - Andrej Krenker, Janez
Bešter and Andrej Kos (2011).
(d) Bi-directional Artificial Neural
Networks (Bi-ANN)
Bi-directional artificial neural
networks are capable of predicting both future and past values. This makes them
unique of all other available ANN. The schematic representation of Bi-ANN is
shown in the image 9. The model consist of two individual inter connected
artificial neural networks through two dynamic artificial neurons which are
capable of remembering their internal state. The two inter connected neural
networks perform direct and inverse transformation functions, this type of
inter connection between future and past values increases the Bi-ANN’s
prediction capabilities. This model has two phase learning methodology where in
the first phase it should be taught future values and in the second phase about
past values.
Image 9: Bi-directional
Artificial Neural Networks (Bi-ANN). (Introduction to the Artificial Neural
Networks - Andrej Krenker, Janez Bešter and Andrej Kos (2011).
(e) Self-Organizing Map (SOM)
Self Organizing Map (SOM) is a
type of FNN however. SOM is different in its arrangement when compared to the
other ANNs, these are usually arranged in an hexagonal shape. The topological property
of this ANN is determined by the neighborhood function. This type of ANN
produces low dimensional views of high dimensional data. Such ANNs can
regularities and correlation in their input signal or values and adapt them for
the future responses. This model uses unsupervised learning technique, it can
be trained by adjusting the weights and arrive at a point of initialization.
After learning phase the model has a process called mapping in which only one
neuron whose weight vector lies closes to the input vector will be chosen, this
neuron is termed as winning neuron.
Image 10: Self Organizing Map (Introduction
to the Artificial Neural Networks - Andrej Krenker, Janez Bešter and Andrej
Kos (2011).
(f) Stochastic Artificial
Neural Network (Boltzmann machine)
The Stochastic Artificial Neural
Networks works are built by either giving network's neurons random transfer
functions, or by giving them random weights. Because of their random fluctuation
these ANNs are useful in solving optimization problems.
(g) Physical Artificial Neural
Network
The physical neural networks is a
field which is growing slowly, the early first physical artificial neural
networks were created using memory transistors called memistors. This
technology did not last long because of its incapability of commercializing.
However, in recent years many researches focused on similar approach using nanotechnology
or phase change material.
Learning Methodologies for ANN
Fine tuning a topology is just a
precondition for ANN. Before we can use ANN we have to teach it solving the
type of given problem, this will be accomplished by learning process. As
human’s behavior comes from continuous learning and social interactions,
similarly we can make an ANN learn and behave as we require.
ANN learning can be classified
into three types, Supervised Learning, Unsupervised Learning and Reinforcement
Learning. Each learning methodology is chosen for specific type of problem that
has to be solved by ANN.
1. Supervised learning: This
is a type of Machine learning technique where
We are aware of the input values
(X) as well as the results (Y). We will train (f) the ANN by adjusting weights
which will produce the desired results.
Y= f(X)
The purpose of this is to approximate
the mapping function so well that when you have new input data (x’) that you
can predict the output variables (Y’) for that data. In this type of leaning
data is divided into two parts Training Data and Test Data. The training data
consist of pairs of input and desired output values that are represented as
data vectors. Test data set consist of data that has not been introduced to ANN
while learning. When supervised learning achieves an acceptable level of performance
it can be deployed as a standardized way of learning in an ANN.
2. Unsupervised Learning: In this
type of learning we will know only the input values which will fed into the
ANN. The model has to come up with the learning process and produce the
underlying structure of the learnt data in order to achieve a suitable output. In
this type of learning ANN is given only unlabeled examples, one common form of
unsupervised learning is clustering where we try to categorize data in
different clusters by their similarity.
3. Reinforcement learning: In
this type of learning data will not be given to the ANN but generated by
interactions with the environment. In reinforcement learning ANN automatically
determines the ideal behavior within a specific context, in order to maximize
the performance. Reinforcement learning is widely used in robot control,
telecommunications, and games such as chess and other sequential decision
making tasks.
Applications of Artificial Neural
Network’s
Artificial Neural Networks have
wide variety of applications in various industries. The most ingesting
applications are
Handwriting recognition: U.S.
Postal department has deployed handwriting recognizing algorithms to sort its
posts. Neural networks can learn and interpret the hand written data and are
best suited for this type of activity. The below image shows how the algorithms
interpret the hand written data correctly.
Image 11: Hand Writing
Recognition (Source: Wikipedia)
Information and Communication
Technologies (ICT) fraud detection: The bi-directional ANN
network can be used in ICT fraud detection, the telecommunication technologies
not only has benefits but also has some threats. Criminals misuse the
technology to capture the data like bank details, personnel information, money
laundering and for terrorist activities. This can be overcome by deploying
neural network system which monitors the behaviors of user and compares with
the pre-defined data. In case of suspicion it triggers an alarm by which ICT
companies can handle the situation way before things goes out of hand.
Retina Scan, Finger Print and
Facial Recognition: In the current world Retina Scan,
Finger Print and Facial Recognition are major security measures and neural
network can be adopted learn the specific patterns of these and output the
details when required.
Gaming Technology and Robotics: ANN
is widely applied in the field of gaming and robotics. The ability of the ANN
to learn, reproduce, and predict the future and past has made it best suited
for gaming and robotic technology.
Financial Risk Management: In the
financial field ANN is adopted in credit scoring, market risk estimation and
predictions. ANN is successfully deployed for credit scoring and rating based
on various inputs that will be given to it, it learns the input feature and
predicts or gives a score as output. ANN is also useful in other areas of
financial risk management such as market risk management and operational risk
management.
Medical Imaging: In
recent years loads of research is being done where ANN is set to learn the
patterns of medical images such as cardiovascular imaging and made to predict
the disease.
Natural Language Processing
(NLP): ANN are widely used in NLP they are made to learn the
patterns and tuned to give the desired outputs.
Voice and Image Recognition: ANN’s
are used in learning/recognizing the voice and interpret it. After
interpretation it will produce the desired results. This is the same way as
iPhone structured its voice recognition technology Siri.
The ANN are deployed in
recognizing photos, the ANN will trained on set of images and will be made to
recognize the image. This is how the Facebook photo tagging works.
Similarly ANN’s have wide variety
of significant application in most of the industries.
References:
1. Andrej Krenker, Janez
Bešter and Andrej Kos (2011). Introduction to the Artificial Neural Networks,
Artificial Neural Networks - Methodological Advances and Biomedical
Applications, Prof. Kenji Suzuki (Ed.), ISBN: 978-953-307-243-2, InTech,
Available from:
http://www.intechopen.com/books/artificial-neural-networksmethodological-advances-and-biomedical-applications/introduction-to-the-artificial-neural-networks
2. Simon Haykin. Neural Networks
– a Comprehensive Foundation. Prentice Hall, New Jersey, 2nd edition, 1999
3. Alan Dorin, An Introduction to
Artificial Neural Networks, AI, A-Life and Virtual Environments, Monash
University
4.
reinforcementlearning.ai-depot.com/
5. Artificial Neural Networks for
Beginners, Carlos Gershenson
6. machinelearningmastery.com
6. machinelearningmastery.com
7. Wikipedia