Academia.eduAcademia.edu

ARTIFICIAL NEURAL NETWORKS IN A NUTSHELL

2022, ARTIFICIAL NEURAL NETWORKS IN A NUTSHELL

Here in this text, I aim to discuss briefly the basics of how Artificial Neural Networks function and also how they have been present in our daily life. My purpose is to do that in a way someone with just basic knowledge or even no knowledge at all about this topic would be able to understand it at least a little.

ARTIFICIAL NEURAL NETWORKS IN A NUTSHELL Clóvis Magno F. Santos Here in this text, I aim to discuss briefly the basics of how Artificial Neural Networks function and also how they have been present in our daily life. My purpose is to do that in a way someone with just basic knowledge or even no knowledge at all about this topic would be able to understand it at least a little. First, what exactly is an Artificial Neural Network? Artificial Neural Networks (or ANNs) are a type of computational tool often used in Deep Learning and Machine Learning processes. These processes are a branch of Artificial Intelligence (AI) studies/applications and both use certain tools to achieve the same objective: give a machine the ability to learn and even understand the world in which we live. Of course, this topic generates tons of philosophical discussions about whether a computer can learn or not. However, this is not the subject here today. This tool is commonly used in computational processes where the objective is to predict the behavior of some situation, such as the general incomes of a company, for images or texts recognition, they are even used in natural language processing (NLP), which involves tools for recognizing elements in texts, doing translations, and more. In technical words, they can be used whenever predicting values from nonlinear problems is required. To accomplish such tasks, artificial neurons are created and used to mimic the way that biological neurons process information. The way an artificial neuron processes information presents notable shortcomings, as a result of that, it does not function as well as a biological neuron. However, the knowledge acquired so far has enabled us to achieve favorable results with artificial neurons, so when applied to the networks, artificial neurons are very useful and they present exceptional results, as they are capable of performing a large number of tasks. Considering what was discussed above, how does an ANN is able to make predictions about data or even recognize photos? Artificial Neural Networks are composed of multiple nodes, which imitate biological neuron connections (also known as synapses). The artificial neurons are connected by these nodes and they can communicate with each other, once the nodes can take input data and perform mathematical operations on them. These operations inside each neuron are performed using what we call weights, usually, the weights are very low values between 0 and 1 and they are associated with the connections between each neuron. The most commonly used operation within neurons is the weighted sum, which means that the value of the input data coming from all of the connections is multiplied by the value of the weights also located in these concrete connections, at the end, the result of these different multiplications is summed and the value generated from the sum is triggered as the output of the neuron where the sum was made to the other neurons in the network. The process mentioned above is known as forward propagation since the value coming from the weighted sum in a neuron is forward propagated to other neurons, establishing a chain reaction where the same process is repeated several times until reaching the very last neuron or layer of neurons in the network. Because of this, the different weights associated with the different connections play a crucial role in generating the value that an artificial neuron will output after receiving the input data. Thus, ANNs are capable of learning by analyzing how far from the desired result was the result obtained by the neurons and, therefore, weight values are readjusted to reach outputs values closer to the desired ones, this process comprehended in what we call backpropagation. This whole process described before is part of the training of an artificial neural network, in this step, the network is learning how to perform specific tasks using as reference the examples given as inputs. After this step, a network is generated, which means that all the weights between the neurons are adjusted to some particular value and, theoretically, the network is now able to carry out the task for which it was designed. Though, after concluding the process of training a neural network, it is necessary to run some tests in order to verify the accuracy of the network, this last phase is known as the test step. For testing the ANN generated after the training step, we usually provide inputs other than those used during training for the model and compare the results generated from these inputs with the real expected results. Different from the training step, in the test step the weight values do not change during the process, so if the network can guess most of the new inputs using the weight configurations obtained from training, it means the model is working well and predicting correctly with a high level of accuracy, otherwise, the model is almost randomly trying to guess the results. The optimal accuracy for a network may depend on the task you want to perform. For instance, if the network is used for facial recognition as an unlocking method on smartphones, it would be interesting to develop a network capable of getting it right almost 100% of the time. But in other goals, high accuracy may not be required and a model with 85% or 90% accuracy can handle the job. Models with less than 80% accuracy are typically unreliable, regardless of the task they were designed for. Many factors may matter when it comes to achieving good or bad accuracy with a network, it may depend on the amount of data, the amount of layers or neurons per layer, the way the data was treated before feeding it to the network, and many other reasons related to the architecture of the network and/or the data used in the training stage. Every time you use a dog filter on Instagram, or receive some advertisement suspiciously connected to sooner research that you may have made on the internet, there is a type of neural network behind it. With the amount of data needed to be processed by governments or enterprises, Deep Learning (DL) and Machine Learning (ML) are increasingly required, and, in most cases, some variation of an Artificial Neural Network is used for making computers capable of learning and processing large amounts of data. For deeper readings about this topic, I highly recommend Michael Nielsen’s free online book Neural Networks and Deep Learning. I also write and share content on Medium: https://medium.com/@magno97