Artificial Intelligence: Neural Networks
In a recent course I took — CS50 Artificial Intelligence — I learned the basics of AI and AI algorithms. To most people, this topic is perceived as an impossibly complicated one; one that most people could never understand. I, too, felt daunted when beginning the course, but after the first few lectures, AI was simplified into several fundamental concepts (which I’ll elaborate on in upcoming posts), which can, in turn, be developed to create the impossibly complicated programs mentioned above. One fundamental form and concept of AI is the Neural Network, used in Deep Learning. Put simply, these neural networks are algorithms similar to human brains in regards to the way they process information (hence the name) with “neurons” in layers connecting to each other, and they are most effective in classifying data sets and forming relationships between data points. Neural networks are extremely effective in a wide range of fields: image classification, speech recognition, financial models and risk assessments, and many other areas.
The structure of a neural network can be as basic or complex as is needed for its intended purpose. The simplest neural network can contain two layers, one input layer and one output layer, and contain as many inputs as the programmer desires. Typically, there is only one output, but additional outputs are possible in a neural network. More complex neural networks (and the vast majority of them) contain additional layers between the input and the output, known as hidden layers. These layers are where the true data processing occurs. Mathematical calculations, or weights, will be performed on each input that is passed to the neurons in a hidden layer. The weight of each connection passes importance to some inputs over others, and after the inputs are weighted, a bias term is added and they are passed into an activation function (a mathematical equation that normalizes the inputs). These weights are not arbitrary; they are given an initial value and are then optimized as the neural network is trained using training data sets. Finally, after the activation function is performed, the output of each node in the hidden layer is passed on to the nodes in the next hidden layer, and this process continues until a final output is found. There are additional steps that can be taken in neural networks, such as forward propagation, backward propagation, and learning rates, but the input, hidden, and output layers are the essential parts of any network.
This image depicts an example of a neural network. As you can see, each connection has a weight, each layer has an activation function, and each neuron in every layer connects to every neuron in the next layer. There are, of course, deeper and shallower neural networks (deep neural networks have more hidden layers and more complex activation functions), but this should serve to help visualize a neural network more easily. Ultimately, neural networks are only as good as the weights and activation functions that they include, and these can be improved through training or machine learning. But no matter what, these networks are core ideas when thinking about artificial intelligence, and every programmer or developer interested in AI should be familiar with their basic concepts and algorithms.