Introduction to Neural Networks. Pt3 - Neurons, Layers and Networks

Richard Walker
Apr 28, 2022
6 min read

Updated: Jul 4, 2022

Neurons are organised into 'layers'. Each layer sends activations to the next layer in the chain.

In this post we’ll get to grips with Neural Networks. This involves stacking the neurons that we’ve studied in the past two posts into layers. The outputs, or activations, of neurons in one layer serve as the inputs into the next layer. Layers of neurons keep cascading information in this fashion until we reach our output layer.

Click 'Play' in the video above or read on to find out how to connect neurons in layers, and compare AI to conventional IT techniques

A neural network is simply a large function; with lots of tunable parameters.

As we'll see this means that our neural network is just a maths function that maps a series of inputs into a series of outputs. While it is ‘just a function’ it is fair to say that it is a large and complicated function. This function has a very large amount of parameters. These parameters can be tweaked to help map inputs to outputs. This tweaking of parameters happens while a network is ‘learning’ or being trained. Training a neural network will be the subject of the next two posts.

From neurons to networks

For the remainder of this post we are going to look at the architecture of a neural networks. We'll show how this architecture differs from conventional IT techniques.

The anatomy of an artificial neuron

We’ve seen in the two previous posts that artificial neurons are pretty simple. Each neuron has an output, or activation. That activation is derived from inputs. We use some straightforward arithmetic. We multiply inputs by weights and add all these products up. Then we add a bias term. Finally we pass this total through an activation function.

An example of a non-linear activation function - The 'Sigmoid'

We’ve said that for the activation function to be useful it needs to be non-linear. We’ve also said that we’d like an activation function’s derivatives to be simple to compute. We’ve justified that statement about the importance of simple derivatives in a vague way. We’ve said that it has something to do with how a network is trained. We’ve promised that we will discuss this training in a future blog post, and we will do that. But first we need to discuss how these neurons are connected together to form networks. That is the purpose of this post. We’ve covered the ‘Neural’ bit of ‘Neural Networks’. In this post we are going to address the ‘Network’ bit.

How do neural networks differ from conventional, logic-based systems?

Conventional data processing systems use conditional logic to figure out how to process data given user requests. When imagining computer systems for automating our business we think in standard terms. We think about the inputs into our systems. For an Equities trading system we look at the ticker, price, quantity, buy/sell and many other deal parameters. For an interest rate swap back-office system we’d have many inputs. These include counterparty, currency, notional, maturity, fixed and floating rates, payment frequency etc.

Subject matter experts supply business rules for data processing systems. These rules operate on data input by users. The system will automate processing and workflow.

It is then common to consult with subject matter experts ('SMEs') to gigure out the conditional logic. These SMEs help analyse process flow and functionality in a system. This would be represented as a set of requirements, from which designs can be drawn up. The rules that govern how this business logic is executed can be understood,. Experts would also understand how to deal with exceptions. All this would then be coded up using logic. We’ve seen huge productivity gains in the past half a century based on the ability to automate businesses using this approach.

Rather than apply rules to inputs, AI systems find mappings between inputs and outputs

AI systems work differently to conventional, logic-based systems. Rather than starting out with a set of inputs and business rules, then applying these business rules to generate outputs, AI starts out in a wholly different way. It starts out with a set of known inputs that match known outputs. AI then figures out the patterns that match the inputs to the outputs.

The AI mapping inputs and outputs approach is thus very different from traditional methods of applying computer systems to solve business problems. Frankly the types of problems that AI is suited to ARE different. Generally AI is not a great solution for processing systems. But it is great at things like pattern matching, regression, classification and estimation.

Why compare AI to conventional systems?

The reason for this discussion comparing conventional systems to Neural Networks is to develop an intuition for what Neural Networks are. An understanding of the problems that they are suited to, and an understanding of how they differ from conventional IT approaches underpins this intuition. Conventional systems can be described with familiar notation – flowcharts, state transition diagrams, database schema – all that stuff. These artifacts are not so useful or common when describing AI systems; neural networks in particular.

In contrast to conventional systems where functionality is contained in blocks, neural networks are made up of loosely connected layers of individual neurons. An input layer takes in data from sensors or observations. This might be a market data feed, company fundamentals from an annual report, or pixel intensities from a camera. An output layer gives the results of the network. This output layer might contain a single neuron – for instance an options pricing system might output the fair price of an option. Or the output layer might contain two neurons for a market making system outputting the optimal bid and offer quotes.

In computer vision systems it is common for a network to have dozens or hundreds of outputs. In these classification systems each output neuron indicates the likelihood of a particular object that it has been trained to see.

Between the input and output layers are ‘hidden layers’. Neurons in these layers receive activations from neurons in proceeding layers. They send their output activations to neurons in the next layer.

The Neural Network, viewed as a function

So one literal and useful way of looking at Neural Networks is as a huge (frankly vast) multi-parameter, multi-variable function. We’ve seen how each neuron has inputs, weights, biases activation functions and outputs. When connected in layers and with several neurons per layer the number of parameters can become very large very fast.

A quadratic equation with one input, one output & three parameters

In high school we were used to seeing equations of this form f(x) = ax^2 +bx + c. Here we have a single input: x, a single output: f(x) and three parameters: a, b & c. We can draw the graph of this function that maps the input x to a single output y.

A neural network with ten inputs, four outputs and three hidden layers

Even for the most basic neural networks there are way more than 3 parameters. A lot of parameters can be a lot to get your head around. But it is these tweakable parameters that give neural networks their power to learn. For the relatively small network shown in the figure above we have four outputs and ten inputs. There are three hidden layers which contain eight, six and five neurons. In total this network has 201 weights and biases. This means that there are 201 ‘tweakable’ parameters. These parameters allow the network to learn the mapping between these ten inputs and four outputs.

Key elements of the Black Scholes Option Pricing Formula - An example of a parametric model that makes assumptions. Neural networks are non-parametric, and are more adaptable to changing evironments.

Let’s look at practical example from financial markets. Neural networks have been deployed to address some of the shortcomings in closed-form option pricing models. There is no doubt that closed-form, parametric solutions for options pricing such as Black Scholes have had huge success. They are simple to understand, intuitive and the analytic formulae means that they are easy to compute.

Feeding raw time series of prices into a non-parametric neural network model removes assumptions of stationary volatility present in parameterised models

But these models contain assumptions. The outputs depend intimately on the form of an underlying asset’s price dynamics. An asset’s volatility is modelled as a single, fixed parameter in these types of equations. Using a neural network permits pricing via a non-parametric pricing approach. An asset’s price history can be fed into the model as a time-series along with strike price, time to maturity etc. We define the output to be the market price of the option. Through training the network ‘learns’ to map these inputs to the observed derivative price.

Thus the neural network learns the mapping between the observable inputs to the model and the market price of derivatives. It ‘becomes’ the model. But it is not subject to some of the assumptions in the analytic models. Namely the assumptions of prices being lognormally distributed, or fixed interest rates. They are also adaptive to structural market changes, such as liquidity, in a way that classical closed-from solutions are not.

In summary:

Neural networks are made up of layers. With neurons in one layer feeding their activations to neurons in the next layer.
There are three types of layers: input layers, hidden layers and output layers.
Neural networks differ from conventional IT techniques. There are no formal business rules or processes that operate on input data to generate a workflow. Instead both inputs and outputs are presented to the network. The network tweaks its internal parameters so that it can correctly map inputs to outputs.
This means that neural networks are not well suited to many data processing tasks. But they have shown great promise in regression, clustering, classification and non-parametric modelling.

In the next couple of blog posts we’ll start to explore what makes these types of AI solution useful. Their ability to ‘learn’.