Your Source for Data Science

What is Deep Learning and how does it work?

Welcome to the world of machine learning and deep-neural networks.


Facebook automatically finds and tags friends in your photos. Google Deepmind’s AlphaGo computer program trounced champions at the ancient game of Go last year. Skype translates spoken conversations in real time – and pretty accurately too.

Behind all this is a type of artificial intelligence called deep learning. But what is deep learning and how does it work?

Deep learning is a subset of machine learning – a field that examines computer algorithms that learn and improve on their own.

Machine learning is by no means a recent phenomenon. It has its roots in the mid-20th century. In the 1950s, British mathematician Alan Turing proposed his artificially intelligent “learning machine”. And, during the following decades, various machine learning techniques have risen and fallen out of favour.

One of these is neural networks – the algorithms that underpin deep learning and play a central part in image recognition and robotic vision.

Inspired by the nerve cells (neurons) that make up the human brain, neural networks comprise layers (neurons) that are connected in adjacent layers to each other. The more layers, the “deeper” the network.

A single neuron in the brain, receives signals – as many as 100,000 – from other neurons. When those other neurons fire, they exert either an excitatory or inhibitory effect on the neurons they connect to. And if our first neuron’s inputs add up to a certain threshold voltage, it will fire too.

In an artificial neural network, signals also travel between ’neurons‘. But instead of firing an electrical signal, a neural network assigns weights to various neurons. A neuron weighted more heavily than another will exert more of an effect on the next layer of neurons. The final layer puts together these weighted inputs to come up with an answer.


Let’s say we want a neural network to recognise photos that contain at least one cat. But cats don’t all look exactly alike – consider a shaggy old Maine Coon and a white Siamese kitten. Nor do photos necessarily show them in the same light, at the same angle and at the same size.

So we need to compile a training set of images – thousands of examples of cat faces, which we (humans) label “cat”, and pictures of objects that aren’t cats, labelled (you guessed it) “not cat”.

These images are fed into the neural network. And if this were a sports drama film, the training montage would look something like this: an image is converted into data which moves through the network and various neurons assign weights to different elements. A slightly curved diagonal line could be more heavily weighted than a perfect 90-degree angle, for instance.

At the end, the final output layer puts together all the pieces of information – pointed ears, whiskers, black nose – and spits out an answer: cat.

The neural network compares this answer to the real, human-generated label. If it matches, great! If not – if the image was of a corgi, for instance – the neural network makes note of the error and goes back and adjusts its neurons’ weightings. The neural network then takes another image and repeats the process, thousands of times, adjusting its weightings and improving its cat-recognition skills – all this despite never being explicitly told what “makes” a cat.

This training technique is called supervised learning.

Unsupervised learning, on the other hand, uses unlabelled data. Neural networks must recognise patterns in data to teach themselves what parts of any photo might be relevant.

A self-learning machine sounds terrific. But until recently, neural networks were, by and large, ignored by machine learning researchers. Neural networks were plagued by a number of seemingly insurmountable problems. One was that they were prone to ‘local minima’. This meant they ended up with weightings that incorrectly appeared to give the fewest errors……

Source Continue Reading