Friday, June 29, 2012

Google Brain

The so-called Google Brain has been in the news the last couple of days (see for example here, here and here, and see here for coverage from the ABC, including part of an interview I did with them on the subject).

The news coverage has focussed on how the machine learned to recognise cats, because cats are cute. Reading the original paper gives a more nuanced view of the technology. The researchers constructed a colossal neural network with one billion neurons, implemented over 1000 16-processor servers. They then presented it with ten million images taken from YouTube, and left it to train for three days before looking to see what it had learned.

The researchers knew that the three most common images on YouTube were cats, human faces, and human bodies. So, they presented images drawn from independent data sets (that is, data sets that were not involved in training the network) that were known to be of cats, faces or bodies, and saw which parts of the network activated. By examining the activations within the network, they found that there were prototypes of cats, faces and bodies within the network. That is, they showed that the network had formed its own exemplars of these objects.

There are four main technical innovations in this paper:

1) The size of the network, which had one billion artificial neurons.

2) The technique they used to reduce the interconnections between the elements of the network, to make it easier to execute in parallel across the 16 000 processors they used.

3) The number of images (ten million) used to train the network.

4) The size of the images used (200 x 200 pixels, which is larger than most).

The network did not learn to recognise cats, faces, or bodies. It still doesn't know what a cat is, or what a face is, or what a human body is. It has no concept of what the images represent. But even so, it still has potential: neural networks have finally reached the age of Big Data.