Jan 1, 2011

Intuition and Neural Networks

I had an interesting discussion with a friend during Christmas. It started because one of my presents, which I chose, was Richard Dawkin's The God Delusion. The discussion at some point became one about spirituality. He was arguing in favor of the existence of it and I was trying to understand what he exactly meant by the word spirituality. The details of the conversation are really not important, but at some point he argued that spirituality was related to intuition, and intuition is something that cannot be logically understood. Of course I disagreed for to me that sounds like a very fun remark as, among all cognitive phenomena, intuition is the one which I would say that was most illuminated by the study of artificial neural networks and machine learning in general.

For many the above statement may seem not only surprising, but highly unbelievable and extremely exaggerated. It's not. In order to prove it, let me start by explaining what I understand by intuition. This is also probably the concept that everyone shares. Most people have already been in a situation where you have to take a decision and, although you cannot explain why and it may even sound counterintuitive, something inside you tells what is the correct answer. I will not use intuition in the sense of premonition or anything like this. I will concentrate on this sort of "I know this is the correct answer but I can't explain it." thing.

You may think that the fact that you cannot explain the decision makes it something beyond logic and therefore impossible to understand. Actually, it is the complete opposite. The explanation is in fact the simplest one: the feeling of what is the correct decision comes from our brain's experience with similar situations. Too simplistic, you would say. Okay, but why should this not be so? But this is not just a guess, we can actually reproduce this in a computer. That is exactly how machine learning algorithms work. 

Let me start by describing the simplest machine learning model, the perceptron. The perceptron is a mathematical model inspired by a real neuron. It has N entries, which usually are taken as N binary numbers, and computes what is called a boolean function using them, giving as a result another binary number. The simplest rule is this

\[\sigma(\mathbf{x})=\mbox{sign}{\sum_i x_i w_i},\] 
where the $\mathbf{x}=(x_i)_{i=1,...,N}$ are the N boolean entries and the real numbers $w_i$ are what enables this simple model to do some kind of very basic learning. The trick is that, if we change these numbers, we can change (to some extent, which is already a technical issue) the boolean function that is implemented by $\sigma$. The idea is that we have what is called a dataset of pairs $(\sigma_\mu,\mathbf{x}_\mu)$, with the indices $\mu$ labeling the datapoints. We usually call these datapoints by the suggestive name of examples, as they indicate to the perceptron what is the pattern it must follow. We then use a computer algorithm to modify the $w_i$ such that it tries to match the correct answers $\sigma_\mu$ for every corresponding $\mathbf{x}_\mu$. The simplest algorithm that works is the so called Hebb algorithm, which is based on the work of the psychologist Donald Hebb, and amounts to reinforcing connections (by which I mean the numbers $w_i$) when the answer is correct and weakening them when it's wrong. 

As I said, in simple situations this algorithm really works. Of course, there are more complex situations where the perceptron does not work, but then there are more sophisticated machine learning models as well as algorithms. I will not discuss these details now, as this is not important to our discussion. The important thing is  that, after learning, the perceptron can infer the correct answer to a question based simply on the adjusted numbers $w_i$. Now, notice that the perceptron does not really know the pattern it's learning. It is too simple a model to have any kind of awareness. The perceptron also does not perform any kind of logical thinking to answer the questions, it just knows the correct answer as soon as the question is presented. It never really knows the pattern it's following after learning. Basically, it gives an intuitive answer. But what is really more incredible is that, even if we look at the numbers $w_i$, we also cannot explain what is the pattern the perceptron learned. It's just a bunch of numbers and if the number N is too large it becomes even more difficult for us to "understand" it.

Looks too simplistic but this is exactly what we called intuition above. In the end, taking a decision based on intuition happens when your brain tells you that the question you are faced with follows some kind of pattern that you cannot really explain, but just seem right. You learned it somehow, although you cannot explain what you've learned. As you can see, intuition is in fact the first thing we were able to understand with machine learning and the myth that this cannot be understood is just that: a myth.

No comments: