Friday, March 13, 2009

Bayesian revolution, I see it coming...

Ideas I would like to express in this blog:

1. Well, bayesian statistics is so simple that the first impression that I have when I heard about it, was that was so obvious that there is nothing to learn about it.

2. I remember when I first considered giving a look to the concept of Bayesian statistics.

3. The Bayesian approach, citing its greatest scholar

4. The Bayesian nightmare: all the science trapped in a Bayesian network.


So, lets go.

As anyone with a mathematically oriented mind, the first thing I did when hear about Bayesian statistics is looking for the fundamental formula to be able to build everything from it.

(a digression here, one of the biggest shocks in my scientific life occurred when I was teach about how with the Euclidean hypothesis all the geometry is build up - in Argentina (in the Mariano Acosta school) they were teach in 6 or 7th grade- you might imagine my disappointment when I realized later in life that such enlightenments are rare).

The formula is absurdly simple

Prob(A/B)=Prob(B/A)*Prob(A)/Prob(B)

You see that this formula is nothing more than rephrasing the definition of conditional probability
Prob(A/B)=Prob(A and B)/Prob(B)

So, any intelligent person (I consider myself an intelligent guy, but I am ashamed of that consideration because it elevate me over my fellows and as not so intelligent guy I considered Intelligence the most valuable thing, that produced me bad feelings about my fellows, and that is nasty ) would disregard such a simple thing. There is not much more to say.

Thats what I thought when I first bumped to the concept of Bayesian analysis.

2. In 2006 I went to a local Argentinian meeting on Biophysics (the field I specialized in my posdoc in the US). And I enjoyed slipping into the big-guys dinner table. Those moments, although they do not reach the levels of Euclidean geometry discovery, are marvelous moments of the profession. There was a guy that he gave an gorgeous talk, one of those talks where you learn something new in the directions of science you like. And also there was this old guy, a figure in NMR analysis. They were living overseas (overseas in Argentina means in the US or EU). And the old guy told me that Bayesian analysis was really a powerful tool. So, you know, I went on an took a better look into this thing.

3. Well, at that time I was deeply into wiki-learning. (Now I have discovered the secret word: gigapedia) And the Bayesian fellows are kind of good at selling their product. (I remember of one previous thing I bought: Object Oriented Programing, with the marvelous Turbo Pascal, te lo rompo igual, one of the best experiences in computer science: both the kindness of a complete environment and the rigorousness of a russian style mathematical school. When years later I have to learn C in a Unix environment, that was like to come back to cave ages -although I like C. ) And I bought that. It is a marvelous concept. It is not about statistics math, it is about statistics philosophy (or the map between the math and the real world).
Bayesian statistics is one of the forces behind bringing down the objectivity wall (another force is Antonio Damasio’s ideas): the idea that an experiment stand on its own. No experiment stands on itself, it always stands on the shoulders of the previous experiments. Science is a net of experiments connected by Bayesian threads.
Bayesian is subjective: its focus is not on the world but on the observer of the world. The Observer has some preconceived ideas, and those ideas give the observer some expectations about whats going on. And the observer adjusts its preconceptions according to what is observed.

The best explanation I know is the following one

I was trying to tell you that the search for explicative laws in natural facts proceeds in a tortuous fashion. In the face of some inexplicable facts you must try to imagine many general laws, whose connection with your facts escapes you. Then suddenly, in the unexpected connection of a result, a specific situation, and one of those laws, you perceive a line of reasoning that seems more convincing than the others. You try applying it to all similar cases, to use it for making predictions, and you discover that your intuition was right. But until you reach the end you will never know which predicates to introduce into your reasoning and which to omit. And this is what I am doing now. I line up so many disjointed elements and I venture some hypotheses. I have to venture many, and many of them are so absurd that I would be ashamed to tell them to you. You see, in the case of the horse Brunellus, when I saw the clues I guessed many complementary and contradictory hypotheses: it could be a runaway horse, it could be that the abbot had ridden down the slope on that fine horse, it could be that one horse, Brunellus, had left the tracks in the snow and another horse, Favellus, the day before, the traces of mane in the bush, and the branches could have been broken by some men. I didn’t know which hypothesis was right until I saw the cellarer and the servants anxiously searching. Then I understood that the Brunellus hypothesis was the only right one, and I tried to prove it true, addressing the monks as I did. I won, but I might also have lost. The others believed me wise because I won, but they didn’t know the many instances in which I have been foolish because I lost, and they didn’t know that a few seconds before winning I wasn’t sure I wouldn’t lose. Now, for the events of the abbey I have many fine hypotheses, but there is no evident fact that allows me to say which is best. So, rather than appear foolish afterward, I renounce seeming clever now. Let me think no more, until tomorrow at least.”
I understood at that moment my master’s method of reasoning, and it seemed to me quite alien to that of the philosopher, who reasons by first principles, so that his intellect almost assumes the ways of the divine intellect. I understood that, when he didn’t have an answer, William proposed many to himself, very different one from another. I remained puzzled.
“But then ...” I venture to remark, “you are still far from the solution. ...”
“I am very close to one,” William said, “but I don’t know which.”
“Therefore you don’t have a single answer to your questions?”
“Adso, if I did I would teach theology in Paris.”
“In Paris do they always have the true answer?”
“Never,” William said, “but they are very sure of their errors.”
“And you,” I said with childish impertinence, “never commit errors?”
“Often,” he answered. “But instead of conceiving only one, I imagine many, so I become the slave of none.”
I had the impression that William was not at all interested in the truth, which is nothing but the adjustment between the thing and the intellect. On the contrary, he amused himself by imagining how many possibilities were possible. At that moment, I confess, I despaired of my master and caught myself thinking, “Good thing the inquisitor has come.” I was on the side of that thirst for truth that inspired Bernard Gui.
And in this culpable mood, more torn than Judas on the night of Holy Thursday, I went with William into the refectory to eat my supper.

Some of you might recognize the fragment, I couldn’t help but cut-paste it almost all the chapter.
So, Umberto Eco laid out the best explanation I know of the Bayesian statistical philosophy.


4. So, it’s coming.

Some areas Science, like Structural Biology, have started.

Others await their destiny.

The Bayesian revolution will be overwhelming.

All the human knowledge will be set in an infernal network of a priori and a posteriori probabilities.

Impact factors will be replaced by Bayes factors.

Tasks will be assigned on basis of the pursue of knowledge, the only reason of our existence.