The week before break, at the group meeting, we discussed the gene lists Miriam and I created. Because of the lack of overlap on my part, my next job to do is to modify my program to account for possible different gene names. Anna sent me a giant text file for it, and I will get it done by the Friday meeting.
I did not spend all of break dormant, and I learned more about bayesian statistics in addition to a brief overview by one of Anna’s post-docs. I think one of the difficulties of knowing how Bayes’ theorem works is the fact that it’s just so ingrained into our normal thought. Given B, what is the probability of A? It’s the probability of B given A times the probability of A divided by the probability of B. The first thing to note is the B denominating the whole equation. The B is accounting for the probability warping from the context of the problem. The second factor is probability of B given A. This represents the relationship we already know. This is multiplied by the probability of A. Therefore, the numerator represents the total probability of B happening because of A, which can also be described as the total probability that the specified relationship happens. By accounting for the probability warping in the denominator, we get the actual probability of A given B.
Bayesian probability is the core of the functional interaction network and the integrated network we will make. I can already kind of see how gene interaction probabilities could be derived from this given interaction data.
However, the mutual exclusivity clause in the theorem might be tricky. I’ll have to closely look at the supplemental data to see how the functional interactive builders accounted for this.