Last week we had the task of running our algorithm for 100,000 iterations. We ran into a couple of problems:
- It took a very long time. We actually stopped running it at about 1300 iterations (which took 12 hours) because the estimated time kept increasing. There are a few lines in our code that we are going to change to improve running time – we will convert some variables we’re using to track changes between iterations from lists of nodes to integer counts.
- The difference in score we wanted to see before stopping iterations was far too big. We allowed the code to run until the difference in scores between iterations was no more than 0.001 (or it hit 100,000 iterations), and to our surprise, it took around 130 and 180 iterations for cell motility and schizophrenia positives, respectively. Because the scores range from 0 to 1, a change of 10^-4 is bigger than we initially thought. After speeding up the code, we are going to run it until we see changes less than 10^-9.
We are also going to create a histogram of run time versus progress (each iteration) to track our algorithm and see if there is a separate problem that causes the code to slow down as the number of iterations increases.
Our continued goals for this week are to BLAST candidates to see if they are conserved in Drosophila.