Summer 2020 Research

“Stay Home, Save Lives” ad campaign by Oregon Gov. Kate Brown.

This summer will not look like any other. Yet, undergraduate research continues! At a time when experimental labs are shut down and figuring out how to reopen, computational biology can provide an opportunity for trainees to learn a new skill while at home.

This summer, there are a bunch of projects in the compbio lab, with a large focus on developing tools for public use.

CS junior Alex Richter will be implementing our recent hypergraph connectivity algorithm for use with the ReactomeFIViz app, which analyzes and visualizes signaling pathways from Reactome.

CS senior Aryeh Stahl will be exploring how to best weight protein-protein interaction networks (such as the HIPPIE interactome among others) and how to integrate high-throughput experimental data in these networks.

Biology junior Frank Zhuang will be picking up Tayla’s project from last summer, working with me and Kara Cerveny to computationally identify retinoic acid response elements (RAREs) in zebrafish.

CS senior Jiarong Li will be returning to the lab, but unlike last summer she will be working alongside Reed’s Computing & Information Services to explore feasible cloud compute platforms for college research across disciplines.

CS sophomore Larry Zeng will be developing a tutorial about network algorithms for molecular systems biology. The tutorial, designed for biologists, will be designed for researchers to learn more about networks visualized by GraphSpace, an interactive graph sharing platform.

Finally, three post-baccalaureate researchers will be working on computational biology research.

  • Tobias Rubel Janssen (Fall ’19) develops new algorithms for signaling pathway reconstruction from protein-protein interaction networks.
  • Gabe Preising (Spring ’20) will extend his undergraduate thesis project with Suzy Renn to identify differentially expressed groups of genes related to mouthbrooding in cichlid fish.
  • Maham Zia (Spring ’20) will develop new measures for quantifying microscopy images of cells according to specific phenotypes that are studied in Derek Applewhite’s lab.

This will make for a full and fun summer! Looking forward to establishing a summer research group, even if it’s virtual.

New compbio logo, courtesy of the pandemic.

Summer Research 2019 – here we go!

Reed has finished for the year, but that doesn’t mean that students are done. Last week kicked off a slew of undergraduate researchers doing all kinds of research. In no particular order, here’s a taste of what people will be working on in the compbio lab. Stay tuned for occaisonal group updates.

Math-CS major Jiarong (Lee) Li ’21 and biology major Tunc Kose ’22 are going to develop algorithms to analyze a cell’s response to external signals (called signaling pathways). They will be working to extend ideas based on the original PathLinker paper and Ibrahim Youssef’s Localized-PathLinker paper.

Recent graduate Amy Rose Lazarte ’19 (alt. bio major with a CS emphasis) will continue to develop a resource and modeling framework for understanding the effect of thermal variation on freshwater phytoplankton. Co-advised by ecologist Sam Fey, she has developed a computational pipeline to analyze longitudinal lake temperature data using simulations of phytoplankton swimming strategies.

Biology major Tayla Isensee ’20 is working on identifying targets of retinoic acid signaling in zebrafish eye development. She has a hand in the wetlab work with developmental biologist Kara Cerveny, and she will be building a zebrafish protein-protein interaction network to find potential regulators to test. First, though, she’s going to hunt for retinoic acid response elements (RAREs) in the zebrafish genome to identify direct targets of retinoic acid.

Another recent graduate, neuroscience major Alex King ’19, will be wrapping up his thesis work to build a network that integrates gene, transcript, and protein relationships in order to identify dysregulated pathways in polygenic diseases based on genome-wide association study (GWAS) data.

Biology major Karl Young ’20 will be reading up on computational modeling in neuroscience, and figuring out the intersection of my world (algorithms for biological networks) and neurobiologist Erik Zornik’s world (neural circuits and how they affect behavior).

Last but not least, CS graduate Ananthan Nambiar ’19 will be getting his thesis ready to present as a poster at ISMB/ECCB in Basel later this summer. He modeled proteins as language with the help of his main advisor, natural language processing (NLP) expert Mark Hopkins in CS.

Bayesian Weighting Schemes

Week 2 Progress Report

Week 2 was spent further looking into Bayesian Weighting Schemes and in particular how they were applied in a specific paper titled Top-Down Network Analysis to Drive Bottom-Up Modeling of Physiological Processes by Poirel et. al. This was the original LINKER paper that PATHLINKER was built upon. I also spent a very decent amount of time looking at other interactomes and curating a working list of interactomes and how they weight their interactomes. So far I have only done this in detail for the HIPPIE interactome but more work is still needed. Besides research, I also worked on a document about Bayesian Weighting that aims to elucidate some of the mathematical set up and notation for the paper mentioned above as it suffered from some convoluted mathematical notation. Hopefully, this will be useful to people trying to read this paper in the future. Other tasks involved debugging code, presenting learned material and sorting through papers to find relevant papers.

Week 1

Aim of Research Project

My research project aims to extend upon the original PATHLINKER paper. In particular, I hope to investigate different weighting schemes both from the perspective of utilizing different data sets and data-summary methods as well as potentially maybe different statistical tools for manipulating the data in meaningful ways. This is motivated by the fact that there appears to be little overlap between interactomes consisting of the same proteins but weighted differently. Therefore, this project will allow us to analyze different schemes and develop an heuristic for weighting interactomes. As a result, we will be able to apply algorithms like PATHFINDER on graphs such that they provide results that have valid biological meaning rather than being affected by superficial factors such as trends in scientific research or biases of the scientific community. Right now, I am focusing on Bayesian Weighting schemes which attempt to use experimental evidence to provide a weight that represent the probability of a given interaction occurring in a given signaling pathway by leveraging Bayes’ theorem. I will talk more about this in future posts.

Week 1 Progress

Week 1 was spent mostly acquainting myself with the necessary prerequisite knowledge required for research during coming weeks such as Dijkstra’s algorithm (and its implementation) and Bayesian Weighting Schemes and in that regard reading some relevant papers. For the sake of keeping track the particular papers were titled :-

  1. Pathways on demand: automated reconstruction of human signalling networks
  2. Top-Down Network Analysis to Drive Bottom-Up Modeling of Physiological Processes
  3. Bridging high-throughput genetic and transactional data reveals cellular responses to alpha-synuclein toxicity.

The first was the original PATH LINKER paper which our research hopes to extend upon. The next two provided some information on Bayesian Weighting but did not provide too much else due to some ambiguous mathematical notation. I hope to attempt to write a document that annotates the relevant portions of the papers for the sake of clarity and future reference.

 

Kicking Off the Collaborative REU

As the summer winds down and classes begin at Reed College, we are excited to begin a new project that sits at the intersection of computer science and biology.  With mentoring expertise on both sides of the aisle (Anna is a computer scientist, and Derek is a cell biologist), our interdisciplinary team will apply computer science techniques to predict potential players in disease.

The Biological Question: How is cell migration regulated in patients with schizophrenia?

Schizophrenia is a psychiatric disorder that affects how a person thinks, feels, and behaves, with potentially severe symptoms.  While we know that susceptibility of this disease runs in families, there are many mysteries about which genes, or “instructions” encoded in DNA, drive schizophrenia.  A paper recently demonstrated that cell migration patterns are altered in patients with schizophrenia – the cells become more motile and less “attached” compared to the same type of cells from healthy patients.  Since genes associated with cell migration have also been implicated in other diseases, we want to identify genes that may be potentially involved in altered cell migration and schizophrenia.

The Computational Approach: Machine learning to predict disease genes

While experiments can test whether a particular gene is associated with cell migration, we can’t simply test all 20,000 possible genes – it would take way too long, be way too expensive, and a vast majority of the experiments will be uninformative.  Instead, we will develop computational approaches to predict a small subset of candidate genes for further experimental testing.  These in silico experiments (which is just a fancy word for computer-simulated experiments) may not be incredibly accurate, but they will sure be fast!

How do we go about developing a computational method to predict candidate cell migration and schizophrenia-associated genes? As we’ll detail in future blog posts, we will search for these genes within large, publicly-available datasets.  We will build a list of the genes that are known to be associated with cell migration or schizophrenia, and then look for other genes that have similar properties to the known genes.  This general technique is called machine learning, where we design instructions for a computer to make predictions.  In our case, we wish to predict whether an unknown gene could be associated with cell migration, schizophrenia, or both.

Experimental Validation: Testing the computational predictions

An important aspect of computational biology research is to experimentally test the predictions to see if we discovered new players involved in schizophrenia and cell migration. In Derek’s lab, the team will test the top candidates in two ways.  First, will see whether each candidate gene affects cell migration in fly cells by “knocking down” the gene product in the cells and observing the change in cell movement.  Next, we will take the top candidates from the first step and observe migration patterns in fly neuroblasts (cells that are destined to become neurons). From these experiments, candidate genes that alter migration patterns in fly neuroblasts may affect neuron cell migration in humans.

There is lots to learn and lots to do!  It will be a fun year – stay tuned.