{"id":420,"date":"2018-07-08T18:24:09","date_gmt":"2018-07-09T01:24:09","guid":{"rendered":"http:\/\/blogs.reed.edu\/compbio\/?p=420"},"modified":"2019-06-14T15:50:36","modified_gmt":"2019-06-14T22:50:36","slug":"summer-week-8-positive-and-negative-sets","status":"publish","type":"post","link":"https:\/\/blogs.reed.edu\/compbio\/2018\/07\/08\/summer-week-8-positive-and-negative-sets\/","title":{"rendered":"Summer Week 8: Positive and Negative Sets"},"content":{"rendered":"<p>I was curious about what our AUC would look like if instead of comparing hidden positives to unlabeled, we compared hidden positives to hidden negatives, wondering if that would get us a more valid representation of the network&#8217;s ability to predict genes associated with schizophrenia. What if the unlabeled genes ranked above the hidden positives were simply actual centers of genetic perturbations in schizophrenia?<\/p>\n<p>Oddly enough, the scores actually <em>decreased<\/em> from 0.65-0.70 to 0.60-0.65. I found that the AUCs for comparing hidden negatives to unlabeled nodes hovered around 0.53, meaning that the hidden negatives were ranked mostly in the top half of the list.<\/p>\n<p>This somewhat makes sense though. The basis for the negative set is that they are genes in non-neurological diseases, but that doesn&#8217;t necessarily exclude them from being included in a more subtle, polygenic disorder that&#8217;s dependent on additive effects of hundreds of genetic perturbations. It also makes sense that genes involved in other diseases might have a slightly higher probability of being associated with some other disease.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-421\" src=\"https:\/\/blogs.reed.edu\/compbio\/files\/2018\/07\/SZ_4-layer_0.15-sinksource_0.150E_overlap_labels_unlabeled_New_vs_Neg.png\" alt=\"\" width=\"640\" height=\"480\" srcset=\"https:\/\/blogs.reed.edu\/compbio\/files\/2018\/07\/SZ_4-layer_0.15-sinksource_0.150E_overlap_labels_unlabeled_New_vs_Neg.png 640w, https:\/\/blogs.reed.edu\/compbio\/files\/2018\/07\/SZ_4-layer_0.15-sinksource_0.150E_overlap_labels_unlabeled_New_vs_Neg-300x225.png 300w\" sizes=\"auto, (max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px\" \/><\/p>\n<p>I decided to make a new negative set. I started by gathering every single node in the 0.150 network.\u00a0There is a resource called SZGR 2.0 (https:\/\/bioinfo.uth.edu\/SZGR\/) that collects all sorts of evidence for genes being associated with schizophrenia. Using this list, I excluded any gene that had any evidence for being associated with schizophrenia. I then took differential gene expression data from autism spectrum disorder, bipolar disorder, and major depressive disorder as well as schizophrenia (http:\/\/science.sciencemag.org\/content\/359\/6376\/693). If a gene was not differentially expressed in any of these diseases (FDR&gt;0.5 for schizophrenia, FDR&gt;0.2 for others) and was in the other list, I added that gene to my list of negatives, which was\u00a01561 genes long. I found that this list, when hidden negatives were compared to unlabeled genes, had an AUC of about 0.44, which means that our negatives are finally oriented around schizophrenia rather than non-neurological diseases. I also found that the AUC for hidden positives vs hidden negatives spiked up to 0.75, meaning that our program is legitimately good at predicting genes associated with schizophrenia.<\/p>\n<p>I found no significant performance differences from factoring in evidence levels.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I was curious about what our AUC would look like if instead of comparing hidden positives to unlabeled, we compared hidden positives to hidden negatives, wondering if that would get us a more valid representation of the network&#8217;s ability to predict genes associated with schizophrenia. What if the unlabeled genes ranked above the hidden positives &hellip; <a href=\"https:\/\/blogs.reed.edu\/compbio\/2018\/07\/08\/summer-week-8-positive-and-negative-sets\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Summer Week 8: Positive and Negative Sets&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1584,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7,8,11,1],"tags":[],"class_list":["post-420","post","type-post","status-publish","format-standard","hentry","category-creu","category-schizophrenia","category-summer-research-2018","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/posts\/420","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/users\/1584"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/comments?post=420"}],"version-history":[{"count":1,"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/posts\/420\/revisions"}],"predecessor-version":[{"id":422,"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/posts\/420\/revisions\/422"}],"wp:attachment":[{"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/media?parent=420"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/categories?post=420"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/tags?post=420"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}