{"id":332,"date":"2018-06-15T16:27:16","date_gmt":"2018-06-15T23:27:16","guid":{"rendered":"http:\/\/blogs.reed.edu\/compbio\/?p=332"},"modified":"2018-06-15T16:27:16","modified_gmt":"2018-06-15T23:27:16","slug":"week-4-obtaining-and-processing-data","status":"publish","type":"post","link":"https:\/\/blogs.reed.edu\/compbio\/2018\/06\/15\/week-4-obtaining-and-processing-data\/","title":{"rendered":"Week 4: Obtaining and Processing Data"},"content":{"rendered":"<p>This week our main goal has been to find a pipeline to obtain TCGA data in a neat form. We discovered UCSC&#8217;s <a href=\"http:\/\/xena.ucsc.edu\/\">Xena Browser,<\/a> which has files from the TCGA and a number of other databases.<\/p>\n<p>Last week, we used the data from FireBrowse to make a graph of the genes that have patients with abnormally high or low levels of expression.<\/p>\n<figure id=\"attachment_333\" aria-describedby=\"caption-attachment-333\" style=\"width: 640px\" class=\"wp-caption alignright\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-333\" src=\"https:\/\/blogs.reed.edu\/compbio\/files\/2018\/06\/patients_sum_per_genenum.png\" alt=\"\" width=\"640\" height=\"480\" srcset=\"https:\/\/blogs.reed.edu\/compbio\/files\/2018\/06\/patients_sum_per_genenum.png 640w, https:\/\/blogs.reed.edu\/compbio\/files\/2018\/06\/patients_sum_per_genenum-300x225.png 300w\" sizes=\"auto, (max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px\" \/><figcaption id=\"caption-attachment-333\" class=\"wp-caption-text\">Number of patients that have abnormally high or low levels of expression<\/figcaption><\/figure>\n<p>This week we changed that graph slightly by showing the difference between the number of patients with high expression and the number of patients with low expression by gene.<\/p>\n<figure id=\"attachment_334\" aria-describedby=\"caption-attachment-334\" style=\"width: 640px\" class=\"wp-caption alignright\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-334 size-full\" src=\"https:\/\/blogs.reed.edu\/compbio\/files\/2018\/06\/Genes.png\" alt=\"\" width=\"640\" height=\"480\" srcset=\"https:\/\/blogs.reed.edu\/compbio\/files\/2018\/06\/Genes.png 640w, https:\/\/blogs.reed.edu\/compbio\/files\/2018\/06\/Genes-300x225.png 300w\" sizes=\"auto, (max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px\" \/><figcaption id=\"caption-attachment-334\" class=\"wp-caption-text\">Number of patients by gene that have abnormally high or abnormally low levels of gene expression.<\/figcaption><\/figure>\n<p>It is interesting to me that there are generally more patients with severe under-expression rather than severe-overexpression. I wonder if this is because these genes play a role in suppressing tumors, and that therefore maybe under-expression is more likely to cause cancer than overexpression?<\/p>\n<p>I also worked on integrating gene expression data from Xena into our graph of the Wnt pathway.<\/p>\n<figure id=\"attachment_341\" aria-describedby=\"caption-attachment-341\" style=\"width: 704px\" class=\"wp-caption alignright\"><img loading=\"lazy\" decoding=\"async\" class=\"size-large wp-image-341\" src=\"https:\/\/blogs.reed.edu\/compbio\/files\/2018\/06\/Wnt-Pathway-from-Pathlinker-1-704x1024.png\" alt=\"\" width=\"704\" height=\"1024\" srcset=\"https:\/\/blogs.reed.edu\/compbio\/files\/2018\/06\/Wnt-Pathway-from-Pathlinker-1-704x1024.png 704w, https:\/\/blogs.reed.edu\/compbio\/files\/2018\/06\/Wnt-Pathway-from-Pathlinker-1-206x300.png 206w, https:\/\/blogs.reed.edu\/compbio\/files\/2018\/06\/Wnt-Pathway-from-Pathlinker-1-768x1116.png 768w, https:\/\/blogs.reed.edu\/compbio\/files\/2018\/06\/Wnt-Pathway-from-Pathlinker-1-1200x1744.png 1200w, https:\/\/blogs.reed.edu\/compbio\/files\/2018\/06\/Wnt-Pathway-from-Pathlinker-1.png 1230w\" sizes=\"auto, (max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px\" \/><figcaption id=\"caption-attachment-341\" class=\"wp-caption-text\">Wnt Pathway from PathLinker with Gene Expression Data. Red = high, orange = medium-high, yellow= medium, green = low, blue(not shown) = very low, white(not shown) = no expression. Triangles are transcription factors, squares are receptors, circles are intermediate proteins.<\/figcaption><\/figure>\n<p>Kathy figured out what was wrong with PathLinker the first time we ran it and re-ran it. I am working on turning it into a graph, but the input data is very different because it&#8217;s coming from a different version of NetPath so I need to change the program to be able to process the new data.<\/p>\n<p>I also noticed while I was processing the expression data from Xena that there was a large amount of variability in gene expression between patients. I&#8217;m currently working on several things. Instead of just averaging gene expression for genes I&#8217;m comparing gene expression patient-by-patient so I&#8217;m comparing a tumor sample to a normal tissue sample for every patient. I also want to come up with a way to visualize the variance of expression among patients, because the more variance there is the less significant differences in expression between cancerous and normal tissue are. Anna suggested I do this by making the borders on nodes with high variance thicker. I am also going back and checking my math on gene expression to make sure that it is actually statistically significant and is conducted in a way that is similar to how other researchers have done similar research in the past.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This week our main goal has been to find a pipeline to obtain TCGA data in a neat form. We discovered UCSC&#8217;s Xena Browser, which has files from the TCGA and a number of other databases. Last week, we used the data from FireBrowse to make a graph of the genes that have patients with &hellip; <a href=\"https:\/\/blogs.reed.edu\/compbio\/2018\/06\/15\/week-4-obtaining-and-processing-data\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Week 4: Obtaining and Processing Data&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1800,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4,9,11,1],"tags":[],"class_list":["post-332","post","type-post","status-publish","format-standard","hentry","category-biology","category-cancer","category-summer-research-2018","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/posts\/332","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/users\/1800"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/comments?post=332"}],"version-history":[{"count":8,"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/posts\/332\/revisions"}],"predecessor-version":[{"id":345,"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/posts\/332\/revisions\/345"}],"wp:attachment":[{"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/media?parent=332"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/categories?post=332"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.reed.edu\/compbio\/wp-json\/wp\/v2\/tags?post=332"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}