Tree Building, part 1

November 21, 2011

For those of you who have actually been reading and are curious, my BLAST finally finished. It only took 9 days, which was nice considering I’ve heard BLAST’s of that size can take months. Finally finished running the myriad programs required to extract data from the proteomes on Friday… and now I’m trying to figure out where and how to utilize this data so I can actually get meaningful info from it.
Beside that, I’m currently working on building a phylogenetic tree using the genomes of the same Archaea I just did BLAST’s of. This tree building is the first part of my actual project for the lab, and I’m pretty excited about it. We’ve decided to use codons to build the tree, which is very time intensive but incredibly accurate and reliable. But, as happens a lot in this field, the software we’re using to build the tree doesn’t support codon models; more specifically, they convert nucleotide models to codon models, but the way in which they do this is less accurate than just using a codon model. So I’ve been tasked with figuring out how to modify the script the software uses to allow us to use the Goldman-Yang (1994) Codon Substitution model. So at the moment I find myself staring at XML scripts and Goldman and Yang (1994) to see what needs to be modified, how, and why. Yay…‽
Hopefully more to come soon.


I’ve been running blast for ~73-74 hours (Daylight Savings Time kinda messes that up), and so far I have a 1.1 Gb table of output… and it’s not done.

On the bright side, I wrote a successful and useful Perl script after having not coded in Perl in 5 months.  And it was my first ever serious experience in Perl.  Oh, and I remember how to use bash now! Google and the O’Reilly Pocket Reference I picked up have helped considerably.  That and the Grad Students who know the small tricks that make life easy.

Things I’ve noticed:
1) Bioinformatics is all about Open Source software and things that people can use without much experience, if any.  But every piece of software I’ve had to use has extremely obfuscated User Guides, Tutorials, and Instructions.
2) There’s no rhyme or reason for anything as far as software or scripting language choice.  I could write everything in Python… except no one in the lab knows Python.  But even our Postdoc, who is a CS/Bioinformatics person exclusively, doesn’t know Perl and uses Google to figure out things he hasn’t seen before.  He’s never actually learned Perl, and neither has anyone else in the lab. Hell, I could use Javascript (if I knew it), because we don’t use any of the scripting language to work between software/applications.
3) I prefer this type of work to most things I’ve ever done.  Yeah, I’m being thrown out into the wild without knowing where I’m going, but that’s what I like (I think).  It actually feels like MY project, not someone else’s that I’m doing the grunt work for.  That’s refreshing, and very cool.

As far as the project… The BLAST I’m running right now is not actually going to be a part of that project (for the moment, anyways).  The project I’m working on has switched from Protein Family ID to building a Phylogenetic Tree with our Genomes after comparing them to references.  From there protein families might come back in.  But I’m waiting on that data anyways, so until then…

I’m having a BLAST.  Oh yeah, I went there.

If you’re interested in what I’m doing, you can go to (if it’s working).  It has a pretty general overview of the work, and even though it’s being run by the lab that my lab is collaborating with, it’s kinda cool to see it all visualized and whatnot.


Yeah yeah yeah, I know, they aren’t called “enraged” anymore, they’re called Activated Macrophages.  But I concur with my Microbiology professor that “Enraged Macrophage” sounds better while being more descriptive.  But anyways, that’s not the point.

I was trying to find a catchy title related to Cell Bio, but there isn’t much in that field that lends itself to catchy AND pun-ny titles.  Although “See You In The ER” might have been ok… Alternatively, “I Ran, I Ran-GEF’ed So Far Away” is just bad, and I’ve been trying to figure out ways to use COPII, MAPK, NLS, and a handful of others with no success.  This is way off topic, though, so let me get to the purpose of this.

I need to blog more, which is something I’ve been saying almost ever since I started this damned thing.  I like blogging, and I like blogging about more interesting or relevant things than just the doldrums of my day-to-day.  It’s hard for me to find something to consistently write about, however, so my plans to be more consistent usually fall apart.  I’m hoping to change that by chronicling my Misadventures in Bioinformatics.  Hey… that’s not a bad title.  But titles aside, I think this is actually something I can accomplish.  I have to write a weekly “blog” for my Modeling in Biology class anyways, and that class is also having me put together a model of my own, complete with a Lit Review, a proposal, and, of course, a model that I have to present and defend.  How does a quarter-long class get me to write more consistently?  Well, in addition to the class I’ve (finally) begun working in the lab I’ve been trying to get into since my Freshman year, and I’m working with Metagenomic data of a very large number of Archaea; Halophilic Archaea, to be precise  (“Halophile Ever Know” might have been a decent title too.  But now I really need to stop).  And I’ve managed to find connections between the modeling project for my class and my lab work in the lab.  Both of which give me something to talk about at regular intervals, as I slog through everything.  It may or may not be interesting, but it’s at least an excuse to blog more often, and about things I actually can say something about.

So that brings me to where I am currently: trying to narrow down my modeling project while trying to relearn Unix/bash and “relearn” Perl, along with learning bioinformatics the old-fashioned way — Coffee, Google, bang head against wall, repeat until you have a vague idea of what you’re doing.  I’m not new to bioinformatics, but I’m extremely new to genomic data from multiple genomes, database management, protein family annotation, and constructing phylogenies from phylogenomic data.  And like all good bioinformatics projects, I’ve been sent out into the wild with nothing more than a pat on the back, some vague info, and a dark spot on the horizon that I’m supposed to walk towards.  Oh, and Google.  But Google pre-Google maps, because it can’t tell me what that dark spot on the horizon is.

So what’s there to look forward to? Well, data and info, for one.  And the excitement of finally (and hopefully soon) figuring out how to do everything I need to do efficiently.  I say that because the people in my lab basically created a lot of the methods that we and many other labs are using to do research of this sort, and they pull their hair out constantly.  They still have issues with everything they use, but they’ve at least found ways to make the head-bashing minimal and more efficient.  But what I’m really, really, really, REALLY excited about is the potential for my modeling project and my lab work to overlap and (hopefully) turn into my own actual research project, complete with experiments, data, and maybe even publication.

So yeah, what is my modeling project?  I’m trying to look into and model the interactions between microbes and humans that might cause tumor formation and cancer.  While I might be working on the genomics of haloarchaea, the major research focus of my lab is in microbe-host interactions.  But hey, what’s that?  An article from 2010 describes finding Halophilic Archaea in human intestines?  Some of the species found in the study are some of the species who’s genomes we’re studying?  Even if there really isn’t anything there, it’s a start.  It’s something I can keep an eye on.  And it’s something I can blog about.


This is from a blog post I had to write for one of my classes this quarter, BIS 132: Dynamic Modeling in Biology. I hope you enjoy it and find the subject as fascinating as I do!

Instead of writing on one of the provided prompts for this week’s blog, I decided to write about something that is specifically interesting to me and, I hope, will be interesting to you as well. I did this because I felt that the 300 word summary for the homework assignment wasn’t sufficient to fully capture the research and implications. That and the fact that I’m extremely excited about this research now that I’ve read through it and have had a chance to dig into it a little bit. So let’s begin, shall we?

The article I am writing about is called “A Dynamical Systems Model for Combinatorial Cancer Therapy Enhances Oncolytic Adenovirus Efficacy by MEK-Inhibition.” It was published this past February in PLoS Computational Biology, and was authored by 4 researches from MIT and UCSF. The article discusses the use of oncolytic adenoviruses for the treatment of metastatic cancers, as metastatic cancers are very dangerous and non-surgical treatments are quite often ineffective. Then again, surgical treatments for these types of cancers are often not very effective as well. So why use viruses? Well, viruses, specifically adenoviruses, have been used as vectors to deliver recombinant DNA to targeted cells for a small number of diseases. Adenoviruses are found in vertebrates, are the cause of conjunctivitis (“Pink Eye”) among other viral diseases, and have double stranded DNA, much like Humans and most other animals. Adenoviruses lacking the E1B-55K gene, which can inhibit the tumor-suppressing protein p53, can selectively target cancerous cells, as mutations to p53 are the most common types of cancerous cells and when p53 is active it can inhibit the action of the adenovirus. Thus, they typically can only infect cancerous cells where a p53 mutation is the cause of the cancer and cannot affect healthy cells, leaving them alone.

Here’s where it gets interesting, at least to me: these oncolytic adenoviruses, including ONYX-015 which is the particular virus studied in this model, require the protein CAR (Coxackievirus-Adenovirus Receptor) to be on the surface of the cell for the virus to be able to attach to and infect the cell. The CAR protein, however, is often not expressed in cancerous cells. To induce expression of this protein requires disruption of the Mitrogen-Activated Protein Kinase Kinase (MEK/MAPK2) pathway; disrupting this pathway arrests, or freezes, the cell in the G1 phase of the cell cycle, which is the phase preceding the S, or DNA Synthesis, phase. When the cell is frozen in the G1 phase, the virus is unable to replicate, and therefore can’t spread and continue to lyse cancerous cells. So how does one kill a tumor with a virus if the virus can’t attach to the tumor cell or replicate if it can attach? This is exactly what the authors of the paper wanted to find out and describe using a dynamic model.

To model the most efficient way to combine treatments to increase CAR expression while not freezing the cell in the G1 phase, the authors constructed a dynamic model using a four-state nonlinear Ordinary Differential Equation, which operates much like a bathtub model. To do this the authors had to use experimental data to quantify CAR expression, tumor cell proliferation, adenovirus infection, cell viability, and viral replication both in the presence and absence of MEK-pathway inhibition. This led them to the four-state ODE, where the four states, which are the state variables, are cell states during the treatment process: 1) uninfected cell density, 2) G1-arrested cell density, 3) untreated and infected cell density, and 4) MEK-inhibited and infected cell density. In addition to this, the total cell population was included as a state variable. The parameters used were the rate of cell proliferation and the rate of infection, where the cells can be either treated or untreated. To simplify the model, delays caused by cell cycle phase transitions were ignored, as were dose and treatment times. Additionally, it was assumed that prolonged treatment would not increase infection and that the cancer cells were uniform.

What was predicted by this model is that two-day pretreatment of cancerous cells with an MEK-inhibitor will almost double the expression of the CAR protein. If this inhibition is stopped when the adenovirus is introduced the cell is allowed to go to the S phase of the cell cycle and the virus is replicated, resulting in cancer cell lysis. It was also predicted that increased cell density at the time of infection will reduce the efficacy of infection. Both of these predictions were shown to be correct during later in vitro experiments. What was even more significant is that the model and these experiments showed that infection during G1 phase arrest is coincident with the greatest amount of virus production and the greatest amount of cancer cell lysis. This is significant for two reasons: one, the more obvious, is that this is when the virus infection should be started to maximize treatment efficacy, and two, which isn’t as obvious initially, is that currently we don’t know much about adenovirus replication in humans, but this points to the G1-S phase transition as being critical to virus replication, which has implications throughout virology, medicine, and cellular biology. Also found is that CAR expression at the time of infection is not the only determining factor for this therapy. The other factors remain to be seen, but nonetheless this model has led to discoveries about adenoviruses and this treatment that were not expected beforehand.

The authors admit, and I agree, that more could be added to the model to make it more accurate. This added accuracy can only come from further in vitro and in vivo experiments in order to elucidate other factors that may influence this type of treatment. That being said, the model data, when compared to data gathered during experiments, is very similar, with the model data for pre-treatment, simultaneous, and post-treatment simulations being within 19% of experimental data or less (the best was within 8% for simultaneous treatment simulations). This shows that the model has an incredible amount of accurate predictive power as is, and this predictive power can only increase as more data and knowledge are accumulated. This predictive power is seen when the authors acknowledge that the simultaneous and post-treatment protocols involved experimental procedures that were not taken into account during model development; in other words, the model came very close to predicting experimental data before other factors were known. I am extremely excited about the potential and promise of further research in this area, and I hope that I was able to accurately convey that through this blog post. Hopefully when more is published I will be able to comment on that in this blog as well.

Bagheri, N., Shiina, M., Lauffenburger, D. A., & Korn, W. M. (2011). A Dynamical Systems Model for Combinatorial Cancer Therapy Enhances Oncolytic Adenovirus Efficacy by MEK-Inhibition. (C. V. Rao, Ed.)PLoS Computational Biology, 7(2), e1001085. Retrieved from


June 29, 2011

Outwardly, I’m calm.  Mostly.  Collected, for the most part.

Inside, I’m a wreck.  And there aren’t many simple solutions to that.


May 24, 2011

Fact: I have ADHD.

Fact: I didn’t find out until late Fall/early Winter quarter.

Fact: I didn’t get treated at all until about a month ago, and nothing effective until a week ago.  Not because of me, but because of the processes and red tape involved in an adult diagnosis.

It didn’t just “pop up”, either.  I’ve dealt with it all of my life, but I always chalked up my symptoms as personality traits.  Or I assumed that what I was going through was normal.  I wasn’t really having problems in school, and any problems I had, either in school or socially, I didn’t link to anything.  Losing attention in a one-on-one conversation?  Must happen to everyone, no big deal.  Unable to focus in lecture?  It’s just me, or the class is boring.  Serial and problematic procrastination?  Everyone procrastinates.  Can’t plan anything, forget important things, no impulse control, etc? Wait, those are problems?

Well, yeah, they are.  And they killed me this year.  OChem raped me the first quarter, even though I studied as much as anyone else, if not more.  Other intense classes (like physics, calculus, etc) gave me similar problems.  I still had no idea why.  I stressed, I worried, I assumed that I just wasn’t cut out for school, biology, medicine, science… you get the picture.

It’s funny how you don’t put two and two together until you learn that two and two might be related, or that they may have a singular cause.

And there’s not much else to tell, honestly.  It’s been a struggle to even be diagnosed, much less to be taken seriously by most doctors.  And it’s very difficult to communicate the feeling I get when I take my medicine to other people.  It’s both satisfying and literally amazing to actually have the motivation to work for once.  To want to work, and to be able to do work when I want to do work.  And I came here to say that; life isn’t suddenly perfect, but it’s better.  And it’s easier to get through my day-to-day and my work.

Best of all, I have some confidence back.  And I needed that, desperately.

Red Panda

January 27, 2011

It’s that time again.  The doldrums, or whatever you want to call them.  You know, that period of time in which I can’t stop analyzing and over-analyzing and thinking and whatnot.  In a nutshell, I’m not as happy as I could be right now.  But that’s more or less my own doing, and I’ll have to deal with that and see if I learned from last quarter.  Nothing like a good academic crotch punch to really put you in your place.  Anywho.

I was in Bio lecture yesterday, and this guy gets up to talk to the class before the professor starts.  Older gentleman, well-dressed, very professor-y (white hair, thinning, kind face, tweed jacket, elbow pads, etc).  He’s there to tell us about this program over the summer for academic credit.  Students can go to South Africa (he leads with this), Big Sur, Alaska, Australia… and China.  Why China?  Well, there’s this animal in China that’s endangered.  Very endangered.  Maybe you’ve heard of it: The Giant Panda?  Yep.  10-12 weeks in China, working with Giant Panda’s, helping with conservation efforts, learning about their ecology and behavior, all of that.  Cool, right?  Well, I said to myself “It’s too bad they aren’t working with Red Pandas; I’d be all over that.” After class I go up to get the information, because it is something to do for the summer, and who knows, maybe there’s something in there that I could do.  I’m not too keen on going to China for Panda research, or to South Africa to work with Elephants, but Big Sur is cool.  So I grab the brochure, or whatever it can be called.  It’s more like an informational magazine on the different programs.  Anyways, I grab it and start walking out of the lecture hall.  As soon as I get outside I open it so that I can peruse it on my sojourn to Starbucks to work.  I open to page 2, and what do I see smiling back at me?  A Red Panda.  A very very CUTE Red Panda, mind you.  Where? Where is this Red Panda? I can work with Red Panda’s?  I look to the top of the Page.  China.  Panda Conservation.  10-12 weeks.  Giant Panda AND Red Panda.  WHY DIDN’T THEY MENTION THIS IN CLASS?!?! That’s the selling point! All of this flys through my head.  Wait, maybe I’m mistaken.  They’re just including it because it’s endangered and in the area.  Nope, Red Panda’s too.  Red Panda’s too.  Since I started here I’ve often thought about and considered taking a quarter in Washington DC to hopefully try to work at the National Zoo.  I would love to work with Red Pandas.  Outside of my field, but that’s besides the point.  They’re Red Pandas!

So what’s the issue?  It’s China.  It’s Summer.  It’s 10-12 weeks.  I have to pay a little over 3 grand to do it.  It’s worth 18 units, sure, but still.  I would love to do it.  It’s a great opportunity.  But I’m really not sure about it right now.  The worst part is that I have to apply soon or not get it at all.  I’m already staring down the possibility that I might not be doing anything this summer yet again, much to my chagrin, and not for lack of trying.  I have bio professors who won’t even email me back about a meeting for a letter of rec, and one letter is one short of what I need for just one summer program.  Jobs aren’t looking good for me at the moment, what with not having enough lab experience or a 3.5.  I need something where I’ll get paid even just a little bit so that I can stay in my apartment over the summer and cover bills and, you know, eat.  I’ve considered applying to be an ER tech at UCDMC, or even looking into an ambulance shift or two, if it comes down to it.  But the point is, I have no idea what I’m doing right now.  I don’t know what I’m doing with myself, I don’t know where I’m headed, I just don’t know.  And most people here would say “But you’re in college, no one knows what they want to do!” No.  I’m almost 23.  I’m a sophomore.  I want to be a doctor.  But I may have screwed myself out of that, at least for the time being.  I mean, there’s still my backup plan, which may just become my plan, assuming I can manage to bring my GPA up from out of the depths of hell (bear in mind, this is pre-med GPA hell, so just a hair below a 3.0).  But really, where am I going?  Moreso, if I can’t get into these research programs over the summer because of my GPA, or even, apparently, because of where I go to school (Thank you, Stanford HCOP, for giving preference in a 25 person class to people from Bay Area schools), how am I supposed to get research, which roughly translates into shiny bullet points on Grad/Med School apps that show that even though I may have had a bad quarter, I’m a good student and suited for their school?  And because I love to think ahead, what if I just get burnt out trying?  I feel like I’ve struggled just to get where I am now.  I don’t think I’ve had anything handed to me.  What if halfway through Grad School I just say “Fuck this, I’m tired and I want to be done already”.  I’m not sure if I want to be a professor, much less a researcher, for the rest of my life.  I want to be a doctor.  But can I put up with that much more school, stress, and bullshit?  I think it comes down to can I deal with knowing that I’ll never be what I want to be and that I had to settle over can I put up with stress, school, and bullshit.

At this point I really have no idea. And as this goes on, I’ve slowly come to the realization that I’m in a persistent outgroup.  I don’t really hang out with anyone besides people I’m in a club with, or friends of Lucy or friends of friends.  What the hell happened to me being social and always having close friends?  And why do I insist on being “best friends” with someone who can’t even reciprocate.  I’m not really sure if it was ever smart for me to go down that road.  And so instead of dumping this all on a close friend and being able to talk it out over coffee, I’m left with writing it out as my only means of true outlet.  Because anyone else I tell really doesn’t know what to say.  It’s either “You’ll be ok”, or “Oh, Pshaw”, or silence.  And I think I prefer the silence of my blog on my computer screen to the silence of someone I actually know.

One step closer to just saying Fuck It and becoming an introverted and anti-social English major.  Then at least I’ll have a good excuse to write and read novels, and maybe I won’t care so much about what happens in the future.