Ever since I’ve been a practicing molecular biologist, we’ve used plasmids as vehicles for genetic engineering. Or, more accurately, episomes, encompassing the range of extra-genomic information that can replicate inside of cells. In parallel, viral vectors have been harnessed in both prokaryotic and eukaryotic cells. I sometimes wonder whether this predilection to eipsomes has driven perceptions of synthetic biology. If you want to engineer a cell of course you must add information to the cell in a somewhat orthogonal fashion, with its own origin and its own means of being maintained, separate from the chromosome. I think this is one of the reasons that the “Venter shunt” of synthesizing the whole genome has attracted so much attention (other than the obvious, of course: that it’s awesome!).

But given my biases regarding trying to either meld a synthetic system with a natural one, or with the limitations on being orthogonal, I think that we begin to see a much more realistic idea coming into vogue: the genome as the unit of synthetic engineering. Short of being Craig Venter (and wouldn’t we all like to be Craig … or maybe Bill Gates. Some days I just can’t decide), what do the hoi polloi do? Well, into the distant past the way was pointed by folks who originally did directed evolution at the unit of the cell. Barry Hall is a pioneer here, with evolved beta-galactosidase (ebg). Pim Stemmer is under-recognized (from my point of view) for his early successes in genome shuffling, including the uber-scary shuffling of HIV genomes (amusingly, synthesis of poliovirus drove folks nuts, while Stemmer’s earlier work on creating new viruses didn’t raise much of an eyebrow in the popular press). More recently George Church has developed MAGE, which allows the large scale, iterative alteration of the E. coli genome using synthetic oligonucleotides. Other recombinase-based methods are beginning to surge, and we have our own version of genome editing based on the Group II self-splicing intron. Folks have also recognized that naturally competent organisms such as Acinetobacter may provide traction for genome engineering.

Soon there will be tailored genomes that contain augmented genetic codes. And MAGE or its relatives will begin to be used regularly for phenotype evolution. I lived through the use of directed evolution for nucleic acids and proteins, and now we’ll have an age of directed evolution of cellular genomes. The corresponding readouts available from NextGen sequencing will make the analysis of the products of directed evolution relatively straightforward. And the tools available via systems biology will allow us to make sense of the glut of information that will become available. It could actually be argued that directed evolution is perhaps the best means of more fully understanding the interconnections that are being discovered by systems biologists.

I’m now wondering about a question that will be resolved by this age of directed evolution. What does the fitness landscape for a cell look like? Now, that’s a loaded question, given that the fitness landscape for anything is highly dependent upon the environment in which it finds itself. A cell (or protein or nucleic acid) in a laboratory environment has a very different available landscape than a cell (or protein or nucleic acid) in the wild. Still, it’s a legitimate question. One would almost have to believe that cells live on very large neutral plains. The genomes of cells have had to learn to get along over billions of years. If you’re the odd gene out, you get booted pretty quickly. That level of interconnection is probably only possible if the landscape for one mutation moves is very broad.

Of course, there are many places that one can climb or that one can fall on the landscape, but my guess is that the new optima (or nadirs) are relatively mild. I think that the directed evolution of cells will not produce the same range of phenotypes as the directed evolution of proteins. This would of course seem paradoxical, given that cellular machinery is of course for the most part made of proteins. Let me refine that musing: phenotypically, you will get more bang for your mutational buck by focusing on a single protein rather than the cell as a whole. You will eat lactose better if you mutate beta-galactosidase, rather than mutating a genome. Or maybe this is a futile attempt to compare apples and oranges, given that there are organismal phenotypes that have no counterpart in a single protein (you don’t eat lactose unless you first take it up, and beta-galacotisdase is not particularly good at that).

And this nicely brings me back to the start. On the one hand I think that genomes should be the unit of evolution, since they are already an integrated system. On the other, I think that the integrated system will only grudgingly provide new function. And hence we need to provide that new function orthogonally, either by having facile integration methods for new information (cells have already figured this out of course; see pathogenicity islands for details) or via … plasmids. And this brings into focus the question of how we best cross-adapt the older integrated systems with the newer functional information, a question I do not immediately have an answer to.

- originally posted on Monday, September 27th, 2010