OK, time for some more old people meandering. When I was a youngster, we sequenced by hand (actually, we also did PCR by hand, but that’s really sad). We ran our own gels, called our own bases, the whole 9 yards. The kits we used were a step above doing Maxim-Gilbert sequencing, but compared to today it’s still pretty weird.

Now, we have Engines of God that can sequence entire bacterial genomes in a day, and human genomes in weeks or less. What is wild to me is that not just within my lifetime but within my scientific lifetime we’ve gone from giving a Nobel Prize for the development of sequencing to having so much sequence data that we quite literally don’t know what to do with it all.

Clearly, there will come a point at which having your sequence information will be an important adjunct to health care, and we’ll have all sorts of privacy battles, yada, yada, yada.

But sequence is more interesting than biology. This, in my opinion, is a key concept. We often think of sequences as coming from biology, but that’s not the case. With the parallel revolution in DNA synthesis we can of course make any sequence we want, including massively unnatural sequences with no known counterpart in biology. And with the parallel-parallel revolution in sequence amplification, we have an unholy (or really cool, depending on your point of view) triad: make the sequence, amplify the sequence, acquire the sequence.

We have the ability to label anything we want with a code that scales to the 4th power, and can acquire that code with single molecule resolution. As I have previously predicted, written about, and been fascinated with for a long time, we have the ability to make taggants that can tell us anything about everything.

I originally got interested in taggants because of the OJ Simpson trial (and subsequent, much more legitimate demonstrations that forensics facilities don’t always do things right). Oh my, could we really have confused one DNA with another? How can we make sure of the chain of evidence? Obviously by tagging it with non-natural DNA, precontaminating it so that wherever the natural DNA goes so goeth the tagged DNA (although this would not guard against purposeful biological mixups by perpetrators, ala the plot twist in Presumed Innocent).

But now let’s expand on that concept. We can tag and follow forensic evidence. What else can we tag? Well, if we let our minds wander the answer is: anything. Any noun: people, places, and things. But more than just nouns, we can tag concepts: the time something occurred, where an event occurred, what interactions have occurred subsequently. Just layer on the DNA, and find it later, again at the single molecule level.

I guess the second time I began to fantasize heavily about taggants was after 9/11. Why didn’t we stop Atta? How could we have known? Now, presumably in this day and age the intelligence or homeland security apparatus would work, but as a technologist I’m of course going to hit the problem nail with a tech hammer. If I was into IT, I’d be pushing image analysis, which is quite good and probably less ridiculous than tagging an individual with DNA. Still … if we had known where the terrorists were meeting, and had some way of pre-introducing DNA, then we could let them disperse, let them carry the clues with them, and then had an easy way to pick up and analyze that DNA on the flip side … that would have done it. Not. There are too many slips twixt the DNA and the sequencer in that scenario. Nonetheless, it does set in motion thoughts about when would it be an appropriate tool.

And that was before the Engines of God began churning out their limitless supply of information. Now, now it is not so crazy to imagine that the scene from GATTACA where they vacuum up DNA and just randomly sequence everything in a room, in order to find a ‘borrowed ladder,’ is very possible. But instead of a borrowed ladder you could now be looking for the sequence evidence of what’s been there, where the DNA came from, what interactions have occurred. You’re setting up physical networks akin to the Internet, where you look for distribution from nodes, with the packets being DNA rather than electronic (or else you can just laugh at my view of the Internet as being only modestly superior to the now-deceased Senator who thought of it as a bunch of pipes).

Where will this first come to be? Government? Tagging piles of cocaine to look at distribution networks downstream? Tagging counterfeit money to plumb how it traverses economies? Or industry, where technological innovation usually starts? Many products are already tagged with RFID; will it in fact be cheaper to use DNA? I don’t know, I’ve never been a good futurologist, but I believe this is coming, if only because the network analysis required to understand dynamic distribution will be arcane enough (read: alot like modeling epidemics) to be counted as a trade secret by someone along the way.

- originally posted on Sunday, August 22nd, 2010