By R. Gary Raham
A biologist-artist’s ruminations about our roles in a science-inspired world
We are generating data today at an alarming pace. VCloudNews estimates that worldwide we humans create 2.5 Quintillion bytes of data every day. (http://www.vcloudnews.com/every-day-big-data-statistics-2-5-quintillion-bytes-of-data-created-daily/) Nobody knows how many funny cat videos that includes. At that rate, current magnetic or optical data storage systems may reach capacity within a century. Besides, how many of you old-timers out there have Zip drives or floppy disks gathering dust on a basement shelf? Technology seems to have the same life span as mayflies—but without the mayfly’s ability to use its DNA (deoxyribonucleic acid) data files to reliably reboot via reproduction—a system that has worked well for hundreds of millions of years. Scientists are busy figuring out how to hack Mother Nature’s masterful and eons-old solution for data storage.
Alex Chen with the New England BioLabs has summarized the steps for using DNA for data storage. (https://bitesizebio.com/36177/synthetic-dna-long-term-data-storage/)
- Convert binary code (0s and 1s used by computer geeks) into nucleotide code: the four chemical bases (abbreviated A, C, T, and G) used to link the chains of sugar/phosphate in DNA.
- Separate codes into bits with appropriate spacers and insert an address code for each fragment so you can retrieve it again.
- Create your DNA in short fragments called oligonucleotides.
- Store in a freezer.
- When you need to read the archived data, chemically amplify and sequence the code.
- Analyze the sequence data and re-assemble it.
- Convert the ACTG code back into binary code.
Seems straight forward, but so is assembling your kids’ Christmas presents. Right? Here are some pros and cons for DNA data storage.
- Fantastically high data density. DNA is literally a million times better than any current technology. All the world’s storage needs for a year could easily be contained within a cubic meter of DNA.
- DNA stores well under the right conditions. Scientists have recovered readable DNA from 10,000-year-old mammoth remains and 500,000-year-old fossil horses.
- Compact storage. DNA molecules wind up into compact bundles. The microscopic bacterium called E. Coli that lives in your gut can store 1019 bits of data in its nucleic acid per cubic centimeter. (Trust me. That’s a very big number.)
- One needs a bio lab for reagents like the polymerase needed to decipher the DNA.
- The cost for synthesizing DNA is going down, but is still time-consuming and expensive.
- One needs a cost-effective and convenient facility for sequencing DNA.
- Techniques must be improved for achieving 100% accuracy. Nature’s small error rates allow living things to evolve, but humans don’t want data with evolving content.
- How should DNA be stored? Scientists are still debating the best method(s).
Scientists have already had amazing successes using DNA to encode 154 of Shakespeare’s sonnets, a 26-second audio clip of Martin Luther King’s famous “I Have a Dream” speech, a copy of James Watson and Francis Crick’s double helix paper (a bit of irony there), and a photo of their research institute, among other things.
Imitating nature is usually a good thing. She has had over 4 billion years to work the kinks out of the process.
To learn more about emerging technologies in 2019 check out the December issue of Scientific American magazine.