Putting genetic snippets together again
The human genome, at three billion base pairs long, is an unwieldy molecule to say the least. While the DNA of other organisms, like bacteria, may be a thousand times shorter than that, others, like that of the loblolly pine tree, are a quite bit longer. Almost eight times longer, in fact — 23 billion base pairs in all.
Now, consider that even the best sequencers available today can only sequence snippets of about 100 base pairs at a time. The millions of resulting shreds must be painstakingly reassembled, one by one, to complete any given genome.
These are facts that Johns Hopkins Department of Biomedical Engineering Professor Steven Salzberg knows well. Salzberg is a computational biologist who has developed a suite of computational tools — software applications — that help put all those millions of snippets back together again into complete genomes. He is a professor in Biomedical Engineering, Computer Science, and Biostatistics and director of the Center for Computational Biology.
“It’s like taking every copy of The Washington Post printed today, chopping them up in random pieces no more than 100 letters each, and then trying to create a complete version of today’s paper from the resulting pile. It takes time, but we have created the tools to do that computationally,” Salzberg says, putting the challenge in perspective.
Last March, Salzberg and his colleagues announced the first assembly of the entire loblolly pine genome, the longest genome ever sequenced. This endeavor was no idle challenge. The loblolly predates dinosaurs and is one of the most significant commercial species of pine in the American Southeast. Knowing its genome, it is hoped, will enable genomic-based breeding programs for wood products and the development of genetic tools to mitigate the effects of the changing climate.
“I’m interested in efficiently reassembling the DNA of many species, including humans, so biologists, geneticists, [and] medical scientists can use that information to solve problems ranging from infectious diseases to cancer,” Salzberg says, adding: “When you speed things up 10, 20, 30 times or more, you change the game. Suddenly, people can do things they never could before.”
— Andrew Myers