Illinois researcher and colleagues build a genomic platform to further understand E. coli
In August 2015, three children in three separate northern Indiana counties were sickened by Escherichia coli O157:H7, and one of them died.
Bacterial outbreaks like these send public health officials into overdrive as they search for the source of the contamination. Was it undercooked meat, unwashed produce, juice that wasn’t fully pasteurized? Did the children have contact with each other?
The O157:H7 version is just one strain of thousands in the E. coli family. But the genome that can make it deadly for children and the elderly differs only slightly from that of its nonpathogenic relatives, said Sergei Maslov, professor in Bioengineering and Bliss Faculty Scholar at the University of Illinois at Urbana-Champaign.
Maslov also is affiliated with the Carl R. Woese Institute for Genomic Biology and the National Center for Supercomputing Applications at Illinois, and he holds a joint appointment in the Biological, Environmental and Climate Sciences Department at Brookhaven National Laboratory, Upton, N.Y.
He and his colleagues, biophysicist William Studier, post-doctoral researcher Purushottam Dixit, and graduate student Tin Yau Pang of Brookhaven National Lab, recently analyzed O157:H7 and 31 other E. coli strains to gain insights into the evolution of bacteria and the development of benign and pathogenic strains. The strains they chose have some of the most complete genetic sequences and contain representatives from each of E. coli’s six evolutionary groups.
The researchers developed a group of computational methods to analyze single-nucleotide polymorphisms (SNPs) within the E. coli strain’s basic genome. SNPs are variations in the DNA sequence, which is made up of varying combinations of four basic nucleotides — adenine, thymine, cytosine and guanine.
Maslov and his colleagues aligned the 32 strains’ genome sequences and filtered out any mobile elements to identify the most reliable lengths of DNA code. This left them with 496 basic genome pairs.
The researchers had enough data to identify a basic genome platform for E. coli.
“This is key because, in the event of an outbreak of a new strain of pathogenic E. coli, medical researchers will be able to run a single comparison of the new strain against this basic platform to quickly find what is new, what is lost or replaced by another strain,” Maslov said.
It will be especially useful in determining antibiotic resistance. “I think this framework could really simplify a lot of things if people get to using it,” added Studier.
Additionally, the group was able to further the current scientific theory that horizontal gene transfer represents a dominant source of differences between genomes of E. coli strains, as well as a very important method of bacterial evolution. The researchers separated vertically inherited (parent-to-offspring) clonal segments of DNA code from those that were horizontally or recombinantly transferred throughout the entire basic genome.
“We found that five to 10 times as many nucleotide changes were acquired by horizontal transfer as were accumulated by mutations in segments inherited from a common ancestor,” Maslov said. “Over time, more successful or just plain lucky segments are replacing ancestral segments, until you don’t have evidence of a clonal part,” he explained. “It’s like looking for your great, great, great grandfather’s DNA in your own DNA. There isn’t very much of it left.”
The researchers found that the basic E. coli genome is continually exchanged by recombination with genome fragments acquired from other genomes in the E. coli population. E. coli’s evolutionary groups appear to exchange DNA preferentially within their own groups but also with other groups to varying degrees.
“We believe these genetic transfers are likely the result of co-evolving populations of bacteriophages, which efficiently distribute variability throughout the E. coli population,” Maslov said. Restriction systems in the recipient cells keep the transfers limited to fragments. “Essentially they are being overrun not by enemies but by friends,” he said.
This study builds on Maslov, Studier, and other collaborators’ 2009 study of E. coli strains B and K-12, two of the most commonly used laboratory strains. Maslov hopes to add significantly to the group’s E. coli platform with his next research venture, which will analyze 1,600 strains of E. coli. “Going from two strains to 32 and then to 1,600 is like going from a static photo of E. coli evolution to a movie with only a few frames per second to a movie with many frames per second,” he says. “We will have a much more complete image of evolutionary processes in E. coli.”
Studier also is looking forward to seeing the results of the larger study. “The interesting thing for me will be to see whether there are more subgroups in E. coli that come out in a larger analysis,” he said.
Eventually Maslov hopes to create platforms for other bacterial genomes, including Campylobacter jejuni, a leading cause of food poisoning in the United States; Helicobacter pylori, which can cause ulcers but may be beneficial in preventing asthma, dermatitis, and some gastrointestinal conditions; and the Bacillus family, which includes anthrax.
Paper published in Proceedings of the National Academy of Sciences: