Use of 16S rRNA and rpoB Genes as Molecular Markers for Microbial Ecology Studies
Rebecca J. Case, Yan Boucher, Ingela Dahllöf, Carola Holmström, W. Ford Doolittle, and Staffan Kjelleberg
Applied and Environmental Microbiology, 73(1), Jan. 2007, 278-288
Here we have a follow-up study to one discussed earlier in this blog, about the troubles of using 16s rRNA for phylogenetic studies.
16s rRNA is a very slowly evolving molecule. There are a number of reasons for that. One is that the 16s secondary structure means that most mutations require a second mutation or else it interrupts a pairing pattern and ruins the structure. Second, the 16s is part of the huge ribosome complex that is really key to most of life's processes (central dogma stuff). You simply can't live if your ribosomes are mucked up, even to the point that the run slowly. The 16s has to interact with all sorts of proteins and other rRNAs. Most of it is critical, one way or another. On top of that, most bacteria have multiple rRNA genes. They need to be able to control their investment in ribosomes and also produce lots and lots of them quickly. Thus the copies are regulated so that some are almost always on, while others turn on only when growth is very rapid. This allows for great flexibility in growth rate. Having multiple copies allows for 'correction' by copying a good copy onto a bad copy. This slows evolution; it may relax purifying selection a little in the copies that are more rarely used, but it also ups the ante on muller's ratchet. Multiple copies, variable per cell, also means that a single copy of the gene doesn't mean a single cell - or even a linear relationship. This is a very serious problem for doing relative counts among different uncultured bacteria.
Anyway, between having multiple copies in the same cell, which may vary at the few points that variability is allowed, and having little variability to start, this creates a mess for phylogeny. Still, 16s is the gold standard, in part because of the size of the database. Also, it is ubiquitous (good) and it has good primers (good) and it is generally accessible to polymerase (good; other genes may have secondary structure in the chromosome that makes copying the gene difficult); further, multiple copies in a cell means there are more targets, more to work with per unit DNA (super for FISH - but differing accessibilities of differing copies complicates this picture as well). It is also an RNA, so there is no 'codon bias' or 'wobble' complications. Still, this way of thinking really disguises the complexity of the basis of mutation rate variation across a functional RNA molecule, which is at least as complicated as that for proteins.
Various genes have been suggested to augment or replace 16s rRNA for various taxa. RecA for Vibrio is one example. MLSA is another option (Santos, S. R., and H. Ochman. 2004. Environ. Microbiol. 6:754-759; Thompson, F. L., D. Gevers, C. C. Thompson, P. Dawyndt, S. Naser, B. Hoste, C. B. Munn, and J. Swings. 2005. Appl. Environ. Microbiol. 71:5107-5115.), and then there is the nice work by Konstantinos T. Konstantinidis, Alban Ramette, and James M. Tiedje (Applied and Environmental Microbiology, November 2006, p. 7286-7293, Vol. 72, No. 11).
In this paper, they suggest RpoB as a single universal gene. Of course, it carries all the baggage of using any protein coding gene. However, it is easier than trying to use a large set of genes - which, though it gets cheaper all the time - is still very expensive, very difficult, and impossible to do in a 'metagenomic' way by just pulling genes out of mixed template without 'coordination' - getting clones or other identifiable genomic units that can be linked together.
From their evidence, it seems that the choice of RpoB is as good as any, but there is no great evidence that it is uniquely better than either another gene, or some collection of genes that changes based on the overarching taxonomic framework that uses different markers for subgroups and something like 16s to generate the coarsest classifications.
At the end of the paper, the authors discuss the use of functional genes - using a gene as a marker not for a taxa, but for a capability (like phenol degradation, etc. This is naturally hugely complex, especially as many certain capabilities can be achieved multiple ways, using unrelated genes. This notion is not advanced, it seems to me, by using RpoB, unless I'm missing something. My bet is that this methods paper is really a pre-paper to carry the methods for something more interesting coming down the pike. Still, it represents an interesting body of data describing how replacing 16s is complex, and providing some more data on how this may or may not be accomplished.
Rebecca J. Case, Yan Boucher, Ingela Dahllöf, Carola Holmström, W. Ford Doolittle, and Staffan Kjelleberg
Applied and Environmental Microbiology, 73(1), Jan. 2007, 278-288
Here we have a follow-up study to one discussed earlier in this blog, about the troubles of using 16s rRNA for phylogenetic studies.
16s rRNA is a very slowly evolving molecule. There are a number of reasons for that. One is that the 16s secondary structure means that most mutations require a second mutation or else it interrupts a pairing pattern and ruins the structure. Second, the 16s is part of the huge ribosome complex that is really key to most of life's processes (central dogma stuff). You simply can't live if your ribosomes are mucked up, even to the point that the run slowly. The 16s has to interact with all sorts of proteins and other rRNAs. Most of it is critical, one way or another. On top of that, most bacteria have multiple rRNA genes. They need to be able to control their investment in ribosomes and also produce lots and lots of them quickly. Thus the copies are regulated so that some are almost always on, while others turn on only when growth is very rapid. This allows for great flexibility in growth rate. Having multiple copies allows for 'correction' by copying a good copy onto a bad copy. This slows evolution; it may relax purifying selection a little in the copies that are more rarely used, but it also ups the ante on muller's ratchet. Multiple copies, variable per cell, also means that a single copy of the gene doesn't mean a single cell - or even a linear relationship. This is a very serious problem for doing relative counts among different uncultured bacteria.
Anyway, between having multiple copies in the same cell, which may vary at the few points that variability is allowed, and having little variability to start, this creates a mess for phylogeny. Still, 16s is the gold standard, in part because of the size of the database. Also, it is ubiquitous (good) and it has good primers (good) and it is generally accessible to polymerase (good; other genes may have secondary structure in the chromosome that makes copying the gene difficult); further, multiple copies in a cell means there are more targets, more to work with per unit DNA (super for FISH - but differing accessibilities of differing copies complicates this picture as well). It is also an RNA, so there is no 'codon bias' or 'wobble' complications. Still, this way of thinking really disguises the complexity of the basis of mutation rate variation across a functional RNA molecule, which is at least as complicated as that for proteins.
Various genes have been suggested to augment or replace 16s rRNA for various taxa. RecA for Vibrio is one example. MLSA is another option (Santos, S. R., and H. Ochman. 2004. Environ. Microbiol. 6:754-759; Thompson, F. L., D. Gevers, C. C. Thompson, P. Dawyndt, S. Naser, B. Hoste, C. B. Munn, and J. Swings. 2005. Appl. Environ. Microbiol. 71:5107-5115.), and then there is the nice work by Konstantinos T. Konstantinidis, Alban Ramette, and James M. Tiedje (Applied and Environmental Microbiology, November 2006, p. 7286-7293, Vol. 72, No. 11).
In this paper, they suggest RpoB as a single universal gene. Of course, it carries all the baggage of using any protein coding gene. However, it is easier than trying to use a large set of genes - which, though it gets cheaper all the time - is still very expensive, very difficult, and impossible to do in a 'metagenomic' way by just pulling genes out of mixed template without 'coordination' - getting clones or other identifiable genomic units that can be linked together.
From their evidence, it seems that the choice of RpoB is as good as any, but there is no great evidence that it is uniquely better than either another gene, or some collection of genes that changes based on the overarching taxonomic framework that uses different markers for subgroups and something like 16s to generate the coarsest classifications.
At the end of the paper, the authors discuss the use of functional genes - using a gene as a marker not for a taxa, but for a capability (like phenol degradation, etc. This is naturally hugely complex, especially as many certain capabilities can be achieved multiple ways, using unrelated genes. This notion is not advanced, it seems to me, by using RpoB, unless I'm missing something. My bet is that this methods paper is really a pre-paper to carry the methods for something more interesting coming down the pike. Still, it represents an interesting body of data describing how replacing 16s is complex, and providing some more data on how this may or may not be accomplished.

0 Comments:
Post a Comment
<< Home