A novel site of insertion of IS 6110 in the moaB 3 gene of a clinical isolate of Mycobacterium tuberculosis

In Mycobacterium tuberculosis, genomic variation is generated mainly by insertions and deletions rather than by point mutations. RvD5 is one such deletion in M. tuberculosis H37Rv. Previous studies from our laboratory have shown the presence of moaA3 gene in the RvD5 region in a large number of clinical isolates, that is absent in M. tuberculosis H37Rv and H37Ra. The present study was aimed at investigating the RvD5 locus of the clinical isolates by a detailed PCR analysis. Here we report a new point of insertion of the mobile genetic element, IS6110 in the genome of one clinical isolate of M. tuberculosis. The insertion has disrupted the moaB3 gene, one of the ORFs in the RvD5 region, which is involved in the molybdopterin biosynthetic pathway. This insertion of IS6110 in the moaB3 of the clinical isolate is different when compared to the insertion in the moaB3 gene of M. tuberculosis H37Rv where 4kb RvD5 region has been lost by homologous recombination and only a truncated form of the gene is present. This finding is of relevance since IS6110 is a major element determining the genome plasticity of M. tuberculosis and its numerical and positional polymorphism has always been of special interest.


Introduction
It is well known that bacterial strains within a single species exhibit variations in their genetic profile.][3] Regions of Difference (RD1-RD16) that range in size between 2 and 13 kb were identified in the members of the M. tuberculosis complex by various comparative genome analyses. 4,5x regions i.e., H37Rv-related deletions (RvD1 to RvD5 and M. tuberculosis specific deletion 1-TbD1), are absent in the M. tuberculosis H37Rv genome relative to other members of the M. tuberculosis complex.A major source of genetic polymorphism in M. tuberculosis is the insertion and deletion of these RDs.4,6Most RDs or DNA large sequence polymorphisms (LSP) are considered unique event polymorphisms and have been used to define the six major lineages and several sublineages of M. tuberculosis. 7S6110 is found to mediate a number of genomic rearrangements in RD and RvD regions.Examples include deletion of the 7 kb locus RvD2 in M. tuberculosis H37Rv, which is present in the closely related avirulent derivative H37Ra, 6 loss of RD2 region encoding the protein antigen MPB64 from some strains of M. bovis BCG 2 etc.The RvD2 region undergoes great variability in clinical isolates of M. tuberculosis and seems to represent a hot-spot for IS6110 transposition events. 8Our laboratory has been interested in studying the regions that are deleted in the laboratory strain M. tuberculosis H37Rv, but are present in clinical isolates of M. tuberculosis.RvD5 region which comprises several moa genes is one such deletion in the H37Rv strain.Primers designed to amplify an internal region of RvD5 were expected to give an amplicon of 1254 bp.Most of the clinical isolates screened using these primers gave the expected amplicon, except RGTB39.PCR amplification of this isolate gave an amplicon of 2.6 kb.Efforts to characterize this fragment led to the finding of an IS6110 insertion disrupting the moaB3 gene in this region in the clinical isolate RGTB39.

M. tuberculosis strains and DNA isolation
A total of 105 clinical isolates of M. tuberculosis from different parts of Kerala in South India were used for the study.The standard strains included in the study are M. tuberculosis H37Rv, M. tuberculosis H37Ra, M. bovis and M. bovis BCG (Pasteur).The clinical isolates are the processed sputum samples of tuberculosis patients grown on LJ medium.These were later subcultured in Middlebrook 7H9 Broth (Difco Laboratories) supplemented with OADC enrichment (Difco Laboratories) and 0.5% glycerol (USB Corporation) at 37°C for four weeks.Biochemical characterization and DNA isolation were carried out as described earlier. 1

Aligning sequences and homology search
DNA sequences identified in this study were compared to sequences in the GenBank and EMBL databases.They were also subjected to BLAST analysis in the M. tuberculosis DNA sequence database (http://genolist.pasteur.fr/TubercuList/) and the M. bovis database (http://genolist.pasteur.fr/BoviList/).For DNA sequence alignment, the program BL2SEQ in the SDSC Biology Workbench was utilized.

IS6110 Fingerprinting
Genomic DNA of the standard strain M. tuberculosis H37Rv and three local isolates was digested with Pvu II (New England Biolabs), separated on 0.8 % agarose gel and blotted onto a nylon membrane (Hybond).After prehybridization at 65°C for 3 hours, a 245bp PCR product of IS6110 from M. tuberculosis H37Rv labelled with α -32 P dCTP probe was denatured, quick chilled and allowed to hybridize the membrane.After overnight hybridization at 65°C, the blot was washed with increasing stringency of SSC-SDS and exposed to an activated Phosphor screen (Kodak).The screen was then scanned using Personal Molecular Imager FX (BioRad) and the picture was visualized using the software Quantity One (BioRad) to obtain the IS6110 RFLP pattern.

PCR
To investigate the RvD5 region and to find out the presence of moaA3 and its adjacent regions in the clinical isolates of M. tuberculosis from Kerala, primers were designed to amplify the flanking sequences.One such primer pair (moaFP1 moaRP1) was expected to amplify a region encompassing the whole of moaB3 gene and a part (768bp) of moaA3 gene.The expected size of the PCR product was 1254 bp.In our laboratory we have a repository of clinical isolates of M. tuberculosis collected from tuberculosis patients in Kerala.PCR screening of these isolates using the primers moaFP1 and moaRP1 gave the expected product of 1254 bp in all isolates tested.But one of the isolate RGTB39 generated a PCR product of size 2.6 kb (Figure 1).

Cloning and sequencing of 2.6 kb PCR product
The 2.6 kb band was gel eluted and cloned in pGEMT Easy vector.The ends of the insert were sequenced using moaFP1 and moaRP1 primers.The forward primer gave a sequence which had homology only with 211bp of moaB3.The latter part of the sequence shared homology with a portion of IS6110 transposases seen in the genomes of both M. tuberculosis H37Rv and in M. tuberculosis CDC1551.As expected, the sequence obtained using reverse primer had homology with moaA3.To get the internal sequences of the 2.6 kb fragment, two sets of internal primers -a) moaFP2 and moaRP2 and b) moaFP3 and moaRP3were used.moaFP2 and moaFP3 gave the rest (remaining part) of the IS6110.moaRP2 and moaRP3 primers gave the rest of the moaA3 and the moaB3 genes.Details of the primers and overlapping sequences obtained are shown in Figure 2.

Sequence analysis
Analysis of the sequence data revealed that in RGTB39, the moaB3 gene was disrupted by an IS6110 insertion breaking the gene into two fragments.A comparison of moaB3 and its neighbouring genes in M. tuberculosis H37Rv, M. tuberculosis CDC1551 and RGTB39 is given in Figure 3  Interestingly in RGTB39, none of these scenarios are observed.The IS6110 insertion is after 164 nucleotides and the rest of moaB3 gene is present downstream of IS6110 sequence.Also, the insertion of IS6110 in moaB3 is in the opposite orientation when compared to that in M. tuberculosis H37Rv.

IS6110 RFLP
IS6110 RFLP of RGTB39, RGTB123 and H37Rv were carried out to find out the number of copies of IS6110 in RGTB39.H37Rv served as the positive control (multicopy IS6110) where as RGTB123 served as the negative control which had no copy of IS6110.We have previously reported a large number of single and no copy strains of M. tuberculosis from our region 9 but RGTB39 turned out to be a strain with more than ten copies of IS6110 (Data not shown).

GenBank Submission
The 2612bp sequence amplified from RGTB39 using primers moaFP1 and moaRP1 was completely sequenced, analysed and submitted to Genbank (GenBank:GU994139).

Discussion
Earlier experiments conducted in our laboratory to screen the different RDs and RvDs showed the absence of moaA3, a gene in the RvD5 region, in some of the clinical isolates from Kerala. 1 Screening larger number of isolates using different primers from RvD5 region showed an IS6110 insertion in the RvD5 region of one clinical isolate, RGTB39.Though initially it seemed to be similar to M. tuberculosis H37Rv, it later turned out to be different for three reasons -a) Firstly, the orientation of IS6110 is reverse in RGTB39 when compared to that in M. tuberculosis H37Rv.b) In M. tuberculosis H37Rv, the moaB3 gene is truncated (Rv3324A) with only the last 137 nucleotides being present.The rest of the gene along with three other ORFs is lost in the RvD5 deletion.However in RGTB39 the whole of the moaB3 gene is present flanking the IS6110 insertion.c) The point of insertion of IS6110 in both the strains is different, since in M. tuberculosis H37Rv only the last 137 nucleotides remain, whereas in RGTB39 the last 211 nucleotides of the gene are present (Figure 3).Thus, this locus can be considered as a new site of insertion of IS6110.It is interesting to note here that the IS6110 insertion in RGTB39 seems to be an independent event.
In clinical isolates, IS6110 preferentially inserts into certain regions of the genome of M. tuberculosis, such as the ipl or the IS preferential locus.Six alternative ipl sequence locations have been reported for the integration of IS6110. 10In addition, SiteMapping, a DNA microarray-based methodology has been used to identify eight previously unknown regions for insertion of IS6110. 11Polymorphisms found in plcD gene in isolates of virulent M. bovis were attributed to IS6110 insertions. 12A recent study has described a hotspot for the insertion of IS6110 in the area of the region of difference (RD) 724. 13S6110 transposition events may disrupt open reading frames 14 or regulatory domains.In addition, IS-mediated deletion events have been suggested to be an important mechanism driving mycobacterial genome variation. 6Homologous recombination between directly repeated IS6110 elements has been proposed as a likely mechanism for genomic deletions in clinical isolates. 15The integration of an insertion element can result in either up or down regulation of genes in which it resides.One such case where IS6110 acts as a promoter in M. tuberculosis infected macrophage has been reported. 16It is equally possible that IS6110 insertion may disrupt existing promoter regions leading to down regulation of gene expression.An IS6110 insertion site could possibly be responsible for a strain's increased capacity for transmission and replication.More than 60% of characterized IS6110 insertions have been found to disrupt coding regions.But multiple copies of a gene could compensate for natural knockouts created by IS6110 insertion as in the case of PPE (Pro-Pro-Glu) genes of M. tuberculosis. 14The nature of any alteration will depend upon the function of the products of the genes involved.Gene knock out studies are needed for the confirmation of the role of such genes on the intracellular survival of the pathogen.
In the present study, the IS6110 insertion was found in moaB3, a gene belonging to the multi gene family of the molybdopterin biosynthetic pathway.Molybdopterin is a cofactor required for nitrate reductase and many other enzymes involved in anaerobic metabolism.Genes involved in the molybdopterin cofactor biosynthesis pathway are present in almost all organisms.A study using a promoter trap vector has identified two of the M. tuberculosis genes in this pathway -moaX and moeB1 -as upregulated in the bacteria present in mouse lungs upon infection. 17Therefore, moa genes may be important in the intracellular survival of the pathogen.Since the new insertion disrupting moaB3 was found in a clinical isolate obtained from a patient with pulmonary tuberculosis, it could be argued that moaB3 gene may not be essential for pathogenesis, but may play a role in the variation of virulence of clinical isolates.It could also be that other members of the multigene family compensate for the loss of function.Since M. tuberculosis dedicates 21 genes to the biosynthesis of the molybdopterin cofactor 18 this may be important in the life of the pathogen, but other genes may be expected to complement the loss of activity of moaB3.

Conclusions
Genome variation is the main underlying reason for phenotypic differences observed between organisms of the same species.Genome comparison by multiple sequence alignment of six known genomes of mycobacterial strains (CDC1551, F1, C, Haarlem, H37Rv and H37Ra) revealed that most of the variation results from deletion and transposition events preferentially associated with insertion sequences and genes of the PE/PPE (Pro-Glu/Pro-Pro-Glu) family, but not with genes implicated in virulence.The comparison of multiple genomes demonstrates that M. tuberculosis genome is currently undergoing an active process of gene decay, analogous to the adaptation process of obligate bacterial symbionts. 19So IS6110 mediated genomic rearrangements possibly have a great influence on the generation of genomic variation and ultimately the evolution of M. tuberculosis.
Cloning and sequencing pGEMT Easy vector (Promega) was used to clone the 2.6 kb PCR product.Using the six moa PCR primers, template DNA was sequenced using the fluorescent big dye terminator cycle sequencing ready reaction kit (PE Biosystems) in an automated sequencer (ABI Prism 310).The sequencing reaction N o n -c o m m e r c i a l u s e o n l y

Figure 2 .
Figure 2. Schematic representation of the genome position of IS6110 insertion in the moaB3 gene of RGTB39.The different sequences obtained to get the final assembly of the 2.6 kb PCR product are shown as striped arrows.The genome coordinates (M.bovis) of the primer positions and the genes are also shown.

Figure 3 .
Figure 3. Schematic representation of the comparison of RvD5 and the flanking regions in H37Rv, CDC1551 and RGTB39.In CDC1551 the moaB3 gene is intact.In RGTB39 it is disrupted after 211 bp.In H37Rv, the gene is truncated with only a 137 bp fragment in the genome.The region spanning the expected PCR product of 1254 bp in CDC1551 and the region spanning the 2.6 kb PCR product in RGTB39 are shown in dotted lines.