Genome-Scale Mapping Reveals Complex Regulatory Activities of RpoN in Yersinia pseudotuberculosis

The alternative sigma factor RpoN (σ54), which is widely distributed in eubacteria, has been implicated in controlling gene expression of importance for numerous functions including virulence. Proper responses to host environments are crucial for bacteria to establish infection, and regulatory mechanisms involved are therefore of high interest for development of future therapeutics. Little is known about the function of RpoN in the intestinal pathogen Y. pseudotuberculosis, and we therefore investigated its regulatory role in this pathogen. This regulator was indeed found to be critical for establishment of infection in mice, likely involving its requirement for motility and biofilm formation. The RpoN regulon involved both activating and suppressive effects on gene expression which could be confirmed with mutagenesis of identified binding sites. This is the first study of its kind of RpoN in Y. pseudotuberculosis, revealing complex regulation of gene expression involving both productive and silent effects of its binding to DNA, providing important information about RpoN regulation in enterobacteria.

responses, alterations of their gene expression that are adaptive to the new environment (1). This type of infection-associated transcriptomic reprogramming was obvious in a previous in vivo transcriptomic study of Yersinia pseudotuberculosis isolated from cecal lymphoid compartments of infected mice (2). In that model, plasmid-carried virulence genes known to be necessary for tissue invasion and resistance toward initial attacks from phagocytes were highly expressed during the early phase of the infection. After about one-and-a-half months of symptomless infection, the expression pattern had changed so that genes encoding proteins involved in adaption and resistance to different types of stresses dominated, while expression of the plasmid-carried virulence genes was considerably reduced. This highlights the importance for bacteria of both adapting to new environments and the regulatory mechanisms involved. Hence, increasing our understanding of bacterial function during infection is of great interest. Mechanisms of bacterial adaptation inform choices of potential targets for new antibiotics, with gene products required to maintain infections considered more promising than those of classical virulence genes.
Transcriptional reprogramming is commonly controlled by various transcriptional regulators that are activated in response to external signals. A major class of transcriptional regulators are sigma factors, which upon activation associate with the core RNA polymerase (RNAP), promoting its binding to specific initiation sites and subsequent open complex formation for transcription of downstream genes (3). There are different types of sigma factors in bacteria, where RpoD or sigma 70 ( 70 ) is the primary and housekeeping sigma factor active during exponential growth (4)(5)(6). Other alternative sigma factors such as RpoE, RpoS, RpoH, and RpoN, which recognize promoter sequences distinct from that of RpoD, regulate transcription under specific conditions, allowing expression of genes required for the bacteria to cope with and adapt to particular situations (5). One of the sigma factors that attracted interest during the analysis of data from our previous in vivo transcriptomic analysis of Y. pseudotuberculosis was RpoN or sigma 54 ( 54 ), which together with many of its associated proteins, including activators and modulating proteins, was upregulated during the persistent stages of infection, when the expression of genes important for adaptation to the tissue environment dominated (2). RpoN has been reported to control regulation of genes involved in nitrogen metabolism, flagella, and motility, but biofilm formation and quorum sensing can also be affected in rpoN mutant strains (7)(8)(9)(10). In some species, RpoN also influences regulation of type III and type VI secretion (8,(11)(12)(13), and there are many reports implicating RpoN as a regulator of bacterial virulence (9,14,15).
RpoN is structurally and functionally distinct from other sigma factors in that transcription initiation commonly depends on its binding to activating proteins termed bacterial enhancer-binding proteins (EBPs) (6). These EBPs use ATP catalysis to remodel RNAP DNA binding to initiate transcription (16). The affinity of RpoN for the core RNAP is higher than that of most other RpoD-related alternative sigma factors, allowing it to compete efficiently for RNAP binding. Regulation by RpoN can be either direct or indirect via activation of different positive or negative regulators, including other sigma factors (5,17,18). Compared with other sigma factors, direct cross talk whereby different sigma factors regulate the same gene is particularly high for RpoN and is also commonly seen for genes that encode proteins involved in complex processes with different levels of regulation, such as adaption, chemotaxis, adhesion, and protein secretion (19). Adding to the complexity is the variation between different bacteria in the specific signals regulating RpoN and the specific downstream outcomes. One example is biofilm formation, where a deletion of the rpoN gene results in severe effects on the capacity to form biofilms in many bacteria but where the opposite is seen in some other bacteria (20,21).
This study investigated the regulatory role of RpoN in Y. pseudotuberculosis, which had not previously been addressed in detail. Neither has the RpoN regulon been defined for this pathogen. The results revealed that RpoN is crucial in Y. pseudotuberculosis to establish infection and is required for biofilm formation and motility. Chromatin immunoprecipitation coupled with next-generation sequencing (ChIP-seq) was used to determine genome-wide binding and revealed more than 100 RpoN binding sites with both inter-and intragenic locations. Transcriptomic data from bacteria lacking rpoN implied a complex regulatory network with direct or indirect effects. Matching the locations of ChIP peaks with transcriptomic data allowed retrieval of more than 130 genes potentially regulated by RpoN, some novel and some known from previous studies. Mutagenesis of selected RpoN binding sites confirmed both activating and suppressive roles of upstream intergenic RpoN binding. This was not seen for sites of intragenic binding to the sense strand. In contrast, mutation of RpoN binding motifs on the antisense strand commonly resulted in suppressed expression of the gene on the sense strand, implicating a novel regulatory mechanism.
to function in similar approaches in other bacteria (24). We generated a vector construct to overexpress RpoN-V5 (prpoN::V5) and also a strain expressing RpoN-V5 from its native promoter, by inserting the sequence for the V5 tag in the 3= end of the rpoN gene (rpoN::V5). Repeating the assays for biofilm formation and motility and including the strains expressing V5-tagged RpoN (ΔrpoN/prpoN::V5 and the rpoN::V5 strains) showed that the V5 tag did not interfere with RpoN function (Fig. 1C and D). It is commonly assumed that RpoN levels and binding to RNAP and to DNA are relatively stable and that the EBPs play the regulatory role. However, our finding of differential expression of rpoN in vivo compared with its expression level in vitro (2), and the possibility of chromosomal structural changes influencing gene expression under certain conditions (2,25,26), prompted us to map the binding of RpoN in bacteria subjected to more than one condition. The conditions used were exponential growth at 26°C, where RpoN-V5 either is overexpressed in trans (prpoN::V5) or is expressed in cis under its native promoter (rpoN::V5), the latter to ensure proper stoichiometry and thereby avoid side effects of competition with other sigma factors. We also included samples subjected to virulence-inducing conditions, with a shift to 37°C and depletion of extracellular Ca 2ϩ for 75 min using the rpoN::V5 strain.
The resulting ChIP-seq data were subjected to a high-stringency bioinformatic analysis with a cutoff of a 2.5-fold difference over genomic noise. The analysis that was done for all samples individually identified totally 119 ChIP-seq peaks representing putative sites for RpoN binding. The number of peaks was in the range previously shown for Escherichia coli, Salmonella enterica serovar Typhimurium, and Vibrio cholerae (8,27,28). Some of the peaks were narrow and distinct, covering 200 to 300 nucleotides, whereas others were relatively broad and covered 300 to 800 nucleotides (see Table S1 in the supplemental material). All 119 predicted peaks were used to identify a common sequence motif in the Ϯ50-bp regions of the peak center. We identified a motif that resembled the RpoN Ϫ24/12 promoter element found in other bacteria (27,29,30) (Fig. 2A). To determine motif strength, defined by level of enrichment of RpoN within a binding site, NN-GG-N9-TGC-NN was used as the base for position-weight matrix calculations. The identified motifs had PSSM (position-specific scoring matrix) scores ranging from 3 to 12, and the motif sequences were found to cover the peak center area (Fig. 2B). The fact that the majority of the predicted motifs are found around the peak center implies that they are genuine RpoN binding sites.
We set a cutoff for the PSSM of scores of Ͼ7 for high-confidence peaks, which yielded 103 peaks encompassing 112 binding sites ( Fig. 2C; Tables 1 and 2). There were relatively few ChIP-seq peaks without motifs (16 out of 119; Table S2) compared with the findings in some other studies (24,27,28). This probably reflects the highstringency analysis used, which limits the number of false positives commonly found in ChIP data from highly transcribed regions (31,32). The majority of the peak-associated RpoN binding sites were found in bacteria expressing RpoN-V5 from its native promoter during logarithmic growth. The corresponding samples from bacteria overexpressing RpoN-V5 lacked six of those peaks, and even more peaks were missing from the samples from virulence-inducing conditions. The reasons for these differences are not obvious, but in the case of peaks missing in bacteria induced for expression of the virulence plasmid, the availability of exposed sites might have been affected by structural changes in the chromosome per se that can be part of mechanisms suppressing chromosomal gene expression (2,25,26). In bacteria overexpressing RpoN-V5, there might be a saturation effect at high RpoN-V5 concentrations, with the precipitation of RpoN molecules not associated with any binding site that dilutes samples resulting in relatively less precipitated DNA where DNA fragments from low-abundance binding sites are missed. Notably, unlike other sigma factors, RpoN can bind its DNA sequence without RNAP, although the binding is 10-fold less efficient than the RpoN: RNAP complex (33). ChIP-seq should therefore also detect RpoN-DNA interactions that are independent of active transcription.
The robustness of the analysis was further verified by the identification of RpoN binding sites at intergenic regions upstream of genes previously shown to be regulated by RpoN in other bacteria. Examples here are YPK_1894 (pspA), YPK_3220 (glnK), YPK_3857 (pspG), and YPK_4189 (glnA) (8,27,28). Comparing our high-stringency peaks with the peaks identified by ChIP-seq in E. coli and S. Typhimurium, it was obvious that the location of many of the peaks in relation to coding DNA sequences (CDS) was conserved, and some were also shared with V. cholerae ( Fig. 2D and Tables 1, 2, and 3). The conserved locations of RpoN binding sites included locations upstream of pspA and glnA with orthologs in E. coli, S. Typhimurium, and V. cholerae, for example, YPK_2229, YPK_2908, and YPK_1600, with orthologs in E. coli and S. Typhimurium (Table 3). There were also many novel binding sites identified in Y. pseudotuberculosis, where a majority were intragenic with some also on the noncoding strand ( Fig. 2E and F). The presence of sense and antisense intragenic binding sites for RpoN is in accordance with what has been previously found for S. Typhimurium and E. coli (27,28). The function and mechanisms of RpoN intragenic binding are generally unknown, but there are examples where this binding can drive transcription of downstream genes with long 5= untranslated regions (UTRs) (27,34). The putative RpoN binding sites identified were divided into groups (A to D) based on their position and orientation. Groups A (26 sites) Those genes which were found to be conserved based on function and phylogenetic distance ratios among all three bacteria and whose corresponding gene was reported (by a ChIP-seq study) to be regulated by RpoN were included in the Venn diagram.  Table 1). Groups C (49 sites) and D (25 sites) comprise intragenic binding sites; those in group C are oriented in the sense direction and those in group D in the antisense direction ( Fig. 2E and F; Table 2). In general, binding sites in group A were associated with stronger peaks than the sites in groups C and D (Tables 1 and 2). Further, compared with the intergenic binding sites among which a relatively large fraction appears to be conserved among E. coli, S. Typhimurium, and Y. pseudotuberculosis, the fraction of conserved intragenic RpoN binding sites was considerably lower (Tables 1, 2, and 3).
To reveal possible sigma factor cross talk in Y. pseudotuberculosis, we also screened for RpoD, RpoE, RpoS, RpoH, and FliA binding sites close (Ϯ200 nt) to the identified RpoN binding sites. This screen showed many potential dual and sometimes triple sigma factor binding regions close to each other in 68% of all A sites (Table 1). Even more multiple binding regions were found associated with intragenic sites, with clusters of 2 to 5 sigma factor sites in more than 90% of all C and D sites (Table 2). Notably, all sigma factor binding regions associated with A sites had a binding site for  Deletion of rpoN has a large impact on the Y. pseudotuberculosis transcriptome, with both direct and indirect effects. Next, we aimed to determine whether the identified binding sites could indeed be coupled to gene expression in Y. pseudotuberculosis. For this we employed transcriptome sequencing (RNA-seq) on WT and ΔrpoN bacteria at 26°C in stationary phase and at 37°C with virulence induction. Analysis of differentially expressed genes revealed a markedly different gene expression pattern in the ΔrpoN strain compared with the isogenic WT strain. More than 500 genes were found to be differentially regulated at 26°C in stationary phase: 294 were downregulated and 213 were upregulated in ΔrpoN (Fig. 3A and B). The effect was even more pronounced at 37°C under virulence-inducing conditions: almost 1,700 genes were affected, with 766 genes downregulated and 929 upregulated. The reason for this discrepancy, with a much higher number of genes affected under virulence-inducing conditions compared with 26°C at stationary phase, might be a higher degree of stress associated with the former. This is a condition known to involve activation of different alternative sigma factors as well as other global regulators. Functional annotation analysis of the differentially expressed genes showed downregulation of genes involved in nitrogen metabolism, flagellar assembly, chemotaxis, and quorum sensing under both conditions ( Fig. 3C and Fig. S2). There was also downregulation of fatty acid biosynthesis and metabolism, but this was not seen in samples of bacteria subjected to virulence-inducing conditions. For these samples, additional pathways were affected by the deletion of rpoN, including, for example, low expression of genes involved in the type III secretion system (T3SS) that normally is highly upregulated under this condition, downregulation of DNA replication and amino acid biosynthesis, and upregulation of genes involved in carbon metabolism, gluconeogenesis, and ribosomal organization, with the latter possibly reflecting the stress of translational reprogramming.  Although the expected effects of RpoN deletion, such as reduced expression of genes involved in nitrogen metabolism, flagella, and quorum sensing were obvious, the impact on the Y. pseudotuberculosis transcriptome was massive, clearly reflecting deletion of a global regulator. This accords with previous studies, which found that the . The criteria for differential expression shown in panels A and B were fold changes of Ͼ1.5 with a P value of Ͻ0.05. (C) Pathway enrichment (KEGG) of genes differentially expressed both in stationary phase and after virulence induction. Pathways are ranked by the negative log 10 of the P value of the enrichment score. The P values were calculated using the Bonferroni correction. (D) Plots showing expression of genes associated with intergenic or intragenic high-confidence peaks (A, B, C, and D sites). The y axis indicates expression values from RNA-seq at stationary phase and virulence induction (log 2 fold change of ΔrpoN/WT), and the x axis indicates the strength of the associated ChIP peaks (the RpoN binding peak over control in log 2 fold) where higher peak strength corresponds to stronger enrichment of RpoN within a binding site. Significantly upregulated (red circles) and downregulated (green circles) genes, as well as nonsignificant changes (black circles), are indicated for the different binding site groups. The portion of potential RpoN binding sites associated with differential gene expression was 88% for group A, 79% for group C, and 68% for group D.
absence of RpoN commonly results in a global effect involving both direct and indirect effects on gene expression (9,24,35). The differential gene expression analysis revealed changed expression levels of other sigma factors in the ΔrpoN strain (Table S3). Expression of RpoD, for example, was significantly upregulated, whereas expression of RpoE was downregulated both in stationary phase and under virulence-inducing conditions. RpoS, on the other hand, was upregulated in stationary phase but downregulated under inducing conditions. In addition to other sigma factors, the mRNA levels of other transcriptional regulators were affected, such as those of CpxR, RovA, FlhCD, and others (Table S3). Hence, these effects, together with effects on the transcription of other sigma factors, are expected to contribute extensively to indirect effects on gene expression, adding further complexity to the data set.
Identified RpoN binding sites mediate both positive and negative regulation of gene expression. Given the complexity, including indirect effects, in the RNA-seq data set, we next aimed to reveal direct effects of RpoN. The RpoN binding sites identified in the ChIP-seq analysis likely include both active promoters driving transcription and suppressive and silent binding of RpoN to the chromosome. The potential RpoN binding sites identified were therefore matched with detected changes in gene expression levels (Fig. 3D). Among the identified binding sites in group A, 88% were associated with differential gene expression, involving both upregulated and downregulated genes (Fig. 3D). The intragenic C and D sites were also associated with differential gene expression, 79% for the C sites and 68% for the D sites. For C sites, the differential gene expression included both upregulation and downregulation of genes containing the motif as well as downstream genes. For the antisense D sites, there was a larger fraction of upregulated than downregulated genes, suggesting potential suppressive effects of the RpoN binding (Fig. 3D). In general, a relatively large portion of the differentially expressed genes associated with RpoN binding showed increased expression in the ΔrpoN strain. Binding by RpoN might suppress transcription by nearby sigma factors and possibly other transcription factors binding to the same region, where the absence of RpoN would then allow transcription to occur (8,27). Also, collision as a consequence of RpoN intragenic binding has been suggested (36,37).
To explore the potential importance of the RpoN binding sites identified in Y. pseudotuberculosis, we set out to mutate some of them to reveal their effects on gene expression. For this, we chose binding sites with locations indicative of putative positive or negative regulation by RpoN, which was the case for many A and D sites. We could not identify any C-site ChIP-seq peak indicative of transcriptional activation of downstream genes in our data set. The sites were mutated by the exchange of 3 to 6 nucleotides in the conserved TGG and TGC sequences of the RpoN binding motif. In intragenic motifs, the exchanged nucleotides were selected in order to minimize changes in the encoded protein (see Table S5 for details). The selected sites included seven A and five D sites (Fig. 4 and Table S4), and the effects of the mutations on gene expression were verified by qPCR. For all putative activating A sites, those represented by downstream genes downregulated in the absence of RpoN, point mutations in the RpoN binding motifs in the WT strain resulted in reduced transcription. This class of binding sites also showed the highest degree of conservation. The most prominent effect of the disruptive nucleotide exchange was seen for pspA (YPK_1894), a gene known to be activated by RpoN (38,39). There were also indications of inhibitory effects of RpoN binding to A sites, where the expression of downstream genes was increased in both the ΔrpoN strain and the corresponding binding site mutant. Intriguingly, four of the five binding site mutations in D sites resulted in increased expression of the CDS on the opposite strand, suggesting a suppressive effect of RpoN binding. We also mutated some of the C sites, but here no effect on transcription compared with that in the WT strain could be seen. Notably, peaks associated with C sites were commonly flatter and broader than the peaks covering the mutated A and D binding sites ( Fig. 4 and Fig. S3).
Taken together, we have by mutagenesis been able to verify effects of RpoN binding to the binding motifs identified in a ChIP-seq screen of Y. pseudotuberculosis. Activating as well as suppressing effects of intergenic RpoN binding were verified by mutating intergenic A sites. Among these verified productive RpoN binding sites, some were known, such as those in PspA, PspP, and GlnA, whereas RpoN regulation of TppB, UgpB, and DkgA is described for the first time. There were also indications of inhibitory effects of RpoN binding to intragenic D sites. As discussed earlier, how RpoN suppresses transcription is less clear, but for binding to the sense strand it might occur by steric hindrance, either by the RpoN-RNA polymerase complex or by RpoN alone binding to the DNA. How this can affect expression from the opposite strand, as would be the case for the observed inhibitory effect of mutating RpoN binding D sites, is less clear, but it might involve disturbed strand separation. For intragenic RpoN binding, we saw effects on transcription only by mutating binding sites on the antisense strand. Thus, intragenic binding by RpoN to coding regions on the sense strand is likely silent and may be used for storage of RpoN-RNAP. RpoN-mediated suppression by binding to internal antisense sequences of genes carried on the opposite strand has not been shown previously, and its mechanism remains to be elucidated.

MATERIALS AND METHODS
Bacterial strains and growth conditions. Strains and plasmids are listed in Table S5 in the supplemental material. Yersinia pseudotuberculosis strain YPIII was used in this study. Escherichia coli S17-1 pir was used for cloning and conjugation. Antibiotics were used at the following concentrations: ampicillin (100 g/ml), kanamycin (50 g/ml), and chloramphenicol (25 g/ml). Motility was tested on Luria-Bertani (LB) medium with 0.6% agar. Biofilm assays were carried out as described previously, using LB medium in glass tubes (40). All strains were routinely grown at 26°C in LB medium containing kanamycin (50 g/ml). For ChIP-and RNA-seq analyses, cultures were grown in LB medium to the desired OD 600 . Arabinose (0.005%) was used for 30-min induction at 26°C for overexpression of rpoN. To reach the virulence induction condition, overnight bacterial cultures were diluted to an OD 600 of 0.05 in LB and grown at 26°C. After 1 h, calcium was depleted by adding 5 mM EGTA and 20 mM MgCl 2 and cultures were shifted to 37°C (22).
Strain construction. In-frame gene deletion and insertion of the V5 epitope and binding-site mutations in Y. pseudotuberculosis were performed using an In-Fusion HD cloning kit (Clontech) according to the manufacturer's instructions. Briefly, the flanking regions of the respective gene were amplified by PCR and cloned into the suicide vector pDM4. This construct was used to transform S17-1 and then transferred into recipient strains through conjugation. Conjugants were purified and incubated on 5% sucrose to recombine out the vector together with WT sequence. Deletion or mutation was confirmed by PCR. For trans-complementation, the gene was PCR amplified and cloned into the pBAD24 plasmid. For gene induction, rpoN and a C-terminal 3ϫV5 epitope tag were PCR amplified and cloned into the pBAD18 plasmid. All constructs were verified by sequencing. Primers used in this study are listed in Table S6.
Ethics statement. Mice were housed and treated in accordance with the Swedish National Board for Laboratory Animals guidelines. All the animal procedures were approved by the Animal Ethics Committee of Umeå University (Dnr A108-10). Mice were allowed for 1 week to conform to the new environment before the experiments started.
Mouse infections. Female FVB/N mice (Taconic) 8 weeks old were deprived of food and water for 16 h before infection. For infection of mice, overnight cultures of the Y. pseudotuberculosis strains were suspended in sterilized tap water supplemented with 150 mM NaCl, reaching an approximate CFU count of 10 6 /ml for low-dose infection and 10 8 CFU/ml for acute infection. Mice were allowed to drink for 6 h. The infection dose was calculated based on viable count and the volume of drinking water supplemented with bacteria that was consumed. Frequent inspections of mice were carried out routinely to ensure no prominent clinical signs were overlooked. Infected mice showing notable clinical signs were euthanized promptly to prevent suffering.
The infections were monitored using the IVIS Spectrum in vivo imaging system (Caliper Life Sciences) routinely every 3rd day until 15 days postinfection (dpi) and later every week up to 28 dpi. The mice were anesthetized using the XGI-8 gas anesthesia system (Caliper Life Sciences) and 2.5% IsoFloVet (Orion Pharma, Abbott Laboratories Ltd., Great Britain) in oxygen for initial anesthesia and 0.5% isoflurane in oxygen during IVIS imaging. After infection, some mice were euthanized and dissected to analyze bacterial localization and presence in various organs, including the intestine, mesenteric lymph nodes, liver, and spleen. The organs were imaged using IVIS. Living Image software, version 3.1 (Caliper Life Sciences, Inc.), was used for image acquisition and data analysis.
Motility and biofilm assays. Determination of swimming motility was performed as described previously (41). A 5-l aliquot of a diluted overnight culture (OD 600 of 1.0) was spotted at the center of a 0.6% LB soft agar plate and incubated at 26°C for 24 h. Bacterial motility was determined by measuring the diameter of the bacterial growth area.
Biofilm formation was determined as previously described (40). Overnight cultures were diluted to an OD 600 of 0.05 and grown to an OD 600 of 0.5 in a 26°C shaking water bath. A 1-ml aliquot of the bacterial culture was pelleted and dissolved in 2 ml fresh LB. The suspension was transferred to glass tubes and incubated for 48 h at 37°C without shaking. After incubation, the bacterial suspension was discarded and tubes were gently washed 3 times with PBS and stained with 0.1% crystal violet (Sigma-Aldrich) for 15 min, followed by 3 successive washes with PBS. The biofilms on the tube surface were thereafter dissolved with 33% acetic acid for 15 min, and the absorbance at 590 nm was measured with an Ultrospec 2100 Pro spectrophotometer (Amersham Biosciences, Piscataway, NJ). conditions and was selected as an internal control in order to calculate the relative expression levels of tested genes, using appropriate primers (see Table S6 in the supplemental material).
Data availability. The RNA-seq and ChIP-seq data files have been deposited in Gene Expression Omnibus (GEO) under accession numbers GSE155606, GSE155607, and GSE155608. All the computer code and pipeline used in these studies are available on request.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only.