Evaluation of Reference Genes for Differential Gene Expression Study of Bovine Tuberculosis

Relative quantitative real-time PCR (qPCR) assays serve as important tools for validating differential gene expression data. A reference gene that is stably expressed across sample types and experimental treatments is crucial for accuracy in interpretation of relative qPCR data. Twelve previously validated reference genes were evaluated in this study to identify a most suitable reference gene that can be used for gene expression study of bovine tuberculosis (bTB) infected and bTB test-false positive cattle, in peripheral blood mononuclear cells after a 4 hour or after an overnight stimulation with bovine tuberculin antigen. Stability of the candidate reference genes were evaluated using the BestKeeper, geNorm and NormFinder programs. The SDHA was found to be the most stably expressed reference gene, regardless of infection status and varying length (4 hours or overnight) of antigen stimulation, while expression of many widely used reference genes are not stable under the studied experimental conditions. We also confirmed that the geNorm and NormFinder programs yielded similar findings in determining the stability of reference genes, which differ largely from the BestKeeper program. This finding stresses the importance of validating the reference gene(s) chosen for each experimental study, and the need for using multiple programs for the evaluation.


Introduction
Quantitative real-time PCR (qPCR) has become the method of choice for quantification of gene expression levels. It is a very sensitive and accurate method for quantification of mRNA transcripts, allowing a direct measure of differential transcription of mRNA for genes of interest, and an indirect measure of regulation of gene expression in biological process [1,2]. Relative qPCR is a rapid and robust method for quantification, and is preferred over the absolute qPCR when the absolute copy number of mRNA is not required [1]. In relative qPCR, a reference gene is used to normalize disparities in RNA recovery and cDNA synthesis efficiency. This permits true comparisons of gene regulation between samples from within a group, and between samples among different groups [1,3,4]. The reference gene is subjected to the same experimental conditions as the genes of interest, and thus serves as a normalizer for correction of experimental variability. The underlying assumption is that the reference gene is expressed at a constant level, and the level of expression remains unchanged across sample types and experimental treatments. Thus, the detected level of expression of the reference gene will correlate with experimental error, which can be normalized directly. Use of a reference gene with an expression level that fluctuates randomly can lead to increased non-specific variation, and use of a reference gene with an expression level that changes with the sample type or with experimental treatments can lead to erroneous interpretation of data [1,5,6].
Conventional housekeeping genes such as glyceraldehyde-3-phosphate dehydrogenase (GAPDH), beta actin (ACTB) and 18S rRNA were widely used as reference genes in early gene expression studies because it was assumed that their expression levels remained constant. However, many studies have shown that expression of housekeeping genes can be influenced by sample types and experimental treatments [7][8][9][10]. Use of the18S rRNA as a reference gene was based on the assumption that the rRNA: mRNA ratio would be the same in all samples and would remain unchanged after treatment; however, that assumption is not always valid [11]. Moreover, the abundance of rRNA, as compared with mRNA, in the total RNA sample could introduce technical issues in the qPCR assay. The disproportionate rRNA: mRNA ratio can complicate optimization of the PCR reaction and performance of the qPCR assay when the preferred fixed sample concentration for each reaction is used. The outcome can be a wide range of 18 Evaluation of Reference Genes for Differential Gene Expression Study of Bovine Tuberculosis qPCR amplification plots for rRNA target that affects the baseline subtraction step in qPCR analysis [2]. Proper validation of reference genes under specific experimental conditions and sample types is critical for accurate gene quantification by the relative qPCR method [6].
Several programs such as BestKeeper [12], geNorm [10] and NormFinder [13] have been developed to evaluate the stability of candidate reference genes. These programs employ different algorithms for calculation of stability values, which may result in different estimates for a stability value. Side by side evaluations using 2 or all 3 of these programs have shown that the best agreement is between geNorm and NormFinder for ranking the most and least stable genes [14][15][16]. To date, there is no single best program for ranking of suboptimal reference genes. The ranking of the candidate genes by 2 or all 3 programs may be necessary before deciding on the best choice for a reference gene [17]. Alternatively, a normalization factor based on 2 to 3 reference genes has been proposed for use as the normalizer. Use of multiple reference genes for a normalization factor can improve the accuracy of quantification [10]. However, this method is costly, labor intensive, and not practical for use with a large number of samples or when resources are limited. This is because all of the normalizer genes must be included on every qPCR plate for every sample [10,18].
Bovine tuberculosis (bTB) is a high impact disease for both animal and human health [19]. A microarray hybridization study was used to examine gene expression profiles from small groups of bTB infected and antemortem test-false positive cattle in the state of Michigan in the United States [20][21]. The overall results showed that differential gene expression profiles exist between test-false positive and true bTB infected cattle. Quantitative real-time PCR (qPCR) assays were chosen to validate differential gene expression data derived from the microarray hybridizations, and to test selected molecular markers for differential diagnosis of the disease.
In this study, a search for a suitable reference gene was performed to identify the best candidate gene that can be used for qPCR of gene expression profile of bTB infected and bTB test-false positive cattle, using mRNA harvested from peripheral blood mononuclear cells (PBMC) after a 4 hour or overnight stimulation with bovine tuberculin antigen (bPPD). Before the initiation of the study, a literature search was performed to identify reference genes previously used in gene expression studies employing i) various bovine sample types, ii) leukocytes or peripheral blood mononuclear cells (PBMC) of species other than bovine, or iii) antigen stimulation studies of human tuberculosis patients. Twelve commonly used and previously validated reference genes were chosen for evaluation using the BestKeeper, geNorm and NormFinder programs. The objective of this study was to determine the most suitable reference gene (with minimal variability in expression level) for qPCR assays of differential gene expression of bTB infected and bTB test-false positive cattle after a 4 hour or overnight stimulation with bPPD.

Materials and Methods
A total of 12 commonly used reference genes as listed in Table 1 were selected from the literature to be considered for use in this study. Published primer sequences for those genes were evaluated in bioinformatics software (a). Primers that met the study criteria were used as published, otherwise new primers were designed using available software (a, b). All primers were synthesized commercially (c), concentration of the PCR primers and qPCR conditions were evaluated and optimized prior to testing of the samples (Table 1).

Experimental Animals and bTB Infection Status
Experimental animals in this study consisted of cattle culled from herds because they showed positive reactions in antemortem diagnostic tests for bTB. The cattle were transported to the Diagnostic Center for Population and Animal Health at Michigan State University the day before a regulatory bTB postmortem examination was done.
Three study groups of cattle were used, which included bTB positive cattle (bTBP) that were positive reactors in antemortem diagnostic tests for bTB and confirmed positive for bTB by postmortem examination; double test-false positive reactor cattle (DFP) that were positive reactors on primary and secondary antemortem diagnostic tests and were negative for bTB on postmortem examination; and single test-false positive cattle (SFP) that originated from bTB positive farms. This last group of cattle was positive reactors in primary and negative on secondary antemortem diagnostic tests, and was negative for bTB on postmortem examination.

Blood Collection and Antigen Stimulation
Whole blood (30-45 ml) was collected from 6 animals in each of the bTBP, DFP, and SFP groups (total of 18 animals) into 10 ml heparinized tubes immediately before euthanasia for postmortem examination. Within 3 hours of collection, the blood from each animal was divided into 3 individual sterile conical tubes and subjected to 1 of 3 treatments. Two of the tubes were stimulated with bPPD (purified protein derivative prepared from the filtrate of a heat-killed M. bovis) (d) at 20µg bPPD/ml of blood. One aliquot was stimulated with bPPD for 4 hours and the other one was stimulated with bPPD overnight (18-22 hours) before being processed for buffy coat harvest. The last tube was harvested without stimulation, serving as nil control. The 2 different stimulation time-points were evaluated simultaneously, to identify a reference gene that could be used for both the 4 hour stimulation study [20] and the overnight stimulation study [21].

Isolation of Peripheral Blood Mononuclear Cells, and Purification of RNA
After stimulation, the blood was centrifuged at 1200 x g for 15 minutes at 18°C to form layers of plasma, buffy coat cells, and red blood cells. Buffy coat cells and 2 ml of red blood cells immediately below the buffy coat cell layer were harvested and transferred to new 50 ml conical tubes. Two rounds of hypotonic lysis of red blood cells were performed by addition of ice-cold diethyl pyrocarbonate treated-sterile de-ionized water for 2 minutes, followed by addition of an equal volume of ice-cold diethyl pyrocarbonate-treated sterile 2X saline (1.7 % w/v NaCl). Intact cells were pelleted by centrifugation at 1200 x g for 15 minutes at 18°C after the first round of hypotonic lyses, then at 190 x g for 10 minutes at 4°C after the second round. After the second round of hypotonic lyses, the supernatant was decanted and 1 ml TRIzol Reagent (e) was added to the loose cell pellet for each 9 ml beginning volume of whole blood. This mixture was frozen at -84°C until use. For isolation of RNA, the mixture was thawed on ice, and subjected to 10 passages through a 20 gauge needle. The resulting homogenate was divided into 1 ml aliquots and the remainder of the RNA extraction procedure was performed according to the manufacturer's recommendations. The total cellular RNA from each animal was then pooled into a single tube and treated with RQ1 RNase-Free DNase (f) according to manufacturer's recommendations. The treated RNA was extracted again using equal volumes of phenol-chloroform, followed by purification using MEGAclear Purification Kit (g). The purified RNA was immediately stored at -84°C until use.
Before use, the RNA from each of the study cattle was thawed on ice and the integrity and concentration of the RNA was determined using an Agilent 2100 Bioanalyzer and RNA Nano 6000 Kit (h).

cDNA Synthesis
Synthesis of cDNA was performed with 2 µg of total RNA from each study animal using Superscript™ II Reverse Transcriptase and Oligo (dT) 12-18 Primer (e), according to manufacturer's recommendations. Upon completion of cDNA synthesis, the RNA template in each reaction was removed with 1U of RNase H (e). The cDNA was purified using QuickClean Enzyme Removal Resin (i) according to the manufacturer's recommendations. The concentration of purified cDNA was measured using NanoDrop ND-1000, and diluted to final concentration of 10 ng/µl. All cDNA were stored at -20°C until use in qPCR assays.

qPCR Assay
A constant amount of cDNA was used in duplicate qPCR reactions for each reference gene. The qPCR assays were performed using SYBR Green PCR Master Mix and an ABI 7500 Real-time PCR System (j). Each 20 µl reaction consisted of 1x SYBR Green PCR master mix, 20 ng of cDNA and a pair of primers at pre-determined optimal concentrations ( Table 1). The reaction conditions were 95 °C for 10 minutes, then 40 cycles of 95 °C for 15 seconds and 58°C for 1 minute. Dissociation curve analysis was done for each reaction to verify the specific amplified products were 20 Evaluation of Reference Genes for Differential Gene Expression Study of Bovine Tuberculosis obtained.

qPCR Data Analysis
The raw cycle threshold (Ct) value for each reaction was exported into an Excel spreadsheet; the ΔCt value was calculated as the difference in Ct of a stimulated sample (4 hours/overnight) from the Ct of the non-stimulated sample from a given animal [Ct(stimulated) minus Ct(unstimulated)].
The stability of each reference gene was evaluated and compared in BestKeeper, geNorm and NormFinder programs. The most stable gene, as determined by use of all programs, was selected and used as the reference gene for subsequent qPCR assay.

Results
Initially, RNA samples from 6 cattle (2 from each group) were used to test the 12 selected reference genes (Table 1) using qPCR assay. The qPCR data (Ct value) of the 12 reference genes were generated for each animal using RNA that was subjected to 3 different treatments (samples without stimulation, 4 hours or overnight antigen stimulation). The data were then evaluated using the BestKeeper, geNorm and NormFinder programs. Both the geNorm and the NormFinder programs used the ΔCt value for calculation of the stability value, while the BestKeeper used the raw Ct value. The ΔCt value was calculated as [Ct(4 hours/overnight stimulated sample)] minus [Ct (no antigen stimulation sample)], and the 2^-ΔCt value was used to compute the stability value in the NormFinder or the geNorm programs.
In the geNorm program, the gene expression stability measure (M) for a reference gene was calculated as the average pairwise variation (V) for that gene with all other tested reference genes. Stepwise exclusion of the gene with the highest M value then allowed ranking of the tested genes according to their expression stability. The NormFinder program utilized a mathematical model to estimate the reference gene's stability based on direct estimation of expression variation and the taking into account of sample subgroups within the data (no sample subgroup was defined in the current study). In the BestKeeper program, the raw Ct values of all data points were used to compute the geometric means, arithmetic means and standard deviation for each reference gene. Stability of the reference gene was determined based on repeated pair-wise correlation analysis and standard deviation of geometric means.
The ranking of the stability values for all 12 reference genes by the three programs is listed in Table 2. The stability ranking for the reference genes varied among the 3 programs, especially with the BestKeeper program that was not based on ΔCt values. Significant discrepancies were observed for a few genes, such as SDHA, B2M and ACTB. Some consistency in ranking was observed among the very unstable gene candidates. The H2A was determined as the least stable gene, with GAPDH, TBP and HPRTI also considered as unstable genes. The most stable gene was not clearly identified using data from 6 animals tested.
Based on the initial results, 5 of the most stable genes ranked by all 3 programs were selected for further evaluation; these genes are SDHA, H3F3A, YWHAZ, B2M and UBC. Despite the lower stability ranking, ACTB and GAPDH were also selected for further evaluation because these were the most commonly used reference genes in the published literature. Additionally, 12 blood samples that represented all 3 study groups were added to increase the sample size to 18 cattle. The qPCR assays were performed on RNA of 18 animals (each with 3 treatments, as mentioned above) for the 7 selected reference genes. Data analysis was performed as the initial run.  Based on the data from the 18 animals, Figure 1, Table 3 and Table 4 show the output of analysis by the geNorm, NormFinder and BestKeeper program respectively. Table 5 summarizes the overall stability rankings of the 7 selected genes by the 3 different programs. Discrepancy was again Advances in Zoology and Botany 5(2): 17-24, 2017 21 observed in stability rankings among the 3 programs. The GAPDH gene remained the most unstable among the 7 selected genes. Interestingly, ACTB was ranked the third most stable gene by the BestKeeper program and was ranked as second least stable gene by the other 2 programs. Overall, SDHA was shown to be the most stable gene by all 3 programs. Based on this result, SDHA was determined as the most stable reference gene for this study.

Discussion
It is clear from the literature that expression of many housekeeping genes can be influenced by different experimental conditions, which should prevent their use in qPCR assays when those conditions are encountered [6,7]. Based on this knowledge, 12 candidate reference genes were selected for evaluation under this study, with the goal to find the most stable reference gene that can be used in the qPCR assays for validation of differential gene expression associated with the bovine tuberculosis disease status. The experimental design of this study includes comparison of animals with and without M. bovis infection (cause of bovine tuberculosis), as well as samples with and without exposure to antigen stimulation. Under these experimental conditions, expression levels of H3F3A, YWHAZ, B2M, UBC, HMBS, RPII, ACTB, HPRTI, TBP, GAPDH and H2A were all shown to be unstable. Surprisingly, GAPDH and ACTB, the 2 most widely used reference genes in early gene expression studies by many researchers, were found to be the least stable genes. In a similar gene expression study of human tuberculosis; GAPDH, ACTB, and B2M were found unstable when tuberculin antigen stimulation was used [5]. Furthermore, the use of GAPDH as a reference gene has been reported to result in erroneous interpretation of the IL-4 gene expression in TB patients [18]. The influence of a stimulant on the expression of housekeeping genes in various cell cultures has also been reported [9,23]. Our findings are in agreement with previous findings, suggesting that stimulation of samples with bPPD antigen has impact on the expression level of those commonly used housekeeping genes tested in this study. Wedlock et al. [22] reported increased expression of housekeeping molecules such as gamma-actin, ACTB, and B2M in M. bovis infected macrophages. Our study using comparisons of animals with and without M. bovis infection yields similar results, confirming the influence of infection status on the expression level of many commonly used housekeeping genes.
Expression levels of housekeeping genes have been found to vary among different cell types (normal versus cancerous cells) [24], different physiological states of cells [25] or infectious status [26]. All of these previous findings stressed the importance for validation of reference gene(s), to ensure that expression is stable irrespective of the experimental conditions. In the current study, the data clearly shows the instability of many candidate reference genes under our experimental conditions. The SDHA gene encodes the succinate dehydrogenase complex subunit A protein, which is a major catalytic subunit of succinate-ubiquinone oxidoreductase, a complex of the mitochondrial respiratory chain. The SDHA gene appears stable in its expression level, and therefore has been evaluated for use as a reference gene in many research studies. For the design of this experimental study, SDHA was found to be the most stable reference gene, across the infection status (bTB positive or negative) and across experimental conditions (4 hours or overnight antigen stimulation). Stability of SDHA has been previously validated in gene expression studies involving bovine polymorphonuclear leukocytes [27], the developing bovine embryo [8], and bovine liver and pituitary tissues [28], which is in agreement with our findings.
BestKeeper, geNorm and NormFinder are programs developed to evaluate the stability of candidate reference genes. Due to the use of different algorithms for calculation of stability values, differences in outputs by these programs have been previously reported [6, 14-17, 24, 26, 29]. Side by side evaluations using multiple programs have been widely recommended for determination of the best choice for a reference gene. In this study, the ranking of gene stability by the geNorm and NormFinder programs was similar, and seldom in agreement the BestKeeper program. Overall, the programs agreed on the least stable genes and, to a lesser extent, on the most stable genes. A similar observation was reported by Perez et al. [14] using the programs to evaluate reference genes for the study of gene expression in bovine muscle tissue, as well as Wang et al. [24] for evaluation of reference genes for the study of human laryngeal cancer. Wood et al. [16] reported good agreement when evaluating reference genes with all three programs, and Skovgaard et al. [15] found good agreement between geNorm and NormFinder for ranking of reference genes. Our finding is in agreement with other researchers, confirming the importance of using multiple programs for evaluation of candidate genes in order to find the best choice for a reference gene for relative qPCR assays. To date, besides the 3 programs used in this study, there are also many other freely available programs for researchers to utilize for reference genes evaluation. Without extra cost, by putting in a little extra time and effort in data analysis, researchers can be assured that they have chosen a suitable reference gene for their study and avoid possible erroneous interpretation of expression study using relative qPCR assays.

Conclusions
Results from this study showed that SDHA was found to be the most stable reference gene, across the infection status (bTB positive or negative) and across experimental conditions (4 hours or overnight antigen stimulation), and is thus the reference gene of choice for use under these experimental conditions. Our results also showed that expression levels of many widely used reference genes are not stable under the studied experimental conditions, thus stressing the importance of validation of suitable reference gene(s) for each experimental study. An adequate reference gene must show stable expression irrespective of the experimental conditions. In the current study, we found discrepancies among 3 commonly used programs in determining the stability of reference genes, where geNorm and NormFinder programs yield similar findings, and were seldom in agreement with the BestKeeper program. Our findings confirmed the need for using multiple programs for evaluation of candidate genes before deciding on the best choice for a reference gene for relative qPCR assays.