|
|
||||||||
Brief Communication
1 Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin 53226
2 Biotechnology and Bioengineering Center, Medical College of Wisconsin, Milwaukee, Wisconsin 53226
| ABSTRACT |
|---|
|
|
|---|
experimental design; Pearson correlation coefficient; outlier concordance; Northern blot; gene expression
| INTRODUCTION |
|---|
|
|
|---|
The potentially enormous power of the cDNA microarray technique and its inherent complexity has motivated a large number of experiments and analyses studying various aspects of the technique, particularly the preparation of arrays and samples and the analysis of data (14). As cDNA microarray is being incorporated into more physiologically oriented studies involving multiple factors and naturally existing variability, experimental design also needs to be rigorously addressed (2, 19). Two particularly important issues in experimental design are the use of dye switching and biological replication. Due to the physiochemical differences between fluorescent dyes Cy3 and Cy5, it is suspected that they might cause systematic bias in the ratios generated. Random variations in the handling of the two samples or the scanning of the two fluorescent channels could also result in ratio bias. In addition to normalization between the two dyes (14, 15), a commonly used approach to correct any residual dye bias is to repeat the hybridization, with Cy3 and Cy5 switched between the two samples being compared. Biological replication, in which several independent individuals are analyzed in a study, is a standard practice in physiological experiments because of the well-known variability between individuals. Although both dye switching and biological replication are intuitively beneficial for cDNA microarray studies, one of the drawbacks is that these procedures substantially increase the costs of these already expensive experiments, further limiting the ability of a laboratory to use cDNA microarrays. These procedures also further increase the complexity of the experimental design and the data structure, posing even greater challenges for data analysis. Therefore, the practical question becomes, to what extent a cDNA microarray experiment can benefit from dye switching and/or biological replication, i.e., whether the benefits are great enough to justify the additional costs and the increased complexity.
In the present analysis, we took advantage of the unique characteristics of a published microarray data set that was generated in a physiologically oriented context (9), and we developed several algorithms to quantitatively assess the importance of dye switching and biological replication. Guidelines for designing cDNA microarray experiments were proposed based on this analysis.
| METHODS |
|---|
|
|
|---|
2,000 genes, representing
80% of all currently known rat genes, was used. Microarray hybridization was carried out with the widely used direct, two-color, Cy3 and Cy5, labeling method. A custom-designed data analysis method was used to screen for reliable data points and to adjust signal intensity, correct background, and calculate and normalize natural log-transformed ratios. Details of these procedures were described previously (9). Renal medullary mRNA expression profiles were compared in four groups of rats, Dahl salt-sensitive rats on a low-salt (SSLS) or high-salt (SSHS) diet, and consomic, salt-insensitive SS.BN13 rats on a low-salt (13LS) or high-salt (13HS) diet, using a loop-like, four-way comparison experimental design with a total of 24 microarrays. As depicted in Fig. 1A, three pairs of individual rats were compared in each comparison between two groups of rats (i.e., biological replication), and each pair of rats was examined with both forward and reverse labeling (i.e., dye switching). This design enabled the evaluation of contributions of biological replication and dye switching separately or in combination. Moreover, 20 randomly selected genes were further analyzed with Northern blots, providing one of the largest sets of validation data in the microarray literature, although still limited from a data analysis point of view.
|
Identification of Outliers Using an Intensity-Dependent, Continuous Curve of Threshold
A criterion of two times the standard deviation of the entire set of ln(ratio) values was used as the threshold to identify differentially expressed genes (i.e., outliers) in the original study (9). This criterion assumed that expressions of the majority of genes remained unchanged under the experimental conditions examined. However, a large dispersion of ln(ratio) values has been noticed at the lower range of signal intensity, which gradually decreases as intensity increases. Similar dispersion patterns were seen when identical samples were hybridized against each other (1), indicating that it was a systematic technical artifact, rather than a biological phenomenon. With data dispersed in this manner, when a constant threshold such as two times the standard deviation of the entire set of ln(ratio) values is applied, genes with lower intensities have a higher probability of being identified as outliers. To avoid this bias, an algorithm was developed to generate an intensity-dependent, continuous threshold curve. Genes were ranked according to their intensities and divided into consecutive groups, each containing 50 genes. The average of normalized ln(ratio) values in each group was confirmed to be close to 0. The standard deviation of ln(ratio) values as well as the average of ln(intensity) values in each group was calculated. An equation was identified to describe the relationship between two times the ln(ratio) standard deviation of each 50-gene group with the corresponding average of ln(intensity). This equation was then used to calculate a ln(ratio) threshold at the ln(intensity) level of any given gene. If the actual ln(ratio) of a gene exceeded the calculated ln(ratio) threshold, then the gene was considered an outlier. This threshold curve was refitted for each subset of arrays as defined below since each data subset might contain a different number of microarrays.
Generation of Data Subsets to Separate the Impact of Dye Switching and Biological Replication
To evaluate the impact of dye switching and/or biological replication, we divided data from each of the four comparisons into several subsets of data in six different combinations as shown in Fig. 1B. The combination of "1 array, 1 pair of rats, 1 way of labeling" (1-1-1) constituted a baseline condition where neither dye switching nor biological replication was utilized. The combination of "2 arrays, 1 pair of rats, both ways of labeling" (2-1-2) utilized dye switching when the second array was added, whereas the combination of "2 arrays, 2 pairs of rats, 1 way of labeling" (2-2-1) utilized biological replication. Any changes in the reliability of microarray results in combinations "2-1-2" and "2-2-1" compared with "1-1-1" would reflect the impact of dye switching and biological replication, respectively, in addition to the impact of adding a second array itself. The combination of "2 arrays, 2 pairs of rats, 2 ways of labeling" (2-2-2) would reflect the impact of simultaneous addition of dye switching and biological replication in the second array. The combination of "4 arrays, 2 pairs of rats, both ways of labeling" (4-2-2) reflected the impact of dye switching and biological replication when they were added sequentially, but also reflected the impact of increasing the number of arrays to four. The combination of "6 arrays, 3 pairs of rats, both ways of labeling" (6-3-2) added to the combination of "4-2-2" another biological replicate with both ways of labeling. The ln(ratio) values were averaged for each gene in each subset and used for subsequent analyses. Note that in some combinations such as "2-2-1" and "2-2-2," a pair of rats had to be used in more than one data subset to take advantage of a more complete coverage of the available data. As a result, not all individual subsets of data in these combinations were completely independent of each other. Accordingly, conventional statistical significance was not tested. Similar trends were seen when only independent subsets were examined.
Quantification of the Impact of Dye Switching and/or Biological Replication
Three indices were examined to assess the reliability of results obtained from each combination described in Fig. 1B and, thereby, to quantify the importance of dye switching and/or biological replication.
Index 1: Consistency between observed ln(ratio) values and ln(ratio) values predicted on the basis of the loop-like, four-way comparison design.
With the loop-like four-way comparison design, ln(ratio) values for any given comparison could be predicted based on ln(ratio) values from the other three comparisons using the following formulas
![]() |
![]() |
![]() |
![]() |
For each combination of arrays shown in Fig. 1B, the Pearson correlation coefficient and the concordance of outliers were calculated as measures of the consistency between predicted and observed data. The Pearson correlation coefficient was calculated based on predicted ln(ratio) values and observed ln(ratio) values of all available genes. The outlier concordance, expressed as percentage, was calculated as [2 x M/(A + B)] x 100, in which A and B represented the numbers of outliers identified from two data subsets being compared (the predicted and the observed data in this case), and M represented the number of overlapping outliers. The number of outliers varied from one data subset to another but was generally within the range of 30 to 60. Ideally, predicted ln(ratio) values should be identical to observed ln(ratio) values. However, technical variance exists between any two microarrays. Since the predicted ln(ratio) values were essentially the sum of ln(ratio) values from three microarrays (or three sets of microarrays), the variance between predicted ln(ratio) values and observed ln(ratio) values would be greater than the variance that can be expected between any two sets of microarrays. The ability of dye switching and/or biological replication to reduce this composite variance, therefore, provided a sensitive measure of their benefits.
Index 2: Consistency between results from subsets of microarrays and the entire set of microarrays.
The Pearson correlation coefficient of ln(ratio) values and the concordance of outliers were calculated for each combination of arrays shown in Fig. 1B (except the combination of "6-3-2") compared with the entire set of arrays (i.e., the combination of "6-3-2"). The ability of dye switching and/or biological replication to increase this consistency was used as a measure of their benefits.
Index 3: Consistency between results from microarrays and Northern blots.
The Pearson correlation coefficient between microarray and Northern blot ln(ratio) values of 20 genes was calculated for each combination of arrays (Fig. 1B) as another index of the reliability of microarray results.
| RESULTS |
|---|
|
|
|---|
|
The Pearson correlation coefficient (r) between observed ln(ratio) values and predicted ln(ratio) values based on subsets of microarrays, each containing a single microarray (the combination "1-1-1," Fig. 1) was 0.38 ± 0.06 (n = 12, Fig. 3A), and the outlier concordance was 21 ± 3% (n = 12, Fig. 3B). Adding a second array examining the same pair of rats, but with a reverse labeling (the combination "2-1-2"), substantially increased the correlation coefficient to 0.62 ± 0.04 (n = 12) and the outlier concordance to 43 ± 4% (n = 12). When a second array was added to examine a different pair of rats with the same way of labeling (the combination "2-2-1"), the correlation coefficient was similarly increased to 0.62 ± 0.03 (n = 12), while the outlier concordance increased to 35 ± 3% (n = 12). Adding a second array examining a different pair of rats with a reverse labeling (the combination "2-2-2") did not increase the correlation coefficient (0.38 ± 0.08, n = 12) and only slightly increased the outlier concordance to 26 ± 4% (n = 12). Increasing the number of arrays to four or six to include two or three pairs of rats, each examined with forward and reverse labeling (combinations "4-2-2" or "6-3-2," n = 4 each), resulted in greater increases in the correlation coefficient that reached 0.69 ± 0.04 or 0.79 ± 0.03. In addition, the outlier concordance was increased to 52 ± 4% or 56 ± 4%. An example of the correlation for each combination is shown in Fig. 3, CH.
|
|
|
| DISCUSSION |
|---|
|
|
|---|
In the absence of a "gold standard," it is still possible to de-compose sources of variation and assess the relative contribution of each source to the overall variations (7, 18). The purpose of the present study, however, was to assess the impact of dye switching and biological replication on the reliability of microarray results. Reliability may or may not be equivalent to reproducibility measured by variation, depending on how these terms are defined. In the setting of biological experiments such as the one analyzed in the present study, reliability can be further defined as precision (i.e., how precise the data reflect the subjects being measured) and "generalizability" (i.e., how well the conclusions derived from the measurement of a limited number of subjects can be extrapolated to a larger population).
In the present analysis, we took advantage of the unique characteristics of a published data set (9) and used the combination of three indices to assess the impact of dye switching and biological replication on the precision and/or generalizability of microarray results. Each index has advantages and disadvantages. The predictability index was used to assess precision, because if each measurement were a precise representation of the subject, then the measured and the predicted data would be identical. One could argue that this index was in fact reflecting reproducibility in repeated measurements of a subject, which may or may not be equivalent to precision. The measured and the predicted data would be identical so long as the repeated measurements were reproducible, even though they might not be precise. However, in the absence of a "gold standard," reproducibility in repeated measurements does provide a reasonable indication of precision. An advantage of this index is that it is free of any assumptions regarding the benefits of dye switching or biological replication. The disadvantage is that it does not reflect generalizability. The use of the consistency with the entire set of arrays had the disadvantage of assuming qualitative benefits of dye switching and biological replication because both procedures were utilized in the entire set of arrays. However, so long as this assumption was acceptable, the relative ability of dye switching and/or biological replication in each subset of arrays to bring the results closer to the entire set of arrays would provide a straightforward measure of the quantitative benefits of dye switching and/or biological replication in extrapolating the results to the whole population, i.e., the generalizability of the results. The obvious advantage of the comparison with Northern blots was the use of an independent second technique, and it could reflect both precision and generalizability. The disadvantage was the number of genes for which both microarray and Northern blot data were available was limited, reducing the power of this index. In addition, because of the lack of a "gold standard," one could always question the relative reliability of microarray vs. Northern blot. Therefore, despite the limitations of each index, the three indices appear to complement each other. Consistent trends observed in more than one of them would provide a strong indication of improvements in data reliability.
Relative Benefits of Dye Switching and Biological Replication
One of these consistent trends was the improvement of all three indices when a second array was added using the reverse labeling to examine the same pair of rats (i.e., dye switching). A 63% increase in correlation coefficient and a doubling of outlier concordance between observed and predicted data were obtained. Similar improvements were found when comparing between subsets of arrays and the entire set of arrays. The data set available did not allow quantitative distinction between the effect of dye switching and the effect of simply adding a second array. However, the improvement in consistency that was observed very likely involved the benefits of dye switching, because other combinations containing two arrays did not achieve the same level of improvement. In fact, the improvement achieved by adding a second array labeled in the same way but to examine a different pair of rats (i.e., biological replication) was often less than that obtained by dye switching. These results indicated that both dye switching and biological replication improved the reliability of microarray results, with dye switching likely having even greater benefits.
The ln(ratio) data used in these analyses had been normalized by adjusting the mean ln(ratio) of each array to 0 (9). It therefore appears that normalization alone was not sufficient to remove the influence of the dye difference. This was consistent with the remarkably strong effect of the dye difference on microarray results, such as that reported by Jin et al. (5), supporting the notion that dye switching is required for obtaining reliable microarray results. The exact nature and the mechanism underlying dye biases are not clear at present. Further experiments and a deeper understanding of the physiochemical characteristics of the dyes and their binding kinetics are needed to address these questions.
The importance of replication in microarray experiments has been emphasized (6, 8, 12). The present analysis showed that biological replication, even when applied in the absence of dye switching, also appeared to have substantial benefits. The magnitude of the impact of biological replication depends highly on the level of naturally existing individual variability in each specific experimental setting. To determine exactly how many replicates are needed for a specific experiment, one would have to determine the variability level of each gene of interest, the magnitude of expression differences expected, and the statistical power desired. Several studies have examined the "normal" variability of gene expression levels (3, 4, 10, 13), providing a prototype of this kind of assessment.
When dye switching and biological replication were included simultaneously in the second array added (the combination of "2-2-2"), consistency with the entire set of arrays was substantially improved to a level similar to or slightly higher than that achieved by the combination of "2-1-2" (i.e., dye switching without biological replication). However, the predictability was only minimally increased compared with a single array. This was perhaps a result of the different nature of these two indices. The consistency with the entire set of arrays essentially reflects the generalizability of the results, that is, the ability to extrapolate the results to the whole population. The predictability, on the other hand, was an index of precision, that is, the accuracy in the measurements of the samples being examined. Compared with a single array, adding a second array with reverse labeling and examining a different pair of rats enhanced the resemblance of the combination structure with the entire set of arrays. It thereby increased the generalizability of the results. However, the second array was used to examine a second pair of rats and with reverse labeling. This, therefore, did not improve the precision of the measurement of mRNA levels in either pair of rats involved and did not substantially improve the predictability.
The improvement of index 2 by the inclusion of dye switching was observed in a second data set generated using a different cDNA labeling method (21). It is important to keep in mind that this index assumes that dye switching is qualitatively beneficial. Therefore, we cannot use this index alone to draw conclusions regarding the benefit of dye switching. However, the fact that this index performed differently for combinations 2-2-1 (without dye switching) and 2-2-2 (with dye switching) supported the notion that dye-labeling patterns had an effect on the results obtained using this labeling method, which is consistent with the conclusion drawn from the analysis of the data from Liang et al. (9).
Determining Thresholds of Differential Expression
Determining the threshold of differential expression is a major issue in microarray studies. A fixed fold change was widely used in earlier studies, often without a convincing rationale. A standard deviation-based threshold (9), predetermined P value threshold (5, 21), corrected P values (17) and "null distribution"-based approaches (4, 10) have also been applied. The intensity-dependent dispersion of ratios has been noted previously (11). The intensity-dependent, continuous threshold curve utilized in the present analysis was similar to that developed by Mutch et al. (11), except that a logarithmic function, instead of an inverse function, was used in the present analysis. Genes identified as differentially expressed using this equation contained a more consistent representation of genes across the entire range of ln(intensity) as shown in Fig. 2A.
It is important to point out that although dye biases and intensity-dependent effects could be partially related, they are in essence two distinct problems. Furthermore, two types of intensity-dependent effects need to be distinguished. One is the intensity-dependent dispersion of log-transformed ratios (i.e., a wider dispersion of ratios at lower intensity levels), which we observed in our data set and addressed by using the threshold curve. The other is the intensity-dependent deviation of log-transformed ratios from 0, i.e., the "Nike swoop" shape (20), which we did not observe in our data set.
Summary and Recommendations
The present analysis indicated that both dye switching and biological replication improved the reliability of microarray results. Dye switching appears to yield greater benefits. The selection of experimental design is governed by scientific logic but can also be influenced by practical issues such as the availability of materials or resources. The results of this analysis argue against sacrificing dye switching and biological replication for the sake of reducing costs or experimental complexity. Based on these analyses, we propose the following guidelines for designing cDNA microarray experiments when only a small, fixed number of microarrays is available for a particular study. If the main purpose of the experiment is to obtain estimates of the whole population, then each array should be used to examine a different pair of samples, with dyes reversed in half of the pairs. If obtaining accurate measurements for the samples examined is the main concern, then two arrays with dye switching should be used to examine each pair of samples. If both the generalizability and the precision are desired, then the second design is preferred because, compared with the first design, the gain of precision appears quantitatively much greater than the loss of generalizability. It is important to note that these guidelines are developed based on physiologically oriented experiments using cDNA microarray techniques described in the two studies analyzed (9, 21). Caution should be taken when applying these guidelines to experiments with drastically different characteristics or using other types of microarray techniques.
| DISCLOSURES |
|---|
|
|
|---|
Editor S. R. Gullans served as the review editor for this manuscript submitted by Editor A. W. Cowley, Jr.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Address for reprint requests and other correspondence: M. Liang, Dept. of Physiology, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226 (E-mail: mliang{at}mcw.edu).
10.1152/physiolgenomics.00143.2002.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
T. Mori, A. Polichnowski, P. Glocka, M. Kaldunski, Y. Ohsaki, M. Liang, and A. W. Cowley Jr. High Perfusion Pressure Accelerates Renal Injury in Salt-Sensitive Hypertension J. Am. Soc. Nephrol., August 1, 2008; 19(8): 1472 - 1482. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Liang, N. H. Lee, H. Wang, A. S. Greene, A. E. Kwitek, M. L. Kaldunski, T. V. Luu, B. C. Frank, S. Bugenhagen, H. J. Jacob, et al. Molecular networks in Dahl salt-sensitive hypertension based on transcriptome analysis of a panel of consomic rats Physiol Genomics, June 1, 2008; 34(1): 54 - 64. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Wesseling, J. A. Joles, H. van Goor, H. A. Bluyssen, P. Kemmeren, F. C. Holstege, H. A. Koomans, and B. Braam Transcriptome-based identification of pro- and antioxidative gene expression in kidney cortex of nitric oxide-depleted rats Physiol Genomics, January 17, 2007; 28(2): 158 - 167. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Aubin-Horth, B. H. Letcher, and H. A. Hofmann Interaction of Rearing Environment and Reproductive Tactic on Gene Expression Profiles in Atlantic Salmon J. Hered., May 1, 2005; 96(3): 261 - 278. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. P. Basile, K. Fredrich, M. Alausa, C. P. Vio, M. Liang, M. R. Rieder, A. S. Greene, and A. W. Cowley Jr. Identification of persistently altered gene expression in the kidney after functional recovery from ischemic acute renal failure Am J Physiol Renal Physiol, May 1, 2005; 288(5): F953 - F963. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. E. Knoll, J. L. Pietrusz, and M. Liang Tissue-specific transcriptome responses in rats with early streptozotocin-induced diabetes Physiol Genomics, April 14, 2005; 21(2): 222 - 229. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Morrison, K. Knoll, M. J. Hessner, and M. Liang Effect of high glucose on gene expression in mesangial cells: upregulation of the thiol pathway is an adaptational response Physiol Genomics, May 19, 2004; 17(3): 271 - 282. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Liang, A. W. Cowley Jr, and A. S. Greene High throughput gene expression profiling: a molecular approach to integrative physiology J. Physiol., January 1, 2004; 554(1): 22 - 30. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Visit Other APS Journals Online |