|
|
||||||||
Call For Papers: Comparative Genomics
-defensins as revealed by comparative analysis of rodent and primate genes
1 Department of Animal Science, Oklahoma State University, Stillwater, Oklahoma
2 Department of Biological Sciences, University of South Carolina, Columbia, South Carolina
| ABSTRACT |
|---|
|
|
|---|
-defensins constitute a family of cysteine-rich, cationic antimicrobial peptides produced by phagocytes and intestinal Paneth cells, playing an important role in innate host defense. Following comprehensive computational searches, here we report the discovery of complete repertoires of the
-defensin gene family in the human, chimpanzee, rat, and mouse with new genes identified in each species. The human genome was found to encode a cluster of 10 distinct
-defensin genes and pseudogenes expanding 132 kb continuously on chromosome 8p23. Such
-defensin loci are also conserved in the syntenic chromosomal regions of chimpanzee, rat, and mouse. Phylogenetic analyses showed formation of two distinct clusters with primate
-defensins forming one cluster and rodent enteric
-defensins forming the other cluster. Species-specific clustering of genes is evident in nonprimate species but not in the primates. Phylogenetically distinct subsets of
-defensins also exist in each species, with most subsets containing multiple members. In addition, natural selection appears to have acted to diversify the functionally active mature defensin region but not signal or prosegment sequences. We concluded that mammalian
-defensin genes may have evolved from two separate ancestors originated from ß-defensins. The current repertoires of the
-defensin gene family in each species are primarily a result of repeated gene duplication and positive diversifying selection after divergence of mammalian species from each other, except for the primate genes, which were evolved prior to the separation of the primate species. We argue that the presence of multiple, divergent subsets of
-defensins in each species may help animals to better cope with different microbial challenges in the ecological niches which they inhabit. defensin; antimicrobial peptide; comparative genomics
| INTRODUCTION |
|---|
|
|
|---|
-, ß-, and
-defensins, with each consisting of a unique six-cysteine motif and a disulfide bonding pattern (6, 16, 21, 29). For example, the consensus
-defensin motif is C-X1-C-X34-C-X9-C-X610-C-C, where C1-C6, C2-C4, and C3-C5 form three intramolecular disulfide bridges. Although ß-defensins have been found in most vertebrate species with a much wider tissue expression pattern,
-defensins are specific to mammals and are mainly produced by leukocytes of myeloid origin and Paneth cells of small intestine (6, 16, 21, 29). On the other hand,
-defensins are believed to have evolved from
-defensins and have only been discovered in leukocytes of primates (6, 16, 20). Defensins possess potent antimicrobial activity against a broad range of bacteria, fungi, and enveloped viruses (6, 16, 21, 29). In addition, certain defensins are also capable of inducing maturation of dendritic cells and spermatocytes, inducing secretion of chloride and proinflammatory cytokines from epithelial cells, and chemoattracting immune and inflammatory cells (6, 16, 40). The mechanism by which defensins kill microbes involves initial electrostatic interactions of cationic peptides with negatively charged phospholipids on the microbial membranes, followed by permeabilization of bacterial membrane and cell death (6). In contrast to most conventional antibiotics that kill microbes by certain biochemical mechanisms, such a physical mechanism of action confers on defensins a broad-spectrum antimicrobial activity with equal efficacy against antibiotic-resistant strains and little risk of developing resistance (8, 43). Consequently, defensins are being explored as a new class of antimicrobials mainly to control antibiotic-resistant microorganisms.
To date, a number of
-defensins have been discovered in human, rhesus macaque, rabbit, guinea pig, mouse, and rat (6, 16, 21). For example, three myeloid
-defensin genes (DEFA1/2, DEFA3, and DEFA4) and two enteric
-defensin genes (DEFA5/HD5 and DEFA6/HD6) have been found in humans, and a group of 19 highly homologous enteric
-defensins (cryptdins) have also been reported in mice. All
-defensins are synthesized from 80105 amino acid precursors with each composed of a short NH2-terminal signal sequence (
20 amino acids), an anionic prosegment (4050 amino acids), and a COOH-terminal cationic mature peptide (3035 amino acids) (6, 16, 21). The
-defensins are stored as either inactive proforms or mature active forms in the cytoplasmic granules of phagocytes and Paneth cells and released (and processed if necessary) in response to microbial infection or cholinergic stimulation (6, 16, 21). The enzyme responsible for processing and activation of intestinal
-defensins is metalloproteinase matrilysin (MMP7) in mice (38) and trypsin in humans (7). The contribution of
-defensins to intestinal mucosal immunity has been best demonstrated by the evidence that transgenic mice expressing human DEFA5 or HD5 became fully resistant to oral lethal infection with Salmonella typhimurium (27) and that MMP7-deficiency rendered mice more susceptible to S. typhimurium infection (38).
The genomes of several evolutionarily divergent vertebrate species have been sequenced, including zebrafish, Japanese pufferfish (Fugu rubripes), chicken, dog, rat, mouse, chimpanzee, and human. Availability of such a large amount of sequence information provides a timely opportunity to search for the possible existence of
-defensins in these species. We hypothesized that identification of the entire repertoire of
-defensin genes across a range of animal species will reveal the origin and evolutionary relationships of this important gene family. This also serves as a first step to study the role of
-defensins in host immunity and explore their potential as novel antimicrobials. Following comprehensive genome-wide screening, here we report identification of the complete repertoires of
-defensin genes in the human, chimpanzee, mouse, and rat with novel sequences being discovered in each species. However, none of
-defensins have been found in any species other than glires (mouse, rat, guinea pig, and rabbit) and primates (human, chimpanzee, olive baboon, and rhesus macaque). In addition, we provide strong evidence showing that a rapid duplication and positive diversifying selection of
-defensin genes have occurred following divergence of mammalian species.
| MATERIALS AND METHODS |
|---|
|
|
|---|
-defensins.
-defensin peptide sequences were individually queried against expressed sequence tags (EST), nonredundant sequences (NR), unfinished high-throughput genomic sequences (HTGS), and whole genome shotgun sequences (WGS) in the GenBank by using the TBLASTN program (1) with the default settings on the National Center for Biotechnology Information (NCBI) web site (http://www.ncbi.nlm.nih.gov/BLAST). The vertebrate species examined included zebrafish, Japanese pufferfish, chicken, dog, rat, mouse, chimpanzee, and human. All potential hits were then examined for presence of the characteristic
-defensin motif or highly conserved signal/prosegment sequence. For every novel
-defensin sequence identified, additional iterative BLAST searches were performed as described above until no more novel sequences can be revealed. Because mammalian defensins tend to form clusters (17, 30, 31), all genomic sequences containing
-defensins were also retrieved from the GenBank to discover potential novel sequences with distant homology. The nucleotide sequences between two neighboring defensin genes were translated into six open reading frames and individually compared with the two defensin peptide sequences for the presence of
-defensin motif or signal/prosegment sequence by using the BLASTP (1) on the NCBI web site (http://www.ncbi.nlm.nih.gov/blast/bl2seq/) and/or ClustalW program (version 1.82) (35) on the European Bioinformatics Institute web site (http://www.ebi.ac.uk/clustalw).
Prediction of full-length coding sequences and genomic structures of
-defensins.
All known
-defensin precursors are encoded in two separate exons separated by a short intron of less than 2 kb, with one exon encoding signal/prosegment sequence and the other exon encoding the mature peptide containing the six-cysteine
-defensin motif (22). If either the signal/prosegment sequence or
-defensin motif of a novel gene was missing in a genomic sequence within a 2-kb distance, then an
5-kb sequence flanking the
-defensin motif or signal/prosegment sequence was retrieved to identify the full-length coding sequence and to derive the structural organization of that novel
-defensin gene by using a combination of GenomeScan (41), GENSCAN (4), and/or GeneWise (3). In the case of the rat, all computational predictions of coding sequences, except for the pseudogenes, have also been confirmed by cloning and sequencing of their respective RT-PCR products amplified from appropriate tissues.
Identification and characterization of
-defensin gene clusters.
To determine the relative position and orientation of each defensin in the genome, individual defensins were searched against the assembled genomes of human (NCBI build 35), chimpanzee (NCBI build 1, version 1), mouse (NCBI build 33), and rat (BCM version 3.1) released in May 2004, November 2003, May 2004, and June 2003, respectively. The BLAT program (14) was used for gene mapping through the UCSC Genome Browser (http://genome.ucsc.edu). The chromosomal locations of the
-defensin gene clusters of human, mouse, and rat were revealed by using the Map Viewer program (http://www.ncbi.nlm.nih.gov/mapview).
Sequence alignment and molecular evolutionary analysis.
Multiple sequence alignments were carried out by using the ClustalW program (version 1.82) (35) (http://www.ebi.ac.uk/clustalw). The neighbor-joining method (26) was used to construct the phylogenetic tree by calculating the proportion of nucleotide or amino acid differences (p-distance), and the reliability of each branch was tested by 1,000 bootstrap replications. Pairwise comparisons of nucleotide sequences at the codon level was carried out by using the method of Nei and Gojobori (19) to estimate the number of nonsynonymous substitutions per nonsynonymous site (dN) and the number of synonymous substitutions per synonymous site (dS) with the Jukes-Cantor correction for multiple substitutions. Construction of the phylogenetic tree and pairwise comparison of nucleotide or amino acid sequences were carried out by using the MEGA software version 2.1 (15).
RT-PCR analysis of tissue expression patterns of
-defensins.
A total of 26 different tissue samples were harvested from healthy, 2-mo-old Sprague-Dawley rats. Bone marrow progenitor cells were collected from femur, followed by hypotonic lysis of erythrocytes as described (44, 45). Only lineages of white blood cells were used for RNA isolation. Total RNA was extracted using TRIzol (Invitrogen, Carlsbad, CA). A panel of 12 human tissue RNA, including bone marrow, small intestine, kidney, trachea, lung, liver, heart, spleen, skeletal muscle, testis, prostate, and uterus, was also purchased from BD Biosciences Clontech (Palo Alto, CA) and used for evaluating the expression pattern of DEFA7. For each human and rat RNA, 4 µg were reverse transcribed with random hexamers and SuperScript II reverse transcriptase by using a first-strand cDNA synthesis kit (Invitrogen) according to the instructions. The subsequent PCR was carried essentially as described (44, 45). Briefly, 1/40 of the first-strand cDNA from each tissue was used to amplify
-defensins and glyceraldehyde-3-phosphate dehydrogenase (GAPDH) with gene-specific primers. Every pair of primers was designed from less conservative regions located on different exons to aid in specific amplification of target genes and in distinguishing PCR products amplified from cDNA vs. genomic DNA (Table 1). The PCR program used was: 94°C denaturation for 2 min, followed by different cycles of 94°C denaturation for 20 s, 55°C annealing for 20 s, and 72°C extension for 40 s, followed by a final extension at 72°C for 5 min. The sequence of primers and number of PCR cycles used are shown in Table 1. The PCR products were analyzed by electrophoresis on 1.2% agarose gels containing 0.5 µg/ml ethidium bromide. The specificity of each PCR product was confirmed by cloning of the PCR product into T/A cloning vector, followed by sequencing of the recombinant plasmid.
|
- and
-defensins described in this manuscript have been submitted to the GenBank. Accession numbers are as follows: rat, Defa6Defa11 (AY623750AY623755), Defa12-psDefa14-ps (AY746422AY746424), and Defa-rs1 (AY623756); mouse, Defcr20Defcr25 (AY746425AY746430), Defcr26 (AY766470), Defa-ps1 (AY746431), CRS1C-2 (AY761183), CRS1C-3 (AY761184), and CRS4C-6 (AY761185); human, DEFA7PDEFA11P (AY746432AY746436); chimpanzee, PTAD1PTAD11-ps (AY746437AY746445), and PTTD1-psPTTD2-ps (AY746446AY746447); and olive baboon, PAAD1PAAD3 (AY746448AY746450). | RESULTS |
|---|
|
|
|---|
-defensins in rodents and primates.
-defensin sequences in the vertebrate species whose genomes have been sequenced, all known
-defensin peptide sequences were individually queried against EST, NR, HTGS, and WGS sequences of zebrafish, Japanese pufferfish, chicken, dog, rat, mouse, chimpanzee, and human in the GenBank by using the TBLASTN program (1). For every novel
-defensin sequence identified, additional iterative BLAST searches were performed until no more novel sequences could be revealed. All genomic sequences containing
-defensins were also retrieved from the GenBank and translated into six open reading frames and curated for the presence of the
-defensin motif or signal/prosegment sequence to reveal potential sequences with distant homology. Whenever necessary, a combination of gene prediction programs, including GenomeScan (41), GENSCAN (4), and/or GeneWise (3), were used to predict full-length coding sequences of novel
-defensins from genomic sequences.
As a result, a number of sequences have been discovered in the human, chimpanzee, rat, and mouse. In the case of human, five novel
-defensin genes were found in addition to five previously characterized ones, DEFA1 through DEFA6 (6, 21). Among new
-defensin genes identified is DEFA7, which consists of a canonical six-cysteine
-defensin motif and an NH2-terminal signal/prosegment sequence highly homologous to known human
-defensins (Fig. 1A).
|
-defensins, we sought to confirm whether it is a functional gene. However, RT-PCR analysis revealed no expression of DEFA7 in any of 12 human tissues examined, including small intestine, bone marrow, and kidney, three of which are the most dominant sites to produce
-defensins in other mammalian species. Although it is still possible to be transcribed in other places, DEFA7 is most likely to be a pseudogene and therefore is designated as DEFA7P. Such results are also consistent with the fact that there is no single EST sequence for DEFA7P in the GenBank, despite nearly 5.8 million ESTs being deposited as of September 1, 2004.
In addition to DEFA7P, four
-defensin pseudogenes, namely DEFA8P11P, have also been discovered in the human (Fig. 1A). Without a premature stop codon, each pseudogene would otherwise encode a peptide that is highly similar to a typical
-defensin (Fig. 1A). Identification of additional five
-defensin genes in the extensively studied human genome highlighted the power of our computational search strategy.
A total of 11 novel
-defensins, including 4 pseudogenes, have also been identified in the chimpanzee genome. These genes, which were termed Pan troglodytes
-defensins (PTAD) 111, are all human
-defensin orthologs (Fig. 1A) and therefore are expected to show a tissue expression pattern similar to human genes. PTAD1 and PTAD4 are predicted to be neutrophil specific, whereas PTAD5 and PTAD6 are expected to be expressed mainly in Paneth cells of small intestine. Interestingly, PTAD7, the chimpanzee ortholog of DEFA7P, appears to be a functional gene with the start codon at the consensus position (Fig. 1A), but its expression site(s) remain(s) to be studied. Four orthologs of human
-defensin pseudogenes are also present in the chimpanzee, namely PTAD8-ps to PTAD11-ps.
In addition to
-defensins, two pseudogenes for
-defensin, PTTD1-ps and PTTD2-ps, have also been found in the chimpanzee (Fig. 1A). Both PTTD1-ps and PTTD2-ps are highly homologous to each other and to known human
-defensin pseudogene, DEFT1P (6, 16) (Fig. 1A), but they differ with DEFT1P in the locations of premature stop codons. DEFT1P contains two stop codons at the amino acid positions 17 and 77; however, only the former stop codon is preserved in PTTD1-ps and PTTD2-ps.
Three new
-defensins, namely PAAD13, were also identified in two bacterial artificial chromosome (BAC) clones (accession no. AC116558 and AC116559) of olive baboon (Papio anubis) (Fig. 1A). PAAD1 and PAAD2 are orthologous to DEFA5, whereas PAAD3 is a DEF7P homolog. However, unlike DEF7P, which contains a single nucleotide mutation at the start codon, PAAD3 is composed of a stretch of completely different nucleotides around the start codon (see Supplemental Fig. S1, available online at the Physiological Genomics web site),1
and therefore it is also predicted to be a pseudogene. This prediction is further supported by the fact that, despite extensive experimental screening efforts (33, 34), no PAAD3-like sequences have been found to be produced in either jejunum or leukocytes of rhesus macaque, which is a member of cercopithecids (Old World monkeys) as olive baboon. Presumably, such PAAD3-like gene(s) are also inactive in rhesus macaque; however, this remains to be experimentally verified.
In the rat, a total of 14
-defensin genes have been discovered, including four known ones (RatNP1/2, RatNP3, RatNP4, and RD5) (5, 42). Among 10 novel rat
-defensins (Defa), 6 genes (Defa611) encode peptide sequences varying 87104 amino acid residues in length with a canonical six-cysteine
-defensin motif (Fig. 1B). Also identified were three
-defensin pseudogenes, namely Defa12-ps, Defa13-ps, and Defa14-ps, with each having a premature stop codon at amino acid positions 34, 29, and 88, respectively (Fig. 1B). In addition, one
-defensin-related sequence (Defa-rs1) was also found in the rat. Despite the conservation in the NH2-terminal signal/prosegment sequence, Defa-rs1 has only five cysteine residues at the COOH terminus with no characteristic
-defensin motif (Fig. 1B). Similarly, two groups of cryptdin-related sequences (CRS1C and CRS4C) with the signal/prosegment sequences highly homologous to
-defensins, but with a different number and spacing pattern of cysteines, were reported earlier in mice (13). Such CRS peptides were shown recently to form covalent homo- or heterodimers and are capable of killing bacteria (9). Therefore, it is tempting to speculate that Defa-rs1, the only
-defensin-related sequence in the rat, associates with itself by forming homodimers and participates in innate defense in the gut as other enteric
-defensins.
Among 19
-defensins (cryptdins) reported in mice, 17 belong to the highly homogeneous cryptdin 1 (Defcr1) subgroup with less than 5 amino acids in difference in the mature sequence and many are believed to be derived from allelic variants of the same gene (22). Defcr4 and Defcr5 represent distant cryptdin family members, and their paralogous genes have not been reported in mice. Among eight novel
-defensins that we have identified in the mouse, five sequences (Defcr23Defcr26 and Defcr-ps1) can be grouped to the Defcr1 subgroup, whereas the other three (Defcr20Defcr22) are similar to Defcr4 (Fig. 1C). In addition, three novel cryptdin-related sequences were also found in mice (Fig. 1D). Among them, CRS1C-2 and CRS1C-3 are highly homologous to CRS1C-1, which contains a total of 11 cysteines with eight forming C-X-Y triplet repeats. On the other hand, CRS4C-6 belongs to a distant member of the CRS4C subgroup. Instead of consisting seven C-X-Y repeats, CRS4C-6 has an extra cysteine located close to the NH2-terminal end of the mature sequence. Unlike other CRS4C peptides that form intermolecular covalent dimmers (9), CRS4C-6 is likely to function as a monomer or noncovalently associated dimmers/oligomers. Interestingly, none of
-defensin sequences have been found in the species other than glires and primates. This is perhaps not surprising, given the fact that no
-defensins have been reported in any other vertebrate species despite the extensive searching efforts in the past (6, 21).
Tissue expression patterns of rat
-defensins.
A semiquantitative RT-PCR was used to analyze tissue expression patterns of rat
-defensins. As shown in Fig. 2, all rat
-defensins exhibited two distinct patterns of expression. One group, which includes RatNP1/2, RatNP3, and Defa12-ps, are preferentially expressed in bone marrow with or without expression in small intestine (Fig. 2). Defa7, Defa10, and Defa11 also belong to this group with little but specific expression in bone marrow (data not shown). In contrast, the other group consisting of RD5, Defa6, Defa8, Defa9, and Defa-rs1 are specifically expressed in small intestine (Fig. 2). These genes are highly transcribed in jejunum and ileum with a low level of expression in duodenum. Interestingly, such a grouping is also consistent with the degree of conservation in the signal peptide sequences, which are more homologous within each group (top vs. bottom, Fig. 1B). It is surprising to see the expression of RatNP3 and Defa12-ps in both bone marrow and small intestine, given the fact that all known
-defensins are produced either by myeloid or Paneth cells, but not by both cell lineages (21). The cells that express rat
-defensins in the intestinal tract are presumably Paneth cells, which produce enteric
-defensins in humans and mice (21). It is worth noting that Defa-rs1 is expressed in the jejunum and ileum at levels comparable to most other enteric defensins (Fig. 2), reinforcing the notion that it most likely will function in vivo.
|
-defensins.
-defensins in glires and primates including the newly identified ones, the proportion of amino acid differences (p-distance) of full-length peptides was calculated by the neighbor-joining method (26) (see Supplemental Fig. S2 for alignment of peptide sequences of 82
-defensins used). As shown in Fig. 3A, two major clusters are evident with
-defensins in the primates with rabbit and guinea pig forming one cluster supported by a bootstrap value of 65%, and mouse and rat enteric
-defensins forming the other cluster with a bootstrap value of 98%. Rat
-defensins appear to be unique in that the genes of myeloid origin formed a separate cluster from the genes of enteric origin. Rat myeloid
-defensins tend to group with primate, rabbit, and guinea pig
-defensins.
|
-defensins formed species-specific clusters in the nonprimate species (rat, mouse, guinea pig, and rabbit), implying that these genes may have undergone repeated duplication after these species diverged from a common ancestor. In contrast, no species-specific clustering was demonstrated among four primate species examined (human, chimpanzee, rhesus macaque, and olive baboon) (Fig. 3A). Instead, several distinct subclusters of
-defensins exist across the primate species. For example, chimpanzee and human are in a complete analogy with
-defensins present in all subclusters. The
-defensins of Old World monkeys are located in three subclusters with human and chimpanzee DEFA13, DEFA5, and DEFA7P, respectively. These results suggested that many of the primate genes were likely evolved before the divergence of these species. The primate-specific
-defensins, which exist in both cercopithecids and hominids, are clearly clustered together with the DEFA10P gene lineage, reinforcing the notion that
-defensins were evolved from
-defensins after divergence of the primates from other mammalian species (6, 16, 20, 21).
To minimize possible biases of selection pressure exerted on exon (and therefore peptide) sequences, the evolutionary relationships of the
-defensin genes were further evaluated by analysis of the intron sequences within the open reading frame. The p-distance of the introns of 59
-defensins with known gene sequences was calculated by the neighbor-joining method (the intron sequences used will be available upon request) (26). Consistent with the results obtained from the full-length peptide sequences, a similar two major clusters of genes were obtained, in which mouse and rat enteric
-defensins were separated from others, supported by a bootstrap value of 100%, and hence were most likely evolved from a different ancestor (Fig. 3B). On the other hand, rat myeloid gene introns are clearly clustered together with primate
-defensins and therefore originated from the same ancestral gene. As for rat and mouse enteric genes, there are two distinct subclusters with each being supported by a bootstrap value of 100%, implying that mouse cryptdins (Defcr) share the same ancestor with rat RD5, Def13-ps, and Defa14-ps, whereas mouse cryptdin-related sequences (CRS1C and CRS4C) were derived from the same primordial gene as rat Defa6, Defa8, and Defa9 and Defa-rs1 (Fig. 3B). It is obvious that each subcluster has undergone a repeated duplication and diversification to give rise to multiple members in each species after these mammalian species diverged from a common ancestor.
Despite significant sequence conservation in the first exon (Fig. 1, C and D) and intron (data not shown), substantial differences exist in the cysteine motif encoded by the second exon between canonical
-defensins and three subgroups of related sequences in rats and mice, namely Defa-rs1, CRS1C, and CRS4C (Fig. 1, C and D). To further address the origin and evolution of the cysteine motif of
-defensin-related sequences, nucleotide sequences of the second exon were compared with rat enteric
-defensins (Defa6, Defa8, and Defa9) in the same subcluster (Fig. 3B). It is evident that Defa-rs1 shares a high homology with three rat
-defensins throughout the entire coding region in the second exon (Fig. 4), suggesting that, in addition to the first exon and intron, the second exon of Defa-rs1 was also originated from the same ancestral gene as rat enteric
-defensins. In fact, the change in the number and pattern of cysteines in Defa-rs1 was mainly because of nucleotide deletions (Fig. 4).
|
-defensins and Defa-rs1, patches of identical nucleotides are evident among them throughout the entire second exon (Fig. 4), implying that, similar to Defa-rs1, the changes in the second exon of CRS genes were likely to be a result of sequence diversification, instead of exon shuffling, which commonly occurred during the evolution of many gene families (23). Coupled with the fact that both intron and the first exon are highly conserved among classic
-defensins and related sequences, the full-length CRS genes most likely were derived from the same ancestral genes for rat enteric
-defensins and Defa-rs1. Apparently, CRS genes and Defa-rs1 were independently evolved from classic
-defensins in mice and rats, respectively, following divergence of these two rodent species from each other.
Collectively, our results suggested that all mammalian
-defensins may have evolved from two ancestral genes, with each giving rise to one major cluster of daughter genes. The absence of
-defensins in nonmammalian species clearly indicated that
-defensins have evolved after the divergence of mammals from other vertebrate species. Formation of species-specific clusters, particularly in nonprimate species, strongly suggested that many
-defensins have evolved independently after these mammalian species diverged from a common ancestor. It is worth noting that, because of the lack of complete repertoires of
-defensin genes in the rabbit and guinea pig, the origin of
-defensins in these two species remains inconclusive. Analysis of coding sequences placed them in the same cluster as primate genes (Fig. 3A). However, comparison of intron sequences indicated that guinea pig myeloid defensins (GNCP1A, GNCB1B, and GNCP2) and rabbit kidney defensins (RK1 and RK2) may share the same ancestor with rat and mouse enteric genes, whereas rabbit myeloid defensins (NP1 and NP2) were likely evolved from the ancestral genes for primates (Fig. 3B).
To further understand the driving force for sequence divergence of
-defensins during evolution, we tested whether positive Darwinian selection has occurred, by estimating the number of synonymous substitutions per synonymous site (dS) and the number of nonsynonymous substitutions per nonsynonymous site (dN) substitution for different regions of mammalian
-defensins using the method of Nei and Gojobori (19). We computed these values for 11 pairs of closely related genes, which were phylogenetically and thus statistically independent of each other (24). Consistent with earlier results (10, 12), mean dN was significantly greater than mean dS in the mature peptide, but there was no significant difference between mean dS and mean dN in the signal peptide or prosegment (Table 2). These results support the hypothesis that natural selection has acted to diversify the functionally active mature defensin region but not other portions of the molecules (10, 12).
|
-defensin gene clusters.
-defensin genes in the human, chimpanzee, rat, and mouse genome (data not shown). To determine the relative position and orientation of each defensin on the chromosome, individual defensins were searched against the assembled human, chimpanzee, rat, and mouse genomes by using the BLAT program (14) through the UCSC Genome Browser. In the case of human, all 11 known and newly identified
- and
-defensins genes were found to form a continuous cluster expanding 132 kb on the chromosome 8p23 (Fig. 5A). All human defensins are transcribed from the same direction from centromere to telomere, with the
-defensin pseudogene (DEFT1P) residing in the center. The chimpanzee genome encodes an
-defensin cluster of 10 distinct genes, which expanded 117 kb in the proximal region of chromosome 7 (Fig. 5A). The
-defensin clusters in the chimpanzee and human are nearly in perfect synteny. The only difference is the presence of two copies of DEFA1 and DEFT1P in the human rather than a single copy in the chimpanzee. Such syntenic
-defensin loci are also present in rats and mice. In the case of rat, all 14
-defensin genes are clustered within a 311-kb distance on the chromosome 16q12.4-q12.5 (Fig. 5B), whereas six mapped mouse
-defensin genes are located continuously on chromosome 8A1.3 (Fig. 5C).
|
-defensin genes in both rodents and primates reside within a ß-defensin cluster (data not shown) and are adjacent to ß-defensin 1 (Defb1), which is evolutionarily conserved across the mammalian species (Fig. 5). Therefore, the syntenic location and physical proximity of
-defensins in these species provided additional evidence to support the conclusion that
-defensins have arisen from a common ancestor(s) by gene duplication followed by diversification.
Comparison of the open reading frames with genomic sequences of all known and newly identified
-defensins (including
-defensin-related sequences) revealed a highly conserved gene structure, which is composed of two exons separated by a 500- to 700-bp intron, except for DEFA5/PTAD5, DEFA6/PTAD6, RK1, RK2, PAAD1, and PAAD2, whose introns vary from 8481,646 bp (data available upon request). Consistent with earlier findings, signal sequence and most of the prosegment of
-defensins are encoded in the first exon, whereas the second exon primarily encodes the mature
-defensin sequences. However, based on
-defensin genes studied thus far, it appears that genes of myeloid origin consist of an additional exon encoding 5'-untranslated region, whereas enteric genes are composed of only two exons (21). The lack of full-length cDNA sequences for these newly identified novel sequences prevented us from predicting the genomic locations of 5'- and 3'-untranslated regions and thus possible existence of additional introns.
| DISCUSSION |
|---|
|
|
|---|
-defensins.
-defensin gene family, including a number of novel genes, have been identified in the human, chimpanzee, rat, and mouse (Fig. 1). Although it is highly unlikely, we could not rule out the possibility that additional
-defensin genes with distant homology might be uncovered in these species by different computational methods such as the use of hidden Markov models (18, 30, 31). Alignment of all known
-defensin peptide sequences revealed the conservation of signal and negatively charged prosegment sequences as well as the characteristic six-cysteine-containing
-defensin motif, with exception of pseudogenes and
-defensin-related sequences (Fig. 1). Given the significance of six cysteines in maintaining the spatial structure and biological functions, most pseudogenes have at least one mutation at the conserved cysteine position (Fig. 1), consistent with their dysfunctional nature.
Among all newly identified genes in the primates, chimpanzee PTAD7 is unique in that it represents a distant, putative functional gene. Unlike a typical
-defensin, which is positively charged, the putative mature sequence of PTAD7 contains two arginines (R) and two glutamic acids (E), and therefore is predicted not to have a net positive charge under physiological conditions. However, its human ortholog, DEFA7P, appears to be a pseudogene because of an absence of the start codon and hence lack of transcription in a wide range of tissues. Interestingly, PAAD3, a DEFA7P ortholog in olive baboon, also appears to be a pseudogene with no start codon at the canonical position (see Supplemental Fig. S1). The unique sequence and evolutionary dynamics of the PTAD7 gene lineage in different primary species implied that PTAD7 might function differently in the chimpanzee from other
-defensins. Consistent with its novelty, PTAD7 is the only known putative functional
-defensin with a nonsynonymous mutation to serine at the fourth canonical cysteine position (Fig. 1A).
As opposed to many ß-defensins, most
-defensins contain no or only one amino acid following the last two cysteines (Fig. 1). However, Defa8, Defa9, and Defa11 in the rat share a unique feature of unusually long COOH-terminal tail (914 amino acids) rich in charged and polar uncharged amino acids. Such COOH-terminal tails, which are predicted to be exposed to the surface to influence the overall charge and amphipathicity, might confer a different antimicrobial spectrum and/or efficacy on those peptides, as the net charge and amphipathicity are strongly correlated with antimicrobial activity of defensins (25, 28, 32). Further studies are warranted to assess the importance of such long COOH-terminal tails in antimicrobial activity as well as in other biological functions.
Evolution of mammalian
-defensins.
All
-defensins have been found only in certain mammals, but not in any other vertebrate species that we examined, suggesting that
-defensin genes have appeared following mammalian divergence. The presence of
-defensin loci in syntenic chromosomal regions of different mammalian species (Fig. 5) is indicative of a common ancestor. Phylogenetic analyses of both full-length peptide and intron sequences of
-defensins revealed two distinct clustering (Fig. 3), implying that they may have independently evolved from two separate ancestral genes. One ancestral gene has undergone significant duplication and diversification giving rise to enteric-specific
-defensins in the rat and mouse, whereas the other ancestral gene has evolved to the genes in the primates as well as in rat myeloid cells.
Phylogenetic analysis of primate
-defensins revealed an interesting evolutionary pattern and pointed out possible presence of subgroups of
-defensins that are specific in the hominid lineage. Despite extensive searches for the repertoire of
-defensins in both leukocytes (34) and small intestines (33), neither DEFA4- nor DEFA6-like genes have been found in rhesus macaques (Figs. 1A and 3). Such genes are also absent in nearly 1,000 olive baboon genomic sequences deposited in the GenBank as of September 1, 2004. Therefore, it is likely that DEFA4/PTAD4 and DEFA6/PTAD6 lineages are hominid-specific genes. On the other hand, although there is only a single copy of DEFA5/PTAD5 gene encoded in human and chimpanzee genomes, multiple members of the DEFA5-like genes are present in both rhesus macaque (RED16, RMAD4/5, and RMAD6/7) and olive baboon (PAAD1 and PAAD2) (Fig. 3A). Apparently, the ancestral gene for DEFA5 has undergone a different evolutionary pattern: it duplicated and expanded in cercopithecids, but remained a single copy in hominids. In addition, the DEFA5 ancestral gene has diverged in rhesus macaques to acquire a new ability for certain offspring genes to express in myeloid cells (e.g., RMAD4/5, and RMAD6/7), in addition to intestinal Paneth cells (e.g., RED1--RED6) as in humans.
With regard to the evolution of mammalian
-defensins, Bevins et al. (2) proposed a model, which was based on dot matrix sequence comparisons of introns of a few mouse and human genes, that
-defensins were derived from two ancestral genes for human enteric DEFA5 and DEFA6 and that a subsequent homologous unequal meiotic crossover of DEFA5 and DEFA6 generated a hybrid gene that was further evolved to present day myeloid
-defensins. Although such a model might be true in primate species (2), our results clearly argued against its generalization to nonmammalian species, because the mouse genome apparently encodes neither DEFA5/DEFA6-like genes nor myeloid
-defensins. Therefore, mouse
-defensin genes were apparently not derived from homologous crossover of the primordial genes for DEFA5 or DEFA6. Furthermore, because of presence of multiple DEFA5 genes expressed in both small intestine and leukocytes in cercopithecids, but an absence of DEFA6 lineages, the validity of Bevins evolutionary model in primate species remains to be examined with availability of additional primate genomes.
As for the origin of mammalian
-defensins, they were most likely evolved from ß-defensins after mammals diverged from other vertebrates, primarily because of the physical proximity on the chromosome as well as a similarity in spatial structure and biological activities (6, 39). As suggested by Nguyen et al. (20), appearance and further diversification of
- and
-defensins from an already large ß-defensin gene family is probably because of the need for antiviral defense in certain mammalian and/or primate species, as these newly evolved
- and
-defensins have acquired novel lectin-like activity and are capable of inhibiting the entry of viruses (such as HIV) to host cells (36, 37). It is possible that divergence of
-defensin-related sequences from canonical
-defensins in rodents could lead to some novel activities beyond antimicrobial. Further functional characterization of these
-defensins will shed light on the significance of their diversification during evolution and facilitate their development as a new class of antimicrobial agents.
Our earlier comparative analysis of ß-defensins in the chicken, rodents, and human revealed that most gene lineages are conserved across mammalian species and thus were evolved before the divergence of mammals from each other (39). In contrast,
-defensins tend to form species-specific clusters particularly in nonprimate species (Fig. 3). Distinct subsets of
-defensin genes also exist with most subsets containing multiple members in each species. It is even true with the rat and mouse (Fig. 3), which only diverged 1725 million years ago. It is rather unique that most
-defensin genes duplicated very recently in the evolutionary time and have undergone significant but independent expansion in different species. Clearly, the ß-defensin gene family is evolutionarily older than
-defensins. Compared with the former,
-defensin genes duplicated and expanded much more rapidly. Calculation of the rates of synonymous vs. nonsynonymous substitutions of 11 pairs of representative
-defensins revealed that positive Darwinian selection appears to have acted to diversify
-defensins, particularly in the mature peptide region (Table 2).
Gene duplication followed by positive selection indeed has been observed in several gene families involved in immune responses (11). Divergence of these immune genes often leads to either an additional layer of functional redundancy or acquisition of functional novelties, both of which conceivably help the hosts cope more effectively with a broad range of pathogens. In fact,
-defensins have been shown to exhibit selectivity against varied microorganisms, and a modest difference in the primary sequence could have a significant impact on the antibacterial spectrum and/or potency (22). Therefore, as different mammals live in quite different ecological niches, the production of species-specific
-defensins would presumably allow them to better respond to the specific microbial challenges that they face. The presence of
-defensin-related sequences in rodents and functional divergence of
-defensins from
-defensins in certain primate species provide additional evidence supporting this notion.
| GRANTS |
|---|
|
|
|---|
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Address for reprint requests and other correspondence: G. Zhang, Dept. of Animal Science, 212D Animal Science Bldg., Oklahoma State Univ., Stillwater, OK 74078 (E-mail: zguolon{at}okstate.edu).
10.1152/physiolgenomics.00150.2004.
1 The Supplemental Material (Supplemental Figs. S1 and S2) for this article is available online at http://physiolgenomics.physiology.org/cgi/content/full/00150.2004/DC1. ![]()
| REFERENCES |
|---|
|
|
|---|