Geoscience Research Institute

PSEUDOGENES AND ORIGINS

L. J. Gibson
Geoscience Research Institute

Modified from Origins 21(2):91-108 (1994)

ABSTRACT - What this article is about.

Pseudogenes are DNA sequences that resemble functional genes but seem to have no purpose. The presence of similar eta globin pseudogenes in humans and chimps has been used as an argument for common ancestry of the two species. The argument has two parts: that the pseudogene sequences actually have no function, and that God would not create similar non-functional sequences in humans and chimps. The latter argument is theological and resembles many other theological arguments that have been proposed and later abandoned. Theological arguments should not be relied on unless well supported by Scripture.
    The argument that the eta globin pseudogene has no function is consistent with most of the data, although lack of function has not been demonstrated. Possible function is suggested by the location of the pseudogene and differences in the extent of divergence of "intronic" and "exonic" sequences. The possibility that the eta globin pseudogene provides a binding site for a molecule involved in gene regulation has not been ruled out. At present, the evidence from pseudogenes fits reasonably well into an evolutionary interpretation, for those who choose to make that interpretation. However, there is much about the operation of the genome in general, and pseudogene sequences in particular, that is not well understood. Rapid progress is being made in understanding how the genome operates, and it is reasonable to expect that greater understanding of the meaning of pseudogenes will be forthcoming.


Introduction

    Theists and naturalists have long argued over whether nature provides evidence of design. Many theists have claimed that nature is so designed that one can infer the existence of a designer. Some theists have made the stronger claim that nature reveals a designer who is the Creator God revealed in the Bible. Many examples of apparent design have been described, ranging from the non-random properties of the universe to the intricate mechanism of a living cell. To the theist, these features speak clearly of the existence of an intelligent Creator who created with a purpose in mind.
    Naturalists have responded with arguments of their own. One argument that is currently popular is the claim that many features in nature are not designed very well. It is affirmed that such poor design indicates either an inferior designer or no designer at all. Several examples of allegedly poor design have been proposed (e.g., Miller 1994). One of the most difficult examples for theists to explain is probably the existence of certain DNA sequences known as pseudogenes. This paper will explore some of the characteristics of pseudogenes and their relationship to the argument for or against design.

What are pseudogenes?

    Ordinary structural genes are made of DNA sequences that contain coded information for making a particular protein molecule. The information includes a start signal, coding for the sequence of amino acids needed to make up the protein, and a stop signal. Additional signals that regulate the timing of gene activity are found adjacent to the gene, and often also at some distance from it. The amino acid-coding sequence is often broken up into portions known as "exons," which are separated by spacer sequences known as "introns." Pseudogenes are DNA sequences that appear similar to functional genes, but contain important defects that appear to make them incapable of producing a functioning protein (Proudfoot 1980). Defects of pseudogenes may include lack of a start codon, presence of extra stop signals, and abnormal or absent flanking regulatory elements. It is thought that mutations in pseudogenes are neutral, and hence free from selection. The first report of a pseudogene was in 1977 (Jacq, Miller and Brownlee 1977). Since that time, a large number of pseudogenes have been described in humans and a wide variety of other species. The supposed defects of pseudogenes have been used as an argument that nature is too poorly designed to attribute its existence to special creation by a supernatural Designer (Miller 1994).
    Two types of pseudogenes are known: unprocessed pseudogenes and processed pseudogenes. Processed pseudogenes are found on different chromosomes from their functional counterparts. They are called "processed" because they appear to be altered copies of active genes. They lack introns (spacer sequences within a gene) and certain regulatory sequences located in front of the gene, they often terminate in a series of adenines, and are flanked by direct repeats. ("Direct repeats" are associated with movable genetic elements, which may in some cases play a role in inserting a pseudogene into a chromosome.) Processed pseudogenes may be complete copies of the coding sequence, or may be incomplete copies, or may have additional inserted sequences. They seem to be present only in mammals (Vanin 1985). Processed pseudogenes are believed to have arisen in a three step process. The first step is copying of the DNA message into an RNA transcript. The introns are then edited out of this transcript to produce a messenger RNA (mRNA) molecule. Finally, the mRNA is copied back into a chromosome in a process called reverse transcription (see Vanin 1985 for review; see Tchenio et al 1993 for an example). The L1 family of repetitive DNA sequences appears to be the result of this process (Jurka 1989).
    Unprocessed pseudogenes are usually found within clusters of similar, functional sequences on the same chromosome (Harris et al. 1984). They typically have "introns" and flanking regulatory sequences resembling a functional gene. As with processed pseudogenes, expression of an unprocessed pseudogene is generally prevented by stop codons. Numerous other differences interpreted as deletions, insertions and point mutations may also be present. A truncated mRNA transcript may or may not be produced. Unprocessed pseudogenes are found in a wide variety of organisms. They are believed to have arisen by gene duplication, which produced an extra copy of the gene. The extra copy, not being needed, could accumulate mutations without harming the organism. Examples of unprocessed pseudogenes are present in the alpha-globin and beta-globin gene families (e.g., see Hardison and Miller 1993 and references therein).

The argument from shared mistakes

    When genes for equivalent proteins are compared in different species, they are often found to differ in sequence. In general, the more similar two species are taxonomically the more similar are their DNA sequences, both in general and for specific enzymes. Exceptions do occur, but the overall pattern is easily recognized. Two explanations have been proposed for the observed pattern of similarities in molecular sequences.
    One explanation for sequence similarities is that they are inherited from an evolutionary ancestor. Genes are similar because they are both inherited from a common ancestor. Sequence differences are attributed to accumulation of mutations since the species diverged from their common ancestor. A second, contrasting, explanation is that sequence similarities are due to common design for a similar function. Sequence differences may reflect functional differences, such as might be required for protein function in different metabolic environments, or regulatory function in different genetic backgrounds. It seems unlikely that sequence similarities could be due to chance, but some have been interpreted in this way (e.g., Djian and Green 1992).
    Similarities in functional sequences for the same protein in different organisms are to be expected, since they perform similar functions; however, what about similarities in sequences, such as pseudogenes, that seem to have no function? Pseudogenes are commonly thought to be flawed copies of functional genes. It has been argued (Max 1987, Gilbert 1993, Miller 1994) that similar pseudogene sequences shared by two or more species are best explained as the result of common ancestry, assuming that an intelligent designer would not repeatedly make mistakes in creating genes. This can be called the "argument from shared mistakes."
    Comparison of DNA sequences from humans, chimp and other mammals reveals a considerable number of shared pseudogenes that are similar in sequence as well as in positional relationship to other genes. Humans and chimps have many similarities; this is interpreted as indicating a recent common ancestry for humans and chimps (Gilbert 1993). The best known example of a shared pseudogene is the eta globin (psi beta globin) gene, a member of the beta globin gene family.

The beta globin family and the (eta globin) pseudogene in humans

    Human hemoglobin molecules are made of two sets of proteins, produced by the alpha globin genes and the beta globin genes. Both beta globin and alpha globin genes occur in "families" of non-identical copies. The beta globin gene family is located on the short arm of human chromosome 11 (11p15.5), near the gene for insulin (Lalley et al. 1989). A family of alpha globin genes is also present in mammals, but it is located on a different chromosome (16p13).
    The beta globin gene cluster consists of five somewhat similar functional genes and one pseudogene. The five functional genes are arranged on the chromosome in a sequence that corresponds to the sequence of timing of their respective functions during growth and development. The first gene in the series is the "epsilon globin" gene, which helps form hemoglobin molecules early in embryonic development. The second and third genes are called "gamma-G" and "gamma-A." They help form hemoglobin molecules later during fetal development. The "eta globin" pseudogene is next in sequence, followed by the "delta" globin gene which is produced at a low rate in adults. The last gene in the series is the "beta" globin gene, which produces most of the adult beta globin, and gives the gene family its name. As the adult globin genes become functional, the fetal genes are turned off. The fact that the sequence of the genes on the chromosome matches the sequence of their activity in the developing organism seems unlikely to be the result of chance, and can easily be interpreted as the result of intelligent design.
    The eta globin sequence has several characteristics of pseudogenes. It resembles the other members of the beta globin gene family, but is most similar to the gamma-A globin gene. However, it has some important differences. Compared with the gamma-A globin gene, the eta globin pseudogene lacks a start codon (AUG) in the appropriate position. It also has numerous extra stop codons which would be expected to prevent production of any protein. No mRNA transcript or protein product has been identified, and it appears that none is produced. No medical defect is known that is traceable to the loss of this pseudogene. In short, the eta globin sequence is not associated with any known function or defect, and appears to be incapable of producing a useful molecule.
    The beta globin gene family is also found in other mammals. Sequences of the human gamma-A globin gene and eta globin pseudogenes from humans and several other species have been compared (Chang and Slightom 1984). The human gamma-A globin gene contains three exons (portions of the DNA that code for amino acids) of 92, 223 and 129 nucleotides, respectively, for a total of 444 nucleotides. The corresponding "exons" of the human eta globin pseudogene differ from the gamma-A globin gene exon sequences in 29, 38 and 43 nucleotide positions, respectively, for an overall difference of 24.8%. The gamma-A globin gene has two introns, of 122 and 877 bases, respectively. These differ from the "intron" sequences of the eta globin pseudogene by 46-79% and 72-94%, respectively (my figures differ somewhat from those of Goodman et al. 1984, probably due to problems in aligning the sequences). The gamma-A-globin exons and pseudogene "exons" are more similar to each other than expected from random sequences, while the "intronic" sequences are so different that no relationship among them can be inferred.

Comparisons of eta globin pseudogenes in humans and other primates

    The arrangement of the beta globin gene family in other primates is very similar to that in humans (Harris et al. 1984). Humans, chimpanzees and gorillas have the same number of beta globin genes arranged in the same sequence. In chimpanzees, the beta globin group is on chromosome 9, which is equivalent to human chromosome 11 (Lalley et al. 1989). Baboons have a similar arrangement, but the delta globin gene appears non-functional, and is classified as a pseudogene. The New World owl monkey has only one gamma globin gene, with a possible partial second gene (Meireles et al 1995), but the arrangement of genes is otherwise the same as in humans. This is true also for the galago ("bush baby"; Hardison and Miller 1993). Among non-primates, the rabbit has only one gamma globin gene, but lacks the eta globin pseudogene, while the delta globin gene appears to be a pseudogene.
    The DNA sequences of the eta globin pseudogene exons in humans, chimpanzees and gorillas are similar. The chimpanzee eta globin pseudogene exonic DNA differs from the human eta globin pseudogene at six nucleotide positions and from the corresponding gorilla pseudogene at seven positions. One of these differences The gorilla pseudogene exonic DNA has three differences from humans and seven from chimpanzees. This means that chimpanzee and gorilla eta globin exon sequences are both slightly more similar to the human pseudogene than to each other.
    It is clear that the "exon" portions of the eta globin pseudogenes in humans, chimps and gorillas are highly similar. None of their differences involves any of the eight stop codons in the pseudogenes. Several potential initiation codons (AUG) are present, and one of the differences in the chimpanzee produces an additional potential initiation codon in the second exon. However, none of these is sufficient to support protein coding function.

Gene duplication hypothesis

    If evolution is to occur, new genes must somehow be produced. The most popular explanation for the evolution of new genes is that they are modified from extra copies of existing genes. This explanation is known as the gene duplication hypothesis (Ohno 1970). According to this hypothesis, functional genes may be duplicated accidentally. The duplicate gene is not needed by the organism. Both copies of the gene may be subject to selection until one of them suffers a disabling mutation, such as a premature stop signal. This disables the gene so it no longer has any function, and is no longer subject to natural selection. It has become a pseudogene, and all subsequent mutations are neutral. Over time, mutations accumulate in the pseudogene. Eventually, according to the theory, random mutations may produce a new gene with a new function (e.g., see Long and Langley 1993).
    The gene duplication hypothesis, although widely accepted, is not without some theoretical and empirical difficulties. Assuming the original gene had been optimized by selection, mutations in the coding region of the duplicated gene prior to a disabling mutation would likely result in production of inferior protein molecules. Individuals with one gene that produced inferior protein products would likely be selected against. Spread of a duplicated gene should be difficult under these conditions. This problem could be reduced if mutations destroyed the function of the extra gene copy early in its history. However, there are only three stop codons, while there are 61 codons for amino acids. One would expect mutations resulting in destruction of function to be much less common than those resulting in production of variant proteins, most of which could be expected to be inferior. Selection may also oppose maintenance of a pseudogene, since it may retain enough activity to disrupt normal cellular activities. Some pseudogenes are suspected to be involved in causing certain diseases (e.g., Wedell and Luthman 1993, Brakenhoff et al. 1994), which should result in negative selection against them. Thus, establishment and maintenance of a pseudogene by gene duplication may require a rather special sequence of events.
    Walsh (1995) has calculated the theoretical conditions thought necessary for establishing the presence of a pseudogene in a population, assuming the pseudogene arose randomly by mutation. Establishment requires a high proportion of favorable mutations, a large number of reproducing individuals in the population, and a high selection coefficient. It seems doubtful that these calculations can explain the frequency of pseudogenes in living species, and some other explanation would be preferred.
    Another problem for the gene duplication hypothesis is that the existence of duplicate copies of a gene does not necessarily permit one of the copies to diverge from the others. For example, seven copies of the "Enhancer of split" gene are present in Drosophila, but it appears that none of them is free to mutate (Maier et al 1993). The "duplicated copies" are not extra, but all seem to be required. Many genes occur in multiple copies that remain similar to each other rather than diverging. This has been explained as due to a process known as gene conversion, in which one DNA sequence is "converted" during copying to match another sequence. This may result in maintenance of similarity among several copies of a sequence. The situation in which multiple copies of a sequence maintain close sequence similarity is known as "concerted evolution" (e.g., Moore et al 1993). Concerted evolution would tend to prevent divergence of duplicated genes, thus presenting a problem for the gene duplication hypothesis. Another problem with the gene duplication hypothesis is that tetraploid species have far fewer pseudogenes than would be expected (Larhammar and Risinger 1994).
    Despite some difficulties in attributing evolution of new information from gene duplication, there seems to be evidence that gene duplication does occur. An apparent example of parallel gene duplications in flies has been described (in Menotti et al. 1991).

Beta globin genes and the gene duplication hypothesis

    It is thought that the eta globin pseudogene originated by duplication of a gamma-A globin gene, because of the similarity in their sequences. Both genes are present in all primates studied. Other mammals may have one or the other of the two genes. For example, gamma globin, but not eta globin, genes are present in rabbits; goats have eta globin but not gamma globin genes (Hardison and Miller 1993); the opossum has neither (Goodman et al 1987).
    It would be useful to review the evolutionary explanation for the distribution of eta globin genes in mammals. The proposed explanation is that the common ancestor of marsupials and placental mammals lacked both genes. After the evolutionary divergence of the marsupials, the gamma globin gene formed by duplication of an existing gene in the beta globin family. Later, but before radiation of the orders of placental mammals, the eta globin gene formed from a duplicated gamma globin gene. This second supposed gene duplication is estimated to have occurred at least 140 million years ago (Harris et al 1984). Gamma and eta genes must both have been present in ancestral placentals, but presumably gamma was lost by goats and eta was lost by rabbits.
    According to this scenario, the eta gene must have been functional at first, because it is functional in goats. It is non-functional in all primates, which is interpreted to mean it was already nonfunctional in the ancestral primates. According to Martin (1993), primates probably originated in the Late Cretaceous, perhaps 70 to 80 million years ago. This interpretation implies that the eta globin pseudogene has been maintained for more than 70 million years without being converted into a useful new gene and without being eliminated. The persistence of a non- functional DNA sequence in an entire lineage for such a supposed long period of time seems remarkable in the context of the gene duplication hypothesis.
    The gamma globin gene is believed to have duplicated a second time, producing the A-gamma and G-gamma genes. Humans, apes, Old World monkeys, and some New World monkeys have two functional gamma globin genes. Other mammals, including galagos, tarsiers and rabbits, have only a single gamma globin gene (Hayasaka et al. 1993, Hardison and Miller 1993). To explain this, the gamma globin gene is postulated to have undergone a second duplication after divergence of simians and tarsiers. Current interpretation of the fossil record of primates (Martin 1993) suggests that simians and tarsiers diverged during the Paleocene, perhaps 60 million years ago. It seems remarkable that both copies of a duplicated gene could remain functional for 60 million years if evolution has depended on gene duplication for the source of new genetic information.

Theological presupposition in the argument from shared mistakes

    Several factors need to be considered in interpreting DNA sequence similarities in the eta globin pseudogenes. The argument has been presented that eta globin pseudogene similarities are compelling evidence of shared ancestry. This argument rests almost entirely on two assumptions: that the eta globin pseudogenes have no function; and that God would not create similar non-functioning sequences in separate species. Thus these assumptions must be carefully examined.
    The argument that God would not act in a certain way is a theological argument, and can hardly be addressed by science. The validity of such an argument depends on the kind of God being postulated. The kind of God at issue for most of those involved in this discussion is the God who revealed Himself in the Bible. The question then is: What do the scriptures say about whether God would create structures or DNA sequences for which we can find no use in unrelated organisms? This subject is not addressed in the Bible, leaving us without an answer. We can postulate that God would not do such a thing, but this position would not be based on any evidence other than our own presuppositions, however reasonable they seem.
    Another theological argument that has been advanced against some proposed actions of God is that God would not deceive us by acting in certain ways. This is equivalent to claiming that our understanding of nature can be trusted to accurately reveal God's activities. This argument is especially dangerous because it places human reason above divine revelation. The scriptures do state clearly that God does not deceive us, but they also make it clear that we are naturally prone to make wrong conclusions. The scriptures reveal the truth about history. When God tells us in scripture that He created in a certain way, we need not be deceived by what we believe to be appearances to the contrary. Our experience should teach us that much. The argument that we can figure out what God would or would not do has not done well historically. At various times it has been claimed that God would create only perfectly circular orbits for the planets, or that God would create only perfect species that would not need to adapt to changing circumstances, or that God would not permit man to contaminate space. None of these arguments has survived. Claims about God's activities should be based on scripture.

Scientific presupposition in the argument from shared pseudogenes

    A second assumption underlying the argument from shared mistakes is that shared pseudogenes, in this case the shared eta globin pseudogenes, have no function. Has it been demonstrated that these sequences have no function?
    It is difficult to completely rule out any possibility of polypeptide production based simply on coding sequence. Examples are known in which the apparent DNA message is altered by RNA editing, reading frame-shifting or skipping parts of sequences (Benhar and Engelbert-Kulka 1993, Dietz et al 1993, Gesteland et al. 1992, Landweber and Gilbert 1993). Nevertheless, the available evidence seems to suggest that the eta globin pseudogene does not code for any protein. No RNA transcript or protein product has been identified. Each of the three "exons" contains at least one stop codon in each of the three "reading frames." ("Reading frames" differ in which nucleotide of each base triplet is used as the starting point.) Seven potential start codons are present, but none of them is in "exon" one. These potential start codons are not sufficient for protein coding function. However, some pseudogenes may produce small amounts of polypeptides in specific tissues (Weinshank et al. 1991, Bristow et al. 1993, Misra-Ress, Cooke and Liebhaber 1994), so it is difficult to rule out the possibility that the eta globin sequence might produce a polypeptide.
    DNA strands come in complementary pairs. One might wonder whether the DNA strand complementary to the pseudogene might have some function, but there seems to be no information available regarding this.
    The eta globin pseudogene does not appear to function in chromosomal structure. Chromosomes are organized into loops that are attached at their bases to a nuclear material often called the nuclear scaffold. Scaffold associated regions are present within the beta gene cluster, and one of them is located near the eta globin pseudogene (Jarman and Higgs 1989). However, it appears that the scaffold associated region is not within the pseudogene sequence itself, making it unlikely that the pseudogene sequence functions in chromosomal structure.
    The observation that the eta globin pseudogene is not associated with any known genetic defect is offered as further argument for its lack of function. Several hemoglobin beta globin abnormalities are known, but none of them is associated specifically with the eta globin pseudogene (Stamatoyannopoulos and Nienhuis 1994). This is interpreted as supporting the assertion that the pseudogene has no function. However, this argument is quite weak. The same result could occur for lethal mutations. No defective individuals would be observed because they do not survive long enough to be observed. Individuals with defective pseudogene sequences have been reported, but their abnormal hemoglobins were attributed to deleted portions outside the pseudogene sequence. It would be helpful to know whether normal individuals exist without the pseudogene sequence. Unless more information is available, the argument that the eta globin pseudogene has no effect on health cannot be said to have been demonstrated.
    The possibility that pseudogenes may have some function is worth exploring further. Some pseudogenes are believed to function as sources of information for producing genetic diversity (Fotaki and Iatrou 1993, Wedell and Luthman 1993), possibly involving a process similar to gene conversion. It is thought that partial pseudogene sequences are copied into functional genes, producing variants of the functional sequence. This phenomenon has been reported many times. Some examples include the immunoglobulins of mice (Selsing et al. 1982) and birds (Reynaud et al. 1989), mouse histone genes (Liu et al. 1987), and in horse globin genes (Flint et al. 1988) and human beta globin genes (Fullerton, S. M., et al. 1994). The possible role of the eta globin pseudogene in gene conversion is unknown.
    Regulation of globin genes is not fully understood, but several regulatory sites and protein factors have been identified (Stamatoyannopoulos and Nienhuis 1994). Each of the five functional beta globin genes has its own promoter region that participates in gene regulation. In addition, a locus control region (LCR) is found in a region several thousand bases upstream from the gene for epsilon globin, which is the first to be expressed.
    There is no evidence that the eta globin pseudogene functions in gene regulation of the beta globin gene family (Engel 1993). However, that possibility has been suggested (Goodman et al. 1984, see also Vanin et al. 1980). The chromosomal arrangement of beta globin genes in a sequence corresponding to the timing of their activity is striking. It appears that chromosomal location plays an important role in beta globin gene regulation (Dillon et al. 1991).
    The fact that the eta globin pseudogene is located between the fetal and adult genes suggests that it could play a role in gene switching- turning off the fetal gamma genes and turning on the adult beta gene. There is evidence that gene switching in human beta globin genes depends in some way on the sequence lying between the fetal and adult genes (Townes et al. 1991), although it is not known whether the eta globin sequence itself is involved. Some pseudogenes have been implicated in gene regulation (Singh and Brown 1991, Assinder et al. 1993, Koonin, Bork and Sander 1994). Such a role could involve competition for regulatory proteins, production of signal RNA molecules, or perhaps some other mechanism (e.g., see Enver et al. 1991).
    Further suggestion of possible functionality of the eta globin pseudogene comes from a comparison of the "non-functional" sequences in humans and chimps. Non-functional sequences in this case include the A-gamma gene introns and the entire eta globin pseudogene. One would expect a similar rate of mutation in all non-functional sequences. We can test this by comparing the extent of difference between various regions of the non-functional sequences. Human and chimp A-gamma introns differ by 23 of 999 positions (2.3%). The respective eta globin "introns" differ by 16 of 999 positions (1.6%). The "exons" in the eta globin pseudogene differ by only 6 of 444 positions (1.35%). The figures for A-globin introns and eta globin exons differ by more than one-third. This could be explained as due to variations in the mutation rate, but this would tend to undermine the argument that differences in non-functional sequences are a function of time (the molecular clock hypothesis). It seems reasonable to suspect that mutations in the eta globin pseudogene "exons" are constrained, perhaps because it has some function that has yet to be discovered (cf discussion of Drosophila Adh locus in Sullivan et al. 1994).
    Another presupposition of the argument from shared mistakes is that they could not have arisen independently, but must have been inherited from a common ancestor. Although convergence and parallelism are common problems in morphological studies (e.g., Carroll and Currie 1991), it seems improbable that identical nucleotide changes would occur independently. However, there is some evidence that nucleotide changes may not be random. Mutational "hotspots" (e.g., Hardison et al 1991) have been identified, and independent gene duplication events have been inferred (Menotti, Starmer and Sullivan 1991).

Are pseudogenes "Junk DNA"?

    It has been thought that only a small proportion of DNA codes for proteins. Typical estimates have been that perhaps 3% of the genome is involved. Recent discoveries (Wilson et al. 1994) indicate a figure closer to 30%. What is the function of the remaining portion? A large amount of DNA would be required for gene regulation, but this still leaves a significant part of the DNA with unknown function. That DNA fraction with no apparent function has been called "junk DNA." Junk DNA has been thought to include intervening sequences (introns), satellite DNA (a highly repetitive DNA fraction), repetitive sequences, and pseudogenes.
    As knowledge of the genome has increased, functions have been discovered for some of the sequences thought to be "junk" (Nowak 1994). For example, introns function in splicing together transcripts of exons. This constrains the kinds of changes that intronic sequences can tolerate. Some introns contain coding sequences which produce functional gene products (see Doolittle 1993 for review). Satellite DNA appears to be involved in chromosomal structure, especially at the ends (telomeres) and attachment points (centromeres) of the chromosomes. Repetitive DNA seems to have effects that are not well understood. Some diseases seem to be related to repetitive sequences (see Maddox 1994). It was recently noted that repetitive sequences seem to have a genomic arrangement characteristic of some kind of information code (Flam 1994), although the test used for this is apparently a weak test. Some supposed pseudogenes have been shown to be lowly or selectively transcribed (e.g., Yaswen et al. 1992, Imai et al 1993, Vazeux, le Scanf and Fandeur 1993), which might suggest some function. The list of DNA sequences that have no effect on the organism has steadily decreased as knowledge of the operation of the genome has increased. This is reminiscent of the history of vestigial organs, in which apparent lack of function was actually lack of knowledge of what the function was. There is still much about pseudogenes that is not understood (Sullivan et al. 1994).
    In retrospect, it seems perfectly reasonable to expect most DNA sequences, as well as organs, to have some function. One of the rules of nature seems to be that structures that are not useful tend to become lost. This is not to say that all DNA sequences must have a function. Copying errors, unequal crossing over and disruptive transposition all may contribute to the accumulation of useless DNA sequences. Many pseudogenes may indeed be junk DNA. However, the argument that particular DNA sequences must not have a function because we haven't discovered any function for them is an argument from silence. To conclude that pseudogenes are junk DNA seems premature.

Summary and Conclusion

    Pseudogenes are DNA sequences that resemble functional genes but seem to have no purpose. The presence of similar eta globin pseudogenes in humans and chimps has been used as an argument for common ancestry of the two species. The argument has two parts: that the pseudogene sequences actually have no function, and that God would not create similar non-functional sequences in humans and chimps. The latter argument is theological, and is similar to many other theological arguments that have been proposed and later abandoned. Theological arguments should not be relied on unless well supported by scripture.
    The argument that the eta globin pseudogene has no function is consistent with most of the data, although lack of function has not been demonstrated. Possible function is suggested by the location of the pseudogene and differences in the extent of divergence of "intronic" and "exonic" sequences. The possibility that the eta globin pseudogene provides a binding site for a molecule involved in gene regulation has not been ruled out. At present, the evidence from pseudogenes fits reasonably well into an evolutionary interpretation, for those who choose to make that interpretation. However, there is much about the operation of the genome in general, and pseudogene sequences in particular, that is not well understood. Rapid progress is being made in understanding how the genome operates, and it is reasonable to expect that greater understanding of the meaning of pseudogenes will be forthcoming.

 

ACKNOWLEDGMENTS

    The author wishes to thank several reviewers for helpful comments and suggestions.

 

LITERATURE CITED


1994

All contents copyright Geoscience Research Institute. All rights reserved.
Send comments and questions to webmaster@grisda.org

| Home | News |
| About Us | Contact Us |