
Modified from Origins 21(2):91-108 (1994)
ABSTRACT - What this article is about.
Pseudogenes are DNA sequences that resemble functional genes but seem to have no purpose. The presence of similar eta globin pseudogenes in humans and chimps has been used as an argument for common ancestry of the two species. The argument has two parts: that the pseudogene sequences actually have no function, and that God would not create similar non-functional sequences in humans and chimps. The latter argument is theological and resembles many other theological arguments that have been proposed and later abandoned. Theological arguments should not be relied on unless well supported by Scripture.
The argument that the eta globin pseudogene has no function is consistent with most of the data, although lack of function has not been demonstrated. Possible function is suggested by the location of the pseudogene and differences in the extent of divergence of "intronic" and "exonic" sequences. The possibility that the eta globin pseudogene provides a binding site for a molecule involved in gene regulation has not been ruled out. At present, the evidence from pseudogenes fits reasonably well into an evolutionary interpretation, for those who choose to make that interpretation. However, there is much about the operation of the genome in general, and pseudogene sequences in particular, that is not well understood. Rapid progress is being made in understanding how the genome operates, and it is reasonable to expect that greater understanding of the meaning of pseudogenes will be forthcoming.
Introduction
Theists and naturalists have long argued over whether nature
provides evidence of design. Many theists have claimed that nature is so designed that one
can infer the existence of a designer. Some theists have made the stronger claim that
nature reveals a designer who is the Creator God revealed in the Bible. Many examples of
apparent design have been described, ranging from the non-random properties of the
universe to the intricate mechanism of a living cell. To the theist, these features speak
clearly of the existence of an intelligent Creator who created with a purpose in mind.
Naturalists have responded with arguments of their own. One argument
that is currently popular is the claim that many features in nature are not designed very
well. It is affirmed that such poor design indicates either an inferior designer or no
designer at all. Several examples of allegedly poor design have been proposed (e.g.,
Miller 1994). One of the most difficult examples for theists to explain is probably the
existence of certain DNA sequences known as pseudogenes. This paper will explore some of
the characteristics of pseudogenes and their relationship to the argument for or against
design.
What are pseudogenes?
Ordinary structural genes are made of DNA sequences that contain
coded information for making a particular protein molecule. The information includes a
start signal, coding for the sequence of amino acids needed to make up the protein, and a
stop signal. Additional signals that regulate the timing of gene activity are found
adjacent to the gene, and often also at some distance from it. The amino acid-coding
sequence is often broken up into portions known as "exons," which are separated
by spacer sequences known as "introns." Pseudogenes are DNA sequences that
appear similar to functional genes, but contain important defects that appear to make them
incapable of producing a functioning protein (Proudfoot 1980). Defects of pseudogenes may
include lack of a start codon, presence of extra stop signals, and abnormal or absent
flanking regulatory elements. It is thought that mutations in pseudogenes are neutral, and
hence free from selection. The first report of a pseudogene was in 1977 (Jacq, Miller and
Brownlee 1977). Since that time, a large number of pseudogenes have been described in
humans and a wide variety of other species. The supposed defects of pseudogenes have been
used as an argument that nature is too poorly designed to attribute its existence to
special creation by a supernatural Designer (Miller 1994).
Two types of pseudogenes are known: unprocessed pseudogenes and
processed pseudogenes. Processed pseudogenes are found on different chromosomes from their
functional counterparts. They are called "processed" because they appear to be
altered copies of active genes. They lack introns (spacer sequences within a gene) and
certain regulatory sequences located in front of the gene, they often terminate in a
series of adenines, and are flanked by direct repeats. ("Direct repeats" are
associated with movable genetic elements, which may in some cases play a role in inserting
a pseudogene into a chromosome.) Processed pseudogenes may be complete copies of the
coding sequence, or may be incomplete copies, or may have additional inserted sequences.
They seem to be present only in mammals (Vanin 1985). Processed pseudogenes are believed
to have arisen in a three step process. The first step is copying of the DNA message into
an RNA transcript. The introns are then edited out of this transcript to produce a
messenger RNA (mRNA) molecule. Finally, the mRNA is copied back into a chromosome in a
process called reverse transcription (see Vanin 1985 for review; see Tchenio et al 1993
for an example). The L1 family of repetitive DNA sequences appears to be the result of
this process (Jurka 1989).
Unprocessed pseudogenes are usually found within clusters of similar,
functional sequences on the same chromosome (Harris et al. 1984). They typically have
"introns" and flanking regulatory sequences resembling a functional gene. As
with processed pseudogenes, expression of an unprocessed pseudogene is generally prevented
by stop codons. Numerous other differences interpreted as deletions, insertions and point
mutations may also be present. A truncated mRNA transcript may or may not be produced.
Unprocessed pseudogenes are found in a wide variety of organisms. They are believed to
have arisen by gene duplication, which produced an extra copy of the gene. The extra copy,
not being needed, could accumulate mutations without harming the organism. Examples of
unprocessed pseudogenes are present in the alpha-globin and beta-globin gene families
(e.g., see Hardison and Miller 1993 and references therein).
The argument from shared mistakes
When genes for equivalent proteins are compared in different
species, they are often found to differ in sequence. In general, the more similar two
species are taxonomically the more similar are their DNA sequences, both in general and
for specific enzymes. Exceptions do occur, but the overall pattern is easily recognized.
Two explanations have been proposed for the observed pattern of similarities in molecular
sequences.
One explanation for sequence similarities is that they are inherited
from an evolutionary ancestor. Genes are similar because they are both inherited from a
common ancestor. Sequence differences are attributed to accumulation of mutations since
the species diverged from their common ancestor. A second, contrasting, explanation is
that sequence similarities are due to common design for a similar function. Sequence
differences may reflect functional differences, such as might be required for protein
function in different metabolic environments, or regulatory function in different genetic
backgrounds. It seems unlikely that sequence similarities could be due to chance, but some
have been interpreted in this way (e.g., Djian and Green 1992).
Similarities in functional sequences for the same protein in different
organisms are to be expected, since they perform similar functions; however, what about
similarities in sequences, such as pseudogenes, that seem to have no function? Pseudogenes
are commonly thought to be flawed copies of functional genes. It has been argued (Max
1987, Gilbert 1993, Miller 1994) that similar pseudogene sequences shared by two or more
species are best explained as the result of common ancestry, assuming that an intelligent
designer would not repeatedly make mistakes in creating genes. This can be called the
"argument from shared mistakes."
Comparison of DNA sequences from humans, chimp and other mammals
reveals a considerable number of shared pseudogenes that are similar in sequence as well
as in positional relationship to other genes. Humans and chimps have many similarities;
this is interpreted as indicating a recent common ancestry for humans and chimps (Gilbert
1993). The best known example of a shared pseudogene is the eta globin (psi beta globin)
gene, a member of the beta globin gene family.
The beta globin family and the (eta globin) pseudogene in humans
Human hemoglobin molecules are made of two sets of proteins,
produced by the alpha globin genes and the beta globin genes. Both beta globin and alpha
globin genes occur in "families" of non-identical copies. The beta globin gene
family is located on the short arm of human chromosome 11 (11p15.5), near the gene for
insulin (Lalley et al. 1989). A family of alpha globin genes is also present in mammals,
but it is located on a different chromosome (16p13).
The beta globin gene cluster consists of five somewhat similar
functional genes and one pseudogene. The five functional genes are arranged on the
chromosome in a sequence that corresponds to the sequence of timing of their respective
functions during growth and development. The first gene in the series is the "epsilon
globin" gene, which helps form hemoglobin molecules early in embryonic development.
The second and third genes are called "gamma-G" and "gamma-A." They
help form hemoglobin molecules later during fetal development. The "eta globin"
pseudogene is next in sequence, followed by the "delta" globin gene which is
produced at a low rate in adults. The last gene in the series is the "beta"
globin gene, which produces most of the adult beta globin, and gives the gene family its
name. As the adult globin genes become functional, the fetal genes are turned off. The
fact that the sequence of the genes on the chromosome matches the sequence of their
activity in the developing organism seems unlikely to be the result of chance, and can
easily be interpreted as the result of intelligent design.
The eta globin sequence has several characteristics of pseudogenes. It
resembles the other members of the beta globin gene family, but is most similar to the
gamma-A globin gene. However, it has some important differences. Compared with the gamma-A
globin gene, the eta globin pseudogene lacks a start codon (AUG) in the appropriate
position. It also has numerous extra stop codons which would be expected to prevent
production of any protein. No mRNA transcript or protein product has been identified, and
it appears that none is produced. No medical defect is known that is traceable to the loss
of this pseudogene. In short, the eta globin sequence is not associated with any known
function or defect, and appears to be incapable of producing a useful molecule.
The beta globin gene family is also found in other mammals. Sequences
of the human gamma-A globin gene and eta globin pseudogenes from humans and several other
species have been compared (Chang and Slightom 1984). The human gamma-A globin gene
contains three exons (portions of the DNA that code for amino acids) of 92, 223 and 129
nucleotides, respectively, for a total of 444 nucleotides. The corresponding
"exons" of the human eta globin pseudogene differ from the gamma-A globin gene
exon sequences in 29, 38 and 43 nucleotide positions, respectively, for an overall
difference of 24.8%. The gamma-A globin gene has two introns, of 122 and 877 bases,
respectively. These differ from the "intron" sequences of the eta globin
pseudogene by 46-79% and 72-94%, respectively (my figures differ somewhat from those of
Goodman et al. 1984, probably due to problems in aligning the sequences). The
gamma-A-globin exons and pseudogene "exons" are more similar to each other than
expected from random sequences, while the "intronic" sequences are so different
that no relationship among them can be inferred.
Comparisons of eta globin pseudogenes in humans and other primates
The arrangement of the beta globin gene family in other primates is
very similar to that in humans (Harris et al. 1984). Humans, chimpanzees and gorillas have
the same number of beta globin genes arranged in the same sequence. In chimpanzees, the
beta globin group is on chromosome 9, which is equivalent to human chromosome 11 (Lalley
et al. 1989). Baboons have a similar arrangement, but the delta globin gene appears
non-functional, and is classified as a pseudogene. The New World owl monkey has only one
gamma globin gene, with a possible partial second gene (Meireles et al 1995), but the
arrangement of genes is otherwise the same as in humans. This is true also for the galago
("bush baby"; Hardison and Miller 1993). Among non-primates, the rabbit has only
one gamma globin gene, but lacks the eta globin pseudogene, while the delta globin gene
appears to be a pseudogene.
The DNA sequences of the eta globin pseudogene exons in humans,
chimpanzees and gorillas are similar. The chimpanzee eta globin pseudogene exonic DNA
differs from the human eta globin pseudogene at six nucleotide positions and from the
corresponding gorilla pseudogene at seven positions. One of these differences The gorilla
pseudogene exonic DNA has three differences from humans and seven from chimpanzees. This
means that chimpanzee and gorilla eta globin exon sequences are both slightly more similar
to the human pseudogene than to each other.
It is clear that the "exon" portions of the eta globin
pseudogenes in humans, chimps and gorillas are highly similar. None of their differences
involves any of the eight stop codons in the pseudogenes. Several potential initiation
codons (AUG) are present, and one of the differences in the chimpanzee produces an
additional potential initiation codon in the second exon. However, none of these is
sufficient to support protein coding function.
Gene duplication hypothesis
If evolution is to occur, new genes must somehow be produced. The
most popular explanation for the evolution of new genes is that they are modified from
extra copies of existing genes. This explanation is known as the gene duplication
hypothesis (Ohno 1970). According to this hypothesis, functional genes may be duplicated
accidentally. The duplicate gene is not needed by the organism. Both copies of the gene
may be subject to selection until one of them suffers a disabling mutation, such as a
premature stop signal. This disables the gene so it no longer has any function, and is no
longer subject to natural selection. It has become a pseudogene, and all subsequent
mutations are neutral. Over time, mutations accumulate in the pseudogene. Eventually,
according to the theory, random mutations may produce a new gene with a new function
(e.g., see Long and Langley 1993).
The gene duplication hypothesis, although widely accepted, is not
without some theoretical and empirical difficulties. Assuming the original gene had been
optimized by selection, mutations in the coding region of the duplicated gene prior to a
disabling mutation would likely result in production of inferior protein molecules.
Individuals with one gene that produced inferior protein products would likely be selected
against. Spread of a duplicated gene should be difficult under these conditions. This
problem could be reduced if mutations destroyed the function of the extra gene copy early
in its history. However, there are only three stop codons, while there are 61 codons for
amino acids. One would expect mutations resulting in destruction of function to be much
less common than those resulting in production of variant proteins, most of which could be
expected to be inferior. Selection may also oppose maintenance of a pseudogene, since it
may retain enough activity to disrupt normal cellular activities. Some pseudogenes are
suspected to be involved in causing certain diseases (e.g., Wedell and Luthman 1993,
Brakenhoff et al. 1994), which should result in negative selection against them. Thus,
establishment and maintenance of a pseudogene by gene duplication may require a rather
special sequence of events.
Walsh (1995) has calculated the theoretical conditions thought
necessary for establishing the presence of a pseudogene in a population, assuming the
pseudogene arose randomly by mutation. Establishment requires a high proportion of
favorable mutations, a large number of reproducing individuals in the population, and a
high selection coefficient. It seems doubtful that these calculations can explain the
frequency of pseudogenes in living species, and some other explanation would be preferred.
Another problem for the gene duplication hypothesis is that the
existence of duplicate copies of a gene does not necessarily permit one of the copies to
diverge from the others. For example, seven copies of the "Enhancer of split"
gene are present in Drosophila, but it appears that none of them is free to mutate
(Maier et al 1993). The "duplicated copies" are not extra, but all seem to be
required. Many genes occur in multiple copies that remain similar to each other rather
than diverging. This has been explained as due to a process known as gene conversion, in
which one DNA sequence is "converted" during copying to match another sequence.
This may result in maintenance of similarity among several copies of a sequence. The
situation in which multiple copies of a sequence maintain close sequence similarity is
known as "concerted evolution" (e.g., Moore et al 1993). Concerted evolution
would tend to prevent divergence of duplicated genes, thus presenting a problem for the
gene duplication hypothesis. Another problem with the gene duplication hypothesis is that
tetraploid species have far fewer pseudogenes than would be expected (Larhammar and
Risinger 1994).
Despite some difficulties in attributing evolution of new information
from gene duplication, there seems to be evidence that gene duplication does occur. An
apparent example of parallel gene duplications in flies has been described (in Menotti et
al. 1991).
Beta globin genes and the gene duplication hypothesis
It is thought that the eta globin pseudogene originated by
duplication of a gamma-A globin gene, because of the similarity in their sequences. Both
genes are present in all primates studied. Other mammals may have one or the other of the
two genes. For example, gamma globin, but not eta globin, genes are present in rabbits;
goats have eta globin but not gamma globin genes (Hardison and Miller 1993); the opossum
has neither (Goodman et al 1987).
It would be useful to review the evolutionary explanation for the
distribution of eta globin genes in mammals. The proposed explanation is that the common
ancestor of marsupials and placental mammals lacked both genes. After the evolutionary
divergence of the marsupials, the gamma globin gene formed by duplication of an existing
gene in the beta globin family. Later, but before radiation of the orders of placental
mammals, the eta globin gene formed from a duplicated gamma globin gene. This second
supposed gene duplication is estimated to have occurred at least 140 million years ago
(Harris et al 1984). Gamma and eta genes must both have been present in ancestral
placentals, but presumably gamma was lost by goats and eta was lost by rabbits.
According to this scenario, the eta gene must have been functional at
first, because it is functional in goats. It is non-functional in all primates, which is
interpreted to mean it was already nonfunctional in the ancestral primates. According to
Martin (1993), primates probably originated in the Late Cretaceous, perhaps 70 to 80
million years ago. This interpretation implies that the eta globin pseudogene has been
maintained for more than 70 million years without being converted into a useful new gene
and without being eliminated. The persistence of a non- functional DNA sequence in an
entire lineage for such a supposed long period of time seems remarkable in the context of
the gene duplication hypothesis.
The gamma globin gene is believed to have duplicated a second time,
producing the A-gamma and G-gamma genes. Humans, apes, Old World monkeys, and some New
World monkeys have two functional gamma globin genes. Other mammals, including galagos,
tarsiers and rabbits, have only a single gamma globin gene (Hayasaka et al. 1993, Hardison
and Miller 1993). To explain this, the gamma globin gene is postulated to have undergone a
second duplication after divergence of simians and tarsiers. Current interpretation of the
fossil record of primates (Martin 1993) suggests that simians and tarsiers diverged during
the Paleocene, perhaps 60 million years ago. It seems remarkable that both copies of a
duplicated gene could remain functional for 60 million years if evolution has depended on
gene duplication for the source of new genetic information.
Theological presupposition in the argument from shared mistakes
Several factors need to be considered in interpreting DNA sequence
similarities in the eta globin pseudogenes. The argument has been presented that eta
globin pseudogene similarities are compelling evidence of shared ancestry. This argument
rests almost entirely on two assumptions: that the eta globin pseudogenes have no
function; and that God would not create similar non-functioning sequences in separate
species. Thus these assumptions must be carefully examined.
The argument that God would not act in a certain way is a theological
argument, and can hardly be addressed by science. The validity of such an argument depends
on the kind of God being postulated. The kind of God at issue for most of those involved
in this discussion is the God who revealed Himself in the Bible. The question then is:
What do the scriptures say about whether God would create structures or DNA sequences for
which we can find no use in unrelated organisms? This subject is not addressed in the
Bible, leaving us without an answer. We can postulate that God would not do such a thing,
but this position would not be based on any evidence other than our own presuppositions,
however reasonable they seem.
Another theological argument that has been advanced against some
proposed actions of God is that God would not deceive us by acting in certain ways. This
is equivalent to claiming that our understanding of nature can be trusted to accurately
reveal God's activities. This argument is especially dangerous because it places human
reason above divine revelation. The scriptures do state clearly that God does not deceive
us, but they also make it clear that we are naturally prone to make wrong conclusions. The
scriptures reveal the truth about history. When God tells us in scripture that He created
in a certain way, we need not be deceived by what we believe to be appearances to the
contrary. Our experience should teach us that much. The argument that we can figure out
what God would or would not do has not done well historically. At various times it has
been claimed that God would create only perfectly circular orbits for the planets, or that
God would create only perfect species that would not need to adapt to changing
circumstances, or that God would not permit man to contaminate space. None of these
arguments has survived. Claims about God's activities should be based on scripture.
Scientific presupposition in the argument from shared pseudogenes
A second assumption underlying the argument from shared mistakes is
that shared pseudogenes, in this case the shared eta globin pseudogenes, have no function.
Has it been demonstrated that these sequences have no function?
It is difficult to completely rule out any possibility of polypeptide
production based simply on coding sequence. Examples are known in which the apparent DNA
message is altered by RNA editing, reading frame-shifting or skipping parts of sequences
(Benhar and Engelbert-Kulka 1993, Dietz et al 1993, Gesteland et al. 1992, Landweber and
Gilbert 1993). Nevertheless, the available evidence seems to suggest that the eta globin
pseudogene does not code for any protein. No RNA transcript or protein product has been
identified. Each of the three "exons" contains at least one stop codon in each
of the three "reading frames." ("Reading frames" differ in which
nucleotide of each base triplet is used as the starting point.) Seven potential start
codons are present, but none of them is in "exon" one. These potential start
codons are not sufficient for protein coding function. However, some pseudogenes may
produce small amounts of polypeptides in specific tissues (Weinshank et al. 1991, Bristow
et al. 1993, Misra-Ress, Cooke and Liebhaber 1994), so it is difficult to rule out the
possibility that the eta globin sequence might produce a polypeptide.
DNA strands come in complementary pairs. One might wonder whether the
DNA strand complementary to the pseudogene might have some function, but there seems to be
no information available regarding this.
The eta globin pseudogene does not appear to function in chromosomal
structure. Chromosomes are organized into loops that are attached at their bases to a
nuclear material often called the nuclear scaffold. Scaffold associated regions are
present within the beta gene cluster, and one of them is located near the eta globin
pseudogene (Jarman and Higgs 1989). However, it appears that the scaffold associated
region is not within the pseudogene sequence itself, making it unlikely that the
pseudogene sequence functions in chromosomal structure.
The observation that the eta globin pseudogene is not associated with
any known genetic defect is offered as further argument for its lack of function. Several
hemoglobin beta globin abnormalities are known, but none of them is associated
specifically with the eta globin pseudogene (Stamatoyannopoulos and Nienhuis 1994). This
is interpreted as supporting the assertion that the pseudogene has no function. However,
this argument is quite weak. The same result could occur for lethal mutations. No
defective individuals would be observed because they do not survive long enough to be
observed. Individuals with defective pseudogene sequences have been reported, but their
abnormal hemoglobins were attributed to deleted portions outside the pseudogene sequence.
It would be helpful to know whether normal individuals exist without the pseudogene
sequence. Unless more information is available, the argument that the eta globin
pseudogene has no effect on health cannot be said to have been demonstrated.
The possibility that pseudogenes may have some function is worth
exploring further. Some pseudogenes are believed to function as sources of information for
producing genetic diversity (Fotaki and Iatrou 1993, Wedell and Luthman 1993), possibly
involving a process similar to gene conversion. It is thought that partial pseudogene
sequences are copied into functional genes, producing variants of the functional sequence.
This phenomenon has been reported many times. Some examples include the immunoglobulins of
mice (Selsing et al. 1982) and birds (Reynaud et al. 1989), mouse histone genes (Liu et
al. 1987), and in horse globin genes (Flint et al. 1988) and human beta globin genes
(Fullerton, S. M., et al. 1994). The possible role of the eta globin pseudogene in gene
conversion is unknown.
Regulation of globin genes is not fully understood, but several
regulatory sites and protein factors have been identified (Stamatoyannopoulos and Nienhuis
1994). Each of the five functional beta globin genes has its own promoter region that
participates in gene regulation. In addition, a locus control region (LCR) is found in a
region several thousand bases upstream from the gene for epsilon globin, which is the
first to be expressed.
There is no evidence that the eta globin pseudogene functions in gene
regulation of the beta globin gene family (Engel 1993). However, that possibility has been
suggested (Goodman et al. 1984, see also Vanin et al. 1980). The chromosomal arrangement
of beta globin genes in a sequence corresponding to the timing of their activity is
striking. It appears that chromosomal location plays an important role in beta globin gene
regulation (Dillon et al. 1991).
The fact that the eta globin pseudogene is located between the fetal
and adult genes suggests that it could play a role in gene switching- turning off the
fetal gamma genes and turning on the adult beta gene. There is evidence that gene
switching in human beta globin genes depends in some way on the sequence lying between the
fetal and adult genes (Townes et al. 1991), although it is not known whether the eta
globin sequence itself is involved. Some pseudogenes have been implicated in gene
regulation (Singh and Brown 1991, Assinder et al. 1993, Koonin, Bork and Sander 1994).
Such a role could involve competition for regulatory proteins, production of signal RNA
molecules, or perhaps some other mechanism (e.g., see Enver et al. 1991).
Further suggestion of possible functionality of the eta globin
pseudogene comes from a comparison of the "non-functional" sequences in humans
and chimps. Non-functional sequences in this case include the A-gamma gene introns and the
entire eta globin pseudogene. One would expect a similar rate of mutation in all
non-functional sequences. We can test this by comparing the extent of difference between
various regions of the non-functional sequences. Human and chimp A-gamma introns differ by
23 of 999 positions (2.3%). The respective eta globin "introns" differ by 16 of
999 positions (1.6%). The "exons" in the eta globin pseudogene differ by only 6
of 444 positions (1.35%). The figures for A-globin introns and eta globin exons differ by
more than one-third. This could be explained as due to variations in the mutation rate,
but this would tend to undermine the argument that differences in non-functional sequences
are a function of time (the molecular clock hypothesis). It seems reasonable to suspect
that mutations in the eta globin pseudogene "exons" are constrained, perhaps
because it has some function that has yet to be discovered (cf discussion of Drosophila
Adh locus in Sullivan et al. 1994).
Another presupposition of the argument from shared mistakes is that
they could not have arisen independently, but must have been inherited from a common
ancestor. Although convergence and parallelism are common problems in morphological
studies (e.g., Carroll and Currie 1991), it seems improbable that identical nucleotide
changes would occur independently. However, there is some evidence that nucleotide changes
may not be random. Mutational "hotspots" (e.g., Hardison et al 1991) have been
identified, and independent gene duplication events have been inferred (Menotti, Starmer
and Sullivan 1991).
Are pseudogenes "Junk DNA"?
It has been thought that only a small proportion of DNA codes for
proteins. Typical estimates have been that perhaps 3% of the genome is involved. Recent
discoveries (Wilson et al. 1994) indicate a figure closer to 30%. What is the function of
the remaining portion? A large amount of DNA would be required for gene regulation, but
this still leaves a significant part of the DNA with unknown function. That DNA fraction
with no apparent function has been called "junk DNA." Junk DNA has been thought
to include intervening sequences (introns), satellite DNA (a highly repetitive DNA
fraction), repetitive sequences, and pseudogenes.
As knowledge of the genome has increased, functions have been
discovered for some of the sequences thought to be "junk" (Nowak 1994). For
example, introns function in splicing together transcripts of exons. This constrains the
kinds of changes that intronic sequences can tolerate. Some introns contain coding
sequences which produce functional gene products (see Doolittle 1993 for review).
Satellite DNA appears to be involved in chromosomal structure, especially at the ends
(telomeres) and attachment points (centromeres) of the chromosomes. Repetitive DNA seems
to have effects that are not well understood. Some diseases seem to be related to
repetitive sequences (see Maddox 1994). It was recently noted that repetitive sequences
seem to have a genomic arrangement characteristic of some kind of information code (Flam
1994), although the test used for this is apparently a weak test. Some supposed
pseudogenes have been shown to be lowly or selectively transcribed (e.g., Yaswen et al.
1992, Imai et al 1993, Vazeux, le Scanf and Fandeur 1993), which might suggest some
function. The list of DNA sequences that have no effect on the organism has steadily
decreased as knowledge of the operation of the genome has increased. This is reminiscent
of the history of vestigial organs, in which apparent lack of function was actually lack
of knowledge of what the function was. There is still much about pseudogenes that is not
understood (Sullivan et al. 1994).
In retrospect, it seems perfectly reasonable to expect most DNA
sequences, as well as organs, to have some function. One of the rules of nature seems to
be that structures that are not useful tend to become lost. This is not to say that all
DNA sequences must have a function. Copying errors, unequal crossing over and disruptive
transposition all may contribute to the accumulation of useless DNA sequences. Many
pseudogenes may indeed be junk DNA. However, the argument that particular DNA sequences
must not have a function because we haven't discovered any function for them is an
argument from silence. To conclude that pseudogenes are junk DNA seems premature.
Summary and Conclusion
Pseudogenes are DNA sequences that resemble functional genes but
seem to have no purpose. The presence of similar eta globin pseudogenes in humans and
chimps has been used as an argument for common ancestry of the two species. The argument
has two parts: that the pseudogene sequences actually have no function, and that God would
not create similar non-functional sequences in humans and chimps. The latter argument is
theological, and is similar to many other theological arguments that have been proposed
and later abandoned. Theological arguments should not be relied on unless well supported
by scripture.
The argument that the eta globin pseudogene has no function is
consistent with most of the data, although lack of function has not been demonstrated.
Possible function is suggested by the location of the pseudogene and differences in the
extent of divergence of "intronic" and "exonic" sequences. The
possibility that the eta globin pseudogene provides a binding site for a molecule involved
in gene regulation has not been ruled out. At present, the evidence from pseudogenes fits
reasonably well into an evolutionary interpretation, for those who choose to make that
interpretation. However, there is much about the operation of the genome in general, and
pseudogene sequences in particular, that is not well understood. Rapid progress is being
made in understanding how the genome operates, and it is reasonable to expect that greater
understanding of the meaning of pseudogenes will be forthcoming.
ACKNOWLEDGMENTS
The author wishes to thank several reviewers for helpful comments and suggestions.
LITERATURE CITED
All contents copyright
Geoscience Research Institute. All rights reserved.
| Home
| About Us
| Contact Us
|
Send comments and questions to
webmaster@grisda.org
| What's New
| Resources
| Search
| Links
|