Lecture notes for Evolution II

Last version 1:30 Nov. 21, 1996
David R. Nelson

 

THE ACQUISITION OF MITOCHONDRIA AND CHLOROPLASTS BY EUKARYOTES. HYDROGENOSOMES PROVIDE CLUES

The earliest eukaryotes had no mitochondria or chloroplasts. We can tell this because the
most ancient branches on the eukaryotic tree all represent groups without mitochondria.
The earliest eukaryotes are the diplomonads (includes Giardia), followed by the
microsporidians and then the trichomonads. The evidence is very strong that mitochondria
were taken in as endosymbionts of the alpha proteobacteria. Phylogenetic trees of the
rRNAs from bacteria and mitochondria show this plainly. A similar case is made for the
chloroplasts origin among cyanobacteria. Hydrogenosomes and peroxisomes may also be
endosymbionts, but they do not contain a genome of their own today.

Hydrogenosomes are organelles found only in eukaryotes that do not have mitochondria.
These organelles are surrounded by a double membrane. They are the site of pyruvate
fermentation to produce acetate, CO2 and H2. ATP is formed by substrate level
phosphorylation, so these organelles resemble mitochondria in that they produce ATP.
They do not have pyruvate dehydrogenase. Instead they use an enzyme called pyruvate
ferredoxin oxidoreductase, and they have hydrogenase. These enzymes are not found in
mitochondria, but they are seen in anaerobic bacteria. Hydrogenosomes do not have an
electron transport chain and F1F0 ATPase. They do not perform oxidative
phosphorylation.

One set of proteins found in hydrogenosomes are the heat shock proteins Hsp70, Hsp60
and Hsp10. These are ideal sequences to use for phylogenetic analysis. As we saw above,
Gupta And Golding used them to argue for a hybrid eukaryote genome. A sequence
analysis of the hydrogenosome Hsp70 and Hsp60 proteins showed a signature sequence
characteristic of these proteins in mitochondria and gram negative purple bacteria. Trees
made with other available sequences of all three Hsp proteins placed each of them in the
mitochondrial Hsp group. Since this happened in all three cases, the hydrogenosome Hsps
and the mitochondrial Hsps appear to have a common bacterial ancestor. In anaerobic
environments where these organisms live, it was not useful to retain the OXPHOS genes
encoded in the mitochondrial genome, which may explain why hydrogenosomes have no
genome of their own. These organelles are apparently a degenerate version of the
mitochondrial ancestor.

Before the eukaryotic ancestor could engulf these bacteria, it would have to evolve into a
phagocytic cell, with the ability to take a whole living bacterium inside itself. This is not
possible with a rigid cell wall. Therefore, the precursor of eukaryotes had to lose the
ancestral cell wall.

Back to the table of contents


A TIMELINE FOR THE MAIN EVENTS OF EARLY LIFE ON EARTH.

Back to the table of contents



INTRONS LATE OR INTRONS EARLY

(see PNAS 92, 8507-8511 1995)

Introns are found in eukaryotic genes (Nature 271, 501 1978). There are two opposing
views on what the origin of introns might be. In one view, introns are ancient and were
present in all genes. The amino acid coding regions of genes were made of small pieces of
15-20 codons each and these were all spliced together by removal of the introns. In this
model, genetic diversity was assured by exon shuffling, that combined different exons to
make a very large collection of proteins. This is the exon theory of genes. Since these
types of introns are not in bacteria, the theory goes that bacteria are streamlined and have
lost all their introns. Furthermore, the exons in this theory are supposed to code for units
of protein structure, such as helices and beta strands.

The "introns late" theory says that the original common ancestor did not have introns and
they only evolved in the eukaryote branch of the tree of life. This theory suggests that
introns are placed pretty much at random into genes, and they do not necessarily
correspond to protein structural elements like helices and strands.

Triose phosphate isomerase is an enzyme that is used to support the introns early
hypothesis, because the chicken TPI gene has six introns and they all occur between
structural elements in the protein. When additional sequences from a plant an a fungus
were determined, additional introns were found. Five were in the same place in plants and
animals suggesting they existed in the common ancestor to plants and animals. The total
number of presumed introns was now 11, with different lineages losing different introns
over time. One of the new exons was big and did not fit well with the theory that compact
modules of protein were encoded in the exons. Walter Gilbert predicted that another
sequence would have an intron that would break this exon in two. This was found in a
mosquito.

The issue was not solved yet, because more sequences were done from another insect,
another fungi and C. elegans (nematode). These identified seven new intron positions, for
a total of 21 introns and an average exon size of 11.2 codons. 12 introns only occur in one
sequence, suggesting that all other lineages lost that intron. The exon theory then is getting
to be very cumbersome. Additional sequences from more insects showed that a close
relative of the Culex mosquito (Aedes mosquitoes) had the intron, but more distant relatives
(Anopheles mosquitoes, flies and moths) did not(PNAS 92, 8503-8506 1995). This is
consistent with late insertion of the intron in an ancestor of the Culex and Aedes
mosquitoes. Since 19 species are missing this intron, at least 10 independent losses of this
intron would be required to fit the introns early model. Therefore, the introns early model
seems to be wrong.

Back to the table of contents


SIMILARITIES AND DIFFERENCES BETWEEN ARCHAEA AND EUKARYA

.

Archaea and eukarya share some distinctive features, including N-linked glycoproteins,
absence of formylmethionine and introns in their tRNAs(see PNAS 92, 5761-5764 1995).
N-linked glycoproteins are made in eukaryotes in the endoplasmic reticulum and the golgi.
They are initially formed using a dolichol carrier lipid intermediate. Their presence in
archaea suggests that dolichol is there and some membrane bound system similar to the N-
linked biosynthetic machinery of the ER is also present. Bacteria use N-formylmethionine
at the beginning of all their proteins. There is a special tRNA for this amino acid. Eukarya
and archaea do not have formylmethionine. Some eukaryotic tRNA genes have introns. In
yeast, there are 262 tRNA genes, and about one third contain introns. Archaea also have
introns in some of their tRNAs. Methanococcus jannaschii has 37 tRNA genes, with
introns in a met and trp tRNA. The transcriptional apparatus of archaea is much more
eukaryote-like than bacteria-like. This may reflect similarities in how DNA is packaged in
the two domains. Methanococcus jannaschii has five histone genes, so the DNA may be
packaged in nucleosomes as in eukaryotes. This may require a more complex transcription
machinery to get at the DNA. Bacteria don't have histones and they have a much simpler
transcriptional apparatus.

Archaea and bacteria both have polycistronic operons, and some of these have the genes
arranged in a similar order in both lineages. This implies that the operons existed in the
common ancestor. Recently, there has been some evidence that C. elegans, a model
organism for genome sequencing also has operons. This was unheard of in animals before
these C. elegans operons were described.

Back to the table of contents


 

HOMEOBOX GENES AND MACROEVOLUTION

(see Molecular Biology of the Cell, chapter 21)

Humans, worms and flies don't look very similar and they do not go through the same
developmental stages. Yet the genes that control their body shape and organization are
related in sequence. They all share a common sequence called the homeobox. This 180
nucleotide sequence codes for 60 amino acids found in these proteins. The rest of the
proteins may be very different, but this 60 amino acid piece is crucial for their function.
The homeodomain is a helix turn helix DNA binding domain that recognizes a specific
DNA sequence. The homeodomain targets the remainder of the protein to regulate the gene
expression of any genes with the appropriate recognition sequence in their control regions.
There are at least 50 homeobox genes in Drosophila. They fall into two main divisions, the
complex superclass and the dispersed superclass. Those in the complex group are found in
clusters, the dispersed group are solo genes.

One subset of these genes are called homeotic selector genes. In Drosophila, there are 8
genes arranged in a series along 650,000 base pairs of DNA. This whole region is called
the HOM complex. There are two smaller subsets of these genes in the HOM complex,
the antennapedia complex (5 genes) and the bithorax complex(3 genes). Other insects
have these genes all in one complex, so it looks as though the HOM complex became split
in Drosophila.

Mutations in the 8 genes of the HOM complex cause large scale mutations in flies. A
mutation in bithorax causes a fly to have an extra set of wings. Mutation in antennapedia
causes a leg to grow where an antenna should be. These genes are not master switches for
making wings or legs, but they specify position in the fly's body. The order of the genes
on the chromosome is the same as the order of segments in the fly's body where they are
expressed. The left most gene is expressed in the head, the right most gene is expressed in
the abdomen. When a gene is deleted or mutated, the segment where it is normally
expressed cannot tell where it is because its position clue is gone, so it behaves like the
closest segment to it. That is why a bithorax mutation causes an extra set of wings. The
segments adjacent to the bithorax segment dictated what should be made.

An amazing fact is that these HOM genes have clear homologs in vertebrates. These are
called hox gene clusters. Mice have four hox gene clusters on four different chromosomes.
These are called HoxA ,B, C and D. HoxB has all the same genes as HOM plus one more.
They are in exactly the same order. The other three segments are missing some of the
HOM genes, but they have some extra homeobox genes not in the HOM cluster.

The HOM cluster seems to have arisen by gene duplication of a single homeobox gene long
ago. This cluster then was duplicated in total four times in the lineage of vertebrates .
Some additional gene duplication and deletion resulted in the present day set of Hox genes
in mammals. These genes specify position in mouse embryos, just like they did in flies.
They seem to have a similar function to the HOM cluster in flies, except it is more
complicated in mammals because there are four clusters. Two sets of hox gene clusters are
expressed in limb buds in perpendicular directions. The gene products from one cluster are
expressed along a left to right axis in the limb bud(HoxD) and the other gene cluster is
expressed top to bottom in the same bud(HoxA). This creates a checkerboard pattern that
makes each position in the limb bud unique, like the elements of a mathematical array that
are described by x and y coordinates. If a single gradient in the fly can specify the
development of different symmetrical segments, like head, thorax and abdomen, then a
dual gradient in the limb bud can specify the development of asymmetry in the limbs,
things like the bones and muscles of the hand, the layout of nerves and blood vessels, what
is to be skin and fingernails.

The HoxA cluster in mouse has 11 genes, Drosophila has eight genes in the HOM cluster.
HoxA has added three extra genes. Probably, if one looks back at simpler organisms there
will be some that have fewer homeotic genes in these clusters, or fewer clusters. The
Annual Review of Biochemistry 1994 has an article on homeodomain proteins (Vol. 63,
487-526). There, evidence is cited for one hox cluster in acorn worms (a hemichordate),
two hox clusters in amphioxus (a cephalochordate) and three (or 4) in lamprey (a primitive
vertebrate). It is tempting to extrapolate that gain of hox genes in a cluster increases the
complexity of an organism by allowing additional segments to be specified. Initially these
would be just like adjacent segments, but there would be opportunity to evolve into more
specialized functions. For example, if there are three sets of legs in insects, could another
set of legs be added just by duplicating a hox gene that specified a leg segment of the body?
What do the hox gene clusters of spiders, centipedes and millipedes look like? Are there
dozens of duplicated hox genes that specify many identical segments? This provides the
possibility of macroevolution. Duplication of hox genes, or whole hox gene clusters,
followed by deletion and mutation might alter a species very dramatically in a short time
period.

Another homeobox gene in Drosophila is eyeless. This gene appears to be a master switch
gene that turns on eye formation(see Science 267, 1766-1767 and 1788-1792 1995). If
eyeless is expressed in tissues where it normally would not be active, whole functional
eyes form. These may be on the end of antenna, on the wings or on the legs. This gene
has homologs in mouse (small eye, Pax-6) and man (aniridia) that also affect the formation
of eyes. In fact, the mouse gene can substitute in Drosophila for the eyeless gene. This
means that eye formation is controlled by a gene that evolved before eyes evolved.
(invertebrates and vertebrates diverged 600 million years ago). The common ancestor
apparently had a light sensitive tissue that later evolved into different types of eyes in
insects and vertebrates. Eyeless controlled the development of that light sensitive tissue
and it has continued in that role for at least 600 million years.

A recently discovered gene called Manx is not a homeobox gene, but it is a zinc-finger
transcription factor and it is another candidate for a master switch gene. (see Science Nov.
15, 1996, p. 1205 and news section) The Manx gene is found in tunicates, a type of
primitive chordate. These organisms start life as a tadpole like creature with a tail and
notochord. During maturation, they lose their tail and become sessile on the sea bottom.
Manx is the gene that controls the tail formation. If it is mutated, the tail never forms. This
was demonstrated by William Jeffery and Billie J. Swalla, who found two closely related
tunicates, one with a tail and one without. They bred the two and the hybrid had a small
tail, suggesting that a single gene was responsible and one functional copy could turn on
the pathway. The gene was identified and its expression was blocked by antisense RNA in
the hybrid embryo. When this was done, the tail did not form. They are now looking for
homologs in more complex vertebrates. Manx is similar to eyeless, in that it turns on a
whole developmental program to form a tail. Of course the hunt is on to see what gene
might control Manx and which genes lie downstream of Manx to effect tail development.

A second way to bring about macroevolution is polyploidy. Xenopus laevis has twice the
DNA of Xenopus tropicalis, by a genome duplication. This gives Xenopus laevis a lot of
DNA to experiment with and try out new functions for old duplicated genes. One
consequence of doubling the number of genes is an increase in size. Xenopus laevis is
much larger than Xenopus tropicalis, perhaps due to a gene dosage effect. Plants are often
tetraploid or hexaploid, again giving evolution a lot of material to work with.

Back to the table of contents


MITOCHONDRIAL EVE

Mitochondrial DNA mutates at a rate that is about 17 times greater than nuclear DNA. This
is probably due to lack of effective repair mechanisms. Because this DNA changes so
rapidly, it can be used as a monitor of evolution on a time scale of a few hundred thousand
years rather than millions or billions of years. It is a fast molecular clock. In addition,
mitochondria are inherited maternally, so there is no genetic recombination to account for.
The line of descent is direct from mother to mother, because fathers do not contribute
mitochondria to the egg on fertilization. By comparing DNA from people from around the
world and especially in Africa, it is possible to build a tree showing the divergence of
humans over time. This tree can be rooted by using chimpanzee mitochondrial DNA as an
outgroup sequence, and the time of the last common ancestor can be estimated. This was
first done in 1987 (Nature 325, p. 31), but it was criticized for inadequate sampling of
populations and weak methods for making the tree. The process was repeated with DNA
sequence from 189 people (121 from Africa) using a hypervariable region of mitochondrial
DNA. (Science 253, 1503-1507 1991). The results were similar in both cases, though the
critics could not find as much to fault in the second paper.

The results were, that the 14 deepest branches on the tree were all of African origin. This
implies that modern humans evolved in Africa. The time of the last common ancestor of all
human mitochondrial DNA types is 166,000 to 249,000 years ago assuming that
chimpanzees and humans diverged from 4-6 million years ago.

One inference from this conclusion is that there was one woman whose mitochondria gave
rise to all present human mitochondrial genomes. This is the concept of a mitochondrial
Eve. This idea was immediately misunderstood to mean that there was only one woman
alive at this time. The result does not suggest that. Such a finding would create a
tremendous genetic bottleneck in human history. What the evidence does say is that the
present population of human mitochondrial DNA did have one founder mother. She was
the lucky one whose mitochondria have survived 200,000 years. All her contemporaries
have had their lineages fizzle, by not having children, or not having female children to be
more specific. There could have been a large population of humans 200,000 years ago, in
fact the next thing we will discuss is exactly how big was this population.

Back to the table of contents


POLYMORPHISMS AND THE SIZE OF HUMAN POPULATIONS.

Polymorphisms are fixed sequence differences in a population that make up more than 1%
of the population. The HLA locus in humans corresponds to the MHC locus in mice and
other vertebrates. This is a highly polymorphic region with about 100 genes. One of these
genes is the DRB1 gene. 59 alleles exist in humans and 60 non-human primate sequences
have been determined. A tree of these sequences gives an estimate of the time for a last
common ancestor of about 60 million years. (see Science 270, 1930-1936 1995) It is
important to point out here, that these 59 alleles are different sequence variants of the same
gene in humans. These do not represent different genes.

To carry that many polymorphisms in a population, the absolute minimum number of
individuals would be 30, one diploid person for every two alleles, and they would all have
to be herterozygotes, each with a different allele. This situation is very unlikely. Six
million years ago, there were 32 lineages of the DRB1 gene, with a minimum population to
carry this number of alleles being 16. Again the actual population would have to be much
greater than that. There is a theory dealing with polymorphisms and population size. This
is called coalescence theory. If the time of coalescence is known, and the number of genes
is known, then this theory will predict what the population size must be for this to happen.
The results of simulations show that for 60 genes to persist for 1.7 million years (time of
humans as Homo sapiens) the population size would have to be about 100,000. If it was
less, many of the alleles would become lost over that many generations.

The numbers are not incompatible with a mitochondrial Eve hypothesis, because
mitochondrial Eve is only considering the inheritance of one small piece of DNA,
equivalent to a single gene. It must be true that a single woman of about 200,000 years ago
is the mother of all of our mitochondrial DNA, but it is not true that she is the mother of all
our other genes.

Back to the table of contents


Y CHROMOSOME ADAM

Males do not have to be left out of this analysis. Portions of the Y chromosome are unique
to males and are inherited paternally. As long as it is out of the pseudoautosomal region, it
cannot recombine with other alleles and so it is very analogous to mitochondrial DNA,
except it evolves at a much slower rate. To do these same types of calculations with Y
chromosome DNA, a 729bp fragment of the ZFY gene has been sequenced from 38 men.
There was no difference found. The numbers of samples is too small yet, or perhaps a
more variable region should be used, like an intron. Even with no differences detected, the
divergence of the ZFY gene between humans and great apes can give a rate that is usable
with coalescence theory. In this case the number of alleles is two, one in humans and one
in apes. The estimate for the time of the last human common ancestor ZFY gene from these
assumptions is 270,000 years.

Back to the table of contents


HOW MANY GENES DOES IT TAKE TO MAKE A LIVING CELL?

The sequences of several genomes are now available.

Go to Genome Projects Page

Mycoplasma genitalium is the smallest known genome that is not a virus. It codes for 468
proteins, that have been called the minimal set for life. This is not strictly true, since there
are probably some genes in this set that are specific for M. genitalium and won't be found
in other unrelated genomes. Mushegian and Koonin compared the M. genitalium genome
with the H. influenza genome (1703 protein coding genes)to identify those genes common
to both.(see PNAS 93, 10268-10273 1996) These are gram positive and gram negative
organisms, so they diverged about 1.5 billion years ago. Any homologous genes are
probably essential. They found 240 genes. Some essential genes were missing, because
the same function in some pathways was performed by different non-orthologous genes in
the two organisms, so these missing genes had to be accounted for. That added back 22
genes. Then they looked for redundant functions and parasite specific genes, since both
organisms are parasites, and subtracted 6 more genes to get a total of 256 protein coding
genes as the minimal set.

The authors point out that parasites import metabolic intermediates, but they do not import
proteins, so the minimum number of genes illustrates what must be done after all possible
intermediates are imported from a rich environment.

Back to Table of contents


Last updated: 15 December 1996.

created by : <opperdoes_at_bchm.ucl.ac.be>