Molecular Evolution

Last version Nov. 19, 1996 4PM


David R. Nelson

University of Tennessee, Memphis


INTRODUCTION

Evolution is the concept that all life on the planet is derived from a common ancestor.
The biochemistry of living organisms is then a collection of accumulated successful
strategies from billions of years of experimentation in life. The best estimate for
how old life is on earth has just been revised to 3.85 billion years (Nature Nov. 7, 1996).
This is based on carbon isotope ratios in some of the oldest sedimentary rocks known on
earth, from the Isua rock belt in Greenland. These rocks do not contain visible
microfossils, but living cells preferentially incorporate the lighter isotope of carbon, C12 as
opposed to C13 or C14. Material that has originated from living things has a ratio of these
isotopes of carbon that reflects depletion of the heavier isotopes. Carbon from non-
biological organic material has a different ratio. The carbon isotope ratios seen in these
3.85 billion year old rocks look as though they derived from living cells. Similar depletion
of heavy carbon isotopes has been reported (but not published) for a Martian meteorite
thought to contain microfossils (see Science 274, Nov. 8 1996 , p. 918).

Much of the information that follows is taken from Vital Dust, by Christian de Duve.

What does that mean for the origin of life? The planet formed about 4.5 billion years ago
and it is thought that the surface was either molten or under continuous bombardment
from space early in its history until about 4 billion years ago. Meteor impacts and volcanic
activity would have made the surface unfit for life. The probable existence of life at 3.85
billion years means that life arose on the planet almost as soon as it was possible for it to
exist on the surface. Therefore, the origin of life on earth was very rapid. The earliest
microfossil evidence of cells that resemble cyanobacteria come from the early archaen Apex
chert of Western Australia dated at about 3.5 billion years. (Schopf, Science 260, 640-
646, 1993
) Other evidence of ancient life on earth are formations called stromatolites.
These are columns of fossilized material that look just like modern structures seen in
various sites around the world, for example Shark Bay, Australia. These are formed by
bacterial colonies that exist in mats, with phototrophs on the top and heterotrophs lower
down. The colonies grow into columnar structures, accumulating debris and eventually
fossilizing. The structures of 3.5 billion year old stromatolites look nearly identical to
modern stromatolites that are still growing. A recent article in the October 3, 1996 (Nature,
p. 385
), proposed that these structures may have formed from non-biological processes
involving some fractal growth patterns. However, the microfossil evidence and modern
day stromatolite structure tends to support the biological origin of stromatolites.

We will assume from this evidence that life on earth dates back 3.85 million years. The
microfossils in stromatolites that appear similar to modern cyanobacteria, suggest that life
evolved to a form similar to today's bacteria very early and that the bacteria at least have
changed little in the intervening 3.5 billion years. All present day life is based on a nucleic
acid informational molecule, either DNA or RNA that encodes information needed to
make a living cell. The information is coded in a triplet code and though there are some
examples of slight variations in this code, no radical departures from it exist. This
"universal" code is interpreted as proteins by a complex machinery called the ribosome
that is also shared in common among all living things. These main features of information
storage and retrieval are conserved and provide convincing evidence that all life on earth
has one origin and shares a common ancestor.

Since life began, it has been changing in permissible directions. Physical constraints on the
chemistry of life including the properties of water, the nature of carbon and other key
aspects of biology have allowed variations on the original theme, but only within certain
boundaries. However, 3.85 billion years is a long time and many variations have been
tried out and many have been successful. Because the informational molecule has to
transmit the information through time for the blueprint of a cell, it is the information that
has changed. This molecule has been copied billions of times, but not without some errors
creeping in. Today, we can look back in time by comparing the sequences of the
nucleotides and the translated sequences of the proteins from present day organisms. By
quantitating the differences between the sequences and by making some modest
assumptions about rates of change in the sequences, we can estimate when different
organisms diverged from one another. Very similar organisms have very similar
sequences, and more distant relatives have more differences in their code. This is the
concept of the molecular clock. With this very simple idea, one can use sequences to build
trees with branches representing different species. The relationships between organisms
can be graphed in this way. If enough organisms are included and the most conserved
sequences are used, a tree of life can be constructed. This is a genealogy of present day
organisms, and it is very interesting.

In such a tree, the branches always diverge, they do not merge back together, because
species do not merge, except in very rare events. The place where two branches come
together is the point in time when they were the same species. Farther and farther back on
the tree are divergences that are deeper and more ancient. If we go back far enough, the
most distant branches meet, at the common ancestor. The single celled organism that
gave rise to all life on the planet.

To build a tree like this is not trivial. It takes some care to pick the right sequences to use,
because not all sequences are appropriate for this job. Even in a single gene, not all of the
gene is useful for this task. Frequently, only the most conserved parts of a gene are
included in building trees. Ideally, the most ancient features common to all life would be
good candidates to compare. This has been done most often with ribosomal RNA, since
all life has this molecule and it is so central to life that it has to be highly conserved. This
is the basis of the rRNA database project. Here more than 3000 rRNA sequences have
been used to make trees. These trees include every kind of life possible to get a pretty
comprehensive tree of life. The results look like this. Here representatives from the major
groups of organisms are shown. Every other organism will fit on this tree very close to an
existing branch, so it does little good to use all the sequences, because it just clutters up
the picture. It is clear from this tree that there are three main divisions in life. These have
been called domains. In a hierarchy of life, domains are higher than kingdoms. The three
domains are bacteria, archaea and eukarya.

Bacteria and archaea are both prokaryotes, without a nucleus. They are as different from
each other as each is to the eukarya, so even though they are prokaryotes, it is incorrect to
lump them together. In fact, the most accepted version of this tree shows the archaea to
be more closely related to eukarya, but this is a debated issue.

The first cell certainly existed before the common ancestor. We do not know how long a
time elapsed before this major split in life occurred. We can compare the three domains
and make some guesses about what the common ancestor was like. Features that are
present in all three domains were probably present in the common ancestor. It is hard to
go beyond that point, except in very general terms.

Back to the table of contents


THE NATURE OF THE COMMON ANCESTOR

The common ancestor was probably a prokaryote, with a cell wall and a lipid bilayer,
incorporating membrane proteins within it. This organism probably used electron
transport and proton pumping to make a proton gradient for ATP synthesis. This strategy
is too common to have arisen later. Along with this, there must have been a minimum set
of membrane transport proteins that could move ions and substrates into the cell (and out
of the cell) without leaking too many protons. This cell must have had the ability to make
many coenzymes and other special molecules like heme, flavins and iron sulfur centers.
The informational molecule was probably DNA, with genes being transcribed to RNA and
then to protein on ribosomes. A DNA polymerase and an RNA polymerase were present.
Oxygen was not available, so the electron acceptor at the end of the electron transport
chain may have been a sulfur compound or Fe3+. This cell probably had the protein
export machinery needed to construct a cell wall outside the plasma membrane.
The cell could make its own lipids, though it is not clear whether it made ether lipids or
ester lipids. Certainly it had some important biochemical pathways for purines,
pyrimidines and probably all 20 amino acids. This last point is supported by the fact that
some enzymes that are clearly present in all three domains can be aligned with some
certainty. The aligned positions of these enzymes include highly conserved positions that
represent all 20 amino acids. Therefore, all 20 were apparently present in the last common
ancestor. It is not clear if it was photosynthetic or not, but the similarities between ancient
stromatolites and modern stromatolites, that have photosynthetic cyanobacteria, suggest
photosynthesis was probably present in the 3.5 billion year old stromatolites. If the
common ancestor was not photosynthetic, then the split between bacteria and archaea had
to predate 3.5 billion years ago. This is the idea favored by Christian de Duve in his book
Vital Dust. He argues that no archaebacteria have chlorophyll, and relatively few bacteria
are photosynthetic. Therefore, chlorophyll and photosynthesis must have developed after
archaea and bacteria split, but before 3.5 billion years ago, when cyanobacteria were
probably photosynthetic. The most ancient bacteria are mostly thermophilic, both archaea
and bacteria. This heat loving characteristic may have been present in the common
ancestor. We cannot say if it had introns or not.

Back to the table of contents


THE THREE MAIN BRANCHES OF LIFE. HOW DO THEY CLUSTER?

With three branches on a tree, there are three ways to cluster the sequences. Archaea can
cluster with eukarya, or archaea can cluster with bacteria, or bacteria can cluster with
eukarya. What is the correct tree, and how do we know? This is a highly controversial
issue that is being debated today. See Science of Nov. 15, 1996 in the letters section to
see this debate. The Tree of Life web page also has a discussion on this issue. Trees
made from different protein sequences support different clusterings. Also, as we shall see,
the eukarya may be derived from a fusion between an archaeon and a gram negative
bacteria. If the origin of the eukaryotes is really hybrid, then this issue is even more
complex than it first appears.

Let me say that there is a true historical answer to this problem. Either the bacteria and
the archaea diverged first followed by eukarya branching off from the archaeal line (most
common view) or they did not. The problem cannot be solved by looking at individual
protein, RNA or DNA sequences. As I have said, all three possible trees can be found this
way. What needs to be done, and what is probably already being done is to compare
whole genomes, or the most useful parts of whole genomes, the highly conserved genes.
This may represent a hundred or two hundred genes that all living things have to have to
be alive. With this much data, the results should be clear enough to pick between the
three possibilities, or the hybrid eukarya option. As of Nov. 1996, whole genomes
(see the page with the ongoing genome projects) from each domain have been sequenced,
and preliminary identity of many of their genes has been made. The raw data is available
for this comparison, and it is almost certainly being done now.

From the bacteria, Haemophilus influenzae and Mycoplasma genitalium have been
sequenced. From eukarya, yeast (Saccharomyces cerevisiae) is done and from archaea,
Methanococcus jannaschii has just been done. The majority of M. jannaschii transcription
and translation genes look like eukaryotic genes. However, the metabolic enzymes seem to
be more like bacterial enzymes. see the paper on the genome, (Science 273, 1043-1045
and 1058-1073 1996
). and consult the M. jannaschii genome database. This confuses the issue, since a single archeon looks like a hybrid between eukarya and bacteria. If eukaryotes evolved from an archaea, then there might be no need to invoke a hybrid eukaryote made from a fusion of genomes. But this does not explain how M. jannaschii came to look so hybrid.

Tree of life (small)Click here for a larger figure (200 kb)

Within the archaea, there are two main divisions called Euryarchaeota and Crenarcheota. Euryarchaeota is a physiologically variable group including halophiles, thermophiles and methanogens, whereas the Crenarcheota are thought to be a more homogeneous group, consisting exclusively of sulphur-dependent extreme thermophiles. Recent sampling and PCR amplification of has led to the detection of crenarcheal small subunit rRNA genes in open marine waters and in terestrial lake and marsh sediments, indicating that such organisms are globally distributed and have an important role in the biosphere (see Hershberger et al. in the 5 December, 1996 issue of Nature). M. jannaschii is from the euryarchaeota. Two other organisms, Pyrobaculum aerophilum and Sulfolobus solfataricus are being sequenced now and they are from the crenarchaeota. These will be finished in a few months, if they are not done already. By this time next year, the question of how the main domains of life diverged should be known with some confidence.

Back to the table of contents


 

THE ROOT OF THE UNIVERSAL TREE OF LIFE


The root of a phylogenetic tree is the location of the last common ancestor shared by all
members of the tree. The root of a tree cannot be determined without an outgroup, a
sequence that is related to the sequences of interest, but not a direct member of that group.
For example, if you wanted to root the tree of mammalian ADP/ATP carriers, you could
include a fungal ADP/ATP carrier as an outgroup. The point on the tree where the
outgroup joins the other sequences is the root.

When there are only three groups being considered, like archaea, bacteria and eukarya, you
cannot root the tree, because there is no outgroup. Clever molecular evolutionists figured
out a way to root this tree by making a tree with duplicated genes that are very old and are
found in all the domains of life. This means that the duplication preceded the divergence of
these three domains. Two duplicated proteins that were used for this are the alpha and beta
subunits of the vacuolar V-type and F1FO ATPases. This did not work too well, because
there are many different V-type and F1FO ATPases, and there is some evidence that there
might have been lateral gene transfer. Another protein set that does not have this criticism
are the elongation factors EF-Tu and EF-G. The EF-G branch on the tree serves to root the
Ef-Tu branch and vice versa. This was reported in the July 96 PNAS vol 93, 7749-7754.

This analysis strongly supports the bacteria/archaea split at the root of the tree. However,
the frequently drawn branching of eukaryotes from archaea does not occur. Instead,
eukaryotes branch from within the archaea. They appear to cluster with the crenarchaeota.
The authors caution that more proteins need to be analyzed in this way and the data is not
absolutely convincing. They also point out that other results have been seen with glutamine
synthase, glutamate dehydrogenase and Hsp70 sequences.

Back to the table of contents


GENOME FUSION TO MAKE EUKARYOTES?

What is the evidence for a bacterial, archael fusion to make eukaryotes? First, eukaryotes
have a nucleus. Neither archaea or bacteria have a nucleus, and it is very hard to imagine
how such a structure would evolve. Fusion of two cells would offer an explanation of
how a nucleus could be formed. If a gram negative bacterial cell engulfed an archael cell, it
could wrap it in its outer membrane to form a structure that would be like a nucleus. If the
archael cell lost its membrane and used the host membrane instead, this would form the
nuclear envelope and the endoplasmic reticulum. (see Gupta and Golding May 96 TIBS
Fig. 3
)

In Feb. 1996 Lynn Margulis published a paper in PNAS suggesting eukaryotes were
formed by an endosymbiosis between prokaryotes. She earlier had promoted the idea
that mitochondria and chloroplasts were endosymbiotic proteobacteria and cyanobacteria,
respectively, and she has been proven correct in that idea. So the idea deserves a fair
hearing. Margulis believes that many genomes have contributed to eukaryotes. Plants, that
could be formed from an endosymbiosis between fungi and algae, might have seven
different genomes mixed together. In May 1996, Radhey S. Gupta and G. Brian Golding
published a discussion in TIBS
on the origin of the eukaryotic cell and also called it an
endosymbiotic event. They gave the following evidence in their favor.

Hsp70 proteins (heat shock proteins, the most conserved proteins found in all three
domains) have a unique insert seen at the same site in sequences from eukaryotes and
gram negative bacteria. This insert is missing in gram positive bacteria and archaea. So, if
eukaryotes evolved from archaea, how can the insert be explained?

Gupta and Golding claim that of 24 proteins they examined, 7 supported eukaryotes
clustering with gram negative bacteria. Nine supported eukaryotes clustering with
archaea. Eight were not clear. This supports the idea of a chimeric origin for eukaryotes.

The results of Gupta and Golding were strongly criticized by Roger and Brown in the Oct.
96 TIBS
. The issue is one of small numbers. Many more sequences need to be compared.

Back to the table of contents


THE RNA WORLD

Several years ago (Cell 31, 147-157 1982) Tom Cech discovered that RNA could splice
introns out of itself (type I and type II introns are self splicing) without any protein being
present. This meant that RNA had enzymatic activity. This changed our concept of the
origin of life, by saying that proteins were not needed early in the evolution of life. A
complete DNA to RNA to protein apparatus was not needed. DNA was not even needed.
Everything could be done by RNA. The concept of life without DNA or protein is the
RNA world.

What are some present day relics of the RNA world? There are several key biochemical
processes that depend on RNA components. These are listed below.

Back to the table of contents


SOME FEATURES THAT ARE NOT CONSERVED AMONG THE THREE DOMAINS

One feature that is not conserved in all three domains is ribonucleotide reductase. Each
domain has a very different enzyme that catalyzes this important reaction. The reaction is
the removal of the 2' OH group from ribonucleotides. The absence of a conserved protein
for this function suggests that this was being catyalyzed by an RNA ribozyme in the last
common ancestor. This ribozyme has since been replaced, but independently in all three
lineages.

The lipids of modern archaea are ether linked lipids, while bacteria and eukarya have ester
linked lipids. The archaea with the highest heat tolerance have ether linked lipids that are
covalently coupled across the bilayer. This adds to their membrane stability at high
temperatures. Christian de Duve suggests that the common ancestor was an ether linked
species that had to switch to ester linkages to survive a drop in temperature. The present
day archaea have some members that can grow at 110 degrees, but the most thermophilic
bacteria can only stand about 80 degrees. He suggests this is related to lipid composition.
If the common model of branching is correct, then the transition to ester linked lipids had to
occur twice, once in the bacteria and again in the eukarya. If the chimeric model of eukarya
is correct, the gram negative bacteria that fused with archaea could have brought the ester
linked lipid biosynthetic genes along.

Murein is only found in bacteria not archaea. It is a cell wall component made of sugars
and D- and L-amino acids. The presence of both D- and L- amino acids suggests an ancient
origin for this material. Archaea may have lost the ability to make murein, however, it is
possible that the synthesis of murein evolved after bacteria and archaea split.

Cholesterol is a eukaryotic lipid. Not all eukaryotes make it. Yeast make ergosterol instead
and plants make cycloartenol derivatives, though palms do make cholesterol. There are
reports that three bacteria can make partially demethylated lanosterol products, on the way
to cholesterol. But, for the most part cholesterol biosynthesis seems to be a eukaryotic
phenomenon.

Back to the table of contents


DEVELOPMENT OF PHOTOSYNTHESIS AND THE OXYGEN CRISIS

Life in the early earth was anaerobic. In the history of photosynthesis, photosystem I must
have developed first. PSI cannot split water to make O2, so the first source of electrons for
this pathway had to be mineral compounds. As we saw in the lecture on photosynthesis,
purple bacteria recycle their electrons through a bc1 complex and generate a proton gradient
that way, so this is also a possibility. Eventually, PSII evolved from PSI and the ability to
split water and form O2 changed the planet. O2 is the source of many toxic oxygen
compounds such as superoxide anion, hydrogen peroxide and hydroxyl radical. Cells had
to protect themselves from these oxygen byproducts of photosynthesis. Of course, the
cells that produced these compound had to protect themselves first. This required the
evolution of free radical scavengers like vitamins C and E, and enzymes like superoxide
dismutase and catalase.

Oxygen released to the environment by photosynthesizing cyanobacteria would react with
any oxidizable substance. The early oceans were believed to contain a lot of Fe2+ that
could react with oxygen to form rust. There is evidence from 3.75 billion year old
sedimentary deposits called banded-iron formations, that Fe2+ served as an oxygen sink.
These deposits formed from 3.75 billion years to about 1.7 billion years. As these deposits
began to lessen, oxygen began to appear in the atmosphere, starting about 2 billion years
ago. It has remained pretty constant since 1.5 billion years ago. The inference is that iron
in the oceans served as an oxygen trap for almost 2 billion years until it was all oxidized.
As the iron became depleted, oxygen concentrations in the atmosphere could rise.

Organisms that were not O2 producers either had to hide from O2 or adapt. There are still
anaerobes that hid from the toxic effects of O2. The other organisms adapted and
developed protections from oxygen. These organisms eventually learned how to exploit
O2 as the ultimate electron acceptor of their electron transport chains and gained a great
advantage in the amount of energy that could be recovered in the form of ATP.

The presence of oxygen also opened up the possibilities of oxygen chemistry that were not
available before. Note that none of the amino acid biosynthesis pathways requires
molecular oxygen, but many of the breakdown pathways do. You should also be aware
that cholesterol biosynthesis requires oxygen to remove a 14 alpha methyl group from the
cholesterol ring structure. This is needed to make cholesterol planar, which is an important
feature of this critical eukaryotic lipid. Cholesterol cannot be made without oxygen and the
cytochrome P450 CYP51 (lanosterol 14 alpha demethylase).

To continue click here

Back to Table of contents

Last updated: 15 December 1996.


created by : <opperdoes_at_bchm.ucl.ac.be>