What are orthologous and what are paralogous genes ?

It is said that:

The original quotation is by Walter Fitch (1970, Systematic Zoology 19:99-113):

"Where the homology is the result of gene duplication so that both copies have descended side by side during the history of an organism, (for example, alpha and beta hemoglobin) the genes should be called paralogous (para = in parallel). Where the homology is the result of speciation so that the history of the gene reflects the history of the species (for example alpha hemoglobin in man and mouse) the genes should be called orthologous (ortho = exact)."

This is also well explained in the book "Fundamentals of Molecular Evolution" by Li & Graur 1991, Ed. Sinauer Associates, Inc., Sunderland, Mass., USA.

What terms should be used in the case that there is a speciation event, followed by duplication events in both lineages ? For example:

                         _________ Rat_gene_1
                Rat     |
               |        |_________ Rat_gene_2
           ---( ) 
               |     _____________ Mouse_gene_1
               |    |
              Mouse |_____________ Mouse_gene_2


where () indicates speciation and X indicates gene duplication. So speciation comes here before duplication.

According to Fitch's definition, Mouse_gene_1 and Mouse_gene_2 are paralogous, as are Rat_gene_1 and Rat_gene_2.

But Rat_gene_1 is orthologous both to Mouse_gene_1 and to Mouse_gene_2, since Rat_gene_1 and the ancestor of the 2 mouse genes diverged after a speciation event. Hence we have:

Hence, it seems acceptable to say that the mouse gene family (that includes Mouse_gene_1 and Mouse_gene_2) is orthologous to the rat gene family (Rat_gene_1 and Rat_gene_2).

Note that many molecular biologists confuse "orthology" and "functional equivalence". For example Koonin et al. (Trends in Genetics 1996 12:334-336) wrote:

"By definition, orthologs are genes that are related by vertical descent from a common ancestor and encode proteins with the same function in different species. By contrast, paralogs are homologous genes that have evolved by duplication and code for protein with similar, but not identical functions."

Does Rat_gene_1 have the same function as Mouse_gene_1 or as Mouse_gene_2 ?

Indeed, the phylogenetic analysis allows one to determine orthology relationships but not functional equivalence.

In fact, whereas it is likely that two orthologs have similar function, these functions are not necessarily "identical".

Thus it it is important not to confuse "orthology" with "functional equivalence". Now that the importance of comparative genomics is well recognized it is essential to avoid misunderstandings !

The case of lactate dehydrogenase

Let's now take the reverse situation of two closely related genes in one organism that ahve been the result of a duplication before a speciation event occurred, e.g. those of mammalian lactate dehydrogenase isoenzymes. In the case of this tetrameric enzyme the gene products of the liver type (LDH_L) and of the muscle type (LDH_M) may give rise to in total five different isoenzymes: M4, M3L, M2L2, ML3 and L4, respectively, depending on the type of tissue and the level of expression of the two genes. When one now constructs a phylogenetic tree, the result should be something like this:


    _________ Rat_LDH_L

    LDH_L |

    _______( )

    | |_________ Mouse_LDH_L


    --- X

    | _________ Rat_LDH_M

    | |

    |_______( )

    LDH_M |_________ Mouse_LDH_M


where X indicates gene duplication and ( ) indicates speciation. Here gene duplication came before speciation. In the case someone is not aware of the presence of isoenzymes in the organisms, because only one sequence for each organism (e.g. LDH_L for mouse and LDD_M for rat) is available and isoenzyme data are missing, then the resulting phylogeny would result in an apparent much earlier separation of mouse and rat. This will inevitably lead to erroneous phylogenies.

According to Walter Fitch's definition,

Hence, it seems acceptable to say the the LDH_M family is paralogous to the LDH_L gene family.

go back to the main text

Last updated: 8 August 1997.
created by :Fred Opperdoes