Genesis of life on Earth

Genesis of life on Earth told in core protein and DNA sequences: RNA polymerases reading DNA for content

To describe the genesis, divergence and complexity of life on Earth, one must understand two major transitions: 1) between the ancient RNA-protein world and LUCA (the last universal common ancestor); and 2) between LUCA and LECA (the last eukaryotic common ancestor). On Earth, life divides into three domains: bacteria, archaea and eukaryotes. Humans, plants and other organisms with complex cell and body structures are eukaryotes, so eukaryotes may have advantages in cell metabolism and developmental complexity compared to some single-celled organisms. Eukaryotes, however, result from chimeric fusion of multiple bacteria and at least one, now extinct archaea. Fused characteristics from bacteria and archaea license eukaryotes to support more complicated and rapid metabolism and increasingly intricate developmental programs.  

Information flow within living organisms requires genetic material in DNA to be read and deciphered. Because, within the double helix, DNA bases project inward, DNA is more stable than single-stranded RNA, in which bases are more exposed. LUCA, therefore, developed a more stable DNA genome than was possible in the RNA-protein world. The DNA double helix, however, is difficult to unzip, so proteins must act on DNA to separate strands. The relative security of a DNA genome compared to a fragmented RNA genome comes with conditions.    

LUCA (~3.5 to 3.8 billion years ago) was one of the first organisms with a streamlined DNA genome. The Earth is only ~4.5 billion years old, so LUCA is very ancient and the time for the RNA-protein world that preceded LUCA was short. To read the genetic information within a DNA genome requires a mobile protein factory called RNA polymerase. To initiate RNA synthesis on a DNA template, RNA polymerases found in the three domains of life require protein factors that help them to recognize and open a promoter DNA sequence. In bacteria, sigma factors help RNA polymerases bind promoter DNA. In archaea, TFB (Transcription factor B) and TBP (TATA binding protein) cooperate to help RNA polymerase recognize a promoter. The evolutionary relationship, among bacterial sigma factors and archaeal TFB-TBP has not clearly been known.

Burton and Burton (2014)1 (son and father) demonstrate that bacterial sigma is related in evolution to archaeal TFB and eukaryotic TFIIB. Bacterial sigma factors have four helix-turn-helix (HTH) motifs, and archaeal TFBs have two related HTH motifs, which are common DNA-binding features of proteins. Burton and Burton hypothesize that both sigma factors and TFB arose from a primordial initiation factor at LUCA with 4-HTH motifs, and bacteria and archaea diverged in evolution about the time of LUCA largely because of this difference in genome reading styles. Archaeal TFB lost 2-HTH motifs and gained compensating functions including cooperation with TBP. Because bacteria and archaea interpret (read; "transcribe") their DNA genomes differently to synthesize RNA, these organisms became distinct.

Evolution of promoter DNA sequences has long been a mystery, but bacterial promoters somewhat resemble archaeal and eukaryotic promoters. Evolutionary relatedness of bacterial sigma factors and archaeal TFB, however, generates a simple story about promoter evolution. According to this view, a primordial promoter was an AT-rich DNA sequence with a neighboring anchor DNA sequence. The anchor DNA sequence gives directionality to RNA synthesis by pointing RNA polymerase in the direction that it must read the gene. Because of chemistry of DNA bases, DNA strand separation is easier at AT-rich DNA than GC-rich DNA. Both bacterial sigma factors and archaeal TFB have a C-terminal HTH motif to bind the anchor DNA sequence, which remains double-stranded when the promoter DNA is opened closer to the start point for RNA polymerase to initiate a RNA chain and transcribe a gene. Another HTH motif helps to open up the DNA helix at or near a focused AT-rich DNA segment. So, evolutionary relatedness of bacterial sigma factors and archaeal TFB helps to describe the structure and evolution of directional DNA promoters with which these core protein factors interact. Anchor DNA sequences found in all of life are explained, and focused AT-rich DNA sequences common in promoters are explained from LUCA to the present.  

Opinions and politics are influenced by what and ways humans read and access information (i.e. compare blinking in Fox News to reading the New York Times). Analogous to the stark divisions of modern American politics, very early in evolution, bacteria and archaea came to read and interpret genomic information in DNA in fundamentally different ways, causing bacteria and archaea to diverge into distinct domains. At LECA (~1.6-2.5 billion years ago) an archaea fused with multiple bacteria to form the first eukaryote. Sadly, it seems unlikely that a comparable synthesis of Republican and Democratic ideologies could be possible.   

Understanding LUCA and LECA is key to describing divergence and evolution of life on Earth and also human complexity compared to single-celled organisms. The protein structures and sequences of sigma factors, TFB and multi-subunit RNA polymerases relate surprisingly straightforward stories about LUCA and LECA. Promoter evolution is also described. Essentially, protein structures and sequence comparisons reveal the most defining events in evolution of DNA-based organisms on Earth. In a preceding paper, a story of LECA and eukaryote complexity was recently told by Burton (2014).2

Because life evolved from a RNA-protein world and DNA originally came from RNA, the story of the genesis of life on earth is more easily told from the point of view of RNA polymerases rather than DNA polymerases.3 Strangely, bacterial and archaeal DNA polymerases are unrelated in evolution indicating that these essential mobile factories evolved separately after divergence. But exiting the RNA-protein world to LUCA must have initially involved RNA-dependent RNA polymerases and reverse transcriptases to synthesize DNA. If reverse transcriptase DNA synthesis mechanisms continued into LUCA, bacteria and archaea could have separately evolved distinct and unrelated DNA polymerases.


A, G, C, T:
the bases that make up DNA
Anchor DNA:
the part of the promoter that remains double-stranded and points RNA polymerase in the appropriate direction to transcribe a gene
DNA polymerase:
a molecular machine to synthesize DNA
helix-turn-helix motifs: a common DNA binding motif in proteins
last universal common ancestor
last eukaryotic common ancestor
a DNA sequence from which RNA polymerase begins transcription of a gene
Reverse transcriptase:
a molecular machine to synthesize DNA from a RNA template
RNA polymerase:
a molecular machine that runs on DNA to polymerize RNA
Sigma factor:
a protein factor that directs initiation by bacterial RNA polymerase
a protein factor that directs initiation by archaeal RNA polymerase
TATA box binding protein
the process of RNA synthesis by RNA polymerase


  1. Burton SP, Burton ZF. The sigma enigma: Bacterial sigma factors, archaeal TFB and eukaryotic TFIIB are homologs. Transcription 2014; 5:e967599.

  2. Burton ZF. The Old and New Testaments of gene regulation: Evolution of multi-subunit RNA polymerases and co-evolution of eukaryote complexity with the RNAP II CTD. Transcription 2014; 5.

  3. Koonin EV. The origins of cellular life. Antonie van Leeuwenhoek 2014; 106:27-41.