Where do promoters come from?

"Now the rovin' gambler, he was very bored
Tryin' to create a next world war
He found a promoter who nearly fell off the floor

He said, "I never engaged in this kind of thing before
But yes, I think it can be very easily done

We'll just put some bleachers out in the sun
And have it on Highway 
Bob Dylan

So, where do promoters come from?

A model for a primordial promoter DNA sequence
Figure 1: A model for a primordial promoter DNA sequence. Modern bacterial and archaeal promoters are derived from an anchor DNA sequence and an extended AT-rich sequence. See the text for details.

Promoters are the DNA sequences from which RNA polymerase initiates a RNA chain. Burton and Burton (2014)[1] proposed a simple model for promoter evolution from LUCA (the last universal cellular common ancestor of bacteria, archaea and eukaryotes; ~3.5 to 3.8 billion years ago) (Figure 1). Remarkably, ancient promoters are posited to be surprisingly similar to the promoter DNA sequences of the present. An ancient promoter is posited to be an AT-rich DNA sequence next to (or encompassing) an anchor DNA sequence. Upstream anchor DNA sequences remain double stranded as the DNA strands of the downstream promoter are opened to expose the DNA template strand for RNA synthesis. The role of the anchor sequence is to place and orient RNA polymerase to open the promoter downstream of the anchor with the help of general transcription factors.

Bacteria utilize sigma factors to recruit RNA polymerase and to bind to and open promoter DNA. Sigma factors have four helix-turn-helix (HTH) motifs: HTH1, HTH2, HTH3 and HTH4.  HTH4 binds anchor DNA (-35TTGACA-30). HTH3 binds the extended -10 region of promoters (-15TGX-13). HTH2 binds the -10 region of the promoter (-12TATAAT-7) and separates the DNA strands by flipping out non-template strand -11A and -7T of the -10 consensus sequence. The AT-rich -10 region of the promoter is posited to be focused from the AT-rich primordial promoter. 

Archaea utilize general transcription factor TFB (Transcription Factor B), TBP (TATA-box binding protein) and TFE (Transcription Factor E) to initiate RNA synthesis from a promoter. TFB includes two HTH motifs related to HTH3 and HTH4 of bacterial sigma factors. TFB is posited to have lost two HTH motifs in evolution. The remaining TFB HTH motifs are here designated HTH1 and HTH2. For historical reasons, the HTH motifs in TFB are also referred to as cyclin-like repeats (CLR). TFB HTH2 binds the BREup (TFB recognition element upstream of TATA; i.e. -38GGGCGCC-32) and TFB HTH1 binds the BREdown (TFB recognition element downstream of TATA; i.e. -23GTTTTTT-17). TBP binds the TATA-box of the promoter, which is focused from the ancient AT-rich promoter sequence (i.e. -31TATAAAAG-24).

Promoters remain overall AT-rich compared to neighboring protein coding sequences. 

Over nearly 4 billion years of evolution, promoters remain very similar to the promoters posited to have existed at LUCA. Promoters are overall AT-rich, promoters include an anchor DNA sequence that remains double-stranded as the promoter opens for transcription, and promoters include an AT-rich recognition sequence. Remarkably, LUCA can be glimpsed from observation of current promoter DNA sequences and general transcription factor sequences.


[1] S.P. Burton, Z.F. Burton, The sigma enigma: Bacterial sigma factors, archaeal TFB and eukaryotic TFIIB are homologs, Transcription, 5 (2014) e967599.