A DNA Structural Alphabet Distinguishes Structural Features of DNA Bound to Regulatory Proteins and in the Nucleosome Core Particle
Bohdan Schneider, Paulina Bozikova, Petr Cech, Daniel Svozil and Jiri Cerny [1]
Molecular Tour
We observe significant structural differences between DNA in complexes with transcription factors and with histone proteins. To analyze these two types of structures, we used the DNA structural alphabet called CANA, which we had developed earlier by analysis of hundreds of crystal structures containing DNA molecules (Journal:Acta_Cryst_D:2) [2]. The structural alphabet allows to "translate" a three-dimensional (spatial, 3D) structure of DNA building blocks called dinucleotides into a series of letters. Each letter then represents the structure of a dinucleotide and a chain of these letters can be analyzed as a "word" representing the whole 3D DNA structure. The process of translation of a 3D structure to the alphabet letters can be performed at the web server dnatco.org [3].
Different patterns of the CANA letters observed in complexes with transcription factors and with histone proteins can be interpreted as features discriminating the specific of DNA to transcription factors from non-specific binding to histone proteins. Especially noteworthy is the role of two DNA structural forms, so called A-DNA and BII-DNA ("B-two-DNA"), which are described by the CANA letters AAA and BB2. The AAA structures are found quite frequently in DNA specifically bound to transcription factors at regions where the DNA duplex bends around the protein. AAA is avoided in non-specific complexes with histone proteins, where BB2 plays the essential role: the wrapping of the DNA duplex around the histone proteins can be explained by the periodic occurrence of the CANA letter BB2 every 10.3 steps along the DNA strand.
A high incidence of untypical conformers (e.g. the CANA letter miB and NAN) and a lower occurrence of the most typical DNA structure type called BI-DNA in DNA regions not bound to proteins indicates that the tools crystallographers use to refine DNA structures from diffraction data need to be improved by the combined use of the best geometrical restraints provided by the CANA alphabet and the electron density maps.
In summary, we showed that plasticity of the DNA double helix can be described by the DNA structural alphabet, and characterized different binding strategies of DNA sequences specifically recognized by regulatory proteins and bound non specifically to the histone proteins.
Percentages of of the CANA alphabet letters and dinucleotide sequences in transcription factors and histone proteins.
- displays DNA from the structure of human TFIIB-related factor 2 and TATA box binding protein bound to U6#2 promoter DNA (PDB code 4roc), where the DNA backbone acquires the A-DNA form (CANA letter AAA) at the bend. Dinucleotides adopting the structure described by the CANA letter AAA (BB2) are highlighted in red (blue) color.
- depicts first 75 base pairs from DNA in NCP of the PDB code 5f99.