Structure of Prp40
by Kelly Hrywkiw
Introduction
Prp40 has been implicated in early steps of spliceosomal assembly, and binding to the phosphorylated C-terminal domain (phospho-CTD) of RNA polymerase II (RNAPII)[1]. Two types of domains are found in Prp40, two WW and six consecutive FF domains[2]. WW domains contain two highly conserved tyrosine residues that bind to proline rich sequences, thereby mediating protein-protein interactions[2]. Both WW domains of Prp40 can interact with PPxY motifs (x is any residue) and PPѰѰP motifs (Ѱ is an aliphatic residue)[2]. FF domains contain two highly conserved proline residues and can be found in arrays up to six domains[3]. Interestingly, these tandem domains appear not to function in co-operative binding, rather they may provide multiple independent binding sites[3]. With regards to interactions between the different domains, the WW domains bind to the splicing protein: branch point-binding protein (BBP, also known as Ms15 and ySF1) and Prp8; the FF domains interact with the splicing factor Clf1 (Syf3p in yeast), and the phospho-CTD of RNAPII[3][2].
Role in spliceosomal assembly
The processing of pre-mRNA takes place through the use of a large dynamic machine known as the spliceosome, through which introns are removed and exons are spliced together to create a mature mRNA[4][5]. The spliceosome is comprised of five snRNA molecules (snRNAs U1, U2, U4, U5, and U6) and over one hundred associated proteins[4][5]. Assembly of the spliceosome is thought to take place in a stepwise manner around the pre-mRNA transcript[4][5]. The first step involves recognition of the 5’ splice site (ss) by U1 snRNP, followed by recognition of the branch point sequence by U2 snRNP[4][5]. From this point the tri-snNRP consisting of U4/U6•U5 binds[4][5]. Together the five snRNPs form the precatalytic spliceosome which must undergo a series of changes before it can actively splice[6][5][7][4].
Prp40 participates in cross-intron bridging through interactions with the 5’ss and the BBP, which brings the 5’ss and the branch point into spatial proximity[2]. In addition, Prp40 interacts with Prp8, such that it is possible that while one W domain interacts with Prp8 the other interacts with the 5’ss at the same time[2]. In fact, this interaction is thought to bridge the 5’ss to the 3’ss after U2 snRNP association displaces BBP and causes Prp8 to interact with the tri-snRNP at the 3’ss.
Role in transcription
RNAPII is responsible for mRNA synthesis, the first step of gene expression of eukaryotes[8]. During the synthesis cycle of mRNA, RNAPII interacts with up to six general transcription factors and numerous regulatory proteins. It is comprised of 12 subunits (Rpb1 through 12) and several disordered regions including the NH2 tail of Rpb6 and Rpb12, short exposed loops in Rpb1, 2, and 8, and the CTD of the largest subunit Rpb1[8]. The CTD is subject to hyperphosphorylation, such that RNAPII can exist in either a hyperphosphorylated form (RNAPII0) or a non-phosphorylated form (RNAPIIA) at the CTD[9]. The phospho-CTD helps coordinate pre-mRNA processing events such as localizing and activating the 5’capping complex. Interestingly, when RNAPII0 is transcriptionally active in the nuclear matrix it colocalizes with splicing factors[10].
Prp40 can bind to the RNAPII0 at the phospho-CTD[2][10]. The WW domains of Prp40 have been shown to interact with the phospho-CTD; however, they do not interact with the conserved YSPTSPS sequence of phospho-CTD. The FF domains have been shown to interact with the phosphor-CTD repeats[3]. Not all the FF domains of Prp40 exhibit the same functionality, such that the first FF domain and the fourth FF domain do not interact with the phosphor-CTD[3]. The ability of Prp40 to bind to spliceosomal factors and the phosphor-CTD of RNAPII0 could connect the phosphor-CTD to the earliest stages of spliceosomal commitment complex formation[10].
Structure of Prp40
Yeast Prp40 contains 583 residues, two WW domains, and four FF domains connected through amino acid linkers[10].
Schematic representation of the domain organization in Prp40.
The WW domains
The follows that of one triple curved antiparallel β-strand sheet connected to the other by an α-helical linker composed of residues . The residues in each of the β strands are as follows, β1 , β2 , and β3 in the , and β1 , β2 , and β3 in the . Located on the convex surface of each of the domains lie three residues, , that form a . On the concave surface lies an aromatic pocket comprised of the residues . These pockets makes up the ligand binding sites on each of the WW domains, however they do not form one large pocket, rather that face away from each other. The linker residues Leu32 and Leu40 fold back into the hydrophobic cores of the WW domains[2].
The first FF domain (FF1)
The , and one 310 helix located between α2 and α3. Helices are composed of the following residues, (134-146), (154-163), (167-170), and (175-187). The core domain is made up of a series of aromatic and aliphatic residues. A type 1 β-turn is exhibited by the residues Asp149, Ser150, Thr151, and Trp152[3].
The fourth FF domain (FF4)
exhibits compact four helical bundle fold comprised of an α1-α2-310-α3 topology. The composition of each helix is a follows (Glu489-Thr507), (Trp519-Leu526), (Tyr532-Gly536) and (Asp539-Phe549). There are a series of interactions between the different helices, for example Tyr532 is in contact with Phe500, Leu503, Ser523, and Arg542. A difference between FF1 and FF4 is the presence of five extra amino acids in F4 which gives the located between α 1 and α2 an extra turn. This insertion however does not increase the flexibility of F4 as compared to F1[1].