Lsm Protein Structure
by Kelly Hrywkiw
Introduction
Sm-like (Lsm) proteins most closely resemble Sm proteins, both of which are found in the three domains of life [1]. Sm proteins play a large role in spliceosome biogenesis through mediating U1, U2, U4, U5, and U6 snRNP assembly[2]. The of proteins can be broken down into seven specific proteins (SmB, SmD1, SmD2, SmD3, SmE, SmF, and SmG in humans) all of which share a conserved Sm motif that is also found in the Lsm proteins[2][3]. Eukaryotes have 16 or more Lsm proteins encoded in their genome, in contrast archaeal species have only one to three [3]. A total of nine specific Lsm proteins are found in yeast (Lsm1-Lsm9). The Lsm proteins 2-7 most closely resemble Sm proteins D1-G, where Lsm 1 and 8 most closely resemble the SmB protein [2]. Lsm9 does not appear to resemble any of the Sm proteins, although there have been some related structures found in the archaeal genome [2]. Several studies have shown that the Sm proteins form into seven membered rings which bind to the Sm binding site, a U rich sequence found in all but U6 snRNA[2]. Lsm proteins can form homomeric rings of heptamers, hexamers, or octamers[1]. In addition they have been found to predominately associate into three complexes: Lsm2-8, Lsm1-7, and Lsm2-7 [1]. The exact functionality of these complexes is in either pre-mRNA splicing, mRNA decay or other roles, and is dictated by their composition, structure, and cellular location [1][2].
The Different Lsm Complexes
Evidence suggests that there are two main Lsm complexes, Lsm1-7 which is associated with mRNA decay, and Lsm 2-8 which is associated with pre-mRNA splicing[1][2]. The first piece of evidence to suggest different roles is the cellular localization of the different Lsm complexes[1]. For example, Lsm1 is predominantly cytoplasmic (where mRNA degradation takes place), as is the Lsm1-7 complex[1]. The Lsm complex 2-8 is most likely nuclear because that is the location of U6 snRNP assembly [4]. In addition, mutation experiments have shown that while Lsm2 to Lsm7 mutants have altered mRNA decay and splicing function, Lsm1 and Lsm8 mutants only have altered mRNA decay and pre-mRNA splicing function respectively [2]. Further evidence is found in immunopercipitation experiments. For example, while Lsm2 to Lsm7 co-immunopercipitate with both U6 snRNA and with mRNA decay factors, Lsm1 and Lsm8 only co-immunopercipitate mRNA degradation factors and U6 snRNA respectively [2]. Due to the difference in functionality of either Lsm1 or Lsm8 it is interesting to note that Lsm1 and Lsm8 are both closely related structurally to the SmB protein[2].
Role in Pre-mRNA splicing
The processing of pre-mRNA takes place through the use of a large dynamic machine known as the spliceosome, through which introns are removed and exons are spliced together to create a mature mRNA[5][6]. The spliceosome is comprised of five snRNA molecules (snRNAs U1, U2, U4, U5, and U6) and over one hundred associated proteins[5][6]. Assembly of the spliceosome is thought to take place in a stepwise manner around the pre-mRNA transcript[5][6]. The first step involves recognition of the 5’ splice site by U1 snRNP, followed by recognition of the branch point sequence by U2 snRNP[5][6]. From this point the remaining snRNPs U4, U5, and U6 join as a preformed tri-snNRP[5][6]. Together the five snRNPs form the precatalytic spliceosome which must undergo a series of changes before it can actively splice[7][6][8][5].
The Lsm complex 2-8 is involved in pre-mRNA splicing through association with the 3’terminal poly(U) tract of U6 snRNA [4]. U6 snRNP is different from the other snRNPs because it is completely assembled in the nucleus, whereas the other snRNAs first travel to the cytoplasm [4][5]. While the exact mechanism by which the Lsm2-8 complex acts is unclear, it is thought that it provides stability and function to the U6 snRNP[2]. For example, several experiments using mutants with point mutations in the Lsm proteins 2-8 have shown defects in splicing that correlate with low levels of U6 snRNA [2]. It may also play a role in the various rearrangements that are necessary throughout the splicing cycle, and has been shown to be important in the assembly of U4-U6 di snRNP and U4-U5/U6 tri snRNP [2]. An interesting difference between the Sm and Lsm proteins is that in order to assemble the Sm ring RNA must be present, yet this is not a requirement in Lsm ring assembly [4]. Overall, there is significant evidence to suggest that the Lsm proteins 2-8 play a key role in spliceosome biogenesis and architecture.
Role in mRNA decay
In yeast degradation of mRNA takes place by first shortening the poly(A) tail then the removing the 5’cap by Dcp1, a decapping protein [4]. The major exoribonuclease in mRNA decay (Xrn1) then quickly degrades the decapped mRNA [4]. Interestingly, when the Lsm 1-7 complex is purified Xrn1, Dcp1, Mrt1 (an additional protein associated with mRNA decapping), and mRNA co-immunopercipitate [4]. However, it is not only Lsm 1 that plays a role in mRNA decay. Lsm mutants of the 2-7 proteins have increased amounts of capped, deadenylated mRNAs [4]. There are two postulated functions of the Lsm1-7 complex in mRNA decay. The first suggests that the Lsm ring binds to the mRNA first then recruits the Dcp1, the second function may be to facilitate rearrangements of the mRNP complex to allow decapping enzymes access to the 5’cap [2]. Lsm proteins therefore do not only appear to play a role in creating functional mRNA, but also in degradation of mature mRNA.
Other roles of Lsm proteins
A third complex of Lsm proteins (Lsm2-7) is found in the nucleoli of Saccharomyces cerecisiae [1]. It is believed to play a role in the function or biogenesis of snoRNAs. Other potential roles of the Lsm proteins include processing of tRNAs, snoRNAs, and rRNAs, histone mRNA decapping, miRNA biogenesis, and maturation and/or stabilization of nascent RNA polymerase III transcripts [1][2].
Structure of Lsm proteins
, which consist of an α-helix proceeded by a twisted β-sheet [1][3]. , located between stands and of the β sheet, varies between 3 to 30 residues in length across the different Lsm proteins [3]. The β-sheet encloses a set of hydrophobic residues [3]. When the Lsm ring is assembled the hydrophobic region spreads into the now adjacent Lsm protein monomers [3]. When assembled into the ring between each subunit there are hydrogen bonds formed between and [3]. These interactions provided the Lsm ring with enough contacts to make a very stable structure [3]. There are two sides to the ring, the helix face and the loop face, found on of the ring [1]. It has been postulated that a U-rich RNA may bind to the inner portion of the helix face, and take part in hydrogen bonding interactions with residues located on loops 3 and 5, as well as potentially pass through the itself [3].
It is possible for single Lsm proteins to form homomeric heptamers, hexamers, or octamers, as well as the Lsm1-7 or 2-8 hexamers. In addition, Lsm proteins have been found to form higher order quaternary structures during the crystallization process [3][1]. These interactions are formed between helix-helix faces, loop-loop faces, and helix-loop faces [3]. The crystal structures available for analysis do not consist of full Lsm1-7 or Lsm2-8 complexes. However, the Lsm3 monomer, the N-terminal region of Lsm4, and an Lsm complex Lsm5-7 have been crystallized.
Lsm 3
The Lsm3 protein had been crystalized from Schizosaccharomyces pombe and from Saccharomyces cerevisiae, hereby referred to as SpLsm3 and ScLsm3 at 2.7Å and 2.5Å respectively.
ScLsm3
The ScLsm3 crystal structure takes the form of a ring composed of eight monomeric subunits. Each monomer contains the Sm motif containing the N-terminal α-helix (pro4-leu10) and the curved β-sheet (Glu14-Ser77). The stands β3 and β4 are long, which causes loop L4 residues to stick out and twist away from the main body of the ring. The only other Sm/Lsm protein to exhibit this is the human Sm protein SmB. Between each of the subunits there are hydrogen interactions between the C-terminal region of β4 and the neighboring β5. In addition, there are buried at this interface, which include Phe67, Ile68, Thr74, and Ile76. The overall ring structure is approximately 75Å wide, 50Å thick. The pore is approximately 20Å at the helix face and 25Å at the loop face. These measurements are greater than those of six or seven membered Lsm rings [3].
SpLsm3
As in ScLsm3, exhibits the sm motif containing an N-terminal α-helix (residues 10-17) and a curved β-sheet (residues 19-89). However, rather than forming an octomeric ring structure it formed a heptameric ring structure in crystallization experiments. SpLsm3 monomers interact through the same β4-β5 pairing as in ScLsm3. The overall ring is 61.5Å wide, 31Å thick, where the pore is approximately 20.7Å wide. In this crystal structure loop four is distorted[1].
Lsm 4
The contains a trimer of the Lsm4 monomers. It contains the Sm motif consisting of an α-helix (distorted) and a β-sheet formed by five antiparallel stands (residues 14-70)[1].
Lsm 5/6/7
A 2.5Å resolution structure of Lsm5, Lsm6 and Lsm7 has been determined where the crystal contains two hexameric Lsm657-657 rings. is located between and which analogous to their Sm counters parts. In the hexameric ring each subunit interacts in the same manner as the other Lsm proteins (ie through the β4 stand of one subunit to the β5 strand of the other) to form a continuous β-sheet through the whole ring. Each of the Lsm proteins exhibits the Sm motif with very small differences seen between them [9].
Role in RNA binding
With respect to the role of Lsm proteins binding to RNA substrates, the pore of the Lsm657-657 ring is positively charged, which would confer to interactions with negatively charged RNA. The Sm ring of Archaeoglobus fulgidus in complex with polyU RNA shows that each of the Sm proteins interacts with one base of RNA through residues in loops 3 and 5, and that the RNA is passed through the pore. Due to the fact that the residues between the Sm and Lsm proteins are fairly conserved it is possible that the Lsm proteins act through a similar mechanism. Two main differences can be seen however. There should be a canonical arginine or lysine in loop five of Lsm5 that forms a hydrogen bond to a base in the RNA, yet there is an asparagine present. In addition, a canonical aromatic residue that provides stacking interactions with an RNA base should be found in loop three of Lsm7, however there is a leucine present instead. While these differences prevent one from applying the RNA-protein interactions of Sm proteins to Lsm proteins, future studies may elucidate the exact mechanism [9].