Monday, March 14, 2011

Protein Folding (Biochemistry)

I apologize in the advance for the overuse of the words “protein,” “fold,” and “structure.”  There’s just only so many ways to say these things!  But, there are lots of colorful pictures in this post.

 
                The central dogma (see post of the same name) explains how we get from DNA to protein and tries to underscore the essentiality of proteins to our existence.  They are the work horses of cells and, without proteins, we would not be alive.  Proteins carry oxygen around our bloodstream (hemoglobin), they catalyze reactions that would not otherwise happen (enzymes), they protect our cells against damage (p53), etc.  The list is seriously endless.

                What do proteins look like?  So far, I’ve explained that proteins are strings of the 20 different amino acids.  What does that mean?  

Let’s first look at amino acids (Figure 7.1).  All amino acids are small molecules that have an amine group (where we get the “amino” part) on one end and a carboxylic acid group (where we get the “acid” part) on the other.  In between is one carbon that bears the “R” group.  Each amino acid has its own “R” group.  For example, if we are talking about glycine, the “R” group is a hydrogen.  If we are talking about cysteine, the “R” group is a sulfur bound to a hydrogen.  All amino acids share the same basic scaffold, but vary at the “R” position to give some variety (the spice of life, people).



                The ribosome is where all these amino acids are linked together.  Since each amino acid shares the same scaffold, they are all linked together in the exact same way: via a peptide bond (Figure 7.2).  The carboxylic acid of the first amino acid comes close to the amine group of the next amino acid and some magic happens: the N binds to the C to form the peptide bond, while the O plus two Hs (in the form of H2O or water) leave the molecule.  



                The first amino acid in a protein (or polypeptide) still has a free amine and is called the N terminus of the protein.  The last amino acid has a free carboxyclic acid and is called the C terminus of the protein. 

                With the facts listed above and a known sequence of amino acids, you could draw an entire protein molecule.  (Yup – you could!)  Your gorgeous drawing would be called the primary structure of the protein.  

                I’m sure you realize, however, that proteins are not just long straight lines of amino acids floating around.  Think of a pipe cleaner – sure, it can be a long straight line, but it can also fold up on itself into any number of conformations.  Ah, the same is true of proteins!  The long string of amino acids will fold up to form a three dimensional structure and, once properly folded, it will become a fully active protein.  Scientists further breakdown protein structure into these three topics: secondary structure, tertiary structure and quaternary structure.

Secondary Structure: Because peptide bonds are just linking essentially the same molecule over and over again, this leads to some predictable ways that individual amino acids within the long string can interact with each other.  Sometimes they will all wrap around each other and form a helical shape, known as an alpha helix or sometimes they will all line up to make a flat sheet, known as a beta sheet (Figure 7.3).  Not all amino acids are involved in forming beta sheets or alpha helices.  Amino acids not involved in either are said to be in “random coil.”  These types of amino acid arrangements help form the three dimensional structure of a protein and are referred to by scientists as secondary structure. 



 Biochemists (like myself) can use algorithms to predict secondary structure from a protein’s primary sequence. That sounds fancy, but honestly, we just type in the primary sequence of the protein and a smart computer with a smarter programmer far away does some complicated calculations and emails us back the prediction in about two minutes.  The computer assigns each amino acid to alpha helix, beta sheet, or random coil.  Mind you, this is just a prediction.  To know for sure what is going on, we need to see the protein’s tertiary structure…



Tertiary Structure: This level explains how the rest of the amino acids are arranged in relation to and around the secondary structures.  Figure 7.4 shows some tertiary structures of proteins to give you a sense of how different proteins can look from each other and how secondary structure is incorporated into the tertiary structure.  Both the right and left slides are showing the same thing for each protein: the right side is highlighting the secondary structure (can you see the alpha helices and beta strands?) while the right is showing the exact same view but with all the atoms filled in (called space filling model).  Think of the difference between the left and right as a tree with no leaves (left) and the same tree full of leaves (right).



Quaternary Structure: For many proteins, one properly folded protein molecule is enough to fulfill the protein’s function.  However, other proteins require another protein to interact with before becoming fully active.  For example, hemoglobin is really comprised of four identical protein molecules (in both sequence and fold) that are all hanging out together.  Quaternary structure tells us how many protein molecules must come together to form the fully active protein.  If it is just one, then the quaternary structure is a monomer.  If it is four, then the quaternary structure is a tetramer.  (No, we are not limited to one or four – everything is possible!  I’ve heard of proteins with a quaternary structure that is dodecameric!)   

Scientists know the full structures (primary – quaternary) of many proteins, but there are thousands of others out there on which we still have little information.  For many of these, all we have is the primary sequence of amino acids and a guess at secondary structure from the available algorithms.  Part of understanding how a protein works is seeing what it looks like.  However, obtaining tertiary and detailed quaternary structures is an arduous amount of work that involves nuclear magnetic resonance imaging or X-ray crystallography (suffice it to say, neither technique yields an answer in a day – more like several months to several years).

Wouldn’t it be lovely if we could look at a primary sequence of a protein and know how it will fold up?  Of course!  But, unfortunately, we can’t predict much beyond secondary structure (YET!)  Several labs all over the world are trying to work out algorithms that will look at the primary sequence and spit back out an estimate at the tertiary structure.  So far, it’s still a work in progress, but advances are being made all the time!


Okay!  I think we are coming to the end of background information on proteins.  Excellent – this means I can discuss lots of literature for you!  The next Spanish Influenza post will look at the tertiary structure of hemagglutinin from 1918 and other influenza viruses (woo!).  I also have some posts ready on p53/cancer and multiple folded forms of proteins.  Ooooh…


Amine group: a nitrogen bound to three Hs (or R groups)
Carboxylic acid group: a carbon bound to both a double bonded oxygen and a single bonded OH or O-
Peptide bond: the bond which holds two amino acids together
Polypeptide: synonym for protein
Primary structure: the sequence of amino acids within a protein from N terminus (beginning) to C terminus (end)
Secondary structure: structures that amino acids held together by peptide bonds tend to form – three types: alpha helix, beta strand/beta sheet and random coil
Tertiary structure: how all the amino acids within the protein fold up or are placed in three dimensions
Quaternary structure: how many of each protein molecule must come together before an active protein is achieved
Alpha helix: coils of amino acids, type of secondary structure
Beta strand / Beta sheet: flat string of amino acids (beta strand), two or more strands coming together create a beta sheet.
Random coil: amino acids not involved in alpha helices or beta strands/sheets that adopt no set conformation at the level of secondary structure

ADDED NOTE: I covered a little of protein folding on Dr. Amedeo in a post called Protein Knotting if you'd like to read a little more!


References
Alberts et al. “Molecular Biology of the Cell, 4th Edition.”  Garland Science, New York, New York. (2002).

Bryson K, McGuffin LJ, Marsden RL, Ward JJ, Sodhi JS. & Jones DT. (2005) Protein structure prediction servers at University College London. Nucl. Acids Res. 33(Web Server issue):W36-38.

All protein structures came from the Protein Data Bank (www.pdb.org) and were rendered in PyMoL.


No comments:

Post a Comment