The Structures of Life
Chapter 1: Proteins are the Body's Worker Molecules
You've probably heard that proteins are important nutrients that help you build muscles. But they are much more than that. Proteins are worker molecules that are necessary for virtually every activity in your body. They circulate in your blood, seep from your tissues, and grow in long strands out of your head. Proteins are also the key components of biological materials ranging from silk fibers to elk antlers.
Proteins are like long necklaces with differently shaped beads. Each "bead" is a small molecule called an amino acid. There are 20 standard amino acids, each with its own shape, size, and properties.
Proteins typically contain from 50 to 2,000 amino acids hooked end-to-end in many combinations. Each protein has its own sequence of amino acids.
These amino acid chains do not remain straight and orderly. They twist and buckle, folding in upon themselves, the knobs of some amino acids nestling into grooves in others.
This process is complete almost immediately after proteins are made. Most proteins fold in less than a second, although the largest and most complex proteins may require several seconds to fold. Most proteins need help from other proteins, called "chaperones," to fold efficiently.
Because proteins have diverse roles in the body, they come in many shapes and sizes. Studies of these shapes teach us how the proteins function in our bodies and help us understand diseases caused by abnormal proteins.
To learn more about the proteins shown here, and many others, check out the Molecule of the Month section of the RCSB Protein Data Bank.(http://www.pdb.org)
Molecule of the Month images by David S. Goodsell, The Scripps Research Institute
Decades ago, scientists who wanted to study three-dimensional molecular structures spent days, weeks, or longer building models out of rods, balls, and wire scaffolding.
Today, they use computer graphics. Within seconds, scientists can display a molecule in several different ways (like the three representations of a single protein shown here), manipulate it on the computer screen, simulate how it might interact with other molecules, and study how defects in its structure could cause disease.
Sometimes, an error in just one amino acid can cause disease. Sickle cell disease, which most often affects those of African descent, is caused by a single error in the gene for hemoglobin, the oxygen-carrying protein in red blood cells.
This error, or mutation, results in an incorrect amino acid at one position in the molecule. Hemoglobin molecules with this incorrect amino acid stick together and distort the normally smooth, lozenge-shaped red blood cells into jagged sickle shapes.
The most common symptom of the disease is unpredictable pain in any body organ or joint, caused when the distorted blood cells jam together, unable to pass through small blood vessels. These blockages prevent oxygen-carrying blood from getting to organs and tissues. The frequency, duration, and severity of this pain vary greatly between individuals.
The disease affects about 1 in every 500 African Americans, and 1 in 12 carry the trait and can pass it on to their children, but do not have the disease themselves.
Another disease caused by a defect in one amino acid is cystic fibrosis. This disease is most common in those of northern European descent, affecting about 1 in 2,500 Caucasians in the United States. Another 1 in 25 or 30 are carriers.
The disease is caused when a protein called CFTR is incorrectly folded. This misfolding is usually caused by the deletion of a single amino acid in CFTR. The function of CFTR, which stands for cystic fibrosis transmembrane conductance regulator, is to allow chloride ions (a component of table salt) to pass through the outer membranes of cells.
When this function is disrupted in cystic fibrosis, glands that produce sweat and mucus are most affected. A thick, sticky mucus builds up in the lungs and digestive organs, causing malnutrition, poor growth, frequent respiratory infections, and difficulties breathing. Those with the disorder usually die from lung disease around the age of 35.
When proteins fold, they don't randomly wad up into twisted masses. Often, short sections of proteins form recognizable shapes. Where a protein chain curves into a corkscrew, that section is called an alpha helix. Where it forms a flattened strip, it is a beta sheet.
These organized sections of a protein pack together with each other-or with other, less organized sections—to form the final, folded protein. Some proteins contain mostly alpha helices (red in the ribbon diagrams). Others contain mostly beta sheets (light blue), or a mix of alpha helices and beta sheets.
Many scientists use computers to try to solve the protein folding problem. One example is David Baker, a mountain climber and computational biologist at the University of Washington. He designs software to predict protein structures—and harnesses unused computer power from college dorm rooms to do so. Read more about David Baker at https://publications.nigms.nih.gov/findings/
A given sequence of amino acids almost always folds into a characteristic, three-dimensional structure. So scientists reason that the instructions for folding a protein must be encoded within this sequence. Researchers can easily determine a protein's amino acid sequence. But for more than 50 years they've tried—and failed—to crack the code that governs folding.
Scientists call this the "protein folding problem," and it remains one of the great challenges in structural biology. Although researchers have teased out some general rules and, in some cases, can make rough guesses of a protein's shape, they cannot accurately and reliably predict the position of every atom in the molecule based only on the amino acid sequence.
The medical incentives for cracking the folding code are great. Diseases including Alzheimer's, cystic fibrosis, and "mad cow" disease are thought to result from misfolded proteins. Many scientists believe that if we could decipher the structures of proteins from their sequences, we could better understand how the proteins function and malfunction. Then we could use that knowledge to improve the treatment of these diseases.
The potential value of cracking the protein folding code skyrocketed after the launch, in the 1990s, of genome sequencing projects. These ongoing projects give scientists ready access to the complete genetic sequence of hundreds of organisms—including humans.
From these genetic sequences, scientists can easily obtain the corresponding amino acid sequences using the "genetic code".
The availability of complete genome sequences (and amino acid sequences) has opened up new avenues of research, such as studying the structure of all proteins from a single organism or comparing, across many different species, proteins that play a specific biological role.
The ultimate dream of structural biologists around the globe is to determine directly from genetic sequences not only the three-dimensional structure, but also some aspects of the function of all proteins.
They are partially there: They have identified amino acid sequences that code for certain structural features, such as a cylinder woven from beta sheets.
Researchers have also cataloged structural features that play specific biological roles. For example, a characteristic cluster of alpha helices strongly suggests that the protein binds to DNA.
But that is a long way from accurately determining a protein's structure based only on its genetic or amino acid sequence. Scientists recognized that achieving this long-term goal would require a focused, collaborative effort. So was born a new field called structural genomics.
The PSI scientists are taking a calculated shortcut. Their strategy relies on two facts.
First, proteins can be grouped into families based on their amino acid sequence. Members of the same protein family often have similar structural features, just as members of a human family might all have long legs or high cheek bones.
Second, sophisticated computer programs can use previously solved structures as guides to predict other protein structures.
The PSI team expects that, if they solve a few thousand carefully selected protein structures, they can use computer modeling to predict the structures of hundreds of thousands of related proteins.
Already, the PSI team has solved a total of more than 2400 structures. Of these, more than 1600 appear unrelated, suggesting that they might serve as guides for modeling the structures of other proteins in their families.
Perhaps even more significant, PSI researchers have developed new technologies that improve the speed and ease of determining molecular structures. Many of these new technologies are robots that automate previously labor-intensive steps in structure determination. Thanks to these robots, it is possible to solve structures faster than ever before. Besides benefiting the PSI team, these technologies have accelerated research in other fields.
PSI scientists (and structural biologists worldwide) send their findings to the Protein Data Bank at http://www.pdb.org. There, the information is freely available to advance research by the broader scientific community.
In addition to the protein folding code, which remains unbroken, there is another code, a genetic code, that scientists cracked in the mid-1960s. The genetic code reveals how living organisms use genes as instruction manuals to make proteins.
|GAU aspartic acid
GAC aspartic acid
GAA glutamic acid
GAG glutamic acid
This table shows all possible mRNA triplets and the amino acids they specify. Note that most amino acids may be specified by more than one mRNA triplet. The highlighted entries are shown in the illustration below.
- Each one of us has several hundred thousand different proteins in our body.
- Spider webs and silk fibers are made of the strong, pliable protein fibroin. Spider silk is stronger than a steel rod of the same diameter, yet it is much more elastic, so scientists hope to use it for products as diverse as bulletproof vests and artificial joints. The difficult part is harvesting the silk, because spiders are much less cooperative than silkworms!
- The light of fireflies (also called lightning bugs) is made possible by a protein called luciferase. Although most predators stay away from the bittertasting insects, some frogs eat so many fireflies that they glow!
- The deadly venoms of cobras, scorpions, and puffer fish contain small proteins that act as nerve toxins. Some sea snails stun their prey (and occasionally, unlucky humans) with up to 50 such toxins. One of these toxins has been developed into a drug called Prialt®, which is used to treat severe pain that is unresponsive even to morphine.
- Sometimes ships in the northwest Pacific Ocean leave a trail of eerie green light. The light is produced by a protein in jellyfish when the creatures are jostled by ships. Because the trail traces the path of ships at night, this green fluorescent protein has interested the Navy for many years. Many cell biologists also use it to fluorescently mark the cellular components they are studying.
- If a recipe calls for rhino horn, ibis feathers, and porcupine quills, try substituting your own hair or fingernails. It's all the same stuff—alpha-keratin, a tough, water-resistant protein that is also the main component of wool, scales, hooves, tortoise shells, and the outer layer of your skin.
What is a protein?
Name three proteins in your body and describe what they do.
What do we learn from studying the structures of proteins?
Describe the protein folding problem.