Skip Over Navigation Links

The Structures of Life

Chapter 1: Proteins are the Body's Worker Molecules

You've probably heard that proteins are important nutrients that help you build muscles. But they are much more than that. Proteins are worker molecules that are necessary for virtually every activity in your body. They circulate in your blood, seep from your tissues, and grow in long strands out of your head. Proteins are also the key components of biological materials ranging from silk fibers to elk antlers.

Proteins are worker molecules that are necessary for virtually every activity in your body.

Proteins have many different functions in our bodies. By studying the structures of proteins, we are better able to understand how they function normally and how some proteins with abnormal shapes can cause disease.
Proteins have many different functions in our bodies. By studying the structures of proteins, we are better able to understand how they function normally and how some proteins with abnormal shapes can cause disease.

Back to Top

Proteins Are Made From Small Building Blocks

Proteins are like long necklaces with differently shaped beads. Each "bead" is a small molecule called an amino acid. There are 20 standard amino acids, each with its own shape, size, and properties.

Proteins typically contain from 50 to 2,000 amino acids hooked end-to-end in many combinations. Each protein has its own sequence of amino acids.

Proteins are made of amino acids hooked end-to-end like beads on a necklace.
Proteins are made of amino acids hooked end-to-end like beads on a necklace.
To become active, proteins must twist and fold into their final, or 'native,' conformation.
To become active, proteins must twist and fold into their final, or "native," conformation."
This final shape enables proteins to accomplish their function in your body.
This final shape enables proteins to accomplish their function in your body.

These amino acid chains do not remain straight and orderly. They twist and buckle, folding in upon themselves, the knobs of some amino acids nestling into grooves in others.

This process is complete almost immediately after proteins are made. Most proteins fold in less than a second, although the largest and most complex proteins may require several seconds to fold. Most proteins need help from other proteins, called "chaperones," to fold efficiently.

Back to Top

Proteins in All Shapes and Sizes

Because proteins have diverse roles in the body, they come in many shapes and sizes. Studies of these shapes teach us how the proteins function in our bodies and help us understand diseases caused by abnormal proteins.

To learn more about the proteins shown here, and many others, check out the Molecule of the Month section of the RCSB Protein Data Bank.(http://www.pdb.org)

Molecule of the Month images by David S. Goodsell, The Scripps Research Institute

Collagen in our cartilage and tendons gains its strength from its three-stranded, ropelike structure.
Collagen in our cartilage and tendons gains its strength from its three-stranded, rope-like structure.
Click for larger image
Antibodies are immune system proteins that rid the body of foreign material, including bacteria and viruses. The two arms of the Y-shaped antibody bind to a foreign molecule. The stem of the antibody sends signals to recruit other members of the immune system.
Antibodies are immune system proteins that rid the body of foreign material, including bacteria and viruses. The two arms of the Y-shaped antibody bind to a foreign molecule. The stem of the antibody sends signals to recruit other members of the immune system.
Click for larger image
Some proteins latch onto and regulate the activity of our genetic material, DNA. Some of these proteins are donut shaped, enabling them to form a complete ring around the DNA. Shown here is DNA polymerase III, which cinches around DNA and moves along the strands as it copies the genetic material.
Some proteins latch onto and regulate the activity of our genetic material, DNA. Some of these proteins are donut shaped, enabling them to form a complete ring around the DNA. Shown here is DNA polymerase III, which cinches around DNA and moves along the strands as it copies the genetic material.
Click for larger image
Enzymes, which are proteins that facilitate chemical reactions, often contain a groove or pocket to hold the molecule they act upon. Shown here (clockwise from top) are luciferase, which creates the yellowish light of fireflies; amylase, which helps us digest starch; and reverse transcriptase, which enables HIV and related viruses to enslave infected cells.
Enzymes, which are proteins that facilitate chemical reactions, often contain a groove or pocket to hold the molecule they act upon. Shown here (clockwise from top) are luciferase, which creates the yellowish light of fireflies; amylase, which helps us digest starch; and reverse transcriptase, which enables HIV and related viruses to enslave infected cells.
Click for larger image

Back to Top

Computer Graphics Advance Research

A ribbon diagram highlights organized regions of the protein (red and light blue).
A ribbon diagram highlights organized regions of the protein (red and light blue).
Click for larger image
A space-filling molecular model attempts to show atoms as spheres whose sizes correlate with the amount of space the atoms occupy. The same atoms are colored red and light blue in this model and in the ribbon diagram.
A space-filling molecular model attempts to show atoms as spheres whose sizes correlate with the amount of space the atoms occupy. The same atoms are colored red and light blue in this model and in the ribbon diagram.
Click for larger image
A surface rendering of the same protein shows its overall shape and surface properties. The red and blue coloration indicates the electrical charge of atoms on the protein's surface.
A surface rendering of the same protein shows its overall shape and surface properties. The red and blue coloration indicates the electrical charge of atoms on the protein's surface.
Click for larger image

Decades ago, scientists who wanted to study three-dimensional molecular structures spent days, weeks, or longer building models out of rods, balls, and wire scaffolding.

Today, they use computer graphics. Within seconds, scientists can display a molecule in several different ways (like the three representations of a single protein shown here), manipulate it on the computer screen, simulate how it might interact with other molecules, and study how defects in its structure could cause disease.

To try one of these computer graphics programs, go to http://www.proteinexplorer.org or http://www.pdb.org.

Back to Top

Small Errors in Proteins Can Cause Disease

Normal Red Blood Cells
Normal Red Blood Cells

Sometimes, an error in just one amino acid can cause disease. Sickle cell disease, which most often affects those of African descent, is caused by a single error in the gene for hemoglobin, the oxygen-carrying protein in red blood cells.

This error, or mutation, results in an incorrect amino acid at one position in the molecule. Hemoglobin molecules with this incorrect amino acid stick together and distort the normally smooth, lozenge-shaped red blood cells into jagged sickle shapes.

The most common symptom of the disease is unpredictable pain in any body organ or joint, caused when the distorted blood cells jam together, unable to pass through small blood vessels. These blockages prevent oxygen-carrying blood from getting to organs and tissues. The frequency, duration, and severity of this pain vary greatly between individuals.

The disease affects about 1 in every 500 African Americans, and 1 in 12 carry the trait and can pass it on to their children, but do not have the disease themselves.

Sickled Red Blood Cells
Sickled Red Blood Cells

Another disease caused by a defect in one amino acid is cystic fibrosis. This disease is most common in those of northern European descent, affecting about 1 in 2,500 Caucasians in the United States. Another 1 in 25 or 30 are carriers.

The disease is caused when a protein called CFTR is incorrectly folded. This misfolding is usually caused by the deletion of a single amino acid in CFTR. The function of CFTR, which stands for cystic fibrosis transmembrane conductance regulator, is to allow chloride ions (a component of table salt) to pass through the outer membranes of cells.

When this function is disrupted in cystic fibrosis, glands that produce sweat and mucus are most affected. A thick, sticky mucus builds up in the lungs and digestive organs, causing malnutrition, poor growth, frequent respiratory infections, and difficulties breathing. Those with the disorder usually die from lung disease around the age of 35.

Back to Top

Parts of Some Proteins Fold Into Corkscrews

A mix of alpha helices and beta sheets
A mix of alpha helices and beta sheets. Image courtesy of RCSB Protein Data Bank (http://www.pdb.org)
Click for larger image
Mostly beta sheets
Mostly beta sheets. Image courtesy of RCSB Protein Data Bank.
Click for larger image
Mostly alpha helices
Mostly alpha helices. Image courtesy of RCSB Protein Data Bank.
Click for larger image

When proteins fold, they don't randomly wad up into twisted masses. Often, short sections of proteins form recognizable shapes. Where a protein chain curves into a corkscrew, that section is called an alpha helix. Where it forms a flattened strip, it is a beta sheet.

These organized sections of a protein pack together with each other-or with other, less organized sections—to form the final, folded protein. Some proteins contain mostly alpha helices (red in the ribbon diagrams). Others contain mostly beta sheets (light blue), or a mix of alpha helices and beta sheets.

Back to Top

Mountain Climbing and Computational Modeling

David BakerMany scientists use computers to try to solve the protein folding problem. One example is David Baker, a mountain climber and computational biologist at the University of Washington. He designs software to predict protein structures—and harnesses unused computer power from college dorm rooms to do so. Read more about David Baker at http://publications.nigms.nih.gov/findings/
sept05/business.html
.

The Problem of Protein Folding

A given sequence of amino acids almost always folds into a characteristic, three-dimensional structure. So scientists reason that the instructions for folding a protein must be encoded within this sequence. Researchers can easily determine a protein's amino acid sequence. But for more than 50 years they've tried—and failed—to crack the code that governs folding.

Scientists call this the "protein folding problem," and it remains one of the great challenges in structural biology. Although researchers have teased out some general rules and, in some cases, can make rough guesses of a protein's shape, they cannot accurately and reliably predict the position of every atom in the molecule based only on the amino acid sequence.

The medical incentives for cracking the folding code are great. Diseases including Alzheimer's, cystic fibrosis, and "mad cow" disease are thought to result from misfolded proteins. Many scientists believe that if we could decipher the structures of proteins from their sequences, we could better understand how the proteins function and malfunction. Then we could use that knowledge to improve the treatment of these diseases.

Back to Top

Structural Genomics: From Gene to Structure, and Perhaps Function

As part of the Protein Structure Initiative, research teams across the nation have determined thousands of molecular structures, including this structure of a protein from the organism that causes tuberculosis. Courtesy of the TB Structural Genomics Consortium
As part of the Protein Structure Initiative, research teams across the nation have determined thousands of molecular structures, including this structure of a protein from the organism that causes tuberculosis. Courtesy of the TB Structural Genomics Consortium
Click for larger image

The potential value of cracking the protein folding code skyrocketed after the launch, in the 1990s, of genome sequencing projects. These ongoing projects give scientists ready access to the complete genetic sequence of hundreds of organisms—including humans.

From these genetic sequences, scientists can easily obtain the corresponding amino acid sequences using the "genetic code".

The availability of complete genome sequences (and amino acid sequences) has opened up new avenues of research, such as studying the structure of all proteins from a single organism or comparing, across many different species, proteins that play a specific biological role.

The ultimate dream of structural biologists around the globe is to determine directly from genetic sequences not only the three-dimensional structure, but also some aspects of the function of all proteins.

They are partially there: They have identified amino acid sequences that code for certain structural features, such as a cylinder woven from beta sheets.

Researchers have also cataloged structural features that play specific biological roles. For example, a characteristic cluster of alpha helices strongly suggests that the protein binds to DNA.

But that is a long way from accurately determining a protein's structure based only on its genetic or amino acid sequence. Scientists recognized that achieving this long-term goal would require a focused, collaborative effort. So was born a new field called structural genomics.

In 2000, NIGMS launched a project in structural genomics called the Protein Structure Initiative or PSI. This multimillion-dollar project involves hundreds of scientists across the nation.

Members of the Protein Structure Initiative determined this structure of an enzyme from a common soil bacterium. Courtesy of the New York Structural GenomiX Consortium
Members of the Protein Structure Initiative determined this structure of an enzyme from a common soil bacterium. Courtesy of the New York Structural GenomiX Consortium
Click for larger image

The PSI scientists are taking a calculated shortcut. Their strategy relies on two facts.

First, proteins can be grouped into families based on their amino acid sequence. Members of the same protein family often have similar structural features, just as members of a human family might all have long legs or high cheek bones.

Second, sophisticated computer programs can use previously solved structures as guides to predict other protein structures.

The PSI team expects that, if they solve a few thousand carefully selected protein structures, they can use computer modeling to predict the structures of hundreds of thousands of related proteins.

Already, the PSI team has solved a total of more than 2400 structures. Of these, more than 1600 appear unrelated, suggesting that they might serve as guides for modeling the structures of other proteins in their families.

Perhaps even more significant, PSI researchers have developed new technologies that improve the speed and ease of determining molecular structures. Many of these new technologies are robots that automate previously labor-intensive steps in structure determination. Thanks to these robots, it is possible to solve structures faster than ever before. Besides benefiting the PSI team, these technologies have accelerated research in other fields.

PSI scientists (and structural biologists worldwide) send their findings to the Protein Data Bank at http://www.pdb.org. There, the information is freely available to advance research by the broader scientific community.

Back to Top

The Genetic Code

In addition to the protein folding code, which remains unbroken, there is another code, a genetic code, that scientists cracked in the mid-1960s. The genetic code reveals how living organisms use genes as instruction manuals to make proteins.

  U C A G
U UUU phenylalanine
UUC phenylalanine
UUA leucine
UUG leucine
UCU serine
UCC serine
UCA serine
UCG serine
UAU tyrosine
UAC tyrosine
UAA stop
UAG stop
UGU cysteine
UGC cysteine
UGA stop
UGG tryptophan
C CUU leucine
CUC leucine
CUA leucine
CUG leucine
CCU proline
CCC proline
CCA proline
CCG proline
CAU histidine
CAC histidine
CAA glutamine
CAG glutamine
CGU arginine
CGC arginine
CGA arginine
CGG arginine
A AUU isoleucine
AUC isoleucine
AUA isoleucine
AUG methionine
ACU threonine
ACC threonine
ACA threonine
ACG threonine
AAU asparagine
AAC asparagine
AAA lysine
AAG lysine
AGU serine
AGC serine
AGA arginine
AGG arginine
G GUU valine
GUC valine
GUA valine
GUG valine
GCU alanine
GCC alanine
GCA alanine
GCG alanine
GAU aspartic acid
GAC aspartic acid
GAA glutamic acid
GAG glutamic acid
GGU glycine
GGC glycine
GGA glycine
GGG glycine

This table shows all possible mRNA triplets and the amino acids they specify. Note that most amino acids may be specified by more than one mRNA triplet. The highlighted entries are shown in the illustration below.

Back to Top

Provocative Proteins

  • Each one of us has several hundred thousand different proteins in our body.
  • Spider webs and silk fibers are made of the strong, pliable protein fibroin. SpiderSpider silk is stronger than a steel rod of the same diameter, yet it is much more elastic, so scientists hope to use it for products as diverse as bulletproof vests and artificial joints. The difficult part is harvesting the silk, because spiders are much less cooperative than silkworms!
  • FrogThe light of fireflies (also called lightning bugs) is made possible by a protein called luciferase. Although most predators stay away from the bittertasting insects, some frogs eat so many fireflies that they glow!
  • The deadly venoms of cobras, scorpions, and puffer fish contain small proteins that act as nerve toxins. CobraSome sea snails stun their prey (and occasionally, unlucky humans) with up to 50 such toxins. One of these toxins has been developed into a drug called Prialt®, which is used to treat severe pain that is unresponsive even to morphine.
  • Sometimes ships in the northwest Pacific Ocean leave a trail of eerie green light. JellyfishThe light is produced by a protein in jellyfish when the creatures are jostled by ships. Because the trail traces the path of ships at night, this green fluorescent protein has interested the Navy for many years. Many cell biologists also use it to fluorescently mark the cellular components they are studying.
  • RhinoIf a recipe calls for rhino horn, ibis feathers, and porcupine quills, try substituting your own hair or fingernails. It's all the same stuff—alpha-keratin, a tough, water-resistant protein that is also the main component of wool, scales, hooves, tortoise shells, and the outer layer of your skin.

Back to Top

 

Got It?

What is a protein?

Name three proteins in your body and describe what they do.

What do we learn from studying the structures of proteins?

Describe the protein folding problem.

Next Chapter

This page last reviewed on October 27, 2011