By Paul Dohrman
Updated Mar 24, 2022
Proteins are long polymers made from the 20 naturally occurring amino acids. Although some proteins incorporate non‑canonical residues, the backbone of every protein is a chain of amino acids linked by peptide bonds.
The journey begins in the nucleus where a segment of DNA is transcribed into messenger RNA (mRNA). The mRNA exits the nucleus and binds to a ribosome, the cell’s protein‑synthesizing machine. Transfer RNA (tRNA) molecules bring the appropriate amino acids to the ribosome, where they are sequentially added to the growing polypeptide chain.
Adjacent amino acids are joined head‑to‑tail via peptide bonds: the carboxyl group (‑COOH) of one residue bonds to the amino group (‑NH₂) of the next. The resulting chain is called a polypeptide. The peptide bond confers planarity to the backbone but allows rotation around the single bonds, giving the chain flexibility needed for folding.
Each amino acid has a distinct side chain (R‑group) attached to its central carbon. These side chains differ in size, charge, and hydrophobicity, influencing how the chain interacts with itself and with the aqueous cellular environment. Polar side chains tend to orient toward the solvent, while nonpolar groups cluster inside the protein core, driving the folding process.
The primary amino‑acid sequence encodes the unique three‑dimensional shape of the protein. Because the backbone can rotate freely, most polypeptides fold spontaneously into a single, energetically favored conformation. Even a single amino‑acid substitution can disrupt folding, rendering the protein nonfunctional.
With 20 amino acids available, there are 20n theoretical polypeptides of length n. However, only a minuscule fraction of these sequences fold into stable, functional proteins. The vast majority would be unstable or adopt multiple low‑energy conformations, so evolutionary pressure selects only the few sequences that meet the organism’s functional needs.