The essential building blocks of proteins and their properties
Amino acids are the fundamental building blocks of proteins and peptides. Every amino acid shares a common core structure but differs in its side chain (R group), which determines its unique chemical and physical properties.
NH₃⁺
|
R — Cα — COO⁻
|
H
At physiological pH (~7.4), amino acids exist as zwitterions—molecules that carry both a positive charge (on the amino group) and a negative charge (on the carboxyl group) simultaneously:
This dipolar form is the predominant species in aqueous solution at neutral pH. The amino acid has zero net charge (charge balance) but is not uncharged—both the positive and negative charges are present and electrostatically active.
Why this matters: The zwitterionic character affects solubility, crystal packing, and reactivity. It also explains why amino acids have relatively high melting points (strong ionic interactions) compared to similar-sized uncharged molecules.
Twenty amino acids are encoded by the standard genetic code and commonly found in proteins. Each is abbreviated by both a three-letter code and a single-letter code:
Common mnemonics for remembering amino acids:
Amino acids can be classified in multiple ways based on their chemical and physical properties. The most common classification is by side chain polarity and charge.
Members: Gly (G), Ala (A), Val (V), Leu (L), Ile (I), Met (M), Phe (F), Trp (W), Pro (P)
Characteristics:
Special cases:
• Glycine (G): Smallest side chain (R = H), achiral, provides flexibility
• Proline (P): Cyclic structure, restricts backbone flexibility, "helix breaker"
• Aromatic (F, W, Y): Contain benzene-like rings, absorb UV light at 280 nm
Members: Ser (S), Thr (T), Cys (C), Tyr (Y), Asn (N), Gln (Q)
Characteristics:
Special cases:
• Cysteine (C): Thiol group (–SH) can form disulfide bonds (S–S), pKa ~8.3
• Tyrosine (Y): Phenolic –OH can be ionized at high pH, pKa ~10.1
• Serine & Threonine: Hydroxyl groups, sites for phosphorylation (PTM)
Members: Asp (D), Glu (E)
Characteristics:
Nomenclature note: "Aspartate" and "glutamate" refer to the ionized forms (–COO⁻), while "aspartic acid" and "glutamic acid" refer to the protonated forms (–COOH). At pH 7.4, they exist almost entirely as aspartate and glutamate.
Members: Lys (K), Arg (R), His (H)
Characteristics:
Special cases:
• Lysine (K): pKa ~10.5, almost always protonated (+) at pH 7.4
• Arginine (R): pKa ~12.5, always protonated (+) at biological pH
• Histidine (H): pKa ~6.0, can switch between protonated/deprotonated near pH 7.4
Understanding amino acid categories helps predict:
Amino acids link together through peptide bonds to form chains. Understanding peptide bond formation, structure, and properties is essential for comprehending peptide behavior.
Key points:
Key points:
The peptide bond's partial double bond character makes it remarkably stable. This stability is essential for proteins to maintain their structure, but it also means:
When amino acids link together through peptide bonds, they form peptide chains with distinct structural features and conventions.
Peptide sequences are written from N-terminus to C-terminus (left to right), using either three-letter or one-letter amino acid codes:
Why N→C convention? This matches the direction of ribosomal synthesis (proteins are built from N-terminus to C-terminus) and is universally used in biochemistry.
Structure: Free amino group (–NH₃⁺ at pH 7.4)
Also called: Amino terminus, N-terminal end
Charge: Usually +1 at physiological pH (pKa ~9.0)
Can be modified: Acetylation (Ac-) removes the charge
Structure: Free carboxyl group (–COO⁻ at pH 7.4)
Also called: Carboxyl terminus, C-terminal end
Charge: Usually -1 at physiological pH (pKa ~3.1)
Can be modified: Amidation (–NH₂) removes the charge
| Feature | Peptides | Proteins |
|---|---|---|
| Length | Typically 2-50 amino acids | Typically >50 amino acids |
| Structure | Often linear, may have some secondary structure | Complex 3D structure with defined tertiary/quaternary structure |
| Synthesis | Can be chemically synthesized (solid-phase peptide synthesis) | Usually produced by recombinant expression in cells |
| Stability | Generally stable, flexible | Require proper folding for stability and function |
| Examples | Oxytocin (9 aa), Insulin B-chain (30 aa), Melittin (26 aa) | Lysozyme (129 aa), Hemoglobin (574 aa), Titin (38,138 aa) |
Note: The boundary between "peptide" and "protein" is arbitrary and not strictly defined. Generally, chains <50 amino acids are called peptides, while longer chains are proteins. However, small proteins (like insulin at 51 residues) may still be called proteins despite their size.
Beyond simple linear peptides, several special categories exist:
Several key properties can be calculated or predicted from a peptide's amino acid sequence. These properties are essential for understanding peptide behavior and designing experiments.
What it is: The sum of atomic masses of all atoms in the peptide.
How to calculate: Add the mass of each amino acid residue (amino acid minus H₂O) plus terminal groups (H for N-term, OH for C-term).
Why it matters: Essential for mass spectrometry, concentration calculations, and gel electrophoresis (SDS-PAGE).
Units: Daltons (Da) or g/mol
What it is: The algebraic sum of all positive and negative charges at a given pH.
How to calculate: Count charged groups (N-term, C-term, K, R, H, D, E) and apply Henderson-Hasselbalch equation for each based on pH and pKa.
Why it matters: Determines electrostatic interactions, membrane binding, and purification strategy (ion exchange chromatography).
What it is: The pH at which the peptide has zero net charge (zwitterion).
How to calculate: Find the pH where positive charges = negative charges using iterative numerical methods (bisection, Newton-Raphson).
Why it matters: Minimum solubility at pI, maximum aggregation at pI, critical for isoelectric focusing (IEF) and formulation.
What it is: A measure of how "water-fearing" vs "water-loving" the peptide is.
How to calculate: Sum hydrophobicity values for each amino acid using scales like Wimley-White (based on water→membrane transfer free energy).
Why it matters: Predicts membrane interactions, retention in reverse-phase HPLC, and tendency to aggregate. Essential for cell-penetrating peptides.
Units: ΔG (kcal/mol)
What it is: How strongly the peptide absorbs UV light at 280 nm wavelength.
How to calculate: Count aromatic residues: Trp (5,500) + Tyr (1,490) + Cys-Cys disulfides (125).
Why it matters: Allows accurate concentration measurement using UV-Vis spectroscopy via Beer-Lambert law (A = εcl).
Units: M⁻¹cm⁻¹
What it is: The count of each element (C, H, N, O, S) in the peptide.
How to calculate: Sum the elemental composition of each amino acid residue plus terminal groups (H and OH).
Why it matters: Needed for accurate mass calculations, isotope labeling experiments, and elemental analysis.
Example: Gly-Ala = C₅H₁₀N₂O₃
PepDraw automatically calculates all these properties from your peptide sequence. Simply enter your sequence in the app, and instantly see:
With the exception of glycine (which has R = H), all amino acids have a chiral center at the α-carbon, meaning they can exist as two non-superimposable mirror images (enantiomers).
COO⁻
|
H₃N⁺—C—H
|
R
(L-amino acid)
How L-amino acids are rendered in PepDraw (up and down):
COO⁻
|
H—C—NH₃⁺
|
R
(D-amino acid)
How D-amino acids are rendered in PepDraw (up and down):
There are two systems for naming stereochemistry:
Important: Most L-amino acids are (S), but L-cysteine is (R) because the sulfur in the side chain has higher priority than the carboxyl group! Don't assume L = S.
The 20 standard amino acids differ only in their side chains (R groups), yet these variations create enormous functional diversity. Understanding side chain properties is key to predicting peptide behavior.
Side chains range from the smallest (Gly, R = H) to large bulky groups (Trp, with an indole ring).
Functional impact: Small residues (Gly) provide flexibility; large residues restrict conformations and drive specific folding patterns.
This is the most common classification scheme (covered in detail above). Briefly:
Several side chains can donate or accept hydrogen bonds:
Functional impact: H-bonding stabilizes secondary structure, mediates protein-protein interactions, and is critical in enzyme active sites.
Some amino acids have unique reactive groups:
• Can form disulfide bonds (Cys-S-S-Cys)
• Nucleophilic at high pH (thiolate, –S⁻)
• Redox active (oxidation ↔ reduction)
• pKa ~8.3, partially deprotonated at pH 7.4
• Weakly acidic (pKa ~10.1)
• Can be phosphorylated (PTM)
• Absorbs UV light (280 nm)
• Neutral at physiological pH
• pKa ~6.0 (near physiological pH!)
• Acts as both acid and base
• pH sensor in proteins
• Essential in many enzyme active sites
• Largest standard side chain
• Strongly absorbs UV (280 nm, ε = 5,500)
• Hydrophobic but can H-bond
• Fluorescent (intrinsic fluorescence)
• Always protonated (pKa ~12.5)
• Delocalized positive charge
• Forms strong salt bridges
• Critical for DNA/RNA binding
• Sulfur but NOT ionizable
• Susceptible to oxidation
• Hydrophobic character
• Important in protein folding
Certain amino acids serve as sites for chemical modifications that expand protein functionality:
Use these calculators to explore amino acid and peptide properties:
Calculate the exact molecular weight from a peptide sequence:
Analyze the amino acid composition of your peptide:
Explore more topics in peptide chemistry and analysis: