Foundational Discovery · 1955

Sanger's Sequencing of Insulin

Portrait of Frederick Sanger, who sequenced the insulin protein — Public domain (Wikimedia Commons)

In the late 1940s, proteins were known to be long chains of amino acids, but whether each protein had a single fixed sequence was still an open question. Some biochemists entertained the idea that proteins might be statistical mixtures of sequences, or that sequence might vary between individuals or species. Frederick Sanger, working at the Medical Research Council unit in Cambridge, set out to answer the question by reading the complete sequence of a protein small enough to be tractable. Insulin, available in large quantities from commercial pancreatic preparations, was the obvious candidate.

The technical obstacle was that no method existed to read a protein's sequence directly. Sanger developed one. He introduced 2,4-dinitrofluorobenzene, later called Sanger's reagent, to label the free amino terminus of a peptide chain, which could then be identified after acid hydrolysis. By cleaving insulin with different enzymes and acids at different positions, generating overlapping fragments, and painstakingly mapping where each fragment fell in the overall chain, he assembled the full sequence of both chains over roughly ten years of work. The A chain had 21 residues; the B chain had 30. Two disulfide bonds connected them at specific positions, with a third within the A chain itself.

The paper describing the complete structure appeared in the Biochemical Journal in 1955. The demonstration that a protein had one and only one defined sequence was not merely a fact about insulin; it resolved the conceptual question for all proteins. If sequence was fixed, it had to be encoded somewhere in the cell, and whatever stored genetic information had to specify an exact linear order. The finding created logical pressure on everyone thinking about genes and heredity.

Watson and Crick had published the double helix structure two years earlier, in 1953. Sanger's result provided the complementary constraint: if DNA was the hereditary molecule, it needed to encode specific sequences, which the base-pair template mechanism could accomplish. The two discoveries together made the genetic code a solvable problem, one that Crick, Brenner, Nirenberg, and others solved by the mid-1960s. Sanger received the Nobel Prize in Chemistry in 1958 for the insulin work.

The peptide-sequencing techniques Sanger developed became standard tools in protein biochemistry through the 1960s and early 1970s. He then turned to nucleic acid sequencing, developing chain-terminating dideoxynucleotide sequencing for DNA in 1977. That method, refined through the early 1980s, became the basis for automated DNA sequencing used in the Human Genome Project. Sanger received a second Nobel Prize in Chemistry in 1980 for that work, making him one of only four people to win the prize twice.

Key People

Frederick Sanger — Biochemist who sequenced insulin; Nobel Prizes in 1958 and 1980.
Hans Tuppy — Austrian biochemist who collaborated with Sanger on the B chain sequence.
E.O.P. Thompson — Collaborated with Sanger on the A chain sequence analysis.

Read the original — PubMed

Biochemical Journal, 1955

Related landmarks

1971 · First clinical CT scanner (EMI head scanner) (Foundational Discovery)
1973 · Lauterbur's NMR imaging (zeugmatography), origin of MRI (Foundational Discovery)
1932 · Identification of vitamin C as the antiscorbutic factor (Foundational Discovery)
1912 · Funk's vitamine (vital amine) hypothesis (Foundational Discovery)

← All Landmark Moments