ABOUT 120,000 types of protein molecule have yielded up their structures to science. That sounds a lot, but it isnt. The techniques, such as X-ray crystallography and nuclear-magnetic resonance (NMR), which are used to elucidate such structures do not work on all proteins. Some types are hard to produce or purify in the volumes required. Others do not seem to crystallise at alla prerequisite for probing them with X-rays. As a consequence, those structures that have been determined include representatives of less than a third of the 16,000 known protein families. Researchers can build reasonable computer models for around another third, because the structures of these resemble ones already known. For the remainder, however, there is nothing to go on.
In addition to this lack of information about protein families, there is a lack of information about those from the species of most interest to researchers: Homo sapiens. Only a quarter of known protein structures are human. A majority of the rest come from bacteria. This paucity is a problem, for in proteins form and function are intimately related. A protein is a chain of smaller molecules, called amino acids, that is often hundreds or thousands of links long. By a process not well understood, this chain folds up, after it has been made, into a specific and complex three-dimensional shape. That shape determines what the protein does: acting as a channel, say, to admit a chemical into a cell; or as an enzyme to accelerate a chemical reaction; or as a receptor, to receive chemical signals and pass them on to a cells molecular machinery. (Models of all three, in that order, are shown above.)
Almost all drugs work by binding to a particular protein in a particular place, thereby altering or disabling that proteins function. Designing new drugs is easier if binding sites can be identified in advance. But that means knowing the proteins structure. To be able to predict this from the order of the amino acids in the chain would thus be of enormous value. That is a hard task, but it is starting to be cracked.
Chain gang
One of the leading researchers in the field of protein folding is David Baker of the University of Washington, in Seattle. For the past 20 years he and his colleagues have used increasingly sophisticated versions of a program they call Rosetta to generate various possible shapes for a given protein, and then work out which is most stable and thus most likely to be the real one. In 2015 they predicted the structures of representative members of 58 of the missing protein families. Last month they followed that up by predicting 614 more.
Even a small protein can fold up into tens of thousands of shapes that are more or less stable. According to Dr Baker, a chain a mere 70 amino acids longa tiddler in biological termshas to be folded virtually inside a computer about 100,000 times in order to cover all the possibilities and thus find the optimum. Since it takes a standard microprocessor ten minutes to do the computations needed for a single one of these virtual foldings, even for a protein this small, the project has, for more than a decade, relied on cadging processing power from thousands of privately owned PCs. Volunteers download a version of Dr Bakers program, called rosetta@home, that runs in the background when a computer is otherwise idle.
This citizen science has helped a lot. But the real breakthrough, which led to those 672 novel structures, is a shortcut known as protein-contact prediction. This relies on the observation that chain-folding patterns seen in nature bring certain pairs of amino acids close together predictably enough for the fact to be used in the virtual-folding process.
An amino acid has four arms, each connected to a central carbon atom. Two arms are the amine group and the acid group that give the molecule its name. Protein chains form because amine groups and acid groups like to react together and link up. The third is a single hydrogen atom. But the fourth can be any combination of atoms able to bond with the central carbon atom. It is this fourth arm, called the side chain, which gives each type of amino acid its individual characteristics.
One common protein-contact prediction is that, if the side chain of one member of a pair of amino acids brought close together by folding is long, then that of the other member will be short, and vice versa. In other words, the sum of the two lengths is constant. If you have but a single protein sequence available, knowing this is not much use. Recent developments in genomics, however, mean that the DNA sequences of lots of different species are now available. Since DNA encodes the amino-acid sequences of an organisms proteins, the composition of those species proteins is now known, too. That means slightly different versions, from related species, of what is essentially the same protein can be compared. The latest version of Rosetta does so, looking for co-variation (eg, in this case, two places along the length of the proteins chains where a shortening of an amino acids side chain in one is always accompanied by a lengthening of it in the other). In this way, it can identify parts of the folded structure that are close together.
Though it is still early days, the method seems to work. None of the 614 structures Dr Baker modelled most recently has yet been elucidated by crystallography or NMR, but six of the previous 58 have. In each case the prediction closely matched reality. Moreover, when used to hindcast the shapes of 81 proteins with known structures, the protein-contact-prediction version of Rosetta got them all right.
There is a limitation, though. Of the genomes well-enough known to use for this trick, 88,000 belong to bacteria, the most speciose type of life on Earth. Only 4,000 belong to eukaryotesthe branch of life, made of complex cells, which includes plants, fungi and animals. There are, then, not yet enough relatives of human beings in the mix to look for the co-variation Dr Bakers method relies on.
Others think they have an answer to that problem. They are trying to extend protein-contact prediction to look for relationships between more than two amino acids in a chain. This would reduce the number of related proteins needed to draw structural inferences and might thus bring human proteins within range of the technique. But to do so, you need a different computational approach. Those attempting it are testing out the branch of artificial intelligence known as deep learning.
Linking the links
Deep learning employs pieces of software called artificial neural networks to fossick out otherwise-abstruse patterns. It is the basis of image- and speech-recognition programs, and also of the game-playing programs that have recently beaten human champions at Go and poker.
Jianlin Cheng, of the University of Missouri, in Columbia, who was one of the first to apply deep learning in this way, says such programs should be able to spot correlations between three, four or more amino acids, and thus need fewer related proteins to predict structures. Jinbo Xu, of the Toyota Technological Institute in Chicago, claims to have achieved this already. He and his colleagues published their method in PLOS Computational Biology, in January, and it is now being tested.
If the deep-learning approach to protein folding lives up to its promise, the number of known protein structures should multiply rapidly. More importantly, so should the number that belong to human proteins. That will be of immediate value to drugmakers. It will also help biologists understand better the fundamental workings of cellsand thus what, at a molecular level, it truly means to be alive.
Read the original:
How to determine a protein's shape - The Economist
- Howard H. Seliger, Hopkins biology professor - January 1st, 2013 [January 1st, 2013]
- General Biology-Concepts and Investigations - Video - January 31st, 2013 [January 31st, 2013]
- Biology Reproduction part 13 (Sexual reproduction: Flower Structure) CBSE class 10 X - Video - January 9th, 2014 [January 9th, 2014]
- How far can a Buddhist approach to biology take us? - January 14th, 2014 [January 14th, 2014]
- Biology revision song on protein synthesis by Andrew Perkins - Video - January 30th, 2014 [January 30th, 2014]
- Scientific Evidence for Creation CSE BIBLE FORUM Origins 1212 Dr Seuss Biology - Video - February 9th, 2014 [February 9th, 2014]
- Synthetic biology lab backed by 2 million award - April 9th, 2014 [April 9th, 2014]
- Vanguard High teacher wins 2014 Shell Science Lab Challenge - April 9th, 2014 [April 9th, 2014]
- Biohacking and the problem of bioterrorism - April 9th, 2014 [April 9th, 2014]
- Synthetic genetic clock keeps accurate time across a range of temperatures - April 9th, 2014 [April 9th, 2014]
- Math modeling integral to synthetic biology research - April 9th, 2014 [April 9th, 2014]
- Vacancies in biology dept. impact course options - April 9th, 2014 [April 9th, 2014]
- Dr. Joshua Reece Earns Best Presentation Award At Conference - April 9th, 2014 [April 9th, 2014]
- Life Science Reference - Biology Online - April 9th, 2014 [April 9th, 2014]
- 9th Grade Biology: A Hectic Introduction to Mammals - Video - April 9th, 2014 [April 9th, 2014]
- Theism vs Evolution, Biology, and History - Video - April 9th, 2014 [April 9th, 2014]
- AP Biology Ch.49 Circulatory System Livestream - Video - April 9th, 2014 [April 9th, 2014]
- AP Biology Review Cards - Video - April 9th, 2014 [April 9th, 2014]
- AP Biology - Chapter 49 Circulatory System Part 1 - Video - April 9th, 2014 [April 9th, 2014]
- MSc Biology and PhD Boreal Ecology - Video - April 9th, 2014 [April 9th, 2014]
- Whale tales: Students set sail for biology class research - April 10th, 2014 [April 10th, 2014]
- Barnard biology professor honored with Emily Gregory award for teaching - April 10th, 2014 [April 10th, 2014]
- biology: Definition from Answers.com - Answers - The Most ... - April 10th, 2014 [April 10th, 2014]
- Biology - Wikipedia, the free encyclopedia - April 10th, 2014 [April 10th, 2014]
- Bridging the Brain Disease Knowledge Gap through Computational Modeling and Systems Biology: An O... - Video - April 10th, 2014 [April 10th, 2014]
- Sharpening microscope images: New technique takes cues from astronomy, ophthalmology - April 15th, 2014 [April 15th, 2014]
- COLLEGE NEWS: April 13 - April 15th, 2014 [April 15th, 2014]
- Eureka Once, Eureka Twice - April 15th, 2014 [April 15th, 2014]
- Biology - April 15th, 2014 [April 15th, 2014]
- The Art of Nutrients - Biology Song - 'Counting Stars' Remake - Video - April 15th, 2014 [April 15th, 2014]
- Biology - The Nervous System - Video - April 15th, 2014 [April 15th, 2014]
- OCR AS BIOLOGY: - Cell Structures - Video - April 15th, 2014 [April 15th, 2014]
- Evolutionary Biology Research / F. Robin O'Keefe and Julie Meachen / Page Museum - Video - April 15th, 2014 [April 15th, 2014]
- What Is a Thyroid In Biology? : Let's Get Medical - Video - April 15th, 2014 [April 15th, 2014]
- Red Ice Radio - Sofia Smallstorm - Hour 1 - Chemtrails to Pseudo-Life & Synthetic Biology - Video - April 15th, 2014 [April 15th, 2014]
- STARNES: Did professor advocate censorship of conservative student newspaper? - April 15th, 2014 [April 15th, 2014]
- German Research Foundation approves new priority program in the life sciences - April 15th, 2014 [April 15th, 2014]
- Announcing BioCoder issue 3 - April 15th, 2014 [April 15th, 2014]
- Poetry by Linda Bierds, Buddhism and biology - April 15th, 2014 [April 15th, 2014]
- Digestion - Biology Music Video - Video - April 15th, 2014 [April 15th, 2014]
- AS Level Biology- Edexcel/SNAB- Unit 1 Revision Notes - Video - April 15th, 2014 [April 15th, 2014]
- The Anatomy Of The Heart - Video - April 15th, 2014 [April 15th, 2014]
- #OilerNation Biology Program Q & A Hangout - Video - April 15th, 2014 [April 15th, 2014]
- Northwestern University researchers on synthetic biology - Video - April 15th, 2014 [April 15th, 2014]
- Gregor Mendel Institute of Molecular Plant Biology (Vienna) - Video - April 16th, 2014 [April 16th, 2014]
- Biology 1B - 2014-04-14 - Video - April 16th, 2014 [April 16th, 2014]
- AP Biology Review 3/7: Cell Energy - Video - April 16th, 2014 [April 16th, 2014]
- MCB 410: Developmental Biology - Video - April 16th, 2014 [April 16th, 2014]
- Dyslexic Advantage - UCSF Symposium - Dr Fumiko Hoeft - Biology of Stealth Dyslexia - Video - April 16th, 2014 [April 16th, 2014]
- Life cycle of Silkworm- Insect Molecular Biology Lab, Dr.M.Krishnan, Bharathithasan University. - Video - April 16th, 2014 [April 16th, 2014]
- Tracking flu levels with Wikipedia - April 17th, 2014 [April 17th, 2014]
- Biology major Katharine Leigh '15 wins Udall scholarship - April 17th, 2014 [April 17th, 2014]
- First in the nation: UW-Madison establishes post-doc in feminist biology - April 17th, 2014 [April 17th, 2014]
- Biology Project: Predation - Video - April 17th, 2014 [April 17th, 2014]
- Report Focussing On Biology Underlining - Video - April 17th, 2014 [April 17th, 2014]
- edX | MITx: Quantitative Biology Workshop: 7QBWx About Video - Video - April 17th, 2014 [April 17th, 2014]
- Cell Mediated Response (Erdmann's 2B-3 AP Biology) - Video - April 17th, 2014 [April 17th, 2014]
- Biology Plantae part 13 (Mosses: structure, life cycle, mosses vs leafy liverwots) CBSE class 11 XI - Video - April 17th, 2014 [April 17th, 2014]
- Biology - Photosynthesis - Video - April 19th, 2014 [April 19th, 2014]
- The Genie in Your Genes: Epigenetics and Biology of Intention - Video - April 19th, 2014 [April 19th, 2014]
- Introduction to Synthetic Biology Andrew Hessel - Video - April 19th, 2014 [April 19th, 2014]
- Teacher of Biology - April 21st, 2014 [April 21st, 2014]
- Biology - Osmosis - Video - April 21st, 2014 [April 21st, 2014]
- UW to host first feminist biology post-doc program in nation - April 22nd, 2014 [April 22nd, 2014]
- The Biology Project: Cell Biology - University of Arizona - April 22nd, 2014 [April 22nd, 2014]
- The Biology Corner - April 22nd, 2014 [April 22nd, 2014]
- Rader's BIOLOGY 4 KIDS.COM - Biology basics for everyone! - April 22nd, 2014 [April 22nd, 2014]
- Stanford CF Education Day 2014 Understanding the Biology - Video - April 22nd, 2014 [April 22nd, 2014]
- 9th Grade Biology: Hectic Introduction to the Human Organ Systems pt.1 - Video - April 22nd, 2014 [April 22nd, 2014]
- Honors Biology and Biology Mrs. Ellis - Video - April 22nd, 2014 [April 22nd, 2014]
- Biology professor researches parasites - April 22nd, 2014 [April 22nd, 2014]
- TRANSCRIPTION-Biology - Video - April 22nd, 2014 [April 22nd, 2014]
- 2014 Interdisciplinary Innovation Forum: "Mathematical Biology" - Video - April 22nd, 2014 [April 22nd, 2014]
- This Week in Genome Biology - April 24th, 2014 [April 24th, 2014]
- Biology - Calvin Cycle - Video - April 24th, 2014 [April 24th, 2014]
- Systems biology course 2014 Uri Alon - lecture 3: FFL and more - Video - April 24th, 2014 [April 24th, 2014]
- Systems biology course 2014 Uri Alon - lecture 2: Auto regulation - Video - April 24th, 2014 [April 24th, 2014]
- Systems biology course 2014 Uri Alon - lecture 1: Basic concepts - Video - April 24th, 2014 [April 24th, 2014]
- 9th Grade Biology: Hectic Introduction to the Human Organ Systems pt.2 - Video - April 24th, 2014 [April 24th, 2014]
- Have Atheists Hijacked Biology? - Video - April 24th, 2014 [April 24th, 2014]