How a Boy From an Indian Village Broke the Code of Life and Won the Nobel Prize

in #biology7 years ago (edited)

463px-GeneticCode21-version-2-.jpg

The code of life.
Source: Wikimedia Commons contributors

Intro: From 4 to 20

After discovering that DNA is the seat of heredity, and that DNA is made up of 4 "letters": G, A, T, and C (the Great GATC-by as some — namely I — have called it), biologists discovered that they were apparently 16 letters short. "Blimey, someone robbed us!" they cried.

You see, life is made of proteins, and proteins are made out of 20 (or 21) amino acids. The Central Dogma states that "DNA encodes RNA encodes proteins". How is DNA supposed to do this, if it only got 4 letters? How do you get from 4 DNA letters to 20 amino acids?

These days we might be less than impressed with this puzzle, seeing how a computer screen can create what you're reading right now — and so much more — out of just two digits: 0 and 1. From that perspective, having 4 letters looks overly generous!

DNA is getting rather pissed right now! It's saying, "I'll tell you what's overly generous! Your size!"

Eniac-.jpg

"Yo computta so big, it fills up a whole room."
(This is a 'yo computta fact'. DNA doesn't do jokes.)
Source: Wikimedia Commons contributors

DNA, as the ancient pre-PC creature it is, is not afraid to fat-shame computers (which also happen to be called PCs). I mean, look at that thing! That's what biologists were faced with at the time, so pardon them for thinking going from 4 to 20 is worth a "hmm, that's funny". Even today, the smallest computer in existence is bigger than a DNA strand. It's visible to the naked eye for X's sake (X as in Xmas). And to DNA, being visible = being fat.

Computers want to say something about how much time DNA had in its disposal to evolve to such small sizes, DNA gives a comeback to the effect of "I was born small", and on it goes. We'll leave them to their squabbles, and just note that at least computers prove that going from few to many is possible in principle.

But thinking that getting 20 out of 4 is possible, is quite different from finding out how exactly it is possible, and it took a small boy from India to find out how that happens.

RNA threesome

PDB_1l2y_EBI-.jpg

"My birth certificate says my name is AACCUGUACAUCCAGUGGCUGAAGGACGGCGGCCCCAGCAGCGGCAGGCCCCCCCCCAGC, but I've shortened it to NLYIQWLKDGGPSSGRPPPS to make it easier for people."
Source: Wikimedia Commons contributors

Take this RNA sequence of the smallest known protein called Trp-cage: AACCUGUACAUCCAGUGGCUGAAGGACGGCGGCCCCAGCAGCGGCAGGCCCCCCCCCAGC

(Note that in the process of getting translated from DNA to RNA, the Ts were replaced with a U. Don't worry about that, it's trivial. Except if I write a post about it, in which case it's definitely not trivial and you should definitely read and upvote!)

The protein is made up of these amino acids: NLYIQWLKDGGPSSGRPPPS

How did we get from that RNA sequence to these amino acids? Well, if you count them, you'll see we get 20 amino acids from 60 letters, so the answer must come in the form of 3s.

If you're even more OCD, you'll notice that the same amino acids correspond to the same 3-letter code. For instance CCC always codes for the amino acid P (proline). If you go up to the title picture of this post, and maybe zoom in, and start from the center-most past of that wheel of fortune, and read CCC, you'll be led, slowly but surely, to proline.

It's easy when you got the chart, and the DNA sequencers. But how did people know back then, in the 1960s, when such technology had yet to exist?

The triplet nature of the genetic code[1]

Tina_Cole_My_Three_Sons_triplets_1969-.jpg

What if I told you we all carry triplets, inside our genetic code?
Source: Wikimedia Commons contributors

In what is referred to as the Crick et al. experiment (because if you're Francis Crick, of double-helix fame, you tend to outshine co-authors), the triplet nature of the genetic code was discovered. The experiment, like many of its time and before, was magnificently simple.

Crick and the unnamed used (like many geneticists do) mutants. They took T4 bacteriophage (a virus) and used a substance on it called proflavine that had the effect of deleting a etter. You can see how that would render a word meaningless. It did the same to the T4 gene they were targeting.

Besides deletions, they used the same method to make inusertions. That's insertions, with an added letter 'u'. You can see why that's no good either.

Together, insertions and deletions rendered the gene non-functional. The reason was because of a thing called frame shift. You see, if you insert a letter between CCC, you don't just stop making proline. The whole sequence has now moved to the right: the frame of reading has shifted. Thish ada ne ffects imilart om ovinga llt hes paceso nep lacet ot her ight. (aka "This had an effect similar to moving all the spaces one place to the right.")

Here's though where things got interesting: when they made three deletions, or three insertions, the gene was functional. That's because "the bases were shifted back into the correct reading frame".[4]

I'm sorry, I have to stop here for a moment. I'm getting teary-eyed. Please take a moment to realize how brilliantly simple that was.

One didn't do it. Two didn't do it. Three did it. It meant the genetic code is read in triplets.

Those triplets are now called codons.

(CCC for instance is the codon for proline.)

But what's the codon's code?!

457px-MNirenberg-.jpg

Nirenberg, back when killing yourself with cigs was all the rage.
Source: Wikimedia Commons contributors

Knowing it comes in threes and knowing the specific letters are two different things. In a process that you can read more about here, Niremberg (no relation to the trial), came up with a way to make RNA composed only of uracil (the letter U). So this RNA spelled UUU, and so it made ... let's check the chart ... phenylalanine! Then he managed to do the same with other letters. What did AAA give? Lysine! CCC? You got it: proline!

And it didn't stop there. He had no way of reading the DNA, and by having two letters coding, say A and U, he could be getting AUU, UUA, AUA, UAU ... he had no way of knowing. But by tweaking with combinations and concentrations, and probabilistically studying the results, he could make educated guesses about what the codons for the other proteins were.

But that's what they were, guesses.

That was until a boy from an Indian village entered the picture.

Har Gobind Khorana

Har_Gobind_Khorana-.jpg

Not a boy studying in a school that was literally just a tree anymore.
Source: Wikimedia Commons contributors

Is it too late to issue a clickbait warning? The boy, you see, wasn't a boy when he cracked the code of life. He was an adult. Also, he didn't crack it all by himself: he did it in combination with the work of Niremberg and another person, Robert W. Holley, whose work I won't examine here. With these two he shared the 1968 Nobel Prize in Physiology or Medicine.

So let's redo the title:

How a man who was once a boy from an Indian village deciphered the DNA code along with two other scientists.

That's a mouthful. You see why I had to shorten it.

But once you read this guy's story, you can see why I made him the centerpiece.

He was born in a village in India. He didn't even know for sure when his birthdate was, because records were somewhat lacking or unreliable. In his autobiography he states that:

Although poor, my father was dedicated to educating his children and we were practically the only literate family in the village inhabited by about 100 people.[11]

Wikipedia states:

The first four years of his education were provided under a tree, a spot that was, in effect, the only school in the village.[11]

That's the closest I ever heard to the myth of Newton and the apple being true, and for that Khorana certainly deserves the title of this post, even though he wasn't aged 4 when he discovered the code of life. Still, to go from such humble beginnings to such great accomplishments is noteworthy, and says a lot about educational opportunities: who knows how many brilliant Khoranas exist right now beyond the reaches of education, where — to take a selfish angle — they can't benefit us with their discoveries?

As for the research, it's not as beautifully simple or as easy to explain as Niremberg's, so I will leave it untreated like Holley's, but I thought his autobiography, albeit presented superficially, was interesting to share.

A coda to the codon

362px-Newton's_tree,_Botanic_Gardens,_Cambridge.JPG

Newton's apple tree, one of many descendants of the original by vegetative propagation.
Source: Wikimedia Commons contributors

The DNA code was later shown to be universal to all life: all codons (with minor exceptions) coded the same proteins for all life. That spoke to many things, among them the fact of the single origin of all living organisms on this planet.[12] (Who knew: we're all relatives!)

If that's not good enough a coda to the story of the codon, I don't know what is.


REFERENCES

1. Wikipedia contributors, "Central dogma of molecular biology," Wikipedia, The Free Encyclopedia, https://en.wikipedia.org/w/index.php?title=Central_dogma_of_molecular_biology&oldid=821939211 (accessed January 29, 2018).

2. Rex Sakamoto, This is the world's smallest computer, CBS News April 6, 2015. https://www.cbsnews.com/news/the-worlds-smallest-computer-university-of-michigan-micro-mote/

3. Yanofsky C. Establishing the triplet nature of the genetic code. Cell. 2007 Mar 9;128(5):815-8. https://www.ncbi.nlm.nih.gov/pubmed/17350564

4. Wikipedia contributors, "Crick, Brenner et al. experiment," Wikipedia, The Free Encyclopedia, https://en.wikipedia.org/w/index.php?title=Crick,_Brenner_et_al._experiment&oldid=810634661 (accessed January 29, 2018).

5. F. H. C. CRICK, LESLIE BARNETT, S. BRENNER & R. J. WATTS-TOBIN. General Nature of the Genetic Code for Proteins. Nature volume 192, pages 1227–1232 (30 December 1961) doi:10.1038/1921227a0 https://www.nature.com/articles/1921227a0 & https://profiles.nlm.nih.gov/ps/access/SCBCBJ.pdf

6. Wikipedia contributors, "Enterobacteria phage T4," Wikipedia, The Free Encyclopedia, https://en.wikipedia.org/w/index.php?title=Enterobacteria_phage_T4&oldid=822464505 (accessed January 29, 2018).

7. Wikipedia contributors, "Proflavine," Wikipedia, The Free Encyclopedia, https://en.wikipedia.org/w/index.php?title=Proflavine&oldid=818924160 (accessed January 29, 2018).

8. Wikipedia contributors, "Marshall Warren Nirenberg," Wikipedia, The Free Encyclopedia, https://en.wikipedia.org/w/index.php?title=Marshall_Warren_Nirenberg&oldid=819575274 (accessed January 29, 2018).

9. Wikipedia contributors, "Robert W. Holley," Wikipedia, The Free Encyclopedia, https://en.wikipedia.org/w/index.php?title=Robert_W._Holley&oldid=819463713 (accessed January 29, 2018).

10. "The Nobel Prize in Physiology or Medicine 1968". Nobelprize.org. Nobel Media AB 2014. Web. 29 Jan 2018. http://www.nobelprize.org/nobel_prizes/medicine/laureates/1968/

11. Wikipedia contributors, "Har Gobind Khorana," Wikipedia, The Free Encyclopedia, https://en.wikipedia.org/w/index.php?title=Har_Gobind_Khorana&oldid=822971848 (accessed January 29, 2018).

12. Koonin EV, Novozhilov AS. Origin and evolution of the genetic code: the universal enigma. Iubmb Life. 2009;61(2):99-111. doi:10.1002/iub.146. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3293468/

13. Wikipedia contributors, "Frameshift mutation," Wikipedia, The Free Encyclopedia, https://en.wikipedia.org/w/index.php?title=Frameshift_mutation&oldid=812212368 (accessed January 29, 2018).

14. Wikipedia contributors, "Genetic code," Wikipedia, The Free Encyclopedia, https://en.wikipedia.org/w/index.php?title=Genetic_code&oldid=822537001 (accessed January 29, 2018).


Earlier Introduction to Biology episodes:

11: The Most Beautiful Experiment In Biology

10: The Great GATC-by: The Most Famous Science Paper of the 20th Century

9: The Great Kitchen Blender Experiments: How DNA was proved to be the seat of heredity

8: Finding, Counting, and Ordering Genes Using Incredibly Sophisticated Biomolecular Megatechnology

7: Christmas Disease — Yes, it's real, 100% scientifically proven!

6: The Most Famous All-Nighter in the History of Genetics

5: Mendel's Lucky Number Seven — The law of genetics that almost wasn't

4: How Cells Use Logic To Do The Impossible

3 : Armchair Science — The Discovery of Proteins' Secondary Structure

2 : How Cell Membranes Form Spontaneously

1 : Eduard Buchner: The Man Who Killed Vitalism


steemSTEM is the go-to place for science on Steemit. Check it out at @steemstem or browse the #steemSTEM tag or chat live at steemit.chat or discord

Sort:  

Finally, after all this time! It's my first time getting one of these! 😄

You should always post them, to let people who stop by know about the project.

It is quite fascinating that many scientific terms are borrowed directly from religious sources, such as the Central Dogma of genetics and Laws of Newton. Do the scientists consciously label their findings in terms of religion or faith because such labels provide symbols of legitimacy? Or maybe the scientists perceive themselves as doing God's work? Or are they truly cursed with monumental ego?

It's an interesting question! I'd say the reasons are probably all of those!

Some have monumental egos, no doubt about that.

Some simply honestly believe these are laws in the sense that hey held and will hold forever (and since Newton did believe in God, he probably thought he was simply revealing the laws set down by his Creator).

And for some I think the words might simply sound nice! You have to call it something. In the case of the Central Dogma, since it was Crick who named it, I think it was mainly a case of making it memorable and being playful with words; he wasn't the type to call something by a dry name like others would, and I'm with him on that, we need to make science more interesting even when it comes to the names we give to things.

Another great entry in this series. Didn't know that Crick was involved in figuring out the triplet nature of the genetic code so thanks for that tidbit.

I totally understand why you don't want to get into how Khorana figured out what each triplet codes. Its complicated enough to fill an entire article by itself, but I'd be happy to read it if you posted. If I remember correctly, it was kind of like solving a multi-step riddle using incredibly difficult to synthesize oligos.

This was really fun - and leaves me with two observations:

  1. I often think about the immense waste of potential squandered by humanities failure to evenly distribute wealth, and the trappings of wealth, across our species. Poverty, social and cultural ostracism, and the limited education that often accompanies such things, have kept countless amazing people from realizing their potential for the overall benefit of our species. It's tragic and absurd.

  2. It's amazing to me that, if I'm understanding correctly, the core problem of "DNA to RNA to Proteins" is essentially cryptographic in nature. I guess sort of obvious - re: referring to your DNA as your "genetic code" - but readig this really laid the cryptographic puzzle bare for me.

With a key readily available now, I can even imagine a future where CRISPR like technology allows state intelligence agencies to send codes messages in the form of crafted DNA strands. In theory you could send a ton of information that way.

  1. Absolutely right.

  2. Lots of problems in biology were, in a sense, essentially math problems. Probably the same is true for much of science. ... I remember when I first learned about junk DNA, I immediately thought up a sci fi story (which I never wrote, it remained an idea, that doubtless many others by now must've dreamed up) about how humans were in fact a coded message (meant for some aliens perhaps), with a tiny bit of the message-DNA coding for an actual living organism whose real purpose is to preserve the message - hence species propagation to copy the message, instincts to avoid situations that will destroy the message, etc. The message can only be decoded by those who hold the key to the code. So, in essence, a fancy USB stick with legs! .... Well, it doesn't have to be realistic, it's just a story! And today we learn more and more that 'junk' DNA isn't just so much junk, it fulfills certain functions, and biologists don't even call it that most of the time, they call it noncoding DNA.

Appreciate you reading and commenting!

That's a fabulous idea! I don't know if it's been written yet, but if you're not gonna write anything about I would love to - it's jusg an awesome premise!

No problem! I'd love to see what you'd do with it! There's so much on my plate, and so many stories that I'd need to write before this one got its turn, that I'll probably postpone it indefinitely, so sure have a go!

Wonderful read, I'll keep my eye out for your posts!

You write really well, I enjoyed reading your article. It's often people who struggled in life that end up doing great things..

hey great job with this it was enjoyable and a great story to share. :) love stem inspo

Wow i believe this little boy is born to lead,reign i love smart peoples

I think this was born to be a genius great and wonderful research

smart boy from 🇮🇳️