In the genetic code, each type of amino acid corresponds. Code within the code: second genetic code revealed

Thanks to the process of transcription in the cell, information is transferred from DNA to protein: DNA - mRNA - protein. The genetic information contained in DNA and mRNA is contained in the sequence of nucleotides in the molecules. How is information transferred from the “language” of nucleotides to the “language” of amino acids? This translation is carried out using the genetic code. A code, or cipher, is a system of symbols for translating one form of information into another. The genetic code is a system for recording information about the sequence of amino acids in proteins using the sequence of nucleotides in messenger RNA. How important exactly the sequence of arrangement of the same elements (four nucleotides in RNA) is for understanding and preserving the meaning of information can be seen in a simple example: by rearranging the letters in the word code, we get a word with a different meaning - doc. What properties does it have? genetic code?

1. The code is triplet. RNA consists of 4 nucleotides: A, G, C, U. If we tried to designate one amino acid with one nucleotide, then 16 out of 20 amino acids would remain unencrypted. A two-letter code would encrypt 16 amino acids (from four nucleotides, 16 different combinations can be made, each of which contains two nucleotides). Nature has created a three-letter, or triplet, code. This means that each of the 20 amino acids is encoded by a sequence of three nucleotides, called a triplet or codon. From 4 nucleotides you can create 64 different combinations of 3 nucleotides each (4*4*4=64). This is more than enough to encode 20 amino acids and, it would seem, 44 codons are superfluous. However, it is not.

2. The code is degenerate. This means that each amino acid is encrypted by more than one codon (from two to six). The exceptions are the amino acids methionine and tryptophan, each of which is encoded by only one triplet. (This can be seen in the genetic code table.) The fact that methionine is encoded by a single OUT triplet has a special meaning that will become clear to you later (16).

3. The code is unambiguous. Each codon codes for only one amino acid. In all healthy people, in the gene carrying information about the beta chain of hemoglobin, the triplet GAA or GAG, I in sixth place, encodes glutamic acid. In patients with sickle cell anemia, the second nucleotide in this triplet is replaced by U. As can be seen from the table, the triplets GUA or GUG, which are formed in this case, encode the amino acid valine. You already know what such a replacement leads to from the section on DNA.

4. There are “punctuation marks” between genes. In printed text there is a period at the end of each phrase. Several related phrases make up a paragraph. In the language of genetic information, such a paragraph is an operon and its complementary mRNA. Each gene in the operon encodes one polypeptide chain - a phrase. Since in some cases several different polypeptide chains are sequentially created from the mRNA matrix, they must be separated from each other. For this purpose, there are three special triplets in the genetic code - UAA, UAG, UGA, each of which indicates the termination of the synthesis of one polypeptide chain. Thus, these triplets function as punctuation marks. They are found at the end of every gene. There are no "punctuation marks" inside the gene. Since the genetic code is similar to a language, let us analyze this property using the example of a phrase composed of triplets: once upon a time there was a quiet cat, that cat was dear to me. The meaning of what is written is clear, despite the absence of punctuation marks. If we remove one letter in the first word (one nucleotide in the gene), but also read in triplets of letters, then the result will be nonsense: ilb ylk ott ilb yls erm ilm no otk Violation of the meaning also occurs when one or two nucleotides are lost from a gene. The protein that will be read from such a damaged gene will have nothing in common with the protein that was encoded by the normal gene.

6. The code is universal. The genetic code is the same for all creatures living on Earth. In bacteria and fungi, wheat and cotton, fish and worms, frogs and humans, the same triplets encode the same amino acids.

Previously, we emphasized that nucleotides have an important feature for the formation of life on Earth - in the presence of one polynucleotide chain in a solution, the process of formation of a second (parallel) chain spontaneously occurs based on the complementary connection of related nucleotides. The same number of nucleotides in both chains and their chemical affinity are an indispensable condition for the implementation of this type of reaction. However, during protein synthesis, when information from mRNA is implemented into the protein structure, there can be no talk of observing the principle of complementarity. This is due to the fact that in mRNA and in the synthesized protein not only the number of monomers is different, but also, what is especially important, there is no structural similarity between them (nucleotides on the one hand, amino acids on the other). It is clear that in this case there is a need to create a new principle for accurately translating information from a polynucleotide into the structure of a polypeptide. In evolution, such a principle was created and its basis was the genetic code.

The genetic code is a system for recording hereditary information in molecules nucleic acids, based on a certain alternation of nucleotide sequences in DNA or RNA, forming codons corresponding to amino acids in the protein.

The genetic code has several properties.

    Tripletity.

    Degeneracy or redundancy.

    Unambiguity.

    Polarity.

    Non-overlapping.

    Compactness.

    Versatility.

It should be noted that some authors also propose other properties of the code related to chemical features included in the code of nucleotides or with the frequency of occurrence of individual amino acids in the proteins of the body, etc. However, these properties follow from those listed above, so we will consider them there.

A. Tripletity. The genetic code, like many complexly organized systems, has the smallest structural and smallest functional unit. A triplet is the smallest structural unit of the genetic code. It consists of three nucleotides. A codon is the smallest functional unit of the genetic code. Typically, triplets of mRNA are called codons. In the genetic code, a codon performs several functions. Firstly, its main function is that it encodes a single amino acid. Secondly, the codon may not code for an amino acid, but, in this case, it performs another function (see below). As can be seen from the definition, a triplet is a concept that characterizes elementary structural unit genetic code (three nucleotides). Codon – characterizes elementary semantic unit genome - three nucleotides determine the attachment of one amino acid to the polypeptide chain.

The elementary structural unit was first deciphered theoretically, and then its existence was confirmed experimentally. Indeed, 20 amino acids cannot be encoded with one or two nucleotides because there are only 4 of the latter. Three out of four nucleotides give 4 3 = 64 variants, which more than covers the number of amino acids available in living organisms (see Table 1).

The 64 nucleotide combinations presented in table have two features. Firstly, of the 64 triplet variants, only 61 are codons and encode any amino acid, they are called sense codons. Three triplets do not encode

Table 1.

Messenger RNA codons and corresponding amino acids

FOUNDATION OF CODONOV

Nonsense

Nonsense

Nonsense

Meth

Shaft

amino acids a are stop signals indicating the end of translation. There are three such triplets - UAA, UAG, UGA, they are also called “meaningless” (nonsense codons). As a result of a mutation, which is associated with the replacement of one nucleotide in a triplet with another, a nonsense codon can arise from a sense codon. This type of mutation is called nonsense mutation. If such a stop signal is formed inside the gene (in its information part), then during protein synthesis in this place the process will be constantly interrupted - only the first (before the stop signal) part of the protein will be synthesized. A person with this pathology will experience a lack of protein and will experience symptoms associated with this deficiency. For example, this kind of mutation was identified in the gene encoding the hemoglobin beta chain. A shortened inactive hemoglobin chain is synthesized, which is quickly destroyed. As a result, a hemoglobin molecule devoid of a beta chain is formed. It is clear that such a molecule is unlikely to fully fulfill its duties. A serious disease occurs that develops as hemolytic anemia (beta-zero thalassemia, from the Greek word “Thalas” - Mediterranean Sea, where this disease was first discovered).

The mechanism of action of stop codons differs from the mechanism of action of sense codons. This follows from the fact that for all codons encoding amino acids, corresponding tRNAs have been found. No tRNAs were found for nonsense codons. Consequently, tRNA does not take part in the process of stopping protein synthesis.

CodonAUG (sometimes GUG in bacteria) not only encode the amino acids methionine and valine, but are alsobroadcast initiator .

b. Degeneracy or redundancy.

61 of the 64 triplets encode 20 amino acids. This three-fold excess of the number of triplets over the number of amino acids suggests that two coding options can be used in the transfer of information. Firstly, not all 64 codons can be involved in encoding 20 amino acids, but only 20 and, secondly, amino acids can be encoded by several codons. Research has shown that nature used the latter option.

His preference is obvious. If out of 64 variant triplets only 20 were involved in encoding amino acids, then 44 triplets (out of 64) would remain non-coding, i.e. meaningless (nonsense codons). Previously, we pointed out how dangerous it is for the life of a cell to transform a coding triplet as a result of mutation into a nonsense codon - this significantly disrupts the normal functioning of RNA polymerase, ultimately leading to the development of diseases. Currently, three codons in our genome are nonsense, but now imagine what would happen if the number of nonsense codons increased by about 15 times. It is clear that in such a situation the transition of normal codons to nonsense codons will be immeasurably higher.

A code in which one amino acid is encoded by several triplets is called degenerate or redundant. Almost every amino acid has several codons. Thus, the amino acid leucine can be encoded by six triplets - UUA, UUG, TSUU, TsUC, TsUA, TsUG. Valine is encoded by four triplets, phenylalanine by two and only tryptophan and methionine encoded by one codon. The property that is associated with recording the same information with different symbols is called degeneracy.

The number of codons designated for one amino acid correlates well with the frequency of occurrence of the amino acid in proteins.

And this is most likely not accidental. The higher the frequency of occurrence of an amino acid in a protein, the more often the codon of this amino acid is represented in the genome, the higher the likelihood of its damage by mutagenic factors. Therefore, it is clear that a mutated codon has a greater chance of encoding the same amino acid if it is highly degenerate. From this perspective, the degeneracy of the genetic code is a mechanism that protects the human genome from damage.

It should be noted that the term degeneracy is used in molecular genetics in another sense. Thus, the bulk of the information in a codon is contained in the first two nucleotides; the base in the third position of the codon turns out to be of little importance. This phenomenon is called “degeneracy of the third base.” The latter feature minimizes the effect of mutations. For example, it is known that the main function of red blood cells is to transport oxygen from the lungs to the tissues and carbon dioxide from the tissues to the lungs. This function is performed by the respiratory pigment - hemoglobin, which fills the entire cytoplasm of the erythrocyte. It consists of a protein part - globin, which is encoded by the corresponding gene. In addition to protein, the hemoglobin molecule contains heme, which contains iron. Mutations in globin genes lead to the appearance of different variants of hemoglobins. Most often, mutations are associated with replacing one nucleotide with another and the appearance of a new codon in the gene 400 , which may encode a new amino acid in the hemoglobin polypeptide chain. In a triplet, as a result of mutation, any nucleotide can be replaced - the first, second or third. Several hundred mutations are known that affect the integrity of the globin genes. Near 100 of which are associated with the replacement of single nucleotides in a gene and the corresponding amino acid replacement in a polypeptide. Of these only

replacements lead to instability of hemoglobin and various kinds of diseases from mild to very severe. 300 (approximately 64%) substitution mutations do not affect hemoglobin function and do not lead to pathology. One of the reasons for this is the above-mentioned “degeneracy of the third base,” when a replacement of the third nucleotide in a triplet encoding serine, leucine, proline, arginine and some other amino acids leads to the appearance of a synonymous codon encoding the same amino acid. Such a mutation will not manifest itself phenotypically. In contrast, any replacement of the first or second nucleotide in a triplet in 100% of cases leads to the appearance of a new hemoglobin variant. But even in this case, there may not be severe phenotypic disorders. The reason for this is the replacement of an amino acid in hemoglobin with another one similar to the first in physicochemical properties. For example, if an amino acid with hydrophilic properties is replaced by another amino acid, but with the same properties.Hemoglobin consists of the iron porphyrin group of heme (oxygen and carbon dioxide molecules are attached to it) and protein - globin. Adult hemoglobin (HbA) contains two identical-chains and two-chain contains 141 amino acid residues,-chain - 146,- And-chains differ in many amino acid residues. The amino acid sequence of each globin chain is encoded by its own gene. Gene encoding-the chain is located in the short arm of chromosome 16,-gene - in the short arm of chromosome 11. Substitution in the gene encoding-the hemoglobin chain of the first or second nucleotide almost always leads to the appearance of new amino acids in the protein, disruption of hemoglobin functions and serious consequences for the patient. For example, replacing “C” in one of the triplets CAU (histidine) with “Y” will lead to the appearance of a new triplet UAU, encoding another amino acid - tyrosine. Phenotypically this will manifest itself in a severe disease.. A similar substitution in position 63-chain of histidine polypeptide to tyrosine will lead to destabilization of hemoglobin. The disease methemoglobinemia develops. Replacement, as a result of mutation, of glutamic acid with valine in the 6th position-chain is the cause of the most severe disease - sickle cell anemia. Let's not continue the sad list. Let us only note that when replacing the first two nucleotides, an amino acid with physicochemical properties similar to the previous one may appear. Thus, replacement of the 2nd nucleotide in one of the triplets encoding glutamic acid (GAA) in-chain with “U” leads to the appearance of a new triplet (GUA), encoding valine, and replacing the first nucleotide with “A” forms the triplet AAA, encoding the amino acid lysine. Glutamic acid and lysine are similar in physicochemical properties - they are both hydrophilic. Valine is a hydrophobic amino acid. Therefore, replacing hydrophilic glutamic acid with hydrophobic valine significantly changes the properties of hemoglobin, which ultimately leads to the development of sickle cell anemia, while replacing hydrophilic glutamic acid with hydrophilic lysine changes the function of hemoglobin to a lesser extent - patients develop a mild form of anemia. As a result of the replacement of the third base, the new triplet can encode the same amino acids as the previous one. For example, if in the CAC triplet uracil was replaced by cytosine and a CAC triplet appeared, then virtually no phenotypic changes in humans will be detected. This is understandable, because both triplets code for the same amino acid – histidine.

In conclusion, it is appropriate to emphasize that the degeneracy of the genetic code and the degeneracy of the third base from a general biological point of view are protective mechanisms that are inherent in evolution in the unique structure of DNA and RNA.

V. Unambiguity.

Each triplet (except nonsense) encodes only one amino acid. Thus, in the direction codon - amino acid the genetic code is unambiguous, in the direction amino acid - codon it is ambiguous (degenerate).

Unambiguous

Amino acid codon

Degenerate

And in this case, the need for unambiguity in the genetic code is obvious. In another option, when translating the same codon, different amino acids would be inserted into the protein chain and, as a result, proteins with different primary structures and different functions would be formed. Cell metabolism would switch to the “one gene – several polypeptides” mode of operation. It is clear that in such a situation the regulatory function of genes would be completely lost.

g. Polarity

Reading information from DNA and mRNA occurs only in one direction. Polarity is important for defining higher order structures (secondary, tertiary, etc.). Earlier we talked about how lower-order structures determine higher-order structures. Tertiary structure and structures more high order in proteins, they are formed immediately as soon as the synthesized RNA chain leaves the DNA molecule or the polypeptide chain leaves the ribosome. While the free end of an RNA or polypeptide acquires a tertiary structure, the other end of the chain continues to be synthesized on DNA (if RNA is transcribed) or a ribosome (if a polypeptide is transcribed).

Therefore, the unidirectional process of reading information (during the synthesis of RNA and protein) is essential not only for determining the sequence of nucleotides or amino acids in the synthesized substance, but for the strict determination of secondary, tertiary, etc. structures.

d. Non-overlapping.

The code may be overlapping or non-overlapping. In most organisms the code does not overlap. Overlapping code is found in some phages.

The essence of a non-overlapping code is that a nucleotide of one codon cannot simultaneously be a nucleotide of another codon. If the code were overlapping, then the sequence of seven nucleotides (GCUGCUG) could encode not two amino acids (alanine-alanine) (Fig. 33, A) as in the case of a non-overlapping code, but three (if there is one nucleotide in common) (Fig. . 33, B) or five (if two nucleotides are common) (see Fig. 33, C). In the last two cases, a mutation of any nucleotide would lead to a violation in the sequence of two, three, etc. amino acids.

However, it has been established that a mutation of one nucleotide always disrupts the inclusion of one amino acid in a polypeptide. This is a significant argument that the code is non-overlapping.

Let us explain this in Figure 34. Bold lines show triplets encoding amino acids in the case of non-overlapping and overlapping code. Experiments have clearly shown that the genetic code is non-overlapping. Without going into details of the experiment, we note that if you replace the third nucleotide in the sequence of nucleotides (see Fig. 34)U (marked with an asterisk) to some other thing:

1. With a non-overlapping code, the protein controlled by this sequence would have a substitution of one (first) amino acid (marked with asterisks).

2. With an overlapping code in option A, a substitution would occur in two (first and second) amino acids (marked with asterisks). Under option B, the replacement would affect three amino acids (marked with asterisks).

However, numerous experiments have shown that when one nucleotide in DNA is disrupted, the disruption in the protein always affects only one amino acid, which is typical for a non-overlapping code.

GZUGZUG GZUGZUG GZUGZUG

GCU GCU GCU UGC GCU GCU GCU UGC GCU GCU GCU

*** *** *** *** *** ***

Alanin - Alanin Ala - Cis - Ley Ala - Ley - Ley - Ala - Ley

A B C

Non-overlapping code Overlapping code

Rice. 34. A diagram explaining the presence of a non-overlapping code in the genome (explanation in the text).

The non-overlap of the genetic code is associated with another property - the reading of information begins from a certain point - the initiation signal. Such an initiation signal in mRNA is the codon encoding methionine AUG.

It should be noted that humans still have a small number of genes that deviate from general rule and overlap.

e. Compactness.

There is no punctuation between codons. In other words, triplets are not separated from each other, for example, by one meaningless nucleotide. The absence of “punctuation marks” in the genetic code has been proven in experiments.

and. Versatility.

The code is the same for all organisms living on Earth. Direct evidence of the universality of the genetic code was obtained by comparing DNA sequences with corresponding protein sequences. It turned out that all bacterial and eukaryotic genomes use the same sets of code values. There are exceptions, but not many.

The first exceptions to the universality of the genetic code were found in the mitochondria of some animal species. This concerned the terminator codon UGA, which reads the same as the codon UGG, encoding the amino acid tryptophan. Other rarer deviations from universality were also found.

MZ. The genetic code is a system for recording hereditary information in nucleic acid molecules, based on a certain alternation of nucleotide sequences in DNA or RNA that form codons,

corresponding to amino acids in protein.The genetic code has several properties.

Nucleotides DNA and RNA
  1. Purines: adenine, guanine
  2. Pyrimidine: cytosine, thymine (uracil)

Codon- a triplet of nucleotides encoding a specific amino acid.

tab. 1. Amino acids that are commonly found in proteins
Name Abbreviation
1. AlanineAla
2. ArginineArg
3. AsparagineAsn
4. Aspartic acidAsp
5. CysteineCys
6. Glutamic acidGlu
7. GlutamineGln
8. GlycineGly
9. HistidineHis
10. IsoleucineIle
11. LeucineLeu
12. LysineLys
13. MethionineMet
14. PhenylalaninePhe
15. ProlinePro
16. SeriesSer
17. ThreonineThr
18. TryptophanTrp
19. TyrosineTyr
20. ValinVal

The genetic code, also called the amino acid code, is a system for recording information about the sequence of amino acids in a protein using the sequence of nucleotide residues in DNA that contain one of 4 nitrogenous bases: adenine (A), guanine (G), cytosine (C) and thymine (T). However, since the double-stranded DNA helix is ​​not directly involved in the synthesis of the protein that is encoded by one of these strands (i.e., RNA), the code is written in RNA language, which contains uracil (U) instead of thymine. For the same reason, it is customary to say that a code is a sequence of nucleotides, and not pairs of nucleotides.

The genetic code is represented by certain code words, called codons.

The first code word was deciphered by Nirenberg and Mattei in 1961. They obtained an extract from E. coli containing ribosomes and other factors necessary for protein synthesis. The result was a cell-free system for protein synthesis, which could assemble proteins from amino acids if the necessary mRNA was added to the medium. By adding synthetic RNA consisting only of uracils to the medium, they found that a protein was formed consisting only of phenylalanine (polyphenylalanine). Thus, it was established that the triplet of nucleotides UUU (codon) corresponds to phenylalanine. Over the next 5-6 years, all codons of the genetic code were determined.

The genetic code is a kind of dictionary that translates text written with four nucleotides into protein text written with 20 amino acids. The remaining amino acids found in protein are modifications of one of the 20 amino acids.

Properties of the genetic code

The genetic code has the following properties.

  1. Triplety- Each amino acid corresponds to a triple of nucleotides. It is easy to calculate that there are 4 3 = 64 codons. Of these, 61 are semantic and 3 are nonsense (termination, stop codons).
  2. Continuity(no separating marks between nucleotides) - absence of intragenic punctuation marks;

    Within a gene, each nucleotide is part of a significant codon. In 1961 Seymour Benzer and Francis Crick experimentally proved the triplet nature of the code and its continuity (compactness) [show]

    The essence of the experiment: “+” mutation - insertion of one nucleotide. "-" mutation - loss of one nucleotide.

    A single mutation ("+" or "-") at the beginning of a gene or a double mutation ("+" or "-") spoils the entire gene.

    A triple mutation ("+" or "-") at the beginning of a gene spoils only part of the gene.

    A quadruple “+” or “-” mutation again spoils the entire gene.

    The experiment was carried out on two adjacent phage genes and showed that

    1. the code is triplet and there is no punctuation inside the gene
    2. there are punctuation marks between genes
  3. Presence of intergenic punctuation marks- the presence among triplets of initiating codons (they begin protein biosynthesis), and terminator codons (indicating the end of protein biosynthesis);

    Conventionally, the AUG codon, the first after the leader sequence, also belongs to punctuation marks. It functions as a capital letter. In this position it encodes formylmethionine (in prokaryotes).

    At the end of each gene encoding a polypeptide there is at least one of 3 stop codons, or stop signals: UAA, UAG, UGA. They terminate the broadcast.

  4. Colinearity- correspondence of the linear sequence of codons of mRNA and amino acids in the protein.
  5. Specificity- each amino acid corresponds only to certain codons that cannot be used for another amino acid.
  6. Unidirectionality- codons are read in one direction - from the first nucleotide to the subsequent ones
  7. Degeneracy or redundancy, - one amino acid can be encoded by several triplets (amino acids - 20, possible triplets - 64, 61 of them are semantic, i.e., on average, each amino acid corresponds to about 3 codons); the exceptions are methionine (Met) and tryptophan (Trp).

    The reason for the degeneracy of the code is that the main semantic load is carried by the first two nucleotides in the triplet, and the third is not so important. From here code degeneracy rule : If two codons have the same first two nucleotides and their third nucleotides belong to the same class (purine or pyrimidine), then they code for the same amino acid.

    However, from this ideal rule there are two exceptions. This is the AUA codon, which should correspond not to isoleucine, but to methionine, and the UGA codon, which is a stop codon, whereas it should correspond to tryptophan. The degeneracy of the code obviously has an adaptive significance.

  8. Versatility- all of the above properties of the genetic code are characteristic of all living organisms.
    Codon Universal code Mitochondrial codes
    Vertebrates Invertebrates Yeast Plants
    U.G.A.STOPTrpTrpTrpSTOP
    AUAIleMetMetMetIle
    CUALeuLeuLeuThrLeu
    A.G.A.ArgSTOPSerArgArg
    AGGArgSTOPSerArgArg

    IN Lately the principle of code universality was shaken in connection with Berrell's discovery in 1979 of the ideal code of human mitochondria, in which the rule of code degeneracy is satisfied. In the mitochondrial code, the UGA codon corresponds to tryptophan, and AUA to methionine, as required by the code degeneracy rule.

    Perhaps at the beginning of evolution, all simple organisms had the same code as mitochondria, and then it underwent slight deviations.

  9. Non-overlapping- each of the triplets of the genetic text is independent of each other, one nucleotide is part of only one triplet; In Fig. shows the difference between overlapping and non-overlapping code.

    In 1976 The DNA of phage φX174 was sequenced. It has single-stranded circular DNA consisting of 5375 nucleotides. The phage was known to encode 9 proteins. For 6 of them, genes located one behind the other were identified.

    It turned out that there is an overlap. Gene E is located entirely within gene D. Its start codon appears as a result of a frame shift of one nucleotide.

  10. Gene J begins where gene D ends. The start codon of gene J overlaps with the stop codon of gene D as a result of a two-nucleotide shift. The construction is called a “reading frameshift” by a number of nucleotides not a multiple of three. To date, overlap has only been shown for a few phages. Noise immunity

    - the ratio of the number of conservative substitutions to the number of radical substitutions.

    Nucleotide substitution mutations that do not lead to a change in the class of the encoded amino acid are called conservative. Nucleotide substitution mutations that lead to a change in the class of the encoded amino acid are called radical.

    Since the same amino acid can be encoded by different triplets, some substitutions in triplets do not lead to a change in the encoded amino acid (for example, UUU -> UUC leaves phenylalanine). Some substitutions change an amino acid to another from the same class (non-polar, polar, basic, acidic), other substitutions also change the class of the amino acid. In each triplet, 9 single substitutions can be made, i.e. There are three ways to choose which position to change (1st or 2nd or 3rd), and the selected letter (nucleotide) can be changed to 4-1=3 other letters (nucleotide). Total possible replacements

    nucleotides - 61 by 9 = 549.


By direct calculation using the genetic code table, you can verify that of these: 23 nucleotide substitutions lead to the appearance of codons - translation terminators.

134 substitutions do not change the encoded amino acid.

When scientists managed to study the properties of the genetic code, universality was recognized as one of the main ones. Yes, strange as it may sound, everything is united by one, universal, common genetic code. It was formed over a long period of time, and the process ended about 3.5 billion years ago. Consequently, in the structure of the code one can trace traces of its evolution, from the moment of its inception to today.

When we talk about the sequence of arrangement of elements in the genetic code, we mean that it is far from chaotic, but has a strictly defined order. And this also largely determines the properties of the genetic code. This is equivalent to the arrangement of letters and syllables in words. Once we break the usual order, most of what we read on the pages of books or newspapers will turn into ridiculous gobbledygook.

Basic properties of the genetic code

Usually the code contains some information encrypted in a special way. In order to decipher the code, you need to know distinctive features.

So, the main properties of the genetic code are:

  • triplicity;
  • degeneracy or redundancy;
  • unambiguity;
  • continuity;
  • the versatility already mentioned above.

Let's take a closer look at each property.

1. Triplety

This is when three nucleotide compounds form a sequential chain within a molecule (i.e. DNA or RNA). As a result, a triplet compound is created or encodes one of the amino acids, its location in the peptide chain.

Codons (they are also code words!) are distinguished by their sequence of connections and by the type of those nitrogenous compounds (nucleotides) that are part of them.

In genetics, it is customary to distinguish 64 codon types. They can form combinations of four types 3 nucleotides each. This is equivalent to raising the number 4 to the third power. Thus, the formation of 64 nucleotide combinations is possible.

2. Redundancy of the genetic code

This property is observed when several codons are required to encrypt one amino acid, usually in the range of 2-6. And only tryptophan can be encoded using one triplet.

3. Unambiguity

It is included in the properties of the genetic code as an indicator of healthy genetic inheritance. For example, the GAA triplet, which is in sixth place in the chain, can tell doctors about the good state of the blood, about normal hemoglobin. It is he who carries information about hemoglobin, and it is also encoded by it. And if a person has anemia, one of the nucleotides is replaced by another letter of the code - U, which is a signal of the disease.

4. Continuity

When recording this property of the genetic code, it should be remembered that codons, like links in a chain, are located not at a distance, but in direct proximity, one after another in the nucleic acid chain, and this chain is not interrupted - it has no beginning or end.

5. Versatility

We should never forget that everything on Earth is united by a common genetic code. And therefore, in primates and humans, in insects and birds, in a hundred-year-old baobab tree and a blade of grass that has barely emerged from the ground, similar triplets are encoded by similar amino acids.

It is in genes that the basic information about the properties of a particular organism is contained, a kind of program that the organism inherits from those who lived earlier and which exists as a genetic code.

GENETIC CODE, a system for recording hereditary information in the form of a sequence of nucleotide bases in DNA molecules (in some viruses - RNA), which determines the primary structure (location of amino acid residues) in protein (polypeptide) molecules. The problem of the genetic code was formulated after proving the genetic role of DNA (American microbiologists O. Avery, K. McLeod, M. McCarthy, 1944) and deciphering its structure (J. Watson, F. Crick, 1953), after establishing that genes determine the structure and functions of enzymes (the principle of “one gene - one enzyme” by J. Beadle and E. Tatem, 1941) and that there is a dependence of the spatial structure and activity of a protein on its primary structure (F. Sanger, 1955). The question of how combinations of 4 nucleic acid bases determine the alternation of 20 common amino acid residues in polypeptides was first posed by G. Gamow in 1954.

Based on an experiment in which they studied the interactions of insertions and deletions of a pair of nucleotides, in one of the genes of the T4 bacteriophage, F. Crick and other scientists in 1961 determined general properties genetic code: triplet, i.e. each amino acid residue in the polypeptide chain corresponds to a set of three bases (triplet, or codon) in the DNA of the gene; codons within a gene are read from a fixed point, in one direction and “without commas”, that is, the codons are not separated by any signs from each other; degeneracy, or redundancy - the same amino acid residue can be encoded by several codons (synonymous codons). The authors assumed that the codons do not overlap (each base belongs to only one codon). Direct study of the coding capacity of triplets was continued using a cell-free protein synthesis system under the control of synthetic messenger RNA (mRNA). By 1965, the genetic code was completely deciphered in the works of S. Ochoa, M. Nirenberg and H. G. Korana. Unraveling the secrets of the genetic code was one of the outstanding achievements of biology in the 20th century.

The implementation of the genetic code in a cell occurs during two matrix processes - transcription and translation. The mediator between the gene and the protein is mRNA, which is formed during transcription on one of the DNA strands. In this case, the sequence of DNA bases, which carries information about the primary structure of the protein, is “rewritten” in the form of a sequence of mRNA bases. Then, during translation on ribosomes, the nucleotide sequence of the mRNA is read transfer RNAs(tRNA). The latter have an acceptor end, to which an amino acid residue is attached, and an adapter end, or anticodon triplet, which recognizes the corresponding mRNA codon. The interaction of a codon and an anti-codon occurs on the basis of complementary base pairing: Adenine (A) - Uracil (U), Guanine (G) - Cytosine (C); in this case, the base sequence of the mRNA is translated into the amino acid sequence of the synthesized protein. Different organisms use different synonymous codons with different frequencies for the same amino acid. Reading of the mRNA encoding the polypeptide chain begins (initiates) with the AUG codon corresponding to the amino acid methionine. Less commonly, in prokaryotes, the initiation codons are GUG (valine), UUG (leucine), AUU (isoleucine), and in eukaryotes - UUG (leucine), AUA (isoleucine), ACG (threonine), CUG (leucine). This sets the so-called frame, or phase, of reading during translation, that is, then the entire nucleotide sequence of the mRNA is read triplet by triplet of tRNA until any of the three terminator codons, often called stop codons, are encountered on the mRNA: UAA, UAG , UGA (table). Reading of these triplets leads to the completion of the synthesis of the polypeptide chain.

AUG and stop codons appear at the beginning and end of the mRNA regions encoding polypeptides, respectively.

The genetic code is quasi-universal. This means that there are slight variations in the meaning of some codons between objects, and this applies primarily to terminator codons, which can be significant; for example, in the mitochondria of some eukaryotes and mycoplasmas, UGA encodes tryptophan. In addition, in some mRNAs of bacteria and eukaryotes, UGA encodes an unusual amino acid - selenocysteine, and UAG in one of the archaebacteria - pyrrolysine.

There is a point of view according to which the genetic code arose by chance (the “frozen chance” hypothesis). It's more likely that it evolved. This assumption is supported by the existence of a simpler and, apparently, more ancient version of the code, which is read in mitochondria according to the “two out of three” rule, when the amino acid is determined by only two of the three bases in the triplet.

Lit.: Crick F. N. a. O. General nature of the genetic code for proteins // Nature. 1961. Vol. 192; The genetic code. N.Y., 1966; Ichas M. Biological code. M., 1971; Inge-Vechtomov S.G. How the genetic code is read: rules and exceptions // Modern natural science. M., 2000. T. 8; Ratner V. A. Genetic code as a system // Soros educational journal. 2000. T. 6. No. 3.

S. G. Inge-Vechtomov.