New Classification Scheme of the Genetic Code

A new classification scheme of the genetic code is based on the purine-pyrimidine classification of the nucleotides :

and on the number of hydrogen bonds (H-bonds)  established by the first two bases in the codons (C --- G binds via 3 H-bonds, A --- U binds via 2 H-bonds):

The new scheme of the genetic code consists of 8 rows numbered from 000 up to 111, due to the 2³=8 binary representations of all possible codons. Each row contains again 8 possibilities, for instance codon 000 (three pyrimidines) represents the 8 codons: CCC, CCU, ..., UUU. Because of the third position degeneracy, the number of columns can be reduced to four. Each field stands for two codons, where the third bases are given in parentheses. For instance, CC(C/U) means that the two codons CCC and CCU encode for Prolin.

The four combinations of the first column (CC*, GC*, CG*, and GG*) always imply 6 hydrogen bonds in complementary base-paring with the corresponding anticodon of the tRNA. They are called strong codons (3+3=6 H-bonds), their third base does not matter for a determination of the corresponding amino acid (always family codons). In the next two columns the first two bases yield 5 H-bonds: mixed codons (3+2=2+3=5 H-bonds). The upper half of these two columns contains family codons and the lower half not. The weak codons (2+2=4 H-bonds)  have 4 H-bonds. Their third base is needed always for translation.

 

Fig 1. The new classification scheme of the genetic code. Purines are encoded  as 1 and pyrimidines are 0 in the codons. The first two bases of  strong codons have 6 H-bonds, (mixed -  5 H-bond,  weak  -  4 H-bonds)  in complementary base-paring . The point in the centre is Halitsky's family -non family symmetry operations. The red horizontal line indicates codon-anticodon symmetry axis and red vertical line is purine -pyrimidine symmetry of the codons.

If the third position is needed for the determination of the correct amino acid, it is sufficient to determine if there is a purine or a pyrimidine, with the only exception of the two fields Trp/Stop and Met/Ile, where the translation machinery has to analyze the third purine base exactly. (Interestingly, they correspond to the begin and end of translation.) Therefore, the new scheme contains only 32 fields instead of 64, although the information content of both tables is the same.

Fig.2. The common classification scheme of the genetic code. The four rows stand for the first base in the codon, the four columns represent the second base and the right site indicates the third base in the triplet. The yellow regions indicate “family codons”, where the encoded amino acid is independent of the third position. Since there are 20 different amino acids and 4^3 = 64 possible codons the genetic code is redundant.