Linus Pauling and the Structure of Proteins: A Documentary History All Documents and Media  
Home | Search | Narrative | Linus Pauling Day-By-Day

All Documents and Media

Memorandum from Linus Pauling to Robert Corey. September 8, 1954.
Pauling writes to express his opinion that they are nearly complete in their efforts to determine the structure of collagen and to propose a detailed plan for refining and finalizing their model.

Transcript

I think that the time has now come for us to finish up the job on collagen.

I have finally become convinced that there is no reasonable alternative to the (10,10,a) structure for collagen, and that our job now is to refine the structure somewhat (this involves moving the atoms only a few hundredths of an Angstrom), and gathering and presenting the evidence about it.

I have just carried out another effort to find an alternative structure to (10,10,a), without success.

I think that it is very unlikely that I have overlooked the correct structure in my search. Moreover, the fact that three structures for collagen have been published in the last few months, by Crick, by Huggins, and by Ramachandran and his coworkers, shows that other investigators have been searching, too, and it is of some significance that they have not proposed a structure that is acceptable. It is, of course, true that they have apparently not discovered the (10,10,a) structure, and that accordingly they might have overlooked the correct structure, if it is a different one.

I shall summarize the present situation in the following paragraphs.

1. It is likely that the amide groups are all in the trans configuration, although possible that the proline amide group has the cis configuration. (I shall use proline in this discussion to refer to both proline and hydroxyproline, when discussing the main chain.) There is evidence, in the papers of Mizushima and Simanouti, that the trans configuration is more stable than the cis configuration, by significant amount. I think that the argument would, however the proline and hydroxyproline groups, because for them the nitrogen atom forms two single bonds to carbon atoms, which would behave in a nearly equivalent manner. Professor Badger says that the spectroscopic data pretty well show that the amide groups have the trans configuration. It is, I think, likely that the spectroscopic argument involves the N-H vibration, and that his conclusion accordingly applies only to the non-proline groups.

Our original structure was of the type cis-cis-trans. It does not agree with the x-ray data. I have not been able to formulate a cis-cis-trans structure that is acceptable. The Huggins structure, which is of this type, is probably unsatisfactory because of a non-coplanarity of the bonds formed by the nitrogen atom. I have not made a further detailed study of it.

I have not been able to find any satisfactory structure of the type cis-trans-trans. I think that all structures of this type can be ruled out if the assumptions are made that two hydrogen bonds must be formed for every three residues, the repeating unit being a three-residue unit, and that the orientation of the hydrogen bonds must conform roughly to the limitation set by the infrared dichroism.

The following detailed discussion relates accordingly to trans-trans-trans structures, involving a repeating unit of three residues.

2. Elimination of structures involving three chains twisted about one another. On 5 March 1954 Professor Badger told me that in glycylglycine the N-H vibration, 3300 cm-1, lies about 10º from the N-H axis, in the N•••O direction, and that the amide II vibration (which is probably to be attributed largely to the wagging of N-H) is very nearly perpendicular to it, and in the plane of the group.

The direction assigned to the amide II vibration by Badger is within 2 degrees of the line stretching across the amide group between the two α-carbon atoms at its ends. Accordingly the observed dichroism of the amide II vibration, which occurs at about 1550 cm-1, provides a very simple way of drawing conclusions about the folding of the polypeptide chain.

Let us assume that the amide II vibration lies within 10 degrees of the αC-αC axis, and draw conclusions from this.

If there are three chains, each amide group must stretch 2.86 A along the fiber axis. The distance between α-carbon atoms is 3.83 A. Hence the angle is 48º between the/axis and the basal plane. This corresponds to a positive dichroism of 2.54, which is far greater than the observed 1.22 (Badger). The observed dichroism corresponds to an angle of 38º, which is 10º less than calculated, and accordingly possible. However, there is no acceptable way of arranging three chains, with suitable hydrogen bonds.

3. Structures involving a single chain.

I have made an exhaustive analysis of structures involving a single chain, with a repeating unit of three trans groups, forming two hydrogen bonds.

Several of these structures have an axial length per unit of 2.86 A. The include (10,10,a), (14,14,a), (16,16,a), and (10,16,a). The last three of these four are easily eliminated by the observed infrared dichroism.

First let us consider the amide II vibration. The observed dichroism is positive, with value 1.22. If we assume that this vibration is along the α-carbon axis of the amide group, then the positive dichroism eliminates all structures of the single-chain all-trans type in which the x coordinates of the α-carbon atoms show only increase in value along the chain. The chain must progress the distance 2.86 A along the z axis, in the unit of three residues. If each residue is occupies one third of this distance, 0.95 A, the axis is at the angle 14.4º with the basal plane, and the dichroism is negative, the ratio being 17.6. The closest approximation to positive dichroism occurs when two of the residues have zero component along the figure axis, and the third the full component 2.86. The dichroism is still then negative, with ratio 1/2.2. This is so far from the observed 1.22/1 that we conclude that any three-chain all-trans structure must be a retrograde structure, with one residue having a negative component of its α-carbon axis along the fiber axis. (Or perhaps two residues having negative components.)

The structures (14,14,a), (16.16,a), and (10,16,a) are all non-retrograde in character. All have calculated dichroisms for amide II that are strongly negative, approximately 1/7 in each case. These structures may accordingly be ruled out.

All three structures may also be ruled out by consideration of the infrared dichroism for the N-H stretching vibration and for the amide I vibration (which is largely the C=O stretching vibration). These vibrations both show negative dichroism. The calculated dichroisms for the three structures just discussed are strongly positive, and the structures are hence eliminated.

4. The (10,10a) structure and the observed infrared dichroisms.

The calculated dichroism for the amide II vibration is 1.23 for the (10,10,a) structure when all three amide groups are counted, and 1.16 when 70 percent of one amide group is occupied by proline and hydroxyproline, and considered not to contribute to this vibration; these values correspond to positive dichroism, and are acceptable.

The calculated dichoirc ratios for the N-H and amide I vibrations, assuming them to be along the N-H axis or C=O axis, are approximately 1, whereas these vibrations are observed to show negative dichroism, 1/1.8 and 1/1.2, respectively. There is enough uncertainty about the effective directions of the dipole moments for these vibrations in the complex structure of the coupled hydrogen bonds that, however, the lack of complete agreement cannot be used to rule out the structure. The structure may be considered to be composed of three amide groups (alternating in the polypeptide chain) that are coupled together by two hydrogen bonds, and that may form a unit responsible for the infrared absorption. (The interaction is probably large for these vibrations, and small for the amide II vibration.) The configuration of this unit in space is such as to indicate that the effective dipole moments form a smaller angle with the basal plane than that formed by the N-H and C=O axes. This is presumably the explanation of the observed negative dichroisms. It may be desirable to carry out a somewhat detailed theroretical discussion of this question.

5. Dimensions of the structure.

The x-ray data indicate that there are three residues for collagen in the axial length 2.86 A, and that there probably is a repeating unit of three residues forming a helix with ten units in three turns or in seven turns. The (10,10,a) structure with planar amide groups and 110º bond angles gives 2.75 A per unit and 4.5 units per turn, rather than 3.3. If two of the residues in a unit are twisted about the C'N axis, through about 6º, the structure can be deformed to 2.86 A per unit and 3.3 units per turn. This corresponds to a strain energy of about 0.3 kcal/mole for each of the two residues. The same result can be achieved by distributing the deformation over several features of the structure – changing several bond angles by about 1 degree apiece, bending the double C'-N bond a little, and rotating about it through a smaller angle, perhaps three degrees. A total strain distributed over n features rather than one decrease the strain energy to 1/n of its value. It is accordingly probable that the amount of deformation involved here is not more than a few tenths of a kcal/mole per unit. The deformation is presumably the result of steric repulsion between the δ-carbon atom of the proline ring and the adjacent (once removed) amide group, and perhaps also between the carbonyl oxygens and the adjacent amide groups.

The orientation of the α-carbon atom adjacent to the proline nitrogen is a satisfactory/for formation of the proline ring.

6. Side chains.

The structure provides satisfactory positions for side chains. Of the two non-proline α-carbon atoms, one projects toward the axis of the compound helix, and one is on the outside. It cannot be concluded, however, that the former must represent only glycine residues, because there seems to be room enough for a larger side chain.

7. Radial distribution curve.

An experimental radial distribution curve published by Riley and Arndt has now been replaced by another one, sent to me by Dr. Riley. In the region from 6 A to 12 A the new curve shows acceptable agreement with the calculated radial distribution curve, made by Dr. Pasternak. There is a pronounced difference between the two curves, in that the experimental curve has a larger peak at 5 A, which is not shown on the theoretical curve.

It seems to me likely that the large peak at 5 A is to be attributed to the effect of atoms of side chains, not taken into consideration in the theoretical calculation. The (10,10,a) structure is in the form of a compound helix, which may be described as a rod with the 310 structure in which every third hydrogen bond is broken, permitting the rod to be twisted into a helix. The rod is about 6 A in diameter, including the Van der Waals radii of the atoms, and it is twisted into a helix with average radius about 3.5 A and pitch about 9.5 A. There is accordingly a helical cavity about 4 A in diameter, which must be filled up with side-chain atoms. Consideration of the model shows that an atom in this helical cavity will have some neighbors at about 3 A distance, and a number of neighbors at about 5 A distance.

I think that it would be wise to have Dr. Pasternak carry out, without delay, a determination of the radial distribution function for collagen. Possibly this should be done by Dr. Marsh, using the spectrometer and a disoriented sample of collagen or of gelatin, or perhaps both.

It might be worth while to repeat this work with a hydrated specimen, to see whether or not the same radial distribution curve is obtained.

8. The distribution of intensities along the equator.

I think that it would be worth while to make a comparison of observed and calculated intensities along the equator of a fiber diagram.

As for the calculated intensities, it would, I think, be enough to use the Bessel-function formula, which is given in our paper on Atomic Coordinates and Structure Factors for Two Helical Configurations of Polypeptide Chains. I think that you have a record of the atomic coordinates for the structure – the atomic coordinates were used by Dr. Pasternak in his radial distribution calculation. They may have to be refined somewhat later on, but I do not think that any changes that will be made are significant for this calculation. Could you have Mrs. Oberhettinger carry it out?

It might be worth while also to have Dr. Marsh make a spectrometer study of the equatorial intensities for kangaroo-tail tendon, both in the dried state and hydrated.

9. I feel that it would be well worth while to make some new collagen photographs, In particular, I think that we should examine the large-angle region along the meridian, to see whether reflections that at 1.43 A, 0.95 A, and 0.72 A can be picked up – perhaps even at 0.57 A. If the orders from 1 to 5 of the 2.86-A meridional reflection could be observed, and if their relative intensities were found to correspond to the proposed structure, we would have a pretty strong argument in favor of the structure. There will no doubt be a very heavy general background in this region, but I feel that it would be wise to make a vigorous effort to obtain experimental values of the intensities of these reflections. You remember how much more could be seen on the silk photographs than had been reported in the literature.

10. Ramachandran has stated that the 2.86-A reflection is not a meridional reflection. Perhaps a Weissenberg photograph could be made to check this point.

11. I think that it might be worth while to evaluate the complete form factor for the structure. I think that it should be done for only one azimuthal orientation of the helix. It is true that there are only ten orientations of different units, rather than eighteen in the α helix, but the result obtained by Dr. Yakel for the α helix, that the form factor varies only slightly with change in azimuthal orientation of the fiber relative to the x-ray beam, probably would apply pretty well to this structure also. The calculation would be made by use of the formula of Cochran, Crick, and Vand.

12. A simple calculation might be made to explain the observed meridional reflections on the third and seventh layer lines – perhaps also on the sixth layer line (I had thought that I could see such a reflection).

The chemical analyses of collagen indicate that in 30 residues there are about three hydroxyproline residues and four proline residues, a total of seven Pro + Hypro. There are, however, ten positions which might be occupied by these seven residues. Presumably three of the positions are occupied by some other amino acid.

I have assumed the sequence x, pro, hypro, x, pro, hypro, pro, x, pro, hypro, as a repeating sequence, in the identity distance 28.6 A.

I have assumed a scattering center with scattering power unity at each of the pro + hypro positions, and have calculated the following values of the structure factor for the first ten orders of reflection (00l): 0.4, 0.6, 2.6, 1.6, 1.0, 1.6, 2.6, 0.6, 0.4, 7.

We see that there should be a very strong tenth order, the observed 2.86 A meridional reflection. The two next longest reflections are the third and seventh; these are observed. Next are the fourth and sixth; of these the fourth is not present, but I have thought that I could see the sixth. It is calculated to be about 40 percent as intense as the third or seventh.

Perhaps the calculation should be carried out with use of the z coordinates of the atoms gamma-C and δ-C for the proline and hydroxyproline ring, and perhaps also for the oxygen atom of hydroxyproline. The β-carbon atom of the ring probably should not be included, because presumably there is a β-carbon atom in the residue taking the place of the ring. It might turn out that interference of the gamma-C, δ-C, and O would cause the intensity of 004 to be decreased, and perhaps that of 006 to be increased.

Return to Document Page

Home | Search | Narrative | Linus Pauling Day-By-Day