1. Random sequences
and unique protein structures
Alexei Finkelstein
Physical theory shows that a substantial fraction (~10-4
- 10-8 %) of
random amino acid sequences can form unique stable 3-D structures
under physiological conditions, fold rapidly, and have
architectures
typical for protein molecules. The main peculiarity of a
"protein-
like" chain is a considerable gap dividing the energy of its
lowest-
energy fold from the energies of the other folds; this gap plays
an essential role in thermodynamics and kinetics of folding.
2. How can a protein chain
find its unique fold?
The problem of how a protein chain can find its most stable
structure
without exhaustive sorting out of all its possible conformations
is known
as the "Levinthal paradox". I shall show in the lecture
that attaining of
the lowest-energy fold is rapid when it occurs in a vicinity of a
thermodynamic "all-or-none" transition from the coil to
the lowest-
energy fold. Such a transition requires an "edited"
chain with an
enhanced stability of its lowest-energy fold. In a vicinity of
the mid-
transition point, all the mis- and semi-folded states cannot
"trap" the
folding since, even taken together, all these states are less
stable than
both the initial coil and the final stable fold of the chain.
Therefore, a
stable fold can be rather rapidly achieved here via that
"nucleation and
growth" folding pathway which provides a continuous
entropy-by-energy
compensation in the course of folding, thus providing a low
transition
state free energy. In the mid-transition, an N-residue chain
folds normally in ~exp(N 2/3) nsec. Therefore, a
100-residue chain
normally finds its most stable fold within minutes rather than in
10100 psec ~ 1080 years, according to the
famous paradoxical estimate of Levinthal.
3. Introduction to protein
structure prediction
This is a review of the state of the art in the recognition and
prediction of protein folds from their sequences. I pay a special
attention to physical background of the predictive methods. In
particular, I review the secondary structure predictions and the
"threading" methods used for recognition of protein
folds. It is shown
that all the predictive methods can use only some part of the
interactions operating in the chain, and that even their energies
are
not known precisely. This is the principal source of errors and
uncertainties. The errors can be reduced by employment of many
distant
homologs, but this opens only a possibility to predict a
secondary
structure and a generalised folding pattern rather than a
particular
fold of a given chain with all details of the fold.
|