|
SCIENTIFIC FRONTIER III:
Gene Regulation
The process of converting a gene in a chromosome to a functional
protein is carried out by the tremendously sophisticated molecular
machinery of the cell (see Figure 2). Since the levels of the
protein product are critical to proper cell function, the overall
throughput for each gene is regulated, and this regulation occurs
at all levels from accessibility of the gene in the chromosome
through release of the mature protein and its ultimate degradation.
The expression of specific genes at various times in the life
cycle of the cell leads to differentiation, and also provides
the possibility for a cell to respond to environmental stresses.
However, loss of regulation for even a single gene product can
also lead to a metabolic disorder. For example, most of the primary
causes of cancer are believed to involve the breakdown of normal
regulatory steps. For these reasons, the understanding of gene
regulation has become in the past decade one of the focal points
of molecular biology. Although tremendous progress has been made,
much remains to be learned before the full potential can be achieved
for developing methods to regain control of aberrant processes,
to treat diseases, and to manipulate cells for new function for
biotechnology.
As the molecular players in the complex and interconnected regulatory
pathways become identified, it has been natural to ask how they
act at the molecular level. Here too, despite the wonderful progress
resulting from the synergistic efforts of molecular biology, X-ray
crystallography and NMR spectroscopy, much more remains to be
done. We now have learned how some of the first defined DNA binding
motifs, such as the helix-turn-helix and zinc fingers, provide
an interface to DNA for sequence specific recognition. At the
same time, came the finding that this process is complex-the facts
that some protein side chains in the interface are disordered
and that water molecules act as bridges in recognition were not
anticipated and are still not well understood. Elucidation of
how proteins modulate transcriptional activity has led to additional
surprises: it has been found that DNA bending and other distortions
frequently occur in response to protein binding, and that large
assemblies of proteins form and interact with the primary machine,
the polymerase.
Some regulatory proteins enhance the rate of complex formation,
while others cause the DNA at the promoter to shift from the closed
state to open, whence transcription can actually begin. Many of
the proteins involved have multiple domains, some to interact
with DNA, others to regulate their own assembly, and yet others
to contact the additional proteins involved. It has been found
that these domains are often flexibly linked (a scientific contribution
largely credited to NMR which is sensitive to local dynamics in
flexible regions), which makes elucidation of their structures
more difficult. Indeed some domains have been found to be completely
unfolded random chains in the absence of their interacting partners
(either DNA or protein) while upon assembly these domains become
ordered. Only the first glimpses of the myriad protein-protein
interactions required for regulation have been characterized.

Once the primary transcript has been made it must be processed
to give the mature message-spliced by huge protein-RNA assemblies
to remove noncoding regions with alternate splicing sometimes
occurring to control protein function. Proteins recognize specific
messages to act as genetic switches analogous to the repressors
on genes, but little is yet known about how the correct messenger
RNAs are recognized. Proteins add caps and poly-A sequences to
messages preparing them for translation, and controlling the rate
at which they are degraded. Other proteins change availability
of key structures for translation of the message, yet another
process subject to regulation by the cell. Though many proteins
involved in these steps have been identified, very few have been
structurally characterized.
Proteins once synthesized must fold, some needing the help of
chaperonins to prevent aggregation. They are transported in the
cell, and processed to add anchors, active site metals, or carbohydrates-
all regulated processes. Some are carried through the cell and
enter the nucleus, others taken to the surface are released to
the surroundings.
All of these processes of regulation have in common the need for
specific recognition of one molecule by another, and often the
ability to then execute a chemical or mechanical function. The
molecules involved have proven to be challenging as structural
targets. A significant number of individual domain structures
have been solved, but the real action comes with the interaction
of these domains resulting in assemblies growing to a size at
the upper end of what has been feasible to study by solution NMR.
Higher fields and new probe technology have improved sensitivity,
and widescale use of uniform stable isotope labeling with 15 N, 13 C and 2 H, and multidimensional spectroscopy have steadily expanded the
molecular weight range that could be studied. A remarkable combination
of developments occurring now stands to significantly extend the
range so that many more of these complexes can be studied. The
use of selective isotope labeling provides an approach to reduce
spectral complexity in large complexes, while retaining key probes
for intermolecular interactions. New methods for chemical synthesis,
in vitro translation and metabolic labeling of proteins will provide
many more options than previously available. Methods for enhancing
and measuring dipolar couplings will provide new restraints to
enhance calculation of accurate structures, and solve some of
the problems that have plagued nucleic acid structure determinations.
Higher magnetic fields will continue to improve spectral resolution
and sensitivity which in spite of steady gains has always been
a limiting factor. Higher fields will now play a further special
role in transverse optimized relaxation spectroscopy (TROSY),
which can yield sharp resonances even for very large complexes
of proteins or nucleic acids. Now the process of solving structures
of important domains and watching their assembly into relevant
complexes can be realized to a far greater extent than ever before.
The processes which occur related to gene regulation should not
be viewed as isolated or unique in the cell. Many other key biological
functions, including recognition and repair of the genome (which
in fact couples to gene regulation), replication, recombination,
as well as many signaling processes rely on formation of multicomponent
complexes with the same issues for NMR analysis. The ability to
deal with larger complexes by NMR will facilitate work in all
of these other areas.22 |