Simon Levin, Princeton University, Co-Chair
Alberto Apostolico, University of Padova
Marjorie Asmussen, University of Georgia
Bruce L. Bush, Merck Research Labs
Carlos Castillo-Chavez, Cornell University
Robert Eisenberg, Rush Medical College
Bard Ermentrout, University of Pittsburgh
Christopher Fields, Santa Fe Institute
John Guckenheimer, Cornell University
Alan Hastings, University of California, Davis
Michael Hines, Yale University
Barry Honig, Columbia University
Lynn Jelinski, Cornell University
Nancy Kopell, Boston University
Don Ludwig, University of British Columbia
Terry Lybrand, University of Washington
George Oster, University of California, Berkeley
Alan Perelson, Los Alamos National labs
Charles Peskin, Courant Institute of Mathematical Sciences
Greg Petsko, Brandeis University
John Rinzel, National Institutes of Health
Robert Silver, Marine Biological Laboratory
Sylvia Spengler, Lawrence Berkeley Labs
DeWitt Sumners, Florida State University
Carla Wofsy, University of New Mexico
I. Summary
II. Introduction
III. Molecular and Cellular Biology
IV. Organismal Biology
V. Ecology and Evolution
VI. Cross-Cutting Issues
VII. Educational Issues
VIII. References
The common theme of this report is the
tremendous potential of mathematical and computational approaches in leading to
fundamental insights and important practical benefits in research on biological
systems. Mathematical and computational approaches have long been appreciated
in physics and in the last twenty years have played an ever-increasing role in
chemistry. In our opinion, they are just coming into their own in biology.
The goals of these mathematical and computational approaches are to elucidate mechanisms
for seeming disparate phenomena. For example, how does the atomic level
structure of an enzyme lead to its functional, enzyme catalysis? To understand
this structure/function relationship requires fundamental quantum mechanical
and molecular dynamical calculations, but successful simulations may lead to
understanding of disease and drug therapy. Knowing the three dimensional
structure of the muscle protein kinesin may lead to understanding of muscle
action as well as other cellular motors. Simulations of the embryonic and fetal
heart at different stages of development are helping to elucidate the role of
fluid forces in shaping the developing heart. The structure and dynamics of
earth's ecosystems are critical elements in how they function and mathematical/computational
methods play a critical role in understanding their function.
In these examples and the many others in the body of this report (sections
III-V), mathematical/computational methods, based either on fundamental
physical laws (e.g. quantum mechanics), empirical data, or a combination of
both, are providing a key element in biological research. These methods can
provide hypotheses that let one go beyond the empirical data and can be
constantly tested for their range of validity.
Our report also highlights (section VI) computational issues that are common
across biology, from the molecular to the ecosystem. Computers are getting more
powerful at a prodigious rate and, in parallel, the potential for computational
methods to ever more complex systems is also increasing. Thus, it is essential
that the next generation of biological scientists have a strong training in
mathematics and computation from kindergarten through graduate school. We
discuss educational issues in section VII of our report.
A purpose of this report is to increase the awareness among biological
scientists of the ever-increasing utility of mathematical and computational
approaches in biology. Sometimes newly emerging areas and interdisciplinary
areas are in danger of falling between the cracks at funding agencies.
Specifically, we hope that this report will raise the level of awareness at the
National Science Foundation and other funding agencies on nurturing
computational and mathematical research in the biological sciences.
II.
INTRODUCTION
Characterization of biological systems has reached an unparalleled level of
detail. To organize this detail and arrive at a better fundamental
understanding of life processes, it is imperative that powerful conceptual
tools from mathematics and the physical sciences be applied to the frontier
problems in biology. Modeling of biological systems is evolving into an
important partner of experimental work. All facets of biology, environmental,
organismic, cellular and molecular biology are becoming more accessible to
chemical, physical and mathematical approaches. This area of opportunity was
highlighted in a 1992 report, supported by the National Science Foundation,
entitled "Mathematics and Biology, the Interface, Challenges and Opportunities."
(MBICO)
A workshop was held at the National Science Foundation (NSF) on March 14 and
15, 1996 that built on the findings of MBICO in order to critically evaluate
its findings and to suggest which areas were the most promising as foci for further
research. This workshop brought together 25 scientists, with expertise ranging
from the molecular to the cellular to the organism to the ecosystem level, all
of whom have an interest in applications of mathematical/computational
approaches to biological systems. The goal of the workshop was to identify
important research areas where theoretical/computational studies could be of
most use in giving insight and in aiding related experimental work. This is
done below. Because of the small size of our group, the limited time we had,
and our not unlimited vision, one must view the areas of research opportunities
presented below as representative, not exhaustive. Hopefully, our report can
provide some guidance and an historical marker as to the state-of-the-art
inModeling of Biological Systems, ca. 1996.
Our report is divided into five sections. We follow the organization of the NSF
in dividing our description of research opportunities into three areas:
Molecular and Cellular Biology, Organismal Biology and Ecology and Evolution.
These three sections are followed by a section focussing on issues that cross
the boundaries between these areas and a final section on educational issues.
III. MOLECULAR AND CELLULAR BIOLOGY
OVERVIEW
A central organizing theme in Molecular and Cellular Biology is the
relationship between structure of molecules and high level complexes of
molecules and their function, both in normal and aberrant biological contexts.
The connection between structure and function was most clearly illustrated in
the paper that began Molecular Biology, the elucidation of the structure of DNA
by Watson and Crick.
This study immediately illustrated how DNA can replicate and retain the
original information stored in it. Thus, the structure showed how this molecule
functions. But this example also shows the important role of mathematics,
chemistry and physics in elucidating structure/function relationships in
biology. Both the information contained in DNA duplexes and their higher order
structures have been usefully analyzed by mathematics, as the sections below on
the GENOME and MOLECULAR HISTOLOGY illustrate, and important questions have
been answered and many still remain unanswered in these areas.
The developments in physics and chemistry have played fundamental roles in
enabling structure determination of the essential molecules in biology -
proteins, nucleic acids, membranes and saccharides - and in that fashion,
helping one to understand their function. Some aspects of these efforts are
described below in the sections on PROTEIN STRUCTURE and NUCLEIC ACIDS. The use
of the simulation methodologies first developed in the physics and chemistry
communities to simulate molecules of biological interest is described in
SIMULATIONS. Evolution has occurred on a molecular as well as a macroscopic
scale and some of the molecules and their properties that have evolved are
quite astonishing. The section on BIOINSPIRED MATERIALS points out the
possibilities inherent in making use of some of the materials that have evolved
in the process of molecular evolution.
Although much progress has been made in understanding structures of molecules
of biological interest and using this to infer function, a tremendous amount
remains to be done. Some of the key questions include: What is the structure of
the DNA in the nucleus and how does this structure govern DNA transcription?
Given the DNA sequence, what determines the RNA and protein structures that the
DNA codes for? Given the protein structure, what is its function? How did this
function evolve and is it optimized? How can one use this function to design
pharmaceuticals that will have a really impact on disease without upsetting the
rest of the delicately balanced biological system? What can we learn from other
organisms, some that grow under extreme conditions of temperature and pressure,
about the nature and limits of living cells and the molecules that make them
up?
The above are just some of the key questions, but it is clear from their nature
that mathematical and physical/chemical methods will be essential in answering
these questions. These methods provide the tools and language of molecular
structure from the smallest to the largest molecules and the fundamental laws
to explain how molecules interact and form their three dimensional shape. It is
this three dimensional shape which determines the molecular function. We have
reached an incredibly exciting time of the determination of protein structure,
with over 200 different types of globular protein structures known and an
estimate of the order of 10**3 expected to exist in all of biology. Thus, we
may soon have examples of every type of globular protein structure, as well as
insight into the nature of the gene which determines it.
It is clear that the nature of biological signaling pathways is very complex
and involves many feedback loops and fail safe mechanisms. The tools of
mathematics are essential to understanding these. These signaling pathways are
just one example where there is a connection between the material presented in
Molecular and Cellular and in Organismal Biology. How do these molecular
signals ultimately get transmitted into neural signals and how can we
understand possible defects at every level of these pathways --are defects due
to mutations in the proteins, subtle changes in concentration of normal
molecules or some external influence? These are exciting and extremely
important questions that involve understanding the connections from the
molecular to the cellular to the organismal level.
In the six years since the MBICO report, genomic sequence
information has continued its exponential growth. Sequencing technology is
being applied directly to sequence diversity analysis and gene expression
analysis via high throughput, chip-based, automated assay systems. This influx
has changed both the questions that are asked, as well as the range of the
interactions considered.
For example, high throughput expression data are now both tissue-specific and
specific to stages of development. Over 300,000 human expressed sequence tags
are now available in public databases, representing at least 40,000 human
genes. Moreover within the next 5 years, as many as 50 complete genomes will be
sequenced. Indeed the complete genomes of number of simple organisms have
already been sequenced (see e.g. (Fleischman, 1995)), the sequence of the yeast
genome has recently been published (see e.g. Williams, 1996), and C. elegans is
reported to be a year or two away. No one is sure as how best to exploit
genomic data but it is clear that there will soon be an explosion of biological
information on an unprecedented scale.
It will become increasingly important to carry out comparisons of entire
genomes rather than just single genes, with a concomitant expansion in the time
to compute. Multiple comparisons remain even more problematic. A similar
expansion of queries from local regions of interest (say 50,000 bp) to long
range patterns of sequence or expression is now necessary, with synthetic regions
on the order of 25 Mb considered a reasonable length for consideration.
Biological and biochemical research is producing exponentially-growing data
sets. In addition to the examples cited above of DNA sequences (currently
doubling about every 6 months) and gene expression data (chips supporting 1000s
of assays per day), combinatorial library screens (10,000s of compounds against
1000s of targets) are producing vast quantities of systematic data on function.
Technological developments will increase these data acquisition rates by an
order of magnitude or more in the next few years.
Significant work is required to develop data management systems to make these
data not just retrievable, but usable as input to computations and amenable to
complex, ad hoc queries across multiple data types. Significant work is also
required on techniques for integrating data obtained for multiple observables,
at different scales, with different uncertainties (data fusion) and for
formulating meaningful queries against such heterogeneous data (data mining).
For example, it should be possible in the future to ask what differences to
expect in the kinetic efficiencies of a signal-transduction pathway across
multiple individuals, given the differences in the sequences of the proteins
involved in the pathway. Answering such queries will require improvements in
data models, heterogeneous database management systems, multivariate
correlation analysis, molecular structure prediction, constrained-network
modeling, and uncertainty management.
As the amount of genomic data
grows, three dimensional structure will provide an increasingly important means
for exploiting and organizing this information. Structure provides a unique yet
largely unexplored vehicle for deducing gene function from sequence data.
Structure also links genomic information to biological assays and serves as a
basis for rational development of bioactive compounds, including drugs and
vaccines.
Research opportunities in this area can be divided into four distinct
categories: experimental structure determination, structure prediction,
structure exploitation of globular proteins, and modeling of membrane proteins,
where the determination of high resolution structures is much more difficult.
During the past decade, advances in protein crystal growth, diffraction data collection and experimental phase determination have led to an explosion of structural information. (Ringe and Petsko, 1996)Despite this rapid growth, the demand for new structural data remains high. Areas where mathematical and computational approaches are still needed to increase the throughput further to include direct phase determination, improved structure solution by molecular replacement, and automated electron density map interpretation.
NMR: Solution NMR is now providing high resolution protein, DNA and RNA structures that rival those from x-ray crystallography. The limit to the size of molecules whose structures can be solved by NMR is dictated by chemical complexity, solubility, redundancy and molecular tumbling time. For proteins, the current upper size limit is currently about 200 KDa. Mathematical and modeling issues that pertain to NMR structures include developing methods to account for the effects of molecular dynamics, to specify the reliability of the structures, and to specify regions of molecular disorder and/or under-determined constraints. Data reduction for NMR should be automated. Current research along this line includes developing data management tools for assigning resonances and keeping track of cross-peaks from multi-dimensional experiments (Johnson and Blevins, 1994).
The most effective methods of structure prediction currently
available involve constructing models of proteins with unknown structures based
on templates derived from protein structures that have been determined (see
e.g. Bowie et al, 1991). There has been remarkable progress in the
development of these "fold recognition" methods in the past few years
and they offer new opportunities in structure prediction that simply did not
exist a few years ago (see e.g. the November 1995 issue of Proteins (Asilomar,
1995)
Fold recognition methods can be used to predict the structures of proteins that
have not yet been determined experimentally and to find homology relationships
between proteins that cannot be detected with traditional sequence alignment
methods. The challenges that now arise offer research opportunities in a number
of areas. These include the integration of structural information in sequence alignment
methods, the development of improved scoring functions for the association of a
given sequence with a given structure (see e.g. Bryant and Lawrence, 1993), and
the identification of folding templates that focus on key structural elements
to be matched to sequence fragments (Orengo et al, 1995). These problems
will all require the development of new computational methods that allow the
analysis and integration of large quantities of structural and sequence data
and new simplified physical models that are designed to the requirements of
this emerging field.
Once an overall structural template has been derived, there is a need for
methods to predict three dimensional structure at the atomic level. There has
been significant progress in the past few years in the building of site chain
conformations onto backbone templates (see e.g. Lee and Subbiah, 1991) but
faster and more accurate solutions to this problem would be extremely useful.
Assuming the conserved structural framework regions are known there is also a
need for new methods which model the structures of loops onto fixed structural
framework regions (see e.g. Levitt, 1993) a problem which is of unique
importance for membrane proteins. These can benefit from fast minimization and
conformational search procedures and from improved physical models which relate
structure to free energy (see e.g. Smith and Honig, 1994)).
The growing body of structural information provides a new
way of organizing biological data, with applications including the prediction
of function given a structure, the discovery of new principles of
protein-protein interactions, and the discovery of new evolutionary
relationships that were not evident from sequence alone. Structure
determination is usually done to address fundamental problems in cell biology,
biochemistry or pharmacology. The specific questions raised by a structure
include: where on the protein surface are the binding sites? What are the
chemical groups that prefer to bind to these sites? How do the protein and
ligand structures change in response to binding? What are the roles of protein
and cofactor groups in catalysis? How do protein dynamic properties influence
protein function?
The construction of a new class of protein structure/function databases offers
a possible approach to these problems. For example, the characterization of
different protein binding sites in terms of physical and geometric properties
will be useful in predicting the function of new proteins whose structures have
been determined, and more, generally, provides a new way of organizing and
interpreting biological data. This area offers research opportunities in
problems including the construction of new methods to represent three
dimensional objects and their incorporation into databases, the merging of
these databases with sequence and function databases, and the development of
new physical models to characterize functionally active regions in proteins.
Structure-based drug design requires locating all usable binding sites followed
by the design of small molecules that bind tightly and specifically to them
(Guida, 1994). Existing computational methods often fail because they do not
adequately account for solvent effects (see e.g. Eisenberg and McLachlan, 1986)
nor for the possibility of conformational adjustment (Kearsley et al,
1994) Better procedures are urgently needed.
Studies of enzyme catalysis ultimately require simulation of entire reaction
pathways including all bond breaking and bond-making steps as well as the
random motion of the enzyme substrate system. Existing methods of combining
quantum mechanical and molecular mechanical potential functions to carry out
such simulations are still ratherinaccurate. This is particularly true for the
interactions of metal ions and clusters which are found in a high percentage of
enzymes. Improved mathematical and computational methods are needed in all of
these areas and it is an area of much active research (Gao, 1996).
One new experimental area that is certain to have major impact on the
exploitation of structural information is combinatorial chemistry. (Gordon et
al, 1994) New techniques for high-speed parallel synthesis of novel organic
compounds are generating libraries of literally hundreds of thousands of
molecules, many of which bind to important biological targets. Methods must be
developed for organizing, correlating and interpreting the plethora of
structure/activity data produced by screening such libraries. The union of
combinatorial chemistry and structural biology offers the possibility of
deducing the rules for molecular recognition, which may ultimately allow us to
build accurate models of multiprotein complexes from the structures of their
components. The merging of small molecule and structural databases offers
unique and important challenges in this regard.
Membrane Proteins
Study of membrane proteins presents special challenges, but also promises to
yield exciting and important information. Greater understanding of membrane
protein structure and function will enhance dramatically our understanding of
basic biochemical processes such as signal transduction, and make possible
significant advances in biotechnology (e.g., receptor-based biosensors) and
biomedical sciences (e.g., structure-aided drug design). Technical problems
make it difficult or impossible to determine high-resolution structures for
most membrane proteins at present. However, a great deal of experimental data
is available for many membrane proteins, and this information can often be used
in concert with computational tools to generate reasonable three-dimensional
models (Findlay, 1996). The models in turn are beneficial in formulation of
hypotheses and design of future experiments (Kontoyianni and Lybrand, 1993). A
number of developmental issues must be addressed to enhance modeling
capabilities for study of proteins in general, and membrane proteins in
particular. For example, it is not well understood at present how much
"constraint" information is needed to permit construction of a reasonable
three-dimensional model structure, or even which types of experimental
information are most useful in model building exercises. Additional
methodological developments are also needed for improved representation and
treatment of lipid bilayers (e.g., efficient treatment of long-range
electrostatic interactions, modified Hamiltonians for representation of
anisotropic pressure tensors, etc.) and lipid-protein interactions. A number of
prokaryotic membrane proteins are now quite well characterized (e.g., bacterial
chemotaxis receptors (Bourret et al, 1991) and porins (Kreusch and
Schulz, 1994), and can serve as useful models for more complex membrane
proteins from higher organisms. These systems are ideal test cases for
evaluation of new procedures for membrane protein modeling.
Rapid progress in the understanding of membrane protein structure and function
has been hindered by the lack of a large number of high-resolution structures.
Structures from x-ray crystallography are limited to those complexes that
crystallize, whereas those from high-resolution solution NMR are limited to
cases where the assemblies have sufficiently short correlation times to produce
narrow lines. Techniques from solid state NMR, including rotational resonance
(RR) and rotational echo double resonance (REDOR) and EPR spectroscopy
(Steinhoff et al, 1994), offer special opportunities for obtaining
highly specific distance constraints for membrane proteins. A promising avenue
of research is to delineate the minimum amount of distance information needed
to specify a structure, and to predict in what order one could perform the
least number of specific NMR or EPR experiments to arrive at a structure.
NUCLEIC ACIDS
The problem of RNA structure prediction and DNA and RNA interactions with
proteins is of central biological interest. There is a need here for improved
physical models to describe the interactions of nucleic acids which differ from
most proteins in that they induce large local electric fields. Recently,
methods have been developed for treating highly charged macromolecules which
are surrounded by concentrated ion atmospheres (see e.g. (Misra et al,
1993; York et al, 1995). These and related methods open up a variety of
opportunities for simulating important biological phenomena involving nucleic
acids at atomic level resolution.
The explosive growth of information about RNA structure and function offers new
opportunities that were nonexistent a few years ago. Requirements in this area
range from computational and mathematical techniques to describe the
interaction of large fragments (see e.g. (Easterwood et al, 1994) which
are treated as rigid structural units to accurate atomic-level representations.
Similarly, methods must be developed to integrate experimental and phylogenetic
data into modeling studies (Jaeger et al, 1994).
Simulations of molecules of
biological interest use computational representations that range from simple
lattice models to full quantum mechanical wave functions of nuclei and
electrons. If one has access to a macromolecular structure derived from NMR or
X-ray crystallography, then one can begin with a full atom representation and
fruitfully examine "small changes" in the system such as ligand
binding or site specific mutation. Again, the goal is to reproduce and predict
structure, dynamics and thermodynamics. In fact, simulations can provide the
connecting link between structure (X-ray and NMR) and function (experimental
measurements of thermodynamic properties).
In the last 10 years, because of increased computer power, molecular dynamics
calculations have progressed from the short-time simulation a macromolecule
without explicit solvent to full representations of solvent and counterions
carried out over a few nanoseconds (Berendsen, 1996). Developments in both
hardware and software for parallel computing have played a major role. However,
the longest time simulations that have been carried out are still 9 orders of
magnitude away from the typical time scale for experimental protein folding.
Simplified but realistic models, for example using a continuum treatment of the
solvent (Gilson et al, 1995), could increase the time scale by 1-2
orders of magnitude. Continuum representations may more readily incorporated
into Monte Carlo methods and thus allow large movements of the molecule during
simulation (Senderowitz et al, 1996). In some cases, the use of Langevin
and Brownian dynamics and multiple time step algorithms (Humphreys et al,
1994) may be warranted. The simulation of biological molecules at the molecular
level has generated much excitement and these approaches have become an
increasingly important partner with experimental studies of these complex
systems.
Electrostatic interactions are a crucial component in the structure and function
of biological macromolecules. In the last few years electrostatic models based
on numerical solutions to the Poisson-Boltzmann (PB) equation have been used
extensively as a basis for interpreting experimental observations on proteins
and nucleic acids (Honig and Nicholls, 1995) including for example the
prediction of the pKa's of ionizable groups (see e.g. Bashford and Karplus,
1990). Electrostatic potential plays a special role in membrane phenomena: the
energies involved are large and the experimental effects of potential changes
are also large, often dominant. The extension of PB methods to membranes and
channels is an area of great interest.
Bio-inspired materials represent a special area of opportunity for developing new high-performance engineering materials based on ideas inferred from Nature (Tirrell et al, 1994). For example, the proteins derived from spider silk serve as the inspiration for high-strength fibers (Simmons et al, 1996); the adhesives from barnacles suggest how to produce glues that cure and function underwater; and the complex protein-inorganic interactions in mollusk shells supply ideas for producing ceramics that are less brittle than current ones. It is likely that ultimate bio-inspired materials will be chimeric, that is, they will be produced as a hybrid between biological and synthetic components. Consequently, these materials represent a special class of the protein folding problem and of polymer physics. In addition to the molecular level interactions, the ultimate mechanical properties of such materials derive also from long-range interactions, orientation and crystallite size. Models from polymer science and from protein folding must be combined and adapted to predict how mechanical properties such as modulus, strength and elasticity depend on these physical parameters. Once such models are also able to explain the mechanical properties of wild-type biomaterials, they can be used in a predictive sense to guide the production of chimeric materials.
Understanding the spatial conformation of biological
macromolecules (DNA, RNA, protein) and functional changes in conformation
provides an ongoing challenge to mathematics. Analytical and computational
models based in geometry and topology continue to be very successful in
providing a theoretical and computational framework for the analysis of enzyme
mechanism and macromolecular conformation (Rybenkov, 1993; Schlick and Olson,
1992; White, 1992; Sumners et al, 1995; Lander and Waterman, 1995).
New experimental modalities, such as cryo-electron microscopy, (Stasiak et
al, 1996), optical tweezers (Smith et al, 1996), provide spatial and
structural data of ever-increasing resolution. This new spectrum of high-resolution
data will require correspondingly high-resolution mathematical models to aid in
the design and interpretation of experiments. Refinement of existing models
will provide a starting point, but new ideas and new combinations of old ideas
are needed. One particularly important need is the development of efficient
descriptors of spatial conformation of macromolecules; descriptors that will
afford efficient database entry and retrieval of information, while encoding
biologically significant structural information.
IV. ORGANISMAL
BIOLOGY
OVERVIEW
The central organizing theme for Cells and Cell Systems is how behavior and
function at one level of organization emerges from the structure and
interactions of components at lower levels. In the set of topics described in
this section the lower level of organization is subcellular or cellular. Though
some of the subcellular components that play a role in these models are
molecular, the focus is not on the structure of those molecules, but on the
part that they play in cellular and multicellular function. The section on CELL
SIGNALING deals with the role of specific molecules in regulation of processes
such as cell division, cellular communication, and gene expression. In the
MECHANICS AND EMBRYOLOGY section, the focus is on how mechanochemical processes
at the molecular level can drive the processes that lead to macroscopic changes
in shapes of tissues and organs. The problems discussed in BIOFLUID DYNAMICS
again start at the level of individual (bacterial) cells, with substructures
(flagella) interacting at tiny scales with the hydrodynamics to produce
macroscopic behavior (swimming).
The sections on IMMUNOLOGY AND VIROLOGY and NEUROSCIENCES focus on scientific
problems that involve larger multicellular systems. Understanding the immune
system requires insights about how classes of molecules found on the cell
surface generate the complex signals which lead to a normal immune response;
this response, which includes a memory of previous interactions with antigens,
is a property of the entire immune system, not of individual cells. Similarly,
the nervous system can be studied at the level of individual cells, to
understand how the biophysical properties of cellular membranes contribute to
the responses of individual cells; but an understanding of the functioning of
the nervous system also requires a study of the behavior of large scale
networks of neurons.
Control of cellular processes,
mediated by interactions of signaling molecules and their cell surface
receptors, is a central and unifying theme in current experimental cell
biology. Within the past five years, techniques of molecular biology have
revealed many of the kinases, phosphatases and other molecules involved in
signal transduction pathways, as well as molecular sub-domains and sequence
motifs that determine distinct functions. New techniques for measuring
phosphorylation, calcium fluxes, and other early biochemical responses to
receptor interactions are being applied to study many cell signaling systems
(e.g., chemotactic bacteria, neurons and lymphocytes). Genetically engineered
experimental systems consisting of homogeneous cell lines, transfected with
homogeneous populations of wild type and mutant receptors and effector
molecules, have facilitated acquisition of much of the new information about
the intracellular molecules that mediate signal transduction. Improved
measurement and experimental design make mathematical modeling an increasingly
feasible tool for testing ideas about the interactions of these molecules.
Modeling has contributed to our understanding of key cell surface interactions
(e.g., ligand-induced receptor aggregation, cell-cell interactions, and cell
adhesion). Modeling has also clarified the nature and effects of cellular
responses (e.g., internalization and secretion of proteins, cell division and
differentiation, and cell motility). Recent combinations of modeling and
experiment have brought a deeper understanding of the role of calcium in the
regulation of cell division, neuronal communication, regulation of muscle
contraction, pollination, and other cellular processes. (Silver, 1996)
Representative descriptions of collaborative work applying mathematics to
problems in experimental cell biology are found in Alt et al, 1996;
Goldstein and Wofsy, 1994 and Lauffenburger and Linderman, 1993. Other recent
examples of the productive application of theory to cell signaling and cell
motility include Alon et al, 1995; Bray, 1995; Jafri and Keizer, 1994;
Naranja et al, 1994; Tranquillo and Alt, 1996 and Tyson et al,
1996. Over the next few years, we can expect mathematical modeling to play a
central role in the design and interpretation of experiments aimed at
understanding in detail the biochemical reactions leading from receptor
interactions to changes in gene expression, cell division, and other functional
responses.
Recent advances in
instrumentation have made it possible to measure motions and mechanical forces
at the molecular scale (Svoboda and Block, 1994). Concomitant with these new
mechanical measurements are crystallographic and x-ray diffraction techniques
that have revealed the atomic structure and molecular geometry of
mechanochemical enzymes to angstrom resolutions (Rayment and Holden, 1994).
Together, these techniques have begun to supply data that has revived interest
in cellular mechanics, and reinvigorated the view of enzymes as mechanochemical
devices. It is now possible to make realistic models of molecular
mechanochemical processes that can be related directly to experimentally
observable, and controllable, parameters (Peskin and Oster, 1995). These
advances in experimental technology have initiated a renaissance in theoretical
efforts to readdress the central question: how do protein machines work? More
precisely, how is chemical energy transduced into directed mechanical forces
that drive so many cellular events?
Embryology has also moved beyond descriptive observation to encompass genetic
control of development and localization of protein effectors. The stress and
strain measurements that are now possible at the cellular scale promise to
unite the genetics, biochemistry and biomechanics of development (Oliver et
al, 1995). By characterizing the mechanical properties of embryonic cells
and tissues, mathematical models can be used to discriminate between various
possible mechanisms for driving morphogenesis (Davidson et al, 1995).
Examples encompass all phenomena that involve the coordinated movement of
macromolecules, cells or tissues. How do embryonic cells crawl and bacteria
swim (Dembo, 1989; Berg, 1995; Mogilner and Oster, 1996)? How are proteins
shuttled about the cell (Scholey, 1994)? What drives the grand progression of
cell division (Murray and Hunt, 1993)? What drives the shaping of tissues and
organs during embryonic development (Murray and Oster, 1984; Brodland, 1994)
and the reshaping of organs after injury (Tranquillo and Murray, 1993; Olsen et
al, 1995)?
Because of the ongoing
revolution in computer technology, we can now solve fluid dynamics problems in
the three spatial dimensions and time (Ellington and Pedley, 1995). This opens
up biological opportunities on many different scales of size. On the organ scale,
for example, one can now perform fluid dynamics simulations of the embryonic
and fetal heart at different stages of development. Such models will help to
elucidate the role of fluid forces in shaping the developing heart. The
swimming mechanics of microorganisms are also accessible to computer
simulation. A particularly challenging problem in this field concerns the
intense hydrodynamic interaction among the different flagella of the same
bacterium: When the flagella are spinning so that their helical waves propagate
away from the cell body, they wrap around each other to form a kind of
superflagellum that propels the bacterium steadily along; when their motors are
reversed and the flagella spin the other way, the superflagellum unravels and
the bacterium tumbles in place. Because of the difficulty of measuring
microscopic fluid flows, hydrodynamics within cells is a much neglected aspect
of cellular and intracellular biomechanics. Indeed, computation provides our
only window onto this important aspect of cellular physiology. The
incompressibility and viscosity of water have the effect of coupling motions
along different axes, and between objects quite distant from one another;
biomolecular processes are also modulated by the necessity of moving water out
of the way. A new feature in this realm of micro and nano hydrodynamics is the
importance of Brownian motion and the related significance of osmotic mechanics
(including sol-gel transformations) for controlling fluid motions.
Progress in this field will depend on access to large-scale scientific
computing. It is important that the best technology be made available to
scientists on a scale sufficient to sustain this kind of research. This will
also necessitate supporting people with the expertise to make effective use of
these powerful machines. At universities, such people are often in non-faculty,
non-tenured research positions. We needsupport to sustain their crucial role.
During the last two years
mathematical modeling has had a major impact on research in immunology and
virology. Serious collaborations between theorists and experiment provided
breakthroughs by viewing experiments in which AIDS patients were given potent
anti-retroviral drugs as perturbations of a dynamical system. Mathematical
modeling combined with analysis of data obtained during drug clinical trials
established for the first time that HIV is rapidly cleared from the body and
that approximately 10 billion virus particles are produced daily (Ho et al,
1995). This work had tremendous impact on the AIDS community and has, for the
first time, given them a quantitative picture of the disease process. The
impact of this type of analysis has extended beyond AIDS, and opportunities
exist for developing realistic and useful models of many viral diseases.
Challenges remain in studying drug therapy as a nonlinear control problem, and
the issue of how rapidly viruses mutate and become drug resistant under
different therapeutic regimes needs to be considered. Such issues also apply to
the development of antibiotic resistance in bacterial disease.
Opportunities exist for substantial advances in immunology by the use of
modeling techniques. Molecular modeling is providing insights into the
structure and function of the cell surface molecules crucial for the operation
of the immune system: immunoglobulin, the T cell receptor, and molecules coded
for by the major histocompatibility complex genes, as well as molecules being
recognized by the immune system. The biochemical sequelae of molecular
recognition involve the generation of complex biochemical and enzymatic
signals, whose net effect are changes in gene expression followed in many cases
by cell proliferation, cell differentiation and cell movement. How these
changes are orchestrated to produce an immune response remain to be elucidated.
However, modeling can give us insights into how cells interact by direct
contact and via secreted molecules, cytokines, to produce the coordinated
behavior necessary to meetimmune system challenges.
The fundamental challenge in neuroscience is to understand
how behavior emerges from properties of neurons and networks of neurons.
Advances in experimental methodologies are providing detailed information on
ionic channels, their distribution over the dendritic and axonal membranes of
cells, their regulation by modulatory agents, and the kinetics of synaptic
interactions. The development of fast computing, sophisticated simulation
tools, and improved numerical algorithms has enabled the development of
detailed biophysically-based computational models that reproduce the complex
dynamic firing properties of neurons and networks. Such computations provide a
two-fold opportunity for advancing our knowledge: (1) they both explain and drive
new experiments, (2) they provide the basis for new mathematical theories that
enable one to obtain reduced models that retain the quantitative essence of the
detailed models. These reduced models, which allow the bridging of multiple
spatial and temporal scales, are the building blocks for higher level models.
Modeling tools and mathematical analysis allow us to address the central
question: What are the cellular bases for neural computations and tasks such as
sensory processing, motor behavior and cognition? (Koch and Segev, 1989; Bower,
1992) More specifically, how do intrinsic properties of neurons combine in
networks with synaptic properties, connectivity, and the cable properties of
dendrites to produce our interaction with the world? Neural modulators affect
both the intrinsic currents and the synaptic interactions between neurons.
(Harris-Warrick et al, 1992) The effects of these changes at the network
level are difficult to work out even for small networks. The largest challenge
in this area is to understand how systems with enormous numbers of degrees of
freedom and large numbers of different modulators combine to produce flexible
but stable behavior. The geometry and electrical cable properties of the
branching dendrites of neurons also affect network activity. (Stuart and
Sakmann, 1994) Mathematical analysis is needed to interpret the results of
massive computations, and to incorporate the insights into network models.
The dynamics of neural networks (Golomb et al, 1996; Kopell and LeMasson,1994)
affect both cognitive and sensory-motor behavior. To understand motor behavior,
one must construct models that illuminate the role of feedback between neural
and mechanical subsystems. For sensory systems, one of the most important
problems is to understand how the brain controls the data that it receives,
including understanding more rigorously the quantitative
parameterization/description of natural stimuli. A current active area of
inquiry is the characterization of codes used in information processing in the
nervous system. (Softky and Koch, 1993; Shadlen and Newsome, 1995; Softky,
1995) Among the issues raised by this question is how the complex dynamics of
the cortex can help shape responses to stimuli, including selecting pathways
that lead to different behavior.
Modeling has become an accepted and central tool in neurobiology. The current
scientific goals listed above create specific challenges in modeling. Some of
these concern the handling and interpretation of the far greater volume of data
that is now, or potentially, available, e.g. through multiunit recording
techniques. With very large and complex models (Whittington et al,
1995), techniques for systematically choosing parameters are important, as are
methods for comparing models and understanding their differences. Both
computers and mathematical analysis will play major roles in dealing with the
technical problems; mathematical analysis remains the fundamental tool for
providing a deep understanding of how models differ in their predictions.
V. ECOLOGY
AND EVOLUTIONARY BIOLOGY
OVERVIEW
Evolution is the central organizing theme in biology (e.g. Roughgarden,1979),
and its manifestation in the relationships among types of organisms spans
levels of organization, and reaches out from biology to earth and social
sciences. Thus, the core problems in ecology and evolution run the gamut from
those that address fundamental biological issues to those that address the role
of science in human affairs. Fundamental challenges facing ecologists and
evolutionary biologists relate to the threats of the loss of biological
diversity, global change, and the search for a sustainable future, as well as
to the continued search for an understanding of the biological world and how it
came to assume its present form. To what extent is the organization of the
biological world the predictable and unique playing out of the fundamental
rules governing its evolution, and to what extent has it been constrained by
historical accident? How are the interactions among species, ranging from the
tight interdependence of host and parasite to the more diffuse connections
among plant species in a forest, manifested in their coevolutionary patterns
and life history evolution? What are the evolutionary relationships among closely
related species, in terms of their shared phylogenetic histories? How do human
influences, such as the use of antibiotics and pesticides, exploitation of
fisheries and land, and accelerated patterns of global change, influence the
evolutionary dynamics of species and patterns of invasion? To what extent can
an evolutionary perspective help us to prepare for the future, in terms of
understanding what species might be best suited to new environments? The latter
is important both in terms of natural patterns of change, and deliberate
manipulations through breeding and species introductions.
Among the central issues are those relating to biodiversity (Tilman,1994) How
it is maintained, how it supports ecosystem services, likely patterns of
change, and steps to preserve it. This leads to a fundamental set of core
issues, both in terms of their importance, and in terms of their ripeness for
success:
What factors maintain biodiversity? How can new approaches to phylogenetic analyses, in clarifying the evolutionary relationships within and among species, help us to understand how we should measure biodiversity? How are ecosystems organized into functional groups, ecologically and evolutionarily, and how does that organization translate into the maintenance of critical ecosystem processes, such as productivity and biogeochemical cycles, as well as climate mediation, sequestering of toxicants, and other issues of importance to human life on earth.
What are the connections between the physical and biological parts of the global biosphere, and the multiple scales of space, time and organizational complexity on which critical processes are played out? (Bolker et al, 1995) In particular, how are individual plants influenced by changes in atmospheric patterns; and, more difficult, how do those effects on individual plants feed back to influence regional and global patterns of climate and biological diversity? How do effects on phytoplankton and zooplankton relate to each other, and to the broader patterns that may be observed?
How do patterns of population growth and resource use, as well as the profligate use of antibiotics, contribute to the emergence and reemergence of deadly new diseases, many of them antibiotic resistant? (Ewald, 1995) Are there approaches to management of the diversity of those diseases, guided by both an evolutionary and an ecosystem perspective, that can reduce the threat and provide new strategies for mitigation?
The history of the management of our sources of food and
fiber is not one of unmitigated successes, and many of these crucial resources
are threatened to a level that they will be unable to support the needs of
humanity in the coming decades. The prospect of large-scale alterations of the
earth's physical and biological systems creates a potential conflict between
human needs, desires and capabilities. (Walters and Parma, 1996; Walters and
Maguire, 1996) This situation is further complicated by the limitations of our
understanding and ability to control complex biological systems. We must
develop methods for decision-making and management that are appropriate for an
uncertain future. (Hilborn et al, 1995)
In all of these issues, there are a variety of cross-cutting themes, some
biological, some methodological or conceptual. From a biological point of view,
the essential point is that all that we see has been shaped by evolutionary
processes; from an ecological point of view, it is that organisms do not exist
in isolation, but have existed within the context of other species and an
abiotic environment, making essential an ecosystem perspective on issues
ranging from the management of diseases to the management of our global
surroundings. Indeed, a central challenge is to understand how the properties
even of ecosystems, those loose assemblages of species in particular habitats,
can be understood in terms of the diffuse coevolution of the components within
very open systems.
From a modeling point of view, fundamental issues remain how to deal with
variation within as well as variation among units, for example in the
importance of heterogeneity in evolutionary processes or infectious transfer.
The interplay among processes operating on very different scales also pervade
these questions, from evolution through global change. And finally, techniques
for simplification, and for relating behaviors at the level of individuals to
macroscopic descriptions, provide the tools for making the essential
connections.
Progress in all of these research areas will derive from the application of a
suite of approaches, ranging from explicit spatial and stochastic simulations
to more compact (Durrett and Levin, 1994) mathematical descriptions that allow
analysis and simplification. Recent advances in computer technology have opened
up the possibility of including much more detail than ever before in simulation
approaches, yielding the possibility of including much more biological detail.
This detail comes at a cost, however. The ability to generate information does
not equal understanding, and the mathematical challenge is to develop
techniques which can include the essential details driving the complex models,
while allowing an understanding of the features driving the biological behavior
at a deeper level that will allow generalization. This will require both close
attention to the underlying biological details and fundamental mathematical
progress in taking appropriate limits and achieving manageable simplification
of complex, spatially explicit, stochastic models.
Below, we focus on modeling opportunities in some of the specific subfields in
the general areas of ecology and evolution.
While evolution is the great unifying principle underlying
all of biology, evolutionary genetics forms the foundation of evolution.
Challenging mathematical and computational applications in this critical area
range from the development of theoretical frameworks from which to infer the operation
of evolutionary mechanisms such as natural selection at the molecular level
through the organismal level, to understanding the genetic basis of
interactions among species.
One critical area, still in its infancy, concerns the identification and genetic
analysis (Coyne et al, 1991) of genes that play key roles in species and
environmental interactions. The mapping of such quantitative trait loci
consists of three interrelated inference problems: detecting the effects of
these loci, determining the number of major loci affecting a trait, and
locating them relative to genomic markers. A complete solution thus involves
problems of testing, model selection, and estimation. Once ecological and
genetic analysis of traits limiting adaptive responses is complete, it will be
possible to address crucial evolutionary questions such as the relative
importance of gene flow, genetic trade-offs, and genetic constraints.
A second exciting area concerns life history evolution, which often focuses
upon the timing of life history events or the allocation of organismal
resources and time among conflicting demands such as longevity and fecundity.
Evolution of these traits can be studied from quantitative genetic descriptions
in which transient dynamics are explored (Tuljapulkar and Wiener, 1995), while
the selective environment is reduced to a selection gradient. Alternatively,
the nature of the environment's selective effect on a trait can be explored
through optimization approaches. There is a pressing need for more complex
formulations such as models bridging the gap between problems of allocation and
timing, models explicitly (Charlesworth, 1994) incorporating how genes act at
different ages and over time, for models at the interface between life history
evolution and behavior (Charlesworth, 1994), and for models examining how life
histories (Tuljapulkar, 1994) are influenced by temporal and spatial variation
in the environment.
Beyond the species level, the coevolutionary dynamics of the quantitative
traits that are often involved in species interactions pose many challenges and
opportunities to theoretical, computational, and mathematical biologists that
cut across all areas of ecology and evolution. For example, the study of the
evolution of virulence (Frank, 1993, 1994) in insect-parasitoid-host systems
and fungi-virus interactions in plants and the study of mechanisms of
specialization and the analysis of hybrid zones are part of the cutting-edge
research being conducted at the interface of biology and the mathematical
sciences.
With the rapid accumulation of sequence data for entire genomes, we are now
poised to analyze the set of genes, their order and organization, codon usage,
etc. across taxa (Griffiths and Tavare, 1996) and how and perhaps why this has
evolved over time. (Thorne et al, 1992) This requires an increased
ability to model how information is represented and acted upon in biological
systems (Griffiths and Tavare, 1996) based on tools from such fields as
discrete mathematics, combinatorics, and formal languages. Novel, perhaps
ad-hoc formulations are needed to form the mathematical basis of genomic
analyses because classical quantitative formulations of notions such as
information, similarity, and classification - all inextricably related to
biology - are inadequate. Correspondingly, methods for organizing vast sequence
data into data structures and databases suited for the most efficient data
storage and access are needed, along with improved algorithms for sequence
analysis and the identification of homologies among sequences.
Population genetic surveys of the genetic structure of natural populations are
a critical tool from which to deduce the evolutionary history of, and
evolutionary forces at work in, natural populations. Current population genetic
theory and data analysis methods are largely based upon single or a few genetic
loci, each with two alternate forms (alleles). Current data, however, typically
includes the genetic makeup at a large number of genetic markers which, with
the advent of new molecular techniques such as the polymerase chain reaction,
are increasingly hypervariable with a large number of alternate forms
segregating at each. New theoretical frameworks and statistical methods are
needed to extract and utilize the full evolutionary information contained in
these complex data sets.
Virtually all important questions in conservation biology
require making predictions, so theory and mathematical methods have played and
will continue to play a central role. Although many of the underlying
scientific issues have been defined during the past decade, many questions
remain to be resolved. What species would be lost in the wake of an invasion,
and what are the effects on ecosystem function? For example, what are the consequences
of the replacement of native fish species by introduced species? Substantial
progress is likely (and needed) in the near future in understanding the
dynamics of invading exotic species, determining more carefully the role
genetics plays in the dynamics of rare or endangered species, and in the
ecological dynamics of threatened species.
Theoretical studies have focused on the population size or characteristics
needed to allow species to maintain the genetic diversity necessary to allow
long term persistence. (Lande, 1993, 1994) These answers have shown that an
effective population size is required, but further work is needed to understand
how effective population size is related to actual population size and
structure and life history characteristics -- what can actually be observed.
These lead to interesting mathematical challenges dealing with structured
populations, and with integrating ecological and genetical models.
The impact of invading exotic species on existing native ecological communities
and species is perhaps the most important conservation issue today (OTA, 1993).
There has been almost no development of theories predicting rates of spread of
species within the context of even simple communities, and the related
mathematical problems of coupled reaction diffusion equations are challenging
as well. Although the basic mathematical models of spatial spread can be traced
at least as far back as Fisher (1937), recent work has shown that the situation
is far more complex, as rates of spread can vary by at least an order of
magnitude as model assumptions are changed. (e.g. Lewis & Kareiva, 1993;
Zadocks and Van Den Bosch, 1994)Further work will be able to lead to robust
quantitative predictions of rates of spread.
In recent years there has been an abrupt shift in management
philosophy. (Hiborn et al, 1995) The old goal of managing individual
species in order to reach and maintain optimal conditions has been replaced by
a new goal of maintaining ecosystem function and adapting to new conditions or
changes in the system. This shift reflects a more mature attitude towards
nature that recognizes the limitations of our knowledge and capabilities, the
importance of interactions between species and an appreciation of the dangers
of a command and control mode of operation.
This new approach to management makes it possible to apply elements of the
scientific method in a new and significant context: we may design experimental
management schemes to provide information that is required to improve the
management process and adapt to changes, even unforeseen changes. This new
approach challenges our mathematical and statistical skills. Successful
adaptation requires effective and timely organization of data through estimation
of parameters that affect system dynamics, including the dynamics of our
learning. That information then must be translated into an assessment of the
likely consequences of management strategies and actions.
The major challenges facing the human species cannot be met by a reductionist
or piecemeal approach. Instead we must muster all of our ingenuity and
resources to learn about the behavior of intact natural systems under stress
and perturbation, and adapt our human institutions to a finite and vulnerable
world.
Climate change and associated
changes in greenhouse gases have made imperative the examination of the
potential impacts on natural systems, and associated feedbacks. Advances in
computational capabilities have made possible the construction of detailed
individual-based models that take account of the responses of individual trees
to changes in environmental conditions, and their mutual effects. Yet such
models are tremendously data-hungry, and have great potential for error
propagation. To make their predictions robust, and to allow those predictions
to be interfaced with the much broader scale predictions of climate models, and
the masses of broad scale information that are becoming available from remote
sensing, we must find ways to reduce dimensionality and simplify those overly
detailed models. Similar comments apply to models of other systems, such as the
aggregation of social organisms from cellular slime molds to marine and
terrestrial invertebrates and vertebrates. Methods such as moment closure and
hydrodynamic limits, borrowed from other disciplines, are proving remarkably
promising, especially when coupled with experimental approaches (Levin and
Pacala, 1996).
This represents one of the most challenging and important issues in ecosystem
science. At the same time, masses of data are becoming available from global
observation systems, and critical experiments are providing understanding of
the linkages between ecosystem structure and function, and in particular the
role of biodiversity in maintaining system processes. The next 5-10 years hold
remarkable potential for integrated theoretical, empirical and computational
approaches to elucidate profound and important issues (Field, 1992; Bolker,
1995).
The subject of infectious disease dynamics has been one of the oldest and most successful in mathematical biology for a century, and has seen powerful advances in recent years in mathematical theory, and in the application of that theory to management strategies (see, for example, Anderson and May, 1991). Much of the literature has assumed homogeneous mixing, so that every individual is equally likely to infect every other individual; but such models are inadequate to describe the central qualitative features of most diseases, especially those that are sexually transmitted, or for which spatial or socioeconomic structure localizes interactions. The classical work of Hethcote and Yorke (1984) on core-group dynamics highlighted the importance of such effects, and formed the basis upon which much recent work rests. Such work, involving spatial structure, frequency and density dependence, and behavioral factors have not only forced us to revise old paradigms, but have reenergized the interplay among nonlinear dynamics, ecology and epidemiology.
VI.
CROSS-CUTTING ISSUES
MATHEMATICAL AND COMPUTATIONAL ISSUES SPANNING ALL DOMAINS - -
RELATIONSHIP BETWEEN SIMULATION AND MATHEMATICS
The revolution in computer technology enables us to perform complex simulations
only dreamed of a decade ago. Effective use of this technology requires
substantial use of mathematics throughout all stages of the simulation process:
the quantitative (or qualitative) formulation of models, the design of appropriate
data types and algorithms, translation of models into efficient computer
implementations, estimation of parameter values, visualization of the output,
and comparison of simulation results with results of further experimentation.
Mathematics is also essential in the critical step of developing algorithms
that compute important properties of models without recourse to numerical
simulation.
Furthermore, mathematics can significantly enhance our understanding of
processes that are studied through simulation. For example, theories of
dynamical systems describe patterns that are widespread, so much so that they
have been called "universal." The elucidation of such recurring
patterns is a central part of mathematics. Mathematics ponders a common language,
a context that gives meaning to simulation results and a firm foundation for
the algorithmic infrastructure of simulation. Such a foundation ensures that
simulation methods are generalizable and capable of generating predictions.
Moreover, theory can serve as a basis for reducing models without loss of
information, thereby improving the efficiency of large-scale simulations.
MAJOR CHALLENGING ISSUES THAT SPAN ALL AREAS OF MODELING SYSTEMS
A. Integrating data and developing models of complex systems across multiple spatial and temporal scales.
B. Structure-function relationships
C. Image analysis and visualization
D. Basic mathematical issues
E. Data management
VII. EDUCATIONAL ISSUES
As noted above, mathematical analysis and computer modeling have become
indispensable tools in biology in recent years. These techniques have had a
major impact in areas ranging from ecology and population biology to
neurosciences to gene and protein sequence analysis and three-dimensional
molecular modeling. Mathematical and modeling techniques make it possible to
analyze and interpret enormous amounts of data, yielding information and
revealing patterns and relationships that would otherwise remain hidden.
Given the essential role that mathematical and modeling techniques play in so
many diverse areas of biology, there is a clear need for appropriate training
opportunities in computational, mathematical, and theoretical biology. Suitable
and practical mechanisms to encourage and nurture training in computational
biology might include 1) graduate training grant programs that involve faculty
engaged in both computational and experimental approaches, 2) postdoctoral
fellowships to encourage mathematicians and computational scientists to pursue
research training in biology, and to enable biologists to acquire computational
and modeling skills, and 3) summer workshops and short courses to help
practicing biologists, mathematicians, and computational scientists to begin to
bridge the gap between these rather diverse disciplines.
In addition to training of computational biology specialists, there is a clear
and dramatic need for enhanced training in mathematics and computational
methods for biological science students or others who might enter the workforce
in any scientific discipline. A systematic approach, beginning at the K-12
level, that emphasizes the importance of mathematics and modeling in biology
activities (as outlined in the National Science Standards) would help insure
that students are better prepared to utilize mathematical approaches in
undergraduate biology curricula, and less likely to avoid mathematically
rigorous courses in undergraduate programs because of weak mathematics
backgrounds or "math phobia". Improved mathematics training at the
earliest levels will also likely increase the number of students interested in
pursuing graduate study in interdisciplinary areas of mathematical and
computational biology. Greater emphasis on mathematics and computational
studies at the K-12 and undergraduate levels can also be coupled effectively
with programs to encourage women and underrepresented minorities to pursue
careers in science, especially in interdisciplinary areas that bridge the
biological, mathematical, and computational sciences.
Finally, it should be recognized that computer simulations and mathematical
modeling tools can be effective teaching aids in the biological sciences.
Topics like protein structure-function relationships benefit greatly from
interactive, three-dimensional graphics demonstrations. Computer simulations
and animations based on mathematical models can be an extremely effective way
to illustrate the behavior and properties of complex systems, ranging from
protein-ligand interactions to migration behavior of large animal populations.
Therefore, inclusion of mathematical and computational course work as a logical
and sequential theme articulated in K-12 and undergraduate curricula will
likely have far-reaching benefits for biology education.
Alon, R., Hammer, D. A., and Springer, T. A. (1995). Lifetime of
the P-selectin -- Carbohydrate Bond and Its Response to Tensile Force in
Hydrodynamic Flow. Nature 374, 539-542.
Alt, W., Deutsch, A., and Dunn, G., eds., (1996). Mechanisms of
Cell and Tissue Motion. Birkhaeuser
Verlag, Basel.
Anderson, R.M. and May, R.M. (1991) Infectious Diseases of
Humans. Oxford Univ. Press.
Arnold, B. and Rossmann, M.G. (1986). Effect of Errors,
Redundancy, and Solvent Content in the Molecular Replacement Procedure for the
Structure Determination of Biological Macromolecules. Proc. Natl. Acad. Sci. USA 83, 5489-5493.
Asilomar (1995). Proteins 23, 295-460.
Bashford, D. and Karplus, M. (1990). pKa's of Ionizable
Groups in Proteins - Atomic Detail from a Continuum Electrostatic Model. J.
Mol. Biol. 29, 10219-10225.
Berendsen, H. J. C. (1996). Bio-molecular Dynamics Comes of Age. Science 271, 954-955.
Berg, H. (1995). Torque Generation by the Flagella Rotary
Motor. Biophys. 68 (4 Suppl), 163S-166S.
Bolker, B.M., Pacala, S.W., Bazzaz, F.A. and Canham, C.D. (1995)
Species Diversity and Ecosystem Response to Carbon Dioxide Fertilization
-Conclusions from a Temperate Forest Model. Global Change Biology 1, 373-381.
Bolker, B. M., Pacala, S.W., Canham, C., Bazzaz, F. and Levin, S.A.
(1995) Species Diversity and Ecosystem Response to Carbon Dioxide
Fertilization: Conclusions from a Temperate Forest Model. Global Change Biology
1, 373-381.
Bourret, R. B., Borkovitch, K. A. and Simon, M. I. (1991). Signal
Transduction Pathways Involving Protein Phosphorylation in Prokaryotes. Ann.
Rev. Biochem. 60, 401-441.
Bower, J.M., Guest Editor (1992). Special Issue: Modeling the
Nervous System. Trends In Neuroscience 15, #11.
Bowie, J. U., Luthy, R. and Eisenberg, D. (1991). A Method to Identify
Protein Sequences That Fold into a Known Three-Dimensional Structure. Science
253, 164-170.
Brodland, G. (1994). Finite Element Methods for Developmental
Biology. In International Review of Cytology, 150, Academic Press, Inc. pp.
95-118.
Bray, D. (1995). Protein Molecules as Computational Elements in
Living Cells. Nature 376, 307-312.
Bryant, S. H. and Lawrence, C. E. (1993). An Empirical Energy
Function for Threading Protein Sequence Through the Folding Motif. Proteins 16,
92-112.
Charlesworth, B. (1994). Evolution in Age-Structured Populations.
Cambridge University Press, 2nd edition.
Coyne, J.A., Aulard, S. and Berry, A. (1991). Lack of
Underdominance in a Naturally Occurring Pericentric Inversion in
Drosophila-Melanogaster and Its Implications for Chromosome Evolution. Genetics
129, 791-802.
Coyne, J.A., Charlesworth, B. and Orr, H.A. (1991). Haldane's
Rule Revisited. Evolution 45, 1710-1714.
Davidson, L., Koehl, M., Keller, A. and Oster, G. (1995). How Do
Sea Urchins Gastrulate? Distinguishing Between Mechanism of Primary
Invagination Using Biomechanics. Development 121, 2005-2018.
Dembo, M. (1989). Mechanics and Control of the Cytoskeleton in
Amoeba proteus. Biophys. J. 55, 1053-1080.
Doering, C., Ermentrout, B. and Oster, G. (1995). Rotary DNA
Motors. Biophys. J. 69, 2256-2267.
Durrett, R. and Levin, S.A. (1994) Stochastic Spatial Models: A
User's Guide to Ecological Applications. Phil. Trans. Soc. Lond. B. 343,
329-350.
Easterwood, T. R., Major, F., Malhotra, A. and Harvey, S. C.
(1994). Orientations of Transfer RNA in the Ribosomal A and P Sites. Nucl.
Acid. Res. 22, 3779-3789.
Ewald, P.W. (1995). The Evolution Of Virulence - A Unifying Link
Between Parasitology and Ecology. J. Parasitology 81, 659-669.
Eisenberg, D. and McLachlan, A. D. (1986). Solvation Energy in
Protein Folding and Binding. Nature 319, 199-203.
Ellington, C.P. and Pedley, T.J., eds. (1995). Biological Fluid
Dynamics. Company of Biologists Limited, Cambridge UK
Field, C.F., Chapin III, F. S., Matson, P. A. and Mooney, H. A.
(1992) Responses of Terrestrial Ecosystems to the Changing Atmosphere: A
Resource-Based Approach. Ann. Rev Ecol. Syst. 23, 201-235.
Findlay, J. B. C. (1996). Membrane Protein Models. BIOS, Oxford
Fisher, R.A. (1937). The Wave of Advance of Advantageous Genes.
Ann. Eugen. (Lond.) 7, 355-369.
Fleischmann, R.D., Adams, M.D., White, O., Clayton, R.A. (1995)
Whole-Genome Random Sequencing and Assembly of Haemophilus influenzae Rd.
Science 269, 496-512.
Frank, S.A. (1993). Evolution of Host-Parasite Diversity.
Evolution 47, 1721-1732.
Frank, S.A. (1994). Coevolutionary Genetics of Host and Parasites
with Quantitative Inheritance. Evolutionary Ecology 8, 74-94.
Gao, J. (1996). Hybrid Quantum and Molecular Mechanical
Simulations - An Alternative Avenue to Solvent Effects in Organic Chemistry.
Acc. Chem. Res. 29, 298-305.
Gilson, M. K., McCammon, J. A. and Madura, J. D. (1995).
Molecular Dynamics Simulation with a Continuum Electrostatic Model of the
Solvent. J. Comput. Chem. 16, 1081-1095.
Goldstein, B., and Wofsy, C., eds. (1994). Lectures on
Mathematics in the Life Sciences 24: Cell Biology. American Mathematical
Society, Providence, RI.
Golomb, D., Wang, X-J, and Rinzel, J. (1996) Propagation of
Spindle Waves in a Thalamic Slice Model. J. Neurophys 75, 750-769.
Griffiths, R.C and Tavare, S. (1996). Computational Methods for
the Coalescent. IMA volume, P. Donnelly and S. Tavare, eds. In press.
Guida, W. C. (1994). Software for Structure-Based Drug Design.
Curr. Opin. Struc. Biol. 4, 777-781.
Harris-Warrick, R., Marder, E., Selverston, A. and Moulins, M.,
eds. (1992). Dynamic Biological Networks: The Stomatogastric Nervous System,
MIT Press
Hethcote, H.W. and Yorke, J.A. (1984) Gonorrhea: Transmission
Dynamics and Control. Lect. Notes in Biomath. 56, 1-105.
Hilborn, R., Walters, C.J. and Ludwig, D. (1995) Sustainable
Exploitation of Renewable Resources. Annual Review Of Ecology And Systematics
26, 45-67.
Ho, D. D., Neumann, A. U., Perelson, A. S., Chen, W., Leonard, J.
M. and Markowitz, M. (1995). Rapid Turnover of Plasma Virions and CD4
Lymphocytes in HIV-1 Infection. Nature 373, 123-126.
Honig, B. and Nicholls, A. (1995). Classical Electrostatics in
Biology and Chemistry. Science 268, 1144-1149.
Humphreys, D. D., Freisner, R. A. and Berne, B. J. (1994). A
Multiple Time-Step Molecular Dynamics Algorithm for Macromolecules. J. Phys.
Chem. 98, 6884-6892.
Jaeger, L., Michel, F. , and Westhof, E. (1994). Involvement of a
GRNA Tetraloop in Long-Range Tertiary Interactions. J. Mol. Biol. 236,
1271-1276.
Jafri, S. M., and Keizer, J. (1994) Diffusion of Inositol
1,4,5-Trisphosphate But Not Ca2+ Is Necessary for a Class of Inositol 1,4,5-Trisphosphate-Induced
Ca2+ Waves. Proc. Natl. Acad. Sci. 91, 9485-9489.
Johnson, B. A. and Blevins, R. A. (1994). NMRView: A Computer
Program for the Visualization and Analysis of NMR Data. J. Bimolec. NMR 4,
603-614.
Kearsley, S.K., Underwood, D.J., Sheridan, R.P. and Miller, M.D.
(1994). Flexibases - A Way to Enhance the Use of Molecular Docking Methods. J.
Comp. Assist. Mol. Design 8, 565-582.
Koch, C. and Segev, I., eds. (1989). Methods in Neuronal
Modeling: From Synapses to Networks, MIT Press, Cambridge MA. 2nd Edition, in
press 1996.
Kontoyianni, M. and Lybrand, T. P. (1993). Three Dimensional
Models for Integral Membrane Proteins: Possibilities and Pitfalls. Perspect,
Drug Disc. Design 1, 291-300.
Kopell, N. and LeMasson, G. (1994) Rhythmogenesis, Amplitude
Modulation, and Multiplexing in a Cortical Architecture, Proc. Natl. Acad. Sci,
USA 91, 10586-10590.
Kreusch, A. and Schulz, G. E. (1994) Refined Structure of the
Porin from Rhodopseudomonas blastica. Comparison with the Porin from Rhodobacter
capsulatus. J. Mol. Biol. 243, 891-905.
Lande, R. (1993). Risks of Population Extinction from Demographic
and Environmental Stochasticity. Am. Nat. 142, 011-927.
Lande, R. (1994). Risks of Population Extinction from New
Deleterious Mutations. Evolution 48, 1460-1469.
Lander, E.S. and Waterman, M.S. (1995). Calculating the Secrets
of Life, National Academy Press, Washington, D.C.
Lauffenburger, D. A. and J. J. Linderman (1993). Receptors:
Models for Binding, Trafficking, and Signaling. Oxford University Press,
Oxford.
Leahy, D.J., Hendrickson, W.A., Aukhil, I. and Erickson, H.P.
(1992). Structure of a Fibronectin Type III Domain from Tenascin Phased by MAD
Analysis of the Selenomethionyl Protein. Science 158, 987-991.
Lee, C. and Subbiah, S. (1991). Prediction of Protein Side-Chain
Conformation by Packing Optimization. J. Mol. Biol. 217, 373-388.
Levitt, M. (1993). Accurate Modeling of Protein Conformation by
Automatic Segment Matching. J. Mol. Biol. 226, 507-533.
Levin, S.A. and Pacala, S.W. (1996). Theories of Simplification
and Scaling of Spatially Distributed Processes. In press, 1997. In Spatial
Ecology: The Role of Space in Population Dynamics and Interspecific
Interactions. D. Tilman and P. Kareiva, eds, Princeton University Press, Princeton
NJ.
Lewis, M.A., Kareiva, P. (1993). Allee Dynamics and the Spread of
Invading Organisms. Theoretical Population Biology 43, 141-158.
Lyubchenko, Y. Shlyakhtenko, L., Harrington, R., Oden, P. and
Lindsay, S. (1993). Atomic Force Microscopy of Long DNA: Imaging in Air and
Under Water. Proc. Natl. Acad. Sci. USA 90, 2137-2140.
Misra, V.K., Hecht, J.L., Sharp, K.A., Friedman, R.A. and Honig,
B. (1993). Salt Effects on Protein-DNA Interactions - The Lambda-CI Repressor
and EcoRI Endonuclease. J. Mol. Biol. 238, 264-280.
Mogilner, A. and Oster, G. (1996). Cell Motility Driven by Actin
Polymerization. Biophys. J., in press.
Murray, A. and Hunt, T. (1993). The Cell Cycle: An Introduction.
New York, W.H. Freeman.
Murray, J. and Oster, G. (1984). Cell Traction Models for
Generating Pattern and Form in Morphogenesis. J. Math. Biol 19, 265-80.
Naranjo, D., Latorre, R., Cherbavaz, D., McGill, P., and
Schumaker, M. F. (1994). A Simple Model for Surface Charges on Ion Channels.
Biophysical J. 66, 59-70.
OTA (Office of Technology Assessment), (Sept, 1993). Harmful
Non-Indigenous Species in the United States. OTA-F-565. US Govt. Printing
Office, Washington D.C.
Oliver, T., Dembo, M. and Jacobson, K. (1995). Traction Forces in
Locomoting Cells. Cell Motil. Cytoskel. 31, 225-240.
Olsen, L., Sherratt, J. and Maini, P. (1995). A Mechanochemical
Model for Adult Dermal Wound Contraction and the Permanence of the Contracted
Tissue Displacement Profile. J. Theor. Biol. 177(2), 113-128.
Orengo, C. A., Swindell, M. B., Michie, A. D., Zvelebil, M. J.,
Driscoll, P. C., Waterfield, M. D. and Thornton, J. M. (1995). Structural
Similarity between Pleckstrin Homology Domain and Verotoxin: The Problem of
Measuring and Evaluating Structural Similarity. Prot. Sci. 4, 1977-1983.
Peskin, C. and Oster, G. (1995). Coordinated Hydrolysis Explains
the Mechanical Behavior of Kinesin. Biophys. J. 68(4), 202s-210s.
Rayment, I. and H. Holden (1994). The Three-Dimensional Structure
of a Molecular Motor. TIBS 19, 129-134.
Ringe, D. and Petsko, G.A. (1996). A User's Guide to Protein
Crystallography. In Protein Engineering and Design, P.R. Carey, ed. Academic
Press, San Diego.
Roughgarden, J. (1979). Theory of Populations Genetics and
Evolutionary Ecology: An Introduction. Macmillan, New York.
Rybenkov, V.V., Cozzarelli, N.R. and Vologodskii, A.V. (1993).
Probability of DNA Knotting and the Effective Diameter of the DNA Double Helix,
Proc. Nat. Acad. Sci. 90, 5307-5311.
Schlick, T. and Olson, W.K. (1992). Supercoiled DNA Energetics
and Dynamics by Computer Simulation, J. Mol. Biol. 223, 1089-1119.
Scholey, J. (1994). Kinesin-Based Organelle Transport. In Modern
Cell Biology: Microtubules. J. S. Hyams and C. W. Lloyd, eds. New York,
Wiley-Liss. 13: pp. 343-365.
Senderowitz, H., Guanieri, F. and Still, W. C. (1996), A Smart
Monte Carlo Technique for Free Energy Simulations of Multiconformational
Molecules, Direct Calculation of the Conformational Populations of Organic
Molecules. J. Amer. Chem. Soc. 117, 8211-8219.
Shadlen, M. and Newsome, W. (1994), Noise, Neural Codes and
Cortical Organization. Curr. Opin. Neurobiol. 4, 569-579.
Silver, R.B. Calcium, BOBs, Microdomains and a Cellular Decision:
Control of Mitotic Cell Division in Sand Dollar Blastomeres, Cell (in press).
Simmons, A. H. , Michal, C. A. and Jelinski, L. W. (1996).
Molecular Orientation and Two-Component Crystalline Fraction of Spider Dragline
Silk, Science 271, 84-87.
Smith, K. C. and Honig, B. (1994). Evaluation of the
Conformational Free Energies of Loops in Proteins. Proteins: Structure,
Function, and Genetics 18, 119-132.
Smith, S.B., Cui, Y. and Bustamante, C. (1996). Overstretching
B-DNA: The Elastic Response of Individual Double-Stranded and Single-Stranded
DNA Molecules, Science 271, 795-799.
Softky, W.R. (1995). Simple Codes Versus Efficient Codes.
(Commentary) Curr. Opin. Neurobiol. 5, 239-247.
Softky, W.R. and Koch, C. (1993). The Highly Irregular Firing of
Cortical Cells is Inconsistent with Temporal Integration of Random EPSPs. J.
Neuroscience 13, 334-350.
Stasiak, A., et al. (1996). Determination of DNA Helical
Repeat and of the Structure of Supercoiled DNA by Cryo-Electron Microscopy. In
Mathematical Approaches to Biomolecular Structure and Dynamics, IMA Proceedings
82, Springer Verlag, New York, p. 117.
Steinhoff, H. J., Mollaaghabada, R., Altenbach, C., Khorana, H.
G. and Hubbell, W. L. (1994). Site-Directed Spin Labeling Studies of Structure
and Dynamics in Bacteriorhodopsin. Biophys. Chem. 56, 89-94.
Stuart, G.J. and Sakmann, B. (1994). Active Propagation of
Somatic Action Potentials into Neocortical Pyramidal Cell Dendrites. Nature
367, 69-72.
Sumners, D.W., Ernst, C., Spengler, S.J. and Cozzarelli, N.R.
(1995). Analysis of the Mechanism of DNA Recombination Using Tangles, Quarterly
Reviews of Biophysics 28, 253-313.
Svoboda, K. and S. Block (1994). Force and Velocity Measured for
Single Kinesin Molecules. Cell 77, 773-84.
Thorne, J.S., Kishino, H. and Febenstein, J. (1992). Inching
Toward Reality: An Improved Likelihood Model of Sequence Evolution. J. Mol.
Evolution 34, 3-16.
Tilman, D. (1994) Competition and Biodiversity in Spatially
Structured Habitats. Ecology 75, 2-16.
Tirrell, J. G., Fournier, M. J., Mason, T. L. and Tirrell, D. A.
(1994). Biomolecular Materials. Chem. Eng. News, December 19, 40-51.
Tranquillo, R. T., and Alt, W. (1996). Stochastic Model of
Receptor-Mediated Cytomechanics and Dynamic Morphology of Leukocytes. J. Math.
Biol. 34, 361-412.
Tranquillo, R. and J. D. Murray (1993). Mechanistic Model of
Wound Contraction. J. Surg. Res 55, 233-47.
Tuljapurkar, S. and Wiener, P. (1994). Migration in Variable:
Exploring Life History Evolution Using Structured Population Models. J. Theor.
Biol. 166 75-90.
Tuljapurkar, S. (1994). Stochastic Demography and Life Histories.
In Frontiers in Mathematical Biology, S.A. Levin, ed. Springer-Verlag, Berlin,
pp. 254-262.
Tyson, J. J., Novak, B., Odell, G. M., Chen, K., and Thron, C. D.
(1996). Chemical Kinetic Theory: Understanding Cell-Cycle Regulation. Trends in
Biochemical Sciences 21, 89-96.
Walters, C. and Maguire, J.J. (1996). Lessons For Stock
Assessment from the Northern Cod Collapse. Reviews In Fish Biology And
Fisheries 6, 125-137.
Walters, C. and Parma, R.M. (1996). Fixed Exploitation Rate
Strategies for Coping with Effects of Climate Change. Canadian Journal Of
Fisheries And Aquatic Sciences 53, 148-158.
Williams, N. (1996). Yeast Genome Sequence Ferments New Research,
Science 272, 481-481.
White, J.H., (1992). Geometry and Topology. In Proceedings of
Symposia in Applied Mathematics 45, American Mathematical Society, Providence,
R.I., 17.
Whittington, M.A., Traub, R.D., and Jefferys, J.G.R. (1995).
Synchronized Oscillations in Interneuron Networks Driven by Metabotropic
Glutamate Receptor Activation. Nature 373, 612-615.
Wofsy, C., Kent, U. K., Mao, S-Y., Metzger, H., and Goldstein, B.
(1995). Kinetics of Tyrosine Phosphorylation When IgE Dimers Bind to Fce Receptors
on Rat Basophilic Leukemia Cells. J. Biol. Chem. 270, 20264-20272.
York, D.M., Yang, W.T., Lee, H., Darden, T. and Pedersen, L.G.
(1995). Toward the Accurate Modeling of DNA - The Importance of Long-Range
Electrostatics. J. Amer. Chem. Soc. 117, 5001-5002.
Zadoks, J.C. and Van Den Bosch, F. (1994). On The Spread Of Plant
Disease - A Theory On Foci. Annual Review Of Phytopathology 32, 503-521.