Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Mathematical Approaches to Molecular Structural Biology
Mathematical Approaches to Molecular Structural Biology
Mathematical Approaches to Molecular Structural Biology
Ebook571 pages3 hours

Mathematical Approaches to Molecular Structural Biology

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Mathematical Approaches to Molecular Structural Biology offers a comprehensive overview of the mathematical foundations behind the study of biomolecular structure. Initial chapters provide an introduction to the mathematics associated with the study of molecular structure, such as vector spaces and matrices, linear systems, matrix decomposition, vector calculus, probability and statistics. The book then moves on to more advanced areas of molecular structural biology based on the mathematical concepts discussed in earlier chapters. Here, key methods such as X-ray crystallography and cryo-electron microscopy are explored, in addition to biomolecular structure dynamics within the context of mathematics and physics.

This book equips readers with an understanding of the fundamental principles behind structural biology, providing researchers with a strong groundwork for further investigation in both this and related fields.

  • Includes a detailed introduction to key mathematical principles and their application to molecular structural biology
  • Explores the mathematical underpinnings behind advanced techniques such as X-ray crystallography and Cryo-electron microscopy
  • Features step-by-step protocols that illustrate mathematical and statistical principles for studying molecular structure and dynamics
  • Provides a basis for further investigation into the field of computational molecular biology
  • Includes figures and graphs throughout to visually demonstrate the concepts discussed
LanguageEnglish
Release dateNov 19, 2022
ISBN9780323906630
Mathematical Approaches to Molecular Structural Biology
Author

Subrata Pal

Subrata Pal obtained his bachelor’s and master’s degrees, both in physics, from Calcutta University. Subsequently, he pursued his predoctoral research in molecular biology and received a PhD degree from the same university in 1982. He carried out his postdoctoral research in DNA replication and gene expression at two of the premiere institutions in the USA: the National Institutes of Health, Bethesda, Maryland and Harvard Medical School, Boston Massachusetts. At Harvard, he received a Claudia Adams Barr special investigator award for basic contribution to cancer research. Professor Pal has a long teaching experience at the undergraduate and graduate levels – the areas of his teaching include physics, molecular biology and genomics.

Read more from Subrata Pal

Related to Mathematical Approaches to Molecular Structural Biology

Related ebooks

Biology For You

View More

Related articles

Reviews for Mathematical Approaches to Molecular Structural Biology

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Mathematical Approaches to Molecular Structural Biology - Subrata Pal

    Front Cover for Mathematical Approaches to Molecular Structural Biology - 1st edition - by Subrata Pal

    Mathematical Approaches to Molecular Structural Biology

    Subrata Pal

    Table of Contents

    Cover image

    Title page

    Copyright

    Dedication

    About the author

    Preface

    Acknowledgments

    Table of symbols

    Chapter 1. Mathematical preliminaries

    Abstract

    1.1 Functions

    1.2 Vectors

    1.3 Matrices and determinants

    1.4 Calculus

    1.5 Series and limits

    Exercise 1

    Further reading

    Chapter 2. Vector spaces and matrices

    Abstract

    2.1 Linear systems

    Exercises 2.1

    2.2 Sets and subsets

    Exercise 2.2

    2.3 Vector spaces and subspaces

    Exercise 2.3

    2.4 Liner combination/linear independence

    Exercise 2.4

    2.5 Basis vectors

    Exercise 2.5

    2.6 Dimension and rank

    Exercise 2.6

    2.7 Inner product space

    Exercise 2.7

    2.8 Orthogonality

    Exercise 2.8

    2.9 Mapping and transformation

    Exercise 2.9

    2.10 Change of basis

    Exercise 2.10

    Further reading

    Chapter 3. Matrix decomposition

    Abstract

    3.1 Eigensystems from different perspectives

    Exercise 3.1

    3.2 Eigensystem basics

    Exercise 3.2

    3.3 Singular value decomposition

    Exercises 3.3

    Further reading

    Chapter 4. Vector calculus

    Abstract

    4.1 Derivatives of univariate functions

    4.2 Derivatives of multivariate functions

    4.3 Gradients of scalar- and vector-valued functions

    4.4 Gradients of matrices

    4.5 Higher-order derivates – Hessian

    4.6 Linearization and multivariate Taylor series

    Exercise 4

    Further Reading

    Chapter 5. Integral transform

    Abstract

    5.1 Fourier transform

    5.2 Dirac delta function

    5.3 Convolution and deconvolution

    5.4 Discrete Fourier transform

    5.5 Laplace transform

    Exercise 5

    Further reading

    Chapter 6. Probability and statistics

    Abstract

    6.1 Probability—definitions and properties

    6.2 Random variables and distribution

    6.3 Multivariate distribution

    6.4 Covariance and correlation

    6.5 Principal component analysis

    Exercise 6

    Further reading

    Chapter 7. X-ray crystallography

    Abstract

    7.1 X-ray scattering

    7.2 Scattering by an atom

    7.3 Diffraction from a crystal – Laue equations

    7.4 Diffraction and Fourier transform

    7.5 Convolution and diffraction

    7.6 The electron density equation

    Exercise 7

    Further reading

    Chapter 8. Cryo-electron microscopy

    Abstract

    8.1 Quantum physics

    8.2 Wave optics of electrons—scattering

    8.3 Theory of image formation

    8.4 Image processing by multivariate statistical analysis—principal component analysis

    8.5 Clustering

    8.6 Maximum likelihood

    Exercise 8

    Reference

    Further reading

    Chapter 9. Biomolecular structure and dynamics

    Abstract

    9.1 Comparison of biomolecular structures

    9.2 Conformational optimization

    9.3 Molecular dynamics

    9.4 Normal mode analysis

    Exercise 9

    References

    Further reading

    Index

    Copyright

    Academic Press is an imprint of Elsevier

    125 London Wall, London EC2Y 5AS, United Kingdom

    525 B Street, Suite 1650, San Diego, CA 92101, United States

    50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States

    The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom

    Copyright © 2023 Elsevier Inc. All rights reserved.

    No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.

    This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).

    Notices

    Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.

    Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.

    To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.

    ISBN: 978-0-323-90397-4

    For Information on all Academic Press publications visit our website at https://www.elsevier.com/books-and-journals

    Publisher: Andre G. Wolff

    Acquisitions Editor: Michelle Fisher

    Editorial Project Manager: Tracy I. Tufaga

    Production Project Manager: Swapna Srinivasan

    Elsevier Cover Designer: Vicky Pearson

    Cover Image Designer: Sharmishtha Pal

    Typeset by MPS Limited, Chennai, India

    Dedication

    To my parents

    About the author

    Subrata Pal

    Bachelor of Science in Physics, Calcutta University, Kolkata, India, 1972.

    Master of Science in Physics, Calcutta University, Kolkata, India, 1974.

    Ph.D. in Molecular Biology, Calcutta University, Kolkata, India, 1982.

    Assistant Professor, Physics, in a Calcutta University-affiliated college, 1979–82.

    Visiting Fellow, National Institutes of Health, National Cancer Institute, Bethesda, MD, United States, 1984–87.

    Fellow/Claudia Adams Barr Investigator, Harvard Medical School, Dana-Farber Cancer Institute, Boston, MA, United States, 1988–92.

    Assoc. Professor, Molecular Biology, Jadavpur University, Kolkata, India, 1993–2001.

    Professor, Molecular Biology, Genomics and Proteomics, Jadavpur University, Kolkata, India, 2001–15.

    Claudia Adams Barr special investigator award for basic contribution to cancer research at the Dana-Farber Cancer Institute (Harvard Medical School), Boston, MA, United States, 1991.

    Preface

    Subrata Pal

    With the discovery of the double-helical structure of DNA in 1953 and, a few years later, the structures of the proteins, myoglobin and hemoglobin, molecular biology rose to the status of molecular structural biology (MSB). Since then, scientific literature and databases have been flooded with biomolecular structural information for which three different physics-based techniques, namely, X-ray crystallography (XRC), nuclear magnetic resonance spectroscopy, and cryo-electron microscopy (cryoEM), have been essentially responsible. It is now possible to mechanistically explain and predict functions of biomolecules based on their structures.

    As technologies improved and molecules became discernible at the atomic scale, it was further revealed that biological macromolecules are intrinsically flexible and naturally exist in multiple conformations. The conformational flexibility is important for their function. Biomolecular functional dynamics has become amenable to investigations by the application of physical principles aided by mathematical and statistical tools. Furthermore, rapid developments in computer hardware and software have remarkably facilitated mathematical analysis of biomolecular structure and dynamics. Needless to say, mathematical approaches are not a total substitute for experimental investigations in MSB but, beyond any doubt, an unchallenged complement.

    Concomitantly, molecular biology courses are also going through a desirable transformation—they are no longer restricted to mere description of biological phenomena but adopting a more mechanistic format. Further, some of the interdisciplinary courses at the graduate level, such as structural biology, macromolecular machines, protein folding, molecular recognition, interaction, etc., being built on structural dynamics of biomolecules, are becoming intensely dependent on mathematics and statistics.

    It is true that most of these courses have mathematics as their prerequisites. However, it is also not unexpected that in such exclusive mathematics courses, students are exposed to a variety of topics which are important for different areas of science and technology but may not all be required for MSB. Consequently, it is likely that the students may not be able to do equal justice to all the topics and, in the process, those who would be going for MSB-related courses later discount the importance of the topics which, subsequently, turn out to be indispensable.

    The book has attempted to sort out and succinctly offer some topics in mathematics and statistics which build the foundation of a discourse in MSB and orient the reader in the appropriate direction. Further, in order to reinforce the orientation, the last three chapters of the book have illustrated how this mathematical background can be applied to a few of the most important MSB problems.

    It is well known that both XRC and cryoEM are capable of solving biomacromolecular structures at atomic resolution. Invariably, they are both dependent on rigorous mathematical and computational analysis. Chapter 7 has discussed the basic physics of X-ray diffraction and gone on to describe how diffraction data from macromolecular crystals are channelized into structural models based on the mathematical theory of Fourier transform and convolution.

    CryoEM is based on the scattering of electron waves by the object under investigation. In Chapter 8, the quantum physics of electron scattering has been briefly reviewed. Further, the chapter has discussed how the issues of heterogeneity and noise in cryoEM have been addressed by statistical approaches—principal component analysis and multivariate statistical analysis.

    Chapter 9 highlights the essential mathematics concerning some selected and widely used computational techniques to investigate biomolecular structure and dynamics. These include quaternion approach to biomolecular structure comparison, molecular dynamics simulations to predict the movement of each atom in a molecular system, and normal mode analysis which considers the constituent atoms of a biomolecule as a set of simple harmonic oscillators.

    Undoubtedly, each of the above stated topics in MSB deserves an entire book for the coverage of all its aspects. Nonetheless, the intention of the presentation has been to restrict to the fundamentals and care has been taken to avoid overloading.

    The book should be useful for the students at different levels, advanced undergraduate, graduate, or research, in regular molecular (structural) biology or related courses, who would not like to restrict themselves to the narrative aspects of biomolecular phenomena but take a keen interest in the mechanistic and mathematical aspects of MSB.

    Acknowledgments

    The book is dedicated to the memory of my father who introduced me to mathematics in my early childhood and my mother in whose tenderness I grew up.

    I respectfully express my indebtedness to late Professor Binayak Dutta-Roy from whom I learnt quantum mechanics.

    Plenty of thanks to Michelle Fisher, Megan Ashdown, Tracy Tufaga, and the entire Elsevier team for their unhesitating help and cooperation in addressing my queries and difficulties right from the inception of the project to the publication of the book.

    My wife has been very supportive of the effort in all possible ways.

    And, last but not least, I am extremely proud of my daughter Sharmishtha who has designed the cover page of the book.

    Table of symbols

    x, y, z, a, b, c, α, β, γ, λ scalars

    x, y, z, u, v, r vectors

    A, B, C matrices

    xT transpose of a vector

    AT transpose of a matrix

    A−1 inverse of a matrix

    ent integers

    ent natural numbers

    ent real numbers

    complex numbers

    ent n n-dimensional vector space of real numbers

    ent m×n m×n ordered array of real numbers

    a:= b a defined as b

    a =: b b defined as a

    >> much greater than

    > greater than

    greater than or equal to

    approximately equal to

    less than or equal to

    < less than

    << much less than

    implies

    if and only if

    Chapter 1

    Mathematical preliminaries

    Abstract

    Molecular structural biology is essentially based on structural dynamics-function correlation of biomacromolecules. The last three chapters of the book have attempted to illustrate the mathematical approaches to investigate this structure-function paradigm currently overwhelming the field of theoretical and computational molecular biology. This chapter reviews the fundamental mathematical concepts that are needed for the said purpose. To begin with, functions—algebraic, trigonometric, exponential and logarithmic, and complex—which are used to describe physical systems (including biological systems), have been recollected. This has been followed by the introduction of vectors and matrices that are of paramount importance in the study on cryo-electron microscopy of biomolecules, in particular, and biomolecular structure and dynamics, in general. The nature of changes in a function, with respect to a variable(s) on which it depends, has been discussed in the section on calculus.

    Keywords

    Functions; algebra; trigonometry; complex variable; vector; matrix; determinant; calculus; series

    Structural dynamics-function correlation of biomacromolecules is perhaps the most important theme of molecular structural biology. The last three chapters of the book have attempted to illustrate the mathematical approaches to investigate this structure-function paradigm currently overwhelming the field of theoretical and computational molecular biology. This chapter reviews the fundamental mathematical concepts that are needed for the said purpose. To begin with, functions—algebraic, trigonometric, exponential and logarithmic, and complex—which are used to describe physical systems (including biological systems), have been recollected. This has been followed by the introduction of vectors and matrices that are of paramount importance in the study on cryo-electron microscopy of biomolecules, in particular, and biomolecular structure and dynamics, in general. The nature of changes in a function, with respect to a variable(s) on which it depends, has been discussed in the section on calculus.

    1.1 Functions

    In biological literature, we have seen that amino acids are often symbolized by single alphabets. For example, the letter G denotes the amino acid glycine, K denotes lysine, and so on. Clearly, we have two sets, one consisting of the alphabets and the other the amino acids, and a defined correspondence so that corresponding to an alphabet in the first set, a unique amino acid can be identified in the second set. This special kind of correspondence between two sets is called a function—the first set is called the domain while the second set the range of the function.

    Definition 1.1.1 A function is a correspondence between one set, called the domain, with a second set, called the range, such that each member of the domain corresponds to exactly one member of the range.

    1.1.1 Algebraic functions

    Physical systems (including biological systems) and their dynamics are quantitatively described in terms of observables or entities which can be measured. In a specific physical problem, we may have an entity, for example, the distance of an object (represented geometrically by a point P) with respect to a reference point O and the value of the entity is denoted by a variable x. Now, it may happen that there is another entity (say, a force on the object), denoted by a variable y, which is related to x. The relation between the two variables may be expressed as

    (1.1)

    Formally, it is said that if there is a unique value of y for each value of x, then y is a function of x. The set of all permitted values of x is called a domain and that of all permitted values of y, a range. Being a function of a single variable, f(x) is also called a univariate function which can be used to describe and analyze a one-dimensional system. Graphically, it is represented by a line. Examples of two univariate functions are shown in Fig. 1.1.

    Figure 1.1 Graphical representation of univariate algebraic functions.

    On the other hand, if there be a unique value of x for each value of y, we can write

    (1.2)

    which is defined as the reverse function. g(y) is also a univariate function.

    However, we may have a function f(x,y) which depends on two variables x and y. In this case, it is required that for any pair of values (x,y), f(x,y) has a well-defined value. The notion can be extended to a function f(x1, x2,…, xn) that depends on n number of variables x1, x2,…, xn. Functions of two variables can be represented by a surface in a three-dimensional space; functions with higher number of variables are usually difficult to visualize. Functions involving more than one variable are called multivariate.

    In certain physical problems, a function can be expressed as a polynomial in x.

    (1.3)

    When f (x) is set equal to zero, the polynomial equation

    (1.4)

    is satisfied by specific values of x known as the roots of the equation. n>0 is an integer known as the degree of the polynomial and the equation. The coefficients a0, a1,…, an, (n≠0) are real quantities determined by the physical properties of the system under study.

    In (1.4), if n=1, the equation takes the form of a linear equation

    (1.5)

    whose solution (root) is given by α1=−a0/a1

    For n=2, (1.4) becomes a quadratic equation

    (1.6)

    the roots of which are

    (1.7)

    n=3 gives a cubic equation.

    1.1.2 Trigonometric functions

    Physical systems that involve the periodic motion of a point can be represented by trigonometric functions that are also periodic in nature. Here, the point in question is considered as the projection of another point moving along a circle. To illustrate, let us consider a circle in the xy-plane (Fig. 1.2). The radius of the circle is r and its center is at the origin of xy-coordinate system. P′ is a point on the circle, the coordinates of which are given by (x, y) and makes an angle ϕ with the x-axis. P is the projection of P′ on the x-axis.

    Figure 1.2 Position of point P′ indicated by Cartesian coordinates (x, y) and polar coordinates (r, ϕ).

    As P′ moves counterclockwise along the circle starting from the x-axis, ϕ increases from 0 to 2π radians or 360 degrees. With P′(x, y) associated with angle ϕ, we have the following definitions:

    (1.8)

    and the reciprocal relations

    (1.9)

    If the circle be of unit radius, that is, r=1

    All these relations are collectively called trigonometric functions.

    Referring to Fig. 1.2, one can see that the point P′ can be represented also by polar coordinates r (the radial coordinate) and ϕ (the angular coordinate). The relations between the polar coordinates and Cartesian coordinates are given by

    (1.10)

    Further, from (1.8), we have cos 0=cos 2π=1 and cos π=−1. Therefore it can easily be visualized that as P′ moves along the circle from 0 to 2π, its projection P moves along the x-axis from +1 to −1 and back to +1. The cycle is repeatable between 2π and 4π and so on. It can be seen that

    (1.11)

    where n=1, 2, 3,… f (ϕ)=cosϕ is, therefore, a periodic function, 2π being the period. Similarly, sin ϕ is also a periodic function. Fig. 1.2 also shows that

    (1.12)

    The basic trigonometric functions, as given previously, are periodic. Hence, their inverses are not single-valued. However, by restricting the domain to an appropriate interval, each of the inverses can be defined as a (single-valued) function.

    (1.13)

    1.1.3 Exponential and logarithmic functions

    We have seen that a polynomial contains terms like an xn where x is the variable and n is a fixed number (an integer). However, there can be occasions when a function will contain terms like nx where n is a fixed number (n>0 and n≠1) and the variable x is in the exponent.

    As for example, in a steadily growing and dividing bacterial culture, the number of cells after x generations is given by N=N0 2x, where N0 is the initial number of cells. Such functions are called exponential functions.

    Definition 1.1.2

    An exponential function f is given by

    (1.14)

    where x is a real number and the base a>0, a≠0.

    Example 1.1.1

    Commonly used exponentials are 10x and 2x, where the bases are, respectively, 10 and 2.

    However, the most commonly used exponential is ex, where the number e (≈2.718281828459), which can be defined as a limiting sum of an infinite series (see Section 1.5), is called the natural base.

    All exponential functions follow the same rules of manipulations

    (1.15a)

    and

    (1.15b)

    The exponential function f(x)=ax is one-to-one with domain (−∞,∞) and range (0,∞). Hence, it does have an inverse function, defined as the logarithmic function with base b.

    Definition 1.1.3

    If x=a y, then y=log a x, a > 0, a ≠ 1

    The number a is called the logarithmic

    Enjoying the preview?
    Page 1 of 1