Ebook1,158 pages13 hours

Handbook of Statistical Systems Biology

Name: Handbook of Statistical Systems Biology
Author: Michael Stumpf
ISBN: 9781119952046

By Michael Stumpf, David J. Balding and Mark Girolami

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Systems Biology is now entering a mature phase in which the key issues are characterising uncertainty and stochastic effects in mathematical models of biological systems. The area is moving towards a full statistical analysis and probabilistic reasoning over the inferences that can be made from mathematical models. This handbook presents a comprehensive guide to the discipline for practitioners and educators, in providing a full and detailed treatment of these important and emerging subjects. Leading experts in systems biology and statistics have come together to provide insight in to the major ideas in the field, and in particular methods of specifying and fitting models, and estimating the unknown parameters.

This book:

Provides a comprehensive account of inference techniques in systems biology.
Introduces classical and Bayesian statistical methods for complex systems.
Explores networks and graphical modeling as well as a wide range of statistical models for dynamical systems.
Discusses various applications for statistical systems biology, such as gene regulation and signal transduction.
Features statistical data analysis on numerous technologies, including metabolic and transcriptomic technologies.
Presents an in-depth presentation of reverse engineering approaches.
Provides colour illustrations to explain key concepts.

This handbook will be a key resource for researchers practising systems biology, and those requiring a comprehensive overview of this important field.

Skip carousel

Biology

LanguageEnglish

PublisherWiley

Release dateSep 9, 2011

ISBN9781119952046

Author

Michael Stumpf

Related authors

Skip carousel

Related to Handbook of Statistical Systems Biology

Related ebooks

Skip carousel

Case Studies in Bayesian Statistical Modelling and Analysis
Ebook
Case Studies in Bayesian Statistical Modelling and Analysis
byClair L. Alston
Rating: 0 out of 5 stars
0 ratings
Applied Bayesian Modelling
Ebook
Applied Bayesian Modelling
byPeter Congdon
Rating: 0 out of 5 stars
0 ratings
Latent Variable Models and Factor Analysis: A Unified Approach
Ebook
Latent Variable Models and Factor Analysis: A Unified Approach
byDavid J. Bartholomew
Rating: 0 out of 5 stars
0 ratings
Modern Industrial Statistics: with applications in R, MINITAB and JMP
Ebook
Modern Industrial Statistics: with applications in R, MINITAB and JMP
byRon S. Kenett
Rating: 0 out of 5 stars
0 ratings
Environmental Modelling: Finding Simplicity in Complexity
Ebook
Environmental Modelling: Finding Simplicity in Complexity
byJohn Wainwright
Rating: 0 out of 5 stars
0 ratings
Biological Knowledge Discovery Handbook: Preprocessing, Mining and Postprocessing of Biological Data
Ebook
Biological Knowledge Discovery Handbook: Preprocessing, Mining and Postprocessing of Biological Data
byMourad Elloumi
Rating: 5 out of 5 stars
5/5
Recent Trends in Computational Intelligence Enabled Research: Theoretical Foundations and Applications
Ebook
Recent Trends in Computational Intelligence Enabled Research: Theoretical Foundations and Applications
bySiddhartha Bhattacharyya
Rating: 0 out of 5 stars
0 ratings
Computational Intelligence and Pattern Analysis in Biology Informatics
Ebook
Computational Intelligence and Pattern Analysis in Biology Informatics
byUjjwal Maulik
Rating: 0 out of 5 stars
0 ratings
Basic and Advanced Bayesian Structural Equation Modeling: With Applications in the Medical and Behavioral Sciences
Ebook
Basic and Advanced Bayesian Structural Equation Modeling: With Applications in the Medical and Behavioral Sciences
bySik-Yum Lee
Rating: 3 out of 5 stars
3/5
Pattern Recognition in Computational Molecular Biology: Techniques and Approaches
Ebook
Pattern Recognition in Computational Molecular Biology: Techniques and Approaches
byMourad Elloumi
Rating: 0 out of 5 stars
0 ratings
Handbook of Blind Source Separation: Independent Component Analysis and Applications
Ebook
Handbook of Blind Source Separation: Independent Component Analysis and Applications
byPierre Comon
Rating: 0 out of 5 stars
0 ratings
Understanding Complex Ecosystem Dynamics: A Systems and Engineering Perspective
Ebook
Understanding Complex Ecosystem Dynamics: A Systems and Engineering Perspective
byWilliam S. Yackinous
Rating: 0 out of 5 stars
0 ratings
Adaptive Learning Methods for Nonlinear System Modeling
Ebook
Adaptive Learning Methods for Nonlinear System Modeling
byDanilo Comminiello
Rating: 0 out of 5 stars
0 ratings
Emerging Trends in Applications and Infrastructures for Computational Biology, Bioinformatics, and Systems Biology: Systems and Applications
Ebook
Emerging Trends in Applications and Infrastructures for Computational Biology, Bioinformatics, and Systems Biology: Systems and Applications
byHamid R Arabnia
Rating: 0 out of 5 stars
0 ratings
Cognitive Radio Communication and Networking: Principles and Practice
Ebook
Cognitive Radio Communication and Networking: Principles and Practice
byRobert Caiming Qiu
Rating: 0 out of 5 stars
0 ratings
Causality: Statistical Perspectives and Applications
Ebook
Causality: Statistical Perspectives and Applications
byCarlo Berzuini
Rating: 0 out of 5 stars
0 ratings
Model Management and Analytics for Large Scale Systems
Ebook
Model Management and Analytics for Large Scale Systems
byBedir Tekinerdogan
Rating: 0 out of 5 stars
0 ratings
Swarm Intelligence and Bio-Inspired Computation: Theory and Applications
Ebook
Swarm Intelligence and Bio-Inspired Computation: Theory and Applications
byXin-She Yang
Rating: 0 out of 5 stars
0 ratings
Pattern Recognition
Ebook
Pattern Recognition
byKonstantinos Koutroumbas
Rating: 4 out of 5 stars
4/5
Medical Image Analysis
Ebook
Medical Image Analysis
byAlejandro Frangi
Rating: 0 out of 5 stars
0 ratings
Mixtures: Estimation and Applications
Ebook
Mixtures: Estimation and Applications
byKerrie L. Mengersen
Rating: 0 out of 5 stars
0 ratings
Cognitive Communications: Distributed Artificial Intelligence (DAI), Regulatory Policy and Economics, Implementation
Ebook
Cognitive Communications: Distributed Artificial Intelligence (DAI), Regulatory Policy and Economics, Implementation
byDavid Grace
Rating: 0 out of 5 stars
0 ratings
Handbook of Green Information and Communication Systems
Ebook
Handbook of Green Information and Communication Systems
byAlagan Anpalagan
Rating: 0 out of 5 stars
0 ratings
Bayesian Analysis of Stochastic Process Models
Ebook
Bayesian Analysis of Stochastic Process Models
byDavid Insua
Rating: 0 out of 5 stars
0 ratings
Fractional Order Systems and Applications in Engineering
Ebook
Fractional Order Systems and Applications in Engineering
byDumitru Baleanu
Rating: 0 out of 5 stars
0 ratings
Statistical Pattern Recognition
Ebook
Statistical Pattern Recognition
byAndrew R. Webb
Rating: 4 out of 5 stars
4/5
Modelling Methodology for Physiology and Medicine
Ebook
Modelling Methodology for Physiology and Medicine
byEwart Carson
Rating: 0 out of 5 stars
0 ratings
Protein Purification: Principles, High Resolution Methods, and Applications
Ebook
Protein Purification: Principles, High Resolution Methods, and Applications
byJan-Christer Janson
Rating: 0 out of 5 stars
0 ratings
Modeling of Mass Transport Processes in Biological Media
Ebook
Modeling of Mass Transport Processes in Biological Media
bySid M. Becker
Rating: 0 out of 5 stars
0 ratings
Renewable-Energy-Driven Future: Technologies, Modelling, Applications, Sustainability and Policies
Ebook
Renewable-Energy-Driven Future: Technologies, Modelling, Applications, Sustainability and Policies
byJingzheng Ren
Rating: 0 out of 5 stars
0 ratings

Biology For You

Skip carousel

Anatomy and Physiology For Dummies
Ebook
Anatomy and Physiology For Dummies
byMaggie Norris
Rating: 4 out of 5 stars
4/5
Homo Deus: A Brief History of Tomorrow
Ebook
Homo Deus: A Brief History of Tomorrow
byYuval Noah Harari
Rating: 4 out of 5 stars
4/5
Sapiens: A Brief History of Humankind
Ebook
Sapiens: A Brief History of Humankind
byYuval Noah Harari
Rating: 4 out of 5 stars
4/5
Anatomy 101: From Muscles and Bones to Organs and Systems, Your Guide to How the Human Body Works
Ebook
Anatomy 101: From Muscles and Bones to Organs and Systems, Your Guide to How the Human Body Works
byKevin Langford
Rating: 4 out of 5 stars
4/5
Fantastic Fungi: How Mushrooms Can Heal, Shift Consciousness, and Save the Planet
Ebook
Fantastic Fungi: How Mushrooms Can Heal, Shift Consciousness, and Save the Planet
byPaul Stamets
Rating: 5 out of 5 stars
5/5
Emotional Blackmail: When the People in Your Life Use Fear, Obligation, and Guilt to Manipulate You
Ebook
Emotional Blackmail: When the People in Your Life Use Fear, Obligation, and Guilt to Manipulate You
bySusan Forward
Rating: 4 out of 5 stars
4/5
Dopamine Detox: Biohacking Your Way To Better Focus, Greater Happiness, and Peak Performance
Ebook
Dopamine Detox: Biohacking Your Way To Better Focus, Greater Happiness, and Peak Performance
byNick Trenton
Rating: 3 out of 5 stars
3/5
Your Brain: A User's Guide: 100 Things You Never Knew
Ebook
Your Brain: A User's Guide: 100 Things You Never Knew
byNational Geographic
Rating: 4 out of 5 stars
4/5
Gut: The Inside Story of Our Body's Most Underrated Organ (Revised Edition)
Ebook
Gut: The Inside Story of Our Body's Most Underrated Organ (Revised Edition)
byGiulia Enders
Rating: 4 out of 5 stars
4/5
Ultralearning: Master Hard Skills, Outsmart the Competition, and Accelerate Your Career
Ebook
Ultralearning: Master Hard Skills, Outsmart the Competition, and Accelerate Your Career
byScott H. Young
Rating: 4 out of 5 stars
4/5
Peptide Protocols: Volume One
Ebook
Peptide Protocols: Volume One
byMD William A. Seeds
Rating: 4 out of 5 stars
4/5
The Obesity Code: the bestselling guide to unlocking the secrets of weight loss
Ebook
The Obesity Code: the bestselling guide to unlocking the secrets of weight loss
byJason Fung
Rating: 4 out of 5 stars
4/5
Why We Sleep: Unlocking the Power of Sleep and Dreams
Ebook
Why We Sleep: Unlocking the Power of Sleep and Dreams
byMatthew Walker
Rating: 4 out of 5 stars
4/5
This Will Make You Smarter: 150 New Scientific Concepts to Improve Your Thinking
Ebook
This Will Make You Smarter: 150 New Scientific Concepts to Improve Your Thinking
byJohn Brockman
Rating: 4 out of 5 stars
4/5
Mother of God: An Extraordinary Journey into the Uncharted Tributaries of the Western Amazon
Ebook
Mother of God: An Extraordinary Journey into the Uncharted Tributaries of the Western Amazon
byPaul Rosolie
Rating: 4 out of 5 stars
4/5
Woman: An Intimate Geography
Ebook
Woman: An Intimate Geography
byNatalie Angier
Rating: 4 out of 5 stars
4/5
The Grieving Brain: The Surprising Science of How We Learn from Love and Loss
Ebook
The Grieving Brain: The Surprising Science of How We Learn from Love and Loss
byMary-Frances O'Connor
Rating: 4 out of 5 stars
4/5
Lifespan: Why We Age—and Why We Don't Have To
Ebook
Lifespan: Why We Age—and Why We Don't Have To
byDavid A. Sinclair
Rating: 4 out of 5 stars
4/5
Jaws: The Story of a Hidden Epidemic
Ebook
Jaws: The Story of a Hidden Epidemic
bySandra Kahn
Rating: 4 out of 5 stars
4/5
How Emotions Are Made: The Secret Life of the Brain
Ebook
How Emotions Are Made: The Secret Life of the Brain
byLisa Feldman Barrett
Rating: 4 out of 5 stars
4/5
The Sixth Extinction: An Unnatural History
Ebook
The Sixth Extinction: An Unnatural History
byElizabeth Kolbert
Rating: 4 out of 5 stars
4/5
All That Remains: A Renowned Forensic Scientist on Death, Mortality, and Solving Crimes
Ebook
All That Remains: A Renowned Forensic Scientist on Death, Mortality, and Solving Crimes
bySue Black
Rating: 4 out of 5 stars
4/5
Outlive Diet Recipes: Over 60 Delicious and Healthy Recipes To Help You Live 10 Decades Younger in The Outlive Plan
Ebook
Outlive Diet Recipes: Over 60 Delicious and Healthy Recipes To Help You Live 10 Decades Younger in The Outlive Plan
byJesse Smith
Rating: 4 out of 5 stars
4/5
The Great Mortality: An Intimate History of the Black Death, the Most Devastating Plague of All Time
Ebook
The Great Mortality: An Intimate History of the Black Death, the Most Devastating Plague of All Time
byJohn Kelly
Rating: 4 out of 5 stars
4/5
Genius Kitchen: Over 100 Easy and Delicious Recipes to Make Your Brain Sharp, Body Strong, and Taste Buds Happy
Ebook
Genius Kitchen: Over 100 Easy and Delicious Recipes to Make Your Brain Sharp, Body Strong, and Taste Buds Happy
byMax Lugavere
Rating: 0 out of 5 stars
0 ratings
Other Minds: The Octopus, the Sea, and the Deep Origins of Consciousness
Ebook
Other Minds: The Octopus, the Sea, and the Deep Origins of Consciousness
byPeter Godfrey-Smith
Rating: 4 out of 5 stars
4/5
The Woman Who Changed Her Brain: And Other Inspiring Stories of Pioneering Brain Transformation
Ebook
The Woman Who Changed Her Brain: And Other Inspiring Stories of Pioneering Brain Transformation
byBarbara Arrowsmith-Young
Rating: 4 out of 5 stars
4/5
The Soul of an Octopus: A Surprising Exploration into the Wonder of Consciousness
Ebook
The Soul of an Octopus: A Surprising Exploration into the Wonder of Consciousness
bySy Montgomery
Rating: 4 out of 5 stars
4/5
A Crack In Creation: Gene Editing and the Unthinkable Power to Control Evolution
Ebook
A Crack In Creation: Gene Editing and the Unthinkable Power to Control Evolution
byJennifer A. Doudna
Rating: 4 out of 5 stars
4/5
Suicidal: Why We Kill Ourselves
Ebook
Suicidal: Why We Kill Ourselves
byJesse Bering
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

Conquering the Last Mile in Data - Caitlin Moorman
Podcast episode
Conquering the Last Mile in Data - Caitlin Moorman
byDataTalks.Club
0 ratings
0% found this document useful
Why Microservices Are Better Than Cloud Computing: This episode on Systems—one of the four Domains of Data Science UVA uses to define the field—explores the challenges of cloud computing within the framework of biomedical research. Phil Bourne, Dean of the UVA School of Data Science, speaks with computational biologist and associate professor Nathan Sheffield about a paper they co-wrote on systemic issues from cloud platforms that do not support FAIRness, including platform lock-in, poor integration across platforms, and duplicated efforts for users and developers. They suggest instead prioritizing microservices and access to modular data in smaller chunks or summarized form. Emphasizing modularity and interoperability would lead to a more powerful Unix-like ecosystem of web services for biomedical analysis and data retrieval. The two discuss how funders, developers, and researchers can support microservices as the next generation of cloud-based bioinformatics. From Cloud Computing to
Podcast episode
Why Microservices Are Better Than Cloud Computing: This episode on Systems—one of the four Domains of Data Science UVA uses to define the field—explores the challenges of cloud computing within the framework of biomedical research. Phil Bourne, Dean of the UVA School of Data Science, speaks with computational biologist and associate professor Nathan Sheffield about a paper they co-wrote on systemic issues from cloud platforms that do not support FAIRness, including platform lock-in, poor integration across platforms, and duplicated efforts for users and developers. They suggest instead prioritizing microservices and access to modular data in smaller chunks or summarized form. Emphasizing modularity and interoperability would lead to a more powerful Unix-like ecosystem of web services for biomedical analysis and data retrieval. The two discuss how funders, developers, and researchers can support microservices as the next generation of cloud-based bioinformatics. From Cloud Computing to
byUVA Data Points
0 ratings
0% found this document useful
Setting the Standard: Impact of Method Standardization in Chromatography
Podcast episode
Setting the Standard: Impact of Method Standardization in Chromatography
byThe Analytical Wavelength
0 ratings
0% found this document useful
#6 - NIRS & Cycling Performance
Podcast episode
#6 - NIRS & Cycling Performance
byOxidative Potential Podcast
0 ratings
0% found this document useful
Inventing the future of computing, with Alessandro Curioni
Podcast episode
Inventing the future of computing, with Alessandro Curioni
byLondon Futurists
0 ratings
0% found this document useful
What Researchers and Engineers Can Learn from Each Other - Mihail Eric
Podcast episode
What Researchers and Engineers Can Learn from Each Other - Mihail Eric
byDataTalks.Club
0 ratings
0% found this document useful
State of Containers in the Public Cloud
Podcast episode
State of Containers in the Public Cloud
byThe Cloudcast
0 ratings
0% found this document useful
17: How Extracting Gold From Your Data Accelerates Process Development w/ Ioscani Jiménez del Val - Part 1
Podcast episode
17: How Extracting Gold From Your Data Accelerates Process Development w/ Ioscani Jiménez del Val - Part 1
bySmart Biotech Scientist | Master Bioprocess CMC Development, Biologics Manufacturing & Scale-up for Busy Scientists
0 ratings
0% found this document useful
Analyzing the Google Paper on Continuous Delivery in ML // Part 4 // MLOps Coffee Sessions #17
Podcast episode
Analyzing the Google Paper on Continuous Delivery in ML // Part 4 // MLOps Coffee Sessions #17
byMLOps.community
0 ratings
0% found this document useful
Putting machine learning into a database: Most data scientists bounce back and forth regula…
Podcast episode
Putting machine learning into a database: Most data scientists bounce back and forth regula…
byLinear Digressions
0 ratings
0% found this document useful
The Quantum Theory of Computation and Developing Constructors to Revolutionize Computing with Chiara Marletto: Where can quantum computing take the next step to continue improving and begin outperforming current computers. Theoretically, physical transformations may be the next stage of development. Listen up to learn: The basic unit of quantum computation...
Podcast episode
The Quantum Theory of Computation and Developing Constructors to Revolutionize Computing with Chiara Marletto: Where can quantum computing take the next step to continue improving and begin outperforming current computers. Theoretically, physical transformations may be the next stage of development. Listen up to learn: The basic unit of quantum computation...
byFinding Genius Podcast
0 ratings
0% found this document useful
Professor Michael Levitt - How COVID-19 Panic From The Government Destroyed Millions Of Lives: BROADCAST YOURSELF - 8 Week Course: 2021 SUMMIT TICKETS: NEW MASTERCLASS EACH WEEK: LATEST EPISODE: Professor Michael Levitt is a Nobel Prize-winning biophysicist who has conducted pioneering work on the molecular...
Podcast episode
Professor Michael Levitt - How COVID-19 Panic From The Government Destroyed Millions Of Lives: BROADCAST YOURSELF - 8 Week Course: 2021 SUMMIT TICKETS: NEW MASTERCLASS EACH WEEK: LATEST EPISODE: Professor Michael Levitt is a Nobel Prize-winning biophysicist who has conducted pioneering work on the molecular...
byLondon Real
0 ratings
0% found this document useful
191R_Decision-making approach to urban energy retrofit – A comprehensive review (research summary)
Podcast episode
191R_Decision-making approach to urban energy retrofit – A comprehensive review (research summary)
byWhat is The Future for Cities?
0 ratings
0% found this document useful
Assessing Quantum Computing, with Ignacio Cirac
Podcast episode
Assessing Quantum Computing, with Ignacio Cirac
byLondon Futurists
0 ratings
0% found this document useful
Ep 38: Hussein Khalil, Argonne National Lab: Nuclear Energy R&D
Podcast episode
Ep 38: Hussein Khalil, Argonne National Lab: Nuclear Energy R&D
byTitans Of Nuclear | Interviewing World Experts on Nuclear Energy
0 ratings
0% found this document useful
Machine Learning and Artificial Intelligence in the Clinical Microbiology Laboratory (JCM ed.): The idea of applying machine learning and digital pathology platforms to everyday workflows in the clinical microbiology laboratory has become increasing intriguing and appealing, especially as labs continue to optimize efficiency in the midst of...
Podcast episode
Machine Learning and Artificial Intelligence in the Clinical Microbiology Laboratory (JCM ed.): The idea of applying machine learning and digital pathology platforms to everyday workflows in the clinical microbiology laboratory has become increasing intriguing and appealing, especially as labs continue to optimize efficiency in the midst of...
byEditors in Conversation
0 ratings
0% found this document useful
The urgent need for more grid automation: Grid optimization expert Dr. Kyri Baker explains how utilities can expand artificial intelligence projects today, and what’s next for the technology.
Podcast episode
The urgent need for more grid automation: Grid optimization expert Dr. Kyri Baker explains how utilities can expand artificial intelligence projects today, and what’s next for the technology.
byWith Great Power
0 ratings
0% found this document useful
Doing Software Engineering in Academia - Johanna Bayer
Podcast episode
Doing Software Engineering in Academia - Johanna Bayer
byDataTalks.Club
0 ratings
0% found this document useful
Cisco Optics Podcast Ep 40. Fascinating laser research projects you wish you thought of (6 of 9): Lasers have been around for over six decades. Sin…
Podcast episode
Cisco Optics Podcast Ep 40. Fascinating laser research projects you wish you thought of (6 of 9): Lasers have been around for over six decades. Sin…
byCisco Podcast Network
0 ratings
0% found this document useful
SE4ML - Software Engineering for Machine Learning - Nadia Nahar
Podcast episode
SE4ML - Software Engineering for Machine Learning - Nadia Nahar
byDataTalks.Club
0 ratings
0% found this document useful
Cisco Optics Podcast Ep 39. Fascinating laser research projects you wish you thought of (5 of 9): Lasers have been around for over six decades. Sin…
Podcast episode
Cisco Optics Podcast Ep 39. Fascinating laser research projects you wish you thought of (5 of 9): Lasers have been around for over six decades. Sin…
byCisco Podcast Network
0 ratings
0% found this document useful
Clare Corthell - Open Source Data Science Masters, Hybrid AI, Algorithmic Ethics - TWiML Talk #1: This Week in Machine Learning & AI brings you the…
Podcast episode
Clare Corthell - Open Source Data Science Masters, Hybrid AI, Algorithmic Ethics - TWiML Talk #1: This Week in Machine Learning & AI brings you the…
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Overcoming the next hurdle to get to 800G pluggable optics, with Mark Nowell, 3 of 4: What are the industry’s technical experts in plug…
Podcast episode
Overcoming the next hurdle to get to 800G pluggable optics, with Mark Nowell, 3 of 4: What are the industry’s technical experts in plug…
byCisco Podcast Network
0 ratings
0% found this document useful
Episode: 42 - Machine Learning Informatics for Antibody Discovery
Podcast episode
Episode: 42 - Machine Learning Informatics for Antibody Discovery
byThe Chain: Protein Engineering Podcast
0 ratings
0% found this document useful
SLP80 Richard Myers - Bitcoin Incentivised Mesh Networking with Lot49
Podcast episode
SLP80 Richard Myers - Bitcoin Incentivised Mesh Networking with Lot49
byStephan Livera Podcast
0 ratings
0% found this document useful
05: Hybrid Modeling: The Key to Smarter Bioprocessing w/ Michael Sokolov - Part 1
Podcast episode
05: Hybrid Modeling: The Key to Smarter Bioprocessing w/ Michael Sokolov - Part 1
bySmart Biotech Scientist | Master Bioprocess CMC Development, Biologics Manufacturing & Scale-up for Busy Scientists
0 ratings
0% found this document useful
A "AI & ML" Look Ahead for 2020
Podcast episode
A "AI & ML" Look Ahead for 2020
byThe Cloudcast
0 ratings
0% found this document useful
Cisco Optics Podcast Ep 36. Fascinating laser research projects you wish you thought of (2 of 9): Lasers have been around for over six decades. Sin…
Podcast episode
Cisco Optics Podcast Ep 36. Fascinating laser research projects you wish you thought of (2 of 9): Lasers have been around for over six decades. Sin…
byCisco Podcast Network
0 ratings
0% found this document useful
MLOps Coffee Sessions #11: Analyzing “Continuous Delivery and Automation Pipelines in ML" // Part 3
Podcast episode
MLOps Coffee Sessions #11: Analyzing “Continuous Delivery and Automation Pipelines in ML" // Part 3
byMLOps.community
0 ratings
0% found this document useful
Episode 129 - Web3.0: Breaking free from the Client Server Model with Michelle Lee: What is Web 3.0? Guest speaker Michelle Lee, Product Lead of Protocol Labs, shares how web 3.0 will revolutionize the Internet and bring trust back into the web.
Podcast episode
Episode 129 - Web3.0: Breaking free from the Client Server Model with Michelle Lee: What is Web 3.0? Guest speaker Michelle Lee, Product Lead of Protocol Labs, shares how web 3.0 will revolutionize the Internet and bring trust back into the web.
byProgramming Throwdown
0 ratings
0% found this document useful

Skip carousel

Quantum Simulators An Overview
Techfastly
Article
Quantum Simulators An Overview
Oct 1, 2021
4 min read
Quantum Computing and The Rise Of Machine Learning
Techfastly
Article
Quantum Computing and The Rise Of Machine Learning
Oct 1, 2021
2 min read
Moore’s Law Is About to Get Weird: Never mind tablet computers. Wait till you see bubbles and slime mold.
Nautilus
Article
Moore’s Law Is About to Get Weird: Never mind tablet computers. Wait till you see bubbles and slime mold.
Feb 12, 2015
I’ve never seen the computer you’re reading this story on, but I can tell you a lot about it. It runs on electricity. It uses binary logic to carry out programmed instructions. It shuttles information using materials known as semiconductors. Its brai
7 min read
Business applications For Quantum computing
Rotman Management
Article
Business applications For Quantum computing
May 1, 2022
COMPUTERS DO ARITHMETIC. Underlying every amazing application of computers today is math, calculated using binary digits or ‘bits.’ The original computers of the early 1950s could perform about 465 multiplications per second — much faster than the ‘h
11 min read
Remember, Remember The 2020 November
PC Pro Magazine
Article
Remember, Remember The 2020 November
Jan 7, 2021
World-changing innovations are like London buses: you wait for years and then three come along at once. The recent wait has been particularly irksome, as virology and epidemiology felt like the only relevant sciences in lockdown – apart from rocket s
3 min read
Migrating Legacy Control Systems To Industry 4.0
Facility Management
Article
Migrating Legacy Control Systems To Industry 4.0
Nov 4, 2020
4 min read
Here’s A Feasible Way For Our Devices To Send Data With Light
Futurity
Article
Here’s A Feasible Way For Our Devices To Send Data With Light
Apr 25, 2018
Researchers have developed a method to fabricate silicon chips that can communicate with light and are no more expensive than current chip technology. The new microchip technology capable of optically transferring data could solve a severe bottleneck
3 min read
Code A Cataloguing Application In Python
Linux Format
Article
Code A Cataloguing Application In Python
Nov 15, 2022
Credit: www.djangoproject.com Matt Holder has been a fan of the open source methodology for over two decades and uses Linux and other tools where possible. More featurepacked source code for this project can be downloaded from https://github.com/mat
8 min read
How Green Tech Will Affect the Upcoming Industrial Revolution, Industry 4.0?
Techfastly
Article
How Green Tech Will Affect the Upcoming Industrial Revolution, Industry 4.0?
Jan 3, 2022
5 min read
How Robot Math and Smartphones Led Researchers to a Drug Discovery Breakthrough
AppleMagazine
Article
How Robot Math and Smartphones Led Researchers to a Drug Discovery Breakthrough
Jan 19, 2018
3 min read
Readers’comments
PC Pro Magazine
Article
Readers’comments
Oct 8, 2020
4 min read
Grid Modeling Overview: Four Types of Models Guiding the Transition to Clean Electricity
Union of Concerned Scientists
Article
Grid Modeling Overview: Four Types of Models Guiding the Transition to Clean Electricity
Apr 25, 2022
6 min read
Japanese Paper Art Could Let Electronics Stretch Out
Futurity
Article
Japanese Paper Art Could Let Electronics Stretch Out
Apr 4, 2018
Kirigami, a variation of origami that involves cutting folded pieces of paper, has inspired researchers’ efforts to build malleable electronic circuits. Their innovation—creating tiny sheets of strong yet bendable electronic materials made of select
1 min read
Social Media In B2b Supply Chain Management
The European Business Review
Article
Social Media In B2b Supply Chain Management
Aug 1, 2022
8 min read
Team Takes Key Step Toward Truly Recycling Plastic
Futurity
Article
Team Takes Key Step Toward Truly Recycling Plastic
Mar 28, 2022
2 min read
Method Predicts When Batteries Will Die
Futurity
Article
Method Predicts When Batteries Will Die
Mar 26, 2019
3 min read
Prototype Paves Way For ‘Computer-on-a-chip’
Futurity
Article
Prototype Paves Way For ‘Computer-on-a-chip’
Feb 22, 2019
2 min read
Circuit Programs Human Cells to Add and Subtract
Futurity
Article
Circuit Programs Human Cells to Add and Subtract
Apr 15, 2017
A new platform offers a fast and more efficient way to target and program mammalian cells as genetic circuits, even complex ones. “The problem synthetic biologists are trying to solve is how we ask cells to make decisions and try to design a strategy
2 min read
Data Centers Aren’t The Energy Hogs We Thought
Futurity
Article
Data Centers Aren’t The Energy Hogs We Thought
Feb 28, 2020
2 min read
How Quantum Computing Can Fight Climate Change
APC
Article
How Quantum Computing Can Fight Climate Change
Nov 28, 2022
8 min read
Commentary: Quantum Technology Could Revolutionize Business Via Data Encryption, Drug Discovery And More
Chicago Tribune
Article
Commentary: Quantum Technology Could Revolutionize Business Via Data Encryption, Drug Discovery And More
Jan 3, 2022
3 min read
8 Tips To Make Your Gadget’s Battery Last Longer
Futurity
Article
8 Tips To Make Your Gadget’s Battery Last Longer
Feb 19, 2020
2 min read
The Hide Report
Shop Talk
Article
The Hide Report
Jan 1, 2024
UNIC – Concerie Italiane attended the Parliamentary Intergroup for Fashion meeting in Rome on November 22. Fulvia Bacchi, general manager of UNIC, brought up both the problems and the allure of the Italian leather industry to the Senate. The beauty i
4 min read
Cracking the Code on Recycling Energy Storage Batteries
Union of Concerned Scientists
Article
Cracking the Code on Recycling Energy Storage Batteries
Oct 30, 2020
5 min read
How Quantum Computing Can Fight Climate Change
PC Pro Magazine
Article
How Quantum Computing Can Fight Climate Change
Oct 8, 2022
8 min read
THE WORLD’S BEST Smart Hospitals 2023
Newsweek
Article
THE WORLD’S BEST Smart Hospitals 2023
Sep 16, 2022
7 min read
Scientists Smash Thousands of Proteins to Find Four 'Legos of Life'
Los Angeles Times
Article
Scientists Smash Thousands of Proteins to Find Four 'Legos of Life'
Jan 23, 2018
3 min read
“Q’s” Real Gadgets Could Go Beyond Lasers Inside Pens
PC Pro Magazine
Article
“Q’s” Real Gadgets Could Go Beyond Lasers Inside Pens
Jul 8, 2021
Recently, I read Quantum Computing: How it Works and Why it Could Change the World by Amit Katwala. It’s a great intro to the present state of and prospects for quantum computing – not too technical, wasting little space on the basics but avoiding th
2 min read
Let Your Supply Chains Be Sustainable Too: SOUND ADVICE From LUXEMBOURG
The European Business Review
Article
Let Your Supply Chains Be Sustainable Too: SOUND ADVICE From LUXEMBOURG
Mar 31, 2020
4 min read
System Lets A.I. Play Chemist To Save Months Of Work
Futurity
Article
System Lets A.I. Play Chemist To Save Months Of Work
Dec 21, 2018
2 min read

Related categories

Skip carousel

Reviews for Handbook of Statistical Systems Biology

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Handbook of Statistical Systems Biology - Michael Stumpf

Part A

Methodological Chapters

Chapter 1

Two Challenges of Systems Biology

William S.Hlavacek

Theoretical Division, Los Alamos National Laboratory, Los Alamos, USA

1.1 Introduction

Articulating the challenges faced by a field, or rather the problems that seem worthy of pursuit, can be a useful exercise. A famous example is Hilbert's problems (Hilbert 1902), which profoundly influenced mathematics. This entire handbook can be viewed as an attempt to define challenges faced by systems biologists, especially challenges requiring a statistical approach, and to provide tools for addressing these challenges. This particular chapter provides a personal perspective. If the editors had invited someone else to write it, I am sure it would have materialized in a very different form, as there are many challenges worthy of pursuit in this field.

Much has been written about systems biology, and excellent overviews of the field are available (Kitano 2002). For many years, I believe one of the challenges faced by systems biology has been community building. In recent years, thanks in part to events such as the International Conference on Systems Biology (Kitano 2001) and the q-bio Summer School and Conference (Edwards et al. 2007), a community of systems biology researchers has established itself. This community identifies with the term ‘systems biology’ and broadly agrees upon its loose definition and the general goals of the field. If one is looking for a definition of systems biology, the work presented at the meetings mentioned above serves the purpose well.

A characteristic of systems biology is the systems approach, which is aided by modeling. In systems biology, there has been special interest in the study and modeling of cellular regulatory systems, such as genetic circuits (Alon 2006). This chapter will be focused on a discussion of challenges faced by modelers. Of course, modeling would be aided by technology development, the generation of new quantitative data and in other ways, but I will be concerned mainly with the practice of modeling. Moreover, I will focus on modeling of cell signaling systems, although I believe the discussion is relevant for modeling of other types of cellular regulatory systems.

We need models of cellular regulatory systems to accelerate elucidation of the molecular mechanisms of cellular information processing, to understand design principles of cellular regulatory systems (i.e. the relationships between system structures and functions), and ultimately, to engineer cells for useful purposes. Cell engineering encompasses a number of application areas, including, manipulation of cellular metabolism for production of desirable metabolic products (Keasling 2010), such as biofuels; identification of drug targets for effective treatment or prevention of diseases (Klipp et al. 2010), such as cancer; and creation of cells with entirely new functions and properties through synthetic biology approaches (Khalil and Collins 2010). In each of these areas, predictive models can be used to guide the manipulation of cellular phenotypes through interventions at the molecular level (e.g. genetic modifications or pharmacological perturbations). In traditional engineering fields, predictive models play a central role, and I believe predictive modeling will be just as central to cell engineering efforts in the future. Another more immediate reason to seek models of cellular regulatory systems is that model-guided studies of the molecular mechanisms of cellular information processing can accelerate the pace at which these mechanisms and their design principles are elucidated (Wall et al. 2004; Alon 2006; Novék and Tyson 2008; Rafelski and Marshall 2008; Mukherji and van Oudenaarden 2009). A model of a cellular regulatory system essentially represents a hypothesis about how the system operates and such a model can be used to interpret experimental observations, to design experiments, and generally to extend the reach of human intuition and reasoning. A model can be constructed in a systematic step-by-step fashion and the logical consequences of a model, no matter how complicated, can usually be determined. Although models are useful because they make predictions, models are also useful for other reasons, which are sometimes overlooked, as recently discussed by (Lander 2010).

It is often assumed, presumably because of the long tradition of modeling in science and engineering and the many successes of modeling (e.g. celestial modeling), that modeling of cell signaling systems should be fairly straightforward and that all the necessary tools (e.g. numerical methods and software implementations of these methods) should be more or less readily available. I do not believe that this is the case or even close to the case. I believe that predictive modeling of cell signaling systems will require the solution of unique and difficult problems and that there is much work to be done.

1.2 Cell Signaling Systems

Cell signaling (Hunter 2000; Scott and Pawson 2009) controls the phenotypes and fates of cells. By conveying information through a signal-initiated cascade of molecular interactions, a cell signaling system enables a cell to sense changes in its environment and respond to them. Cell signaling is mediated in large part by complex networks of interacting proteins, each of which is itself complex. A signaling protein may contain a catalytic subunit that is tightly regulated, such as a tyrosine kinase (Bradshaw 2010; Lemmon and Schlessinger 2010), as well as other functional components, including modular protein interaction domains [e.g. a Src homology 2 (SH2) domain] (Pawson and Nash 2003), short linear motifs (e.g. a proline-rich sequence that can be recognized by an SH3 domain) (Gould et al. 2010), and sites of post-translational modification (e.g. substrates of kinases and phosphatases) (Walsh et al. 2005). Many of the components of signaling proteins enable and regulate interactions with binding partners, and regulated assembly of protein complexes plays an important role in cell signaling by co-localizing enzymes and substrates, which affects both enzyme activity and specificity (Ptashne and Gann 2001). Although many of the molecular-level properties of specific cell signaling systems have been elucidated and characterized, we are only now beginning to investigate system-level properties, i.e. how the parts work together to process information (Aldridge et al. 2006; Kholodenko 2006; Kholodenko et al. 2010).

A systems-level understanding of cell signaling is central to a basic understanding of cell biology. This type of understanding is also relevant for finding better treatments for a number of diseases involving dysregulation of cell signaling systems, such as cancer. The malignancy of cancer cells is driven and/or sustained in part by mutations that affect signaling proteins, many of which are the products of oncogenes and tumor suppressor genes (Hanahan Weinberg 2000). We can hope to reverse the effects of deleterious mutations by understanding the system-level properties of cell signaling systems (Kreeger and Lauffenburger 2010; Xu and Huang 2010). However, the complexity of cell signaling systems poses a formidable barrier to understanding and rational manipulation of system-level properties. There is a pressing need to find better ways to reason about cell signaling, so that we can relate the molecular interactions in cell signaling systems to the higher level phenomena that emerge from these interactions (Nurse 2008). Addressing this need now seems timely, given recent advances in proteomics that allow quantitative multiplex assays of the molecular states of cell signaling systems(Gstaiger and Aebersold 2009; Choudhary and Mann 2010; Kolch and Pitt 2010) and opportunities to take advantage of molecular therapeutics (Gonzalez-Angulo et al. 2010), i.e. drugs that specifically target signaling proteins, such as imatinib (Deininger et al. 2005; Hunter 2007).

A fairly standard aid for reasoning about a cell signaling system is a network wiring diagram, which is used to provide a visual representation and summary of available knowledge about a system, in particular the molecules involved and their interactions. Wiring diagrams come in many forms, from simple cartoons to elaborate depictions that make use of formalized iconography 2010; Kitano et al. 2005). A testament to the importance of wiring diagrams is the Systems Biology Graphical Notation (SBGN) project, an effort to develop standardized conventions for diagrammatic representation of biological knowledge Le Novére et al. 2009). Although diagrams are clearly useful, they lack predictive power.

A more powerful aid for reasoning about a cell signaling system is a mathematical model Kholodenko et al. 2010; Kreeger and Lauffenburger 2010; Xu and Huang 2010). Models of cell signaling systems are less commonly used and less accessible than network wiring diagrams, but modeling provides a means to cast available knowledge in a form that allows predictions to be made via computation. Once a model has been formulated, computational procedures can usually be applied to elucidate the logical consequences of the model. We can expect modeling to be useful for understanding cell signaling systems, especially given emerging capabilities in synthetic biology (Bashor et al. 2010; Grünberg and Serrano 2010; Lim 2010), which present opportunities for model-guided cell engineering. However, it is becoming clear that traditional modeling approaches, such as the formalism of ordinary differential equations (ODEs), will not be entirely adequate. Modeling of cell signaling systems presents unique challenges for which new approaches are needed. Two of these challenges are discussed below.

1.3 The Challenge of Many Moving Parts

To address many interesting questions, we need large models (Hlavacek 2009a). The reason is twofold. First, we need models of entire cell signaling systems and indeed models that encompass sets of connected cell signaling systems and other regulatory systems (e.g. metabolic circuits) to understand how the molecular states of cells affect their fates and phenotypes. The molecular state of a cell can be perturbed and potentially rationally manipulated. For example, pharmacological inhibition of a receptor tyrosine kinase might be used to impact cell-cycle control and reduce proliferation. However, the effect of a perturbation must generally propagate through a network of regulatory systems to bring about a change in phenotype, and these systems are rife with interconnections and feedback loops (Tyson et al. 2003; Brandman and Meyer 2008). The effects of some perturbations can be intuited. Others can be predicted with greater accuracy on the basis of simple or coarse-grained models or on the basis of mechanistically detailed but circumscribed models (Tyson and Novák 2010). However, at some point, to obtain deep understanding and/or exquisite control, we will seek to develop models that are large in scope and heavy in detail. Such models may only promise minor improvements in understanding or controllability, but even minor improvements, given the potential benefits, could provide strong motivation to pursue ambitious modeling efforts. In the United States, the annual cost of treating cancer is estimated to now be greater than US $100 billion and rising (Mariotto et al. 2011). This cost could perhaps be reduced and/or care could be improved with greater predictive capabilities with regard to the effects of perturbations on cell signaling.

The second reason we need large models is that a cell signaling system typically comprises a large number of proteins and other signaling molecules (e.g. lipids), as well as a larger number of molecular interactions. For example, documentation of a cell signaling system in NetPath (netpath.org) includes, on average, information about >80 different molecules (counts from ‘Molecules Involved’ fields) and >150 molecular interactions (counts from ‘Physical Interactions’ and ‘Enzyme Catalysis’ fields) (Kandasamy et al. 2010). This problem can be dramatically exacerbated when one attempts to translate biological knowledge of a cell signaling system into mathematical form because the number of formal elements in a conventional, or traditional, model (e.g. variables or equations) is typically much larger than the number of molecules or interactions considered in the model. For example, a recent model encompassing only about 30 proteins is composed of nearly 500 ODEs (Chen et al. 2009).

Models that are far more comprehensive than available models, such as models that couple receptor signaling to cell cycle control, can be developed, evaluated, and incrementally improved through an iterative process, including parameter estimation, experimental tests of model predictions, and model refinement. This iterative process is often discussed, but the process is difficult to implement. It will continue to be difficult to implement until certain types of resources and conventions are developed to aid modelers. In addition, methods are needed to make modeling more automatic, such as methods for data-driven model specification [for an example of inference of network structure in the context of cell signaling, (Ciaccio et al. 2010)], automated comparison of model predictions to known, formalized system properties (Batt et al. 2005; Calzone et al. 2006; Gong et al. 2010), and automated model-guided design of experiments (Bongard and Lipson 2007). The resources I have in mind would provide the type of information available at the Tyson Lab website (mpf.biol.vt.edu), where for example, one can find an impressively long list of budding yeast mutants and their cell cycle phenotypes. This collection of information, which would be time consuming to reproduce, serves to document the basis of the model of Chen et al. (2004) and also provides a set of known system properties, which can be used to validate any new model for the budding yeast cell cycle or to compare alternative models. There is a wealth of information in the primary literature about system properties, which are mostly qualitative in nature. This information could perhaps be better used by modelers to (tightly) constrain the values of parameters in models.

The information available about a cell signaling system is often so abundant that only a fraction of the available information can be held in memory at a time. In fact, it may be impossible to even enter into memory all the available information. Available information must be systematized (ideally in the form of reusable and extensible models), reporting of new knowledge must be made more formal, and protocols and standardized conventions for accomplishing these tasks are needed. Translating available knowledge into formal elements of a model specification or into benchmarks for validating a model is currently a rate-limiting step in the modeling process, and it requires an unusual combination of skills. (It also requires a rare temperament.) With the adoption of conventions for reporting observed system properties, this burden could be lifted from modelers. A standard language for specifying dynamical system properties could perhaps be developed on the basis of a temporal logic (Calzone et al. 2006; Gong et al. 2010). An important element of such a language will be a capacity to represent system perturbations, such as mutations. Progress on formal representation of perturbations has been made by Danos et al. (2009).

It should be noted that a common concern about the development of a large model is the difficulty of estimating the many parameters in such a model. Recently, Apgar et al. (2010) demonstrated, via an assumed ground truth model and synthetic data, that the parameters of a model can be identified via fitting to measurements from an unexpectedly small number of experiments if the experiments are well chosen and under the assumptions that model structure is correct and all variables of the model are observable in the form of time-series data. Despite these caveats, this study suggests cautious optimism. The parameters in large models may be more identifiable than conventional wisdom suggests, and new statistical methods could perhaps be developed for better addressing the problem of parameter estimation in the the context of large models for cell signaling systems. One need in this area is the development of parameter estimation methods that respect thermodynamic constraints, which are easily enforced in the case of a small model but not so easily enforced in the case of a large model (Yang et al. (2006); Ederer and Gilles (2007, 2008); Vlad and Ross 2009; Jenkinson et al. 2010).

1.4 The Challenge of Parts with Parts

The parts of cell signaling systems (e.g. proteins) themselves have parts (e.g. sites of post-translational modification). To faithfully represent the mechanisms of cell signaling, we need models that account for the site-specific details of molecular interactions (Hlavacek et al. 2006; Hlavacek and Faeder 2009b; Mayer et al. 2009). A signaling protein is typically composed of multiple functional components or sites. Tracking information about these sites is critical for modeling the system-level dynamics of molecular interactions because protein interactions are context sensitive, i.e. they depend on site-specific details. For example, many protein–protein interactions are regulated by phosphorylation (Pawson and Kofler 2009). It is difficult to document and/or visualize the contextual dependencies of molecular interactions in a way that is both clear and precise. Online resources, such as NetPath (Kandasamy et al. 2010), Phospho.ELM (Dinkel et al. 2011) and Reactome (Matthews et al. 2009), provide (limited) information about the site-specific details of protein–protein interactions through combinations of illustrations, lists (e.g. a list of sites of phosphorylation), comments, and literature citations. It is also difficult to incorporate the site-specific details of molecular interactions into a conventional or traditional model (e.g. ODEs) (Hlavacek et al. 2003). Nevertheless, our mechanistic understanding of site-specific details is fairly advanced and such details can be monitored experimentally (Gstaiger and Aebersold 2009; Choudhary and Mann 2010; Kolch and Pitt 2010), which encourages a consolidation of this knowledge in the form of models.

As mentioned above, ODEs are traditionally used to model the kinetics of (bio)chemical reaction systems, and many models of cell signaling systems take the form of a system of coupled ODEs Kholodenko et al. 1999; Schoeberl et al. 2002). Another commonly used modeling approach involves using Gillespie's method, or kinetic Monte Carlo (KMC), to execute a stochastic simulation of the kinetics of a given list of reactions (Gillespie 2007; Voter 2007). This approach is typically applied in preference to the ODE-based approach when one is concerned about stochastic fluctuations that arise from small population sizes (McAdams and Arkin 1999). Both approaches require a listing of the reactions that are possible and the chemical species that can be populated in a system. The approach of using ODEs to represent the kinetics of chemical reaction systems was first applied by Harcourt and Esson (1865), long before we understood how cells process information. Cell signaling systems have features that distinguish these systems from other reaction systems, such as many reaction systems of importance in the chemical processing industry. In a typical commercial reactor or test tube, there are large numbers of molecules that can populate a relatively small number of chemical species. In contrast, in a cell signaling system, because of the combinatorial potential of protein interactions, relatively small numbers of molecules populate a tiny fraction of a large, possibly astronomical or even super astronomical number of possible chemical species (Endy and Brent 2001; Bray 2003; Hlavacek et al. 2003, 2006). This feature of cell signaling systems has been called combinatorial complexity. Combinatorial complexity hinders application of traditional modeling approaches, because in these approaches, one must write an equation for each chemical species that could be populated in a system, or the equivalent.

To overcome the problem of combinatorial complexity, the rule-based modeling approach has been developed (Hlavacek et al. 2006). In this approach, graphs are used to model proteins, and graph-rewriting rules are used to model protein interactions. Rules can be specified using a formal language, such as the BioNetGen language (BNGL) (Faeder et al. 2009) or Kappa (Feret et al. 2009; Harmer et al. 2010), which is closely related to BNGL. The two basic ideas underlying the rule-based modeling approach are to treat molecules as the building blocks of chemical species and to model the molecular interactions that generate chemical species as rules rather than to specify a list of reactions connecting the chemical species that can be populated, which is equivalent to specifying a set of ODEs under the assumption of mass action kinetics. A rule can be viewed as implicitly defining a set of individual reactions, which involve a common transformation. A rule includes a specification of the necessary and sufficient conditions that a chemical species must satisfy to undergo a transformation defined by the rule and a rate law for the transformation. If the conditions are minimal, the number of individual reactions implied by the rule is maximal. For example, if ligand–receptor binding is independent of receptor signaling, then a rule can be specified that captures ligand–receptor interaction without considering the interaction of the receptor with cytosolic signaling proteins. However, there is no free lunch. The cost of the rule-based modeling approach is that the reactions implied by a rule are all assumed to be characterized by the same rate law. Thus, the approach relies on a coarse-grained description of reaction kinetics, because each individual reaction in a system may truly be characterized by a unique rate law. Although it is important to keep this limitation of rule-based modeling in mind, the reliance on coarse-grained reaction kinetics is not onerous, as the granularity of a rule can be adjusted as needed. At the finest level of granularity, a rule implies only a single reaction, i.e. a rule is equivalent to a reaction in the traditional sense. Thus, the rule-based modeling approach can be viewed as a generalization of conventional modeling approaches. Rule-based modeling provides a practical solution to the problem of combinatorial complexity.

A number of methods for simulating rule-based models have been developed and these methods have been implemented in various software tools (Blinov et al. 2004; Lok and Brent 2005; Meier-Schellersheim et al. 2006; Moraru et al. 2008; Colvin et al. 2009, 2010; Lis et al. 2009; Mallavarapu et al. 2009; Andrews et al. 2010; Gruenert et al. 2010; Ollivier et al. 2010; Sneddon et al. 2010). A few recently developed tools are briefly discussed below to point to newly available modeling capabilities and to identify capability gaps. The tools discussed below all implement statistical methods. Once a rule-based model has been specified, it can be simulated in a number of different ways. For example, the set of ODEs for mass action kinetics corresponding to the list of reactions implied by a model can be derived from the rules of the model (Blinov et al. 2004; Faeder et al. 2005), or a smaller set of ODEs for the dynamics of collective variables can be derived from the rules (Borisov et al. 2008; Conzelmann and Gilles 2008; Koschorreck and Gilles 2008; Feret et al. 2009; Harmer et al. 2010). However, statistical methods are particularly well suited for simulation of a rule-based model, because tracking individual molecules in a stochastic simulation can be easier than tracking the population levels of chemical species in a deterministic simulation (Morton-Firth and Bray 1998; Danos et al. 2007; Yang et al. 2008; Colvin et al. 2009, 2010).

The stochastic simulation compiler (SSC) of Lis et al. (2009) implements a ‘next subvolume’ method for simulation of spatial rule-based models, in which subvolumes of a reaction volume are considered to be well mixed but transport between subvolumes is included to capture the effects of diffusion. Within a subvolume, KMC is used to simulate reaction kinetics, i.e. Gillespie's method is applied. SSC uses a constructive solid geometry approach, in which Boolean operators are used to combine primitive shapes/objects, to define surfaces and objects. A limitation of SSC is its reliance on the computationally expensive procedure of network generation, as SSC implements a ‘generate-first’ approach to simulation (Blinov et al. 2005). In this approach, a rule set is expanded into the corresponding reaction network (or equivalently, the list of reactions implied by the rules) before a simulation is performed. Network generation is expensive and impractical for many rule-based models. It is also unnecessary, as various ‘network-free’ approaches are available (Danos et al. 2007; Yang et al. 2008; Colvin et al. 2009, 2010). In these approaches, rules are used directly to advance a stochastic simulation. In fact, network-free simulation can be viewed as an extension of Gillespie's method in which a list of reactions is replaced with a list of rules. A number of software tools implement network-free simulation methods, including DYNSTOC (Colvin et al. 2009), RuleMonkey (Colvin et al. 2010) and NFsim (Sneddon et al. 2010), which are all compliant with BNGL. None of these tools have a spatial modeling capability, i.e. the methods implemented in DYNSTOC, RuleMonkey and NFsim all rely on the assumption of a well-mixed reaction compartment. Network-free simulation tools compliant with Kappa with similar capabilities and limitations, such as KaSim, are available at kappalanguage.org/tools.

Another software tool for simulation of spatial rule-based models is Smoldyn (Andrews et al. 2010), which performs particle-based reaction-diffusion calculations. These types of calculations are well suited for dealing with complex geometrical constraints, such as a reconstructed 3D surface (Mazel et al. 2009). Smoldyn implements an ‘on-the-fly’ approach to simulation (Faeder et al. 2005; Lok and Brent 2005), which limits network generation by expanding a list of reactions as chemical species become populated during the course of a simulation. However, on-the-fly simulation still relies on network generation, and because of this, on-the-fly simulation is impractical for many rule-based models (Hlavacek et al. 2006; Yang et al. 2008). Interestingly, the network-free simulation approaches implemented in DYNSTOC, RuleMonkey, NFsim and KaSim are particle-based methods, and it should be straightforward to extend any of these methods to include molecule positions (spatial coordinates) and propagation of molecules in space, on or off a lattice. Powerful tools are available for specifying model geometry and visualizing results in 3D (Czech et al. 2009), and such tools could perhaps be coupled to rule-based simulation capabilities.

Finally, I wish to call attention to the simulation capability provided by the SRSim package (Gruenert et al. 2010), which couples BNGL (Faeder et al. 2009) to LAMMPS (Plimpton 1995), a package for performing molecular dynamics (MD) calculations. The MD simulation capability provided by SRSim allows molecules and their binding sites to be modeled as patchy particles and molecular interactions to be modeled in terms of force fields. This capability could be used to study the effects of orientation constraints on molecular interactions, molecular size (i.e. volume exclusion), and both translational and rotational diffusion. A wealth of structural information is available about the modular domains of signaling proteins, other molecules involved in signaling, and biomolecular complexes (Rose et al. 2011). Structures or models for protein signaling complexes could perhaps be built and used to identify potential steric clashes and other factors that potentially impact protein–protein interactions involved in signaling. Relating such structural information to function could perhaps be aided by system-level modeling of the dynamics of protein–protein interactions. A model for a signaling complex is static, whereas assembly of a signaling complex during the course of cell signaling is dynamic. Tools such as SRSim could provide a bridge for connecting structural data available about signaling proteins to system-level dynamics of protein–protein interactions, i.e. to protein function. The study of Monine et al. (2010) provides another example, but no general-purpose software, of how structural constraints on molecular interactions can be considered within the framework of rule-based modeling.

1.5 Closing Remarks

Interest in modeling of cellular regulatory systems has increased dramatically in recent years, especially interest in stochastic modeling. In 2010, the classic paper of Gillespie (1977) was cited over 300 times, which is more than the number of times the paper was cited in the 20 years before Gillespie's method was applied by McAdams and Arkin (1997) to study gene regulation. Although modeling of cellular regulatory systems is more commonplace now than ever, the models being considered are usually small models, which also tend to incorporate little of the known mechanistic details (e.g. the site-specific details of protein–protein interactions). Much can be learned from small models, but cellular regulatory systems are large, and there are interesting questions that can be addressed with large and mechanistically detailed models. I am reminded of the saying attributed to Einstein, ‘everything should be as simple as it can be, but not simpler.’

It will not be easy to develop useful large models of cell signaling systems for a variety of reasons. One reason is that there is no biological expert available to help the ambitious modeler, who may be a physicist or mathematician by training, with formalization of the known biology. I could be wrong, but I do not believe there is a person alive today who has what can be considered a complete command of the known facts about any well-studied cell signaling system. One could be discouraged by this situation or motivated by it, as the systematization of knowledge provided by a model could help us make better use of available knowledge. Another difficulty for the ambitious modeler is the lack of support that can be expected from colleagues through the peer review process. Once an ODE-based model, for example, crosses some threshold of complexity, a modeler can expect little proofreading of equations, questioning of model assumptions, or helpful suggestions on how to improve parameter estimates. This situation points to a need to develop methods for better communicating the content of large models. Finally, one should keep in mind that models should be formulated to address specific questions, not for the sake of modeling. Finding the right questions to ask, the ones that can be answered with the help of large models, is the crux of the matter. Whether, or rather when, these questions can be found will determine when large modeling efforts become commonplace in the future.

References

Aldridge BB, Burke JM, Lauffenburger DA and Sorger PK 2006 Physicochemical modelling of cell signalling pathways. Nat. Cell Biol. 8, 1195–1203.

Alon U 2006 An Introduction to Systems Biology: Design Principles of Biological Circuits. Chapman and Hall/CRC, Boca Raton, FL.

Andrews SS, Addy NJ, Brent R and Arkin AP 2010 Detailed simulations of cell biology with Smoldyn 2.1. PLoS Comput. Biol. 6, e1000705.

Apgar JF, Witmer DK, White FM and Tidor B 2010 Sloppy models, parameter uncertainty, and the role of experimental design. Mol. Biosyst. 6, 1890–1900.

Bashor CJ, Horwitz AA, Peisajovich SG and Lim WA 2010 Rewiring cells: synthetic biology as a tool to interrogate the organizational principles of living systems. Annu. Rev. Biophys. 39, 515–537.

Batt G, Ropers D, de Jong, H, Geiselmann J, Mateescu R, Page M and Schneider D 2005 Validation of qualitative models of genetic regulatory networks by model checking: analysis of the nutritional stress response in Escherichia coli. Bioinformatics 21, i19–i28.

Blinov ML, Faeder JR, Goldstein B and Hlavacek WS 2004 BioNetGen: software for rule-based modeling of signal transduction based on the interactions of molecular domains. Bioinformatics 20, 3289–3291.

‘On-the-fly’ or ‘generate-first’ modeling? Nat. Biotechnol. 23, 1344–1345.

Bongard J and Lipson H 2007 Automated reverse engineering of nonlinear dynamical systems. Proc. Natl. Acad. Sci. USA 104, 9943–9948.

Borisov NM, Chistopolsky AS, Faeder JR and Kholodenko BN 2008 Domain-oriented reduction of rule-based network models. IET Syst. Biol. 2, 342–351.

Bradshaw JM 2010 The Src, Syk, and Tec family kinases: distinct types of molecular switches. Cell. Signal. 22, 1175–1184.

Brandman O and Meyer T 2008 Feedback loops shape cellular signals in space and time. Science 322, 390–395.

Bray D 2003 Molecular prodigality. Science 299, 1189–1190.

Calzone L, Fages F and Soliman S 2006 BIOCHAM: an environment for modeling biological systems and formalizing experimental knowledge. Bioinformatics 22, 1805–1807.

Chen KC, Calzone L, Csikasz-Nagy, A, Cross FR, Novak B and Tyson JJ 2004 Integrative analysis of cell cycle control in budding yeast. Mol. Biol. Cell 15, 3841–3862.

Chen WW, Schoeberl B, Jasper PJ, Niepel M, Nielsen UB, Lauffenburger DA and Sorger PK 2009 Input-output behavior of ErbB signaling pathways as revealed by a mass action model trained against dynamic data. Mol. Syst. Biol. 5, 239.

Choudhary C and Mann M 2010 Decoding signalling networks by mass spectrometry-based proteomics. Nat. Rev. Mol. Cell Biol. 11, 427–439.

Ciaccio MF, Wagner JP, Chuu CP, Lauffenburger DA, and Jones RB 2010 Systems analysis of EGF receptor signaling dynamics with microwestern arrays. Nat. Methods 7, 148–155.

Conzelmann H and Gilles ED 2008 Dynamic pathway modeling of signal transduction networks: a domain-oriented approach. Methods Mol. Biol. 484, 559–578.

Colvin J, Monine MI, Faeder JR, Hlavacek WS, Von Hoff, DD and Posner, RG 2009 Simulation of large-scale rule-based models. Bioinformatics 25, 910–917.

Colvin J, Monine MI, Gutenkunst RN, Hlavacek WS, Von Hoff, DD and Posner, RG 2010 RuleMonkey: software for stochastic simulation of rule-based models. BMC Bioinformatics 11, 404.

Czech J, Dittrich M and Stiles JR 2009 Rapid creation, Monte Carlo simulation, and visualization of realistic 3D cell models. Methods Mol. Biol. 500, 237–287.

Danos V, Feret J, Fontana W and Krivine J 2007 Scalable simulation of cellular signaling networks. Lect. Notes Comput. Sci. 4807, 139–157.

Danos V, Feret J, Fontana W, Harmer R and Krivine J 2009 Rule-based modelling and model perturbation. Lect. Notes Comput. Sci. 5750, 116–137.

Deininger M, Buchdunger E and Druker BJ 2005 The development of imatinib as a therapeutic agent for chronic myeloid leukemia. Blood 105, 2640–2653.

Phopho.ELM: a database of phosphorylation sites—update 2011. Nucleic Acids Res. 39, D261–D267.

Ederer M and Gilles ED 2007 Thermodynamically feasible kinetic models of reaction networks. Biophys. J. 92, 1846–1857.

Ederer M and Gilles ED 2008 Thermodynamic constraints in kinetic modeling: thermodynamic-kinetic modeling in comparison to other approaches. Eng. Life Sci. 8, 467–476.

q-bio 2007: a watershed moment in modern biology. Mol. Syst. Biol. 3, 148.

Endy D and Brent R 2001 Modelling cellular behaviour. Nature 409, 391–395.

Faeder JR, Blinov ML, Goldstein B and Hlavacek WS 2005 Rule-based modeling of biochemical networks. Complexity 10, 22–41.

Faeder JR, Blinov ML and Hlavacek WS 2009 Rule-based modeling of biochemical systems with BioNetGen. Methods Mol. Biol. 500, 113–167.

Feret J, Danos V, Krivine J, Harmer R and Fontana W 2009 Internal coarse-graining of molecular systems. Proc. Natl. Acad. Sci. USA 106, 6453–6458.

Gillespie DT 1977 Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 81, 2340–2361.

Gillespie DT 2007 Stochastic simulation of chemical kinetics. Annu. Rev. Phys. Chem. 58, 35–55.

Gonzalez-Angulo AM, Hennessy BT and Mill GB 2010 Future of personalized medicine in oncology: a systems biology approach. J. Clin. Oncol. 28, 2777–2783.

Gong H, Zuliani P, Komuravelli A, Faeder JR and Clarke EM 2010 Analysis and verification of the HMGB1 signaling pathway. BMC Bioinformatics 11, S10.

ELM: the status of the 2010 eukaryotic linear motif resource. Nucleic Acids Res. 38, D167–D180.

Gruenert G, Ibrahim B, Lenser T, Lohel M, Hinze T and Dittrich P 2010 Rule-based spatial modeling with diffusing, geometrically constrained molecules. BMC Bioinformatics 11, 307.

Günberg R and Serrano L 2010 Strategies for protein synthetic biology. Nucleic Acids Res. 38, 2663–2675.

Gstaiger M and Aebersold R 2009 Applying mass spectrometry-based proteomics to genetics, genomics and network biology. Nat. Rev. Genet. 10, 617–627.

Hanahan D and Weinberg RA 2000 The hallmarks of cancer. Cell 100, 57–70.

Harcourt AV and Esson W 1865 On the laws of connexion between the conditions of a chemical change and its amount. Proc. R. Soc. 14, 470–474.

Harmer R, Danos V, Feret J, Krivine J and Fontana W 2010 Intrinsic information carriers in combinatorial dynamical systems. Chaos 20, 037108.

Hilbert D 1902 Mathematical problems. Bull. Am. Math. Soc. 8, 437–479.

Hlavacek WS 2009a How to deal with large models? Mol. Syst. Biol. 5, 240.

Hlavacek WS and Faeder JR 2009b The complexity of cell signaling and the need for a new mechanics. Sci. Signal. 2, pe46.

Hlavacek WS, Faeder JR, Blinov ML, Perelson AS and Goldstein B 2003 The complexity of complexes in signal transduction. Biotechnol. Bioeng. 84, 783–794.

Hlavacek WS, Faeder JR, Blinov ML, Posner RG, Hucka M and Fontana W 2006 Rules for modeling signal-transduction systems. Sci. STKE 2006, re6.

Signaling–-2000 and beyond. Cell 100, 113–127.

Hunter T 2007 Treatment for chronic myelogenous leukemia: the long road to imatinib. J. Clin. Invest. 117, 2036–2043.

Jenkinson G, Zhong X and Goutsias J 2010 Thermodynamically consistent Bayesian analysis of closed biochemical reaction systems. BMC Bioinformatics 11, 547.

Kandasamy K, Mohan SS, Raju R, Keerthikumar S, Sameer Kumar GS, Venugopal AK, Telikicherla D, Navarro JD, Mathivanan S, Pecquet C, Kanth Gollapudi S, Gopal Tattikota S, Mohan S, Padhukasahasram H, Subbannayya Y, Goel R, Jacob HKC, Zhong J, Sekhar R, Nanjappa V, Balakrishnan L, Subbaiah R, Ramachandra YL, Abdul Rahiman B, Keshava Prasad TS, Lin JX, Houtman JCD, Desiderio S, Renauld JC, Constantinescu SN, Ohara O, Hirano T, Kubo M, Singh S, Khatri P, Draghici S, Bader GD, Sander C, Leonard WJ and Pandey A 2010 NetPath: a public resource of curated signal transduction pathways. Genome Biol. 11, R3.

Keasling JD 2010 Manufacturing molecules through metabolic engineering. Science 330, 1355–1358.

Khalil AS and Collins JJ 2010 Synthetic biology: applications come of age. Nat. Rev. Genet. 11, 367–379.

Kholodenko BN, Demin OV, Moehren G and Hoek JB 1999 Quantification of short term signaling by the epidermal growth factor receptor. J. Biol. Chem. 274, 30169–30181.

Kholodenko BN 2006 Cell-signalling dynamics in time and space. Nat. Rev. Mol. Cell Biol. 7, 165–176.

Kholodenko BN, Hancock JF and Kolch W 2010 Signalling ballet in space and time. Nat. Rev. Mol. Cell Biol. 11, 414–426.

Kitano H (ed.) 2001 Foundations of Systems Biology. MIT Press, Cambridge, MA.

Kitano H 2002 Systems biology: a brief overview. Science 295, 1662–1664.

Kitano H, Funahashi A, Matsuoka Y and Oda K 2005 Using process diagrams for the graphical representation of biological networks. Nat. Biotechnol. 23, 961–966.

Klipp E, Wade RC and Kummer U 2010 Biochemical network-based drug-target prediction. Curr. Opin. Biotechnol. 21, 511–516.

Kohn KW 1999 Molecular interaction map of the mammalian cell cycle control and DNA repair systems. Mol. Biol. Cell 10, 2703–2734.

Kolch W and Pitt A 2010 Functional proteomics to dissect tyrosine kinase signalling pathways in cancer. Nat. Rev. Cancer 10, 618–629.

Koschorreck M and Gilles ED 2008 ALC: automated reduction of rule-based models. BMC Syst. Biol. 2, 91.

Kreeger PK and Lauffenburger DA 2010 Cancer systems biology: a network modeling perspective. Carcinogenesis 31, 2–8.

Lander AD 2010 The edges of understanding. BMC Biol. 8, 40.

Lemmon MA and Schlessinger J 2010 Cell signaling by receptor tyrosine kinases. Cell 141, 1117–1134.

Le Novére N, Hucka M, Mi H, Moodie S, Schreiber F, Sorokin A, Demir E, Wegner K, Aladjem MI, Wimalaratne SM, Bergman FT, Gauges R, Ghazal P, Kawaji H, Li L, Matsuoka Y, Villeger A, Boyd SE, Calzone L, Courtot M, Dogrusoz U, Freeman TC, Funahashi A, Ghosh S, Jouraku A, Kim S, Kolpakov F, Luna A, Sahle S, Schmidt E, Watterson S, Wu G, Goryanin I, Kell DB, Sander C, Sauro H, Snoep JL, Kohn K and Kitano H 2009 The systems biology graphical notation. Nat. Biotechnol. 27, 735–741.

Lim WA 2010 Designing customized cell signalling circuits. Nat. Rev. Mol. Cell Biol. 11, 393–403.

Lis M, Artyomov MN, Devadas S and Chakraborty AK 2009 Efficient stochastic simulation of reaction-diffusion processes via direct compilation. Bioinformatics 25, 2289–2291.

Lok L and Brent R 2005 Automatic generation of cellular reaction networks with Moleculizer 1.0. Nat. Biotechnol. 23, 131–136.

Mallavarapu A, Thomson M, Ullian B and Gunawardena J 2009 Programming with models: modularity and abstraction provide powerful capabilities for systems biology. J. R. Soc. Interface 6, 257–270.

Projections of the cost of cancer care in the United States: 2010–2020. J. Natl. Cancer Inst. 103, 117–128.

Matthews L, Gopinath G, Gillespie M, Caudy M, Croft D, de Bono B, Garapati P, Hemish J, Hermjakob H, Jassal B, Kanapin A, Lewis S, Mahajan S, May B, Schmidt E, Vastrik I, Wu G, Birney E, Stein L and D'Eustachio P 2009 Reactome knowledge base of human biological pathways and processes. Nucleic Acids Res. 37, D619–D622.

Mayer BJ, Blinov ML and Loew LM 2009 Molecular machines or pleiomorphic ensembles: signaling complexes revisited. J. Biol. 8, 81.

Mazel T, Raymond R, Raymond-Stintz M, Jett S and Wilson BS 2009 Stochastic modeling of calcium in 3D geometry. Biophys. J. 96, 1691–1706.

McAdams HH and Arkin A 1997 Stochastic mechanisms in gene expression. Proc. Natl. Acad. Sci. USA 94, 814–819.

McAdams HH and Arkin A 1999 It's a noisy business! Genetic regulation at the nanomolar scale. Trends Genet. 15, 65–69.

Meier-Schellersheim M, Xu K, Angermann B, Kunkel EJ, Jin T and Germain RN 2006 Key role of local regulation in chemosensing revealed by a new molecular interaction-based modeling method. PLoS Comput. Biol. 2, e82.

Monine MI, Posner RG, Savage PB, Faeder JR and Hlavacek WS 2010 Modeling multivalent ligand-receptor interactions with steric constraints on configurations of cell-surface receptor aggregates. Biophys. J. 96, 48–56.

Moraru II, Schaff JC, Slepchenko BM, Blinov ML, Morgan F, Lakshminarayana A, Gao F, Li Y and Loew LM 2008 Virtual Cell modelling and simulation software environment. IET Syst. Biol. 2, 352–362.

Morton-Firth CJ and Bray D 1998 Predicting temporal fluctuations in an intracellular signalling pathway. J. Theor. Biol. 192, 117–128.

Mukherji S and van Oudenaarden A 2009 Synthetic biology: understanding biological design from synthetic circuits. Nat. Rev. Genet. 10, 859–871.

Novák B and Tyson JJ 2008 Design principles of biochemical oscillators. Nat. Rev. Mol. Cell Biol. 9, 981–991.

Nurse P 2008 Life, logic and information. Nature 454, 424–426.

Ollivier JF, Shahrezaei V and Swain PS 2010 Scalable rule-based modelling of allosteric proteins and biochemical networks. PLoS Comput. Biol. 6, e1000975.

Kinome signaling through regulated protein–protein interactions in normal and cancer cells. Curr. Opin. Cell Biol. 21, 147–153.

Pawson T and Nash P 2003 Assembly of cell regulatory systems through protein interaction domains. Science 300, 445–452.

Plimpton SJ 1995 Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 117, 1–19.

Ptashne M and Gann A 2001 Genes & Signals. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.

Rafelski SM and Marshall WF 2008 Building the cell: design principles of cellular architecture. Nat. Rev. Mol. Cell Biol. 9, 593–602.

Rose PW, Beren B, Bi C, Bluhm WF, Dimitropoulos D, Goodsell DS, Prli , A, Quesada M, Quinn GB, Westbrook JD, Young J, Yukich B, Zardecki C, Berman HM and Bourne PE 2011 The RCSB Protein Data Bank: redesigned web site and web services. Nucleic Acids Res. 39, D392–D401.

Schoeberl B, Eichler-Jonsson C, Gilles ED and Müller G 2002 Computational modeling of the dynamics of the MAP kinase cascade activated by surface and internalized EGF receptors. Nat. Biotechnol. 20, 370–375.

Cell signaling in space and time: where proteins come together and when they’re apart. Science 326, 1220–1224.

Sneddon MW, Faeder JR and Emonet T 2010 Efficient modeling, simulation and coarse-graining of biological complexity with NFsim. Nat. Methods doi:10.1038/nmeth.1546.

Tyson JJ and Novák B 2010 Functional motifs in biochemical reaction networks. Annu. Rev. Phys. Chem. 61, 219–240.

Tyson JJ, Chen KC and Novak B 2003 Sniffers, buzzers, toggles and blinkers: dynamics of regulatory and signaling pathways in the cell. Curr. Opin. Cell Biol. 15, 221–231.

Vlad MO and Ross J 2009 Thermodynamically based constraints for rate coefficients of large biochemical networks. WIREs Syst. Biol. Med. 1, 348–358.

Voter AF 2007 Introduction to the kinetic Monte Carlo method. In Radiation Effects in Solids (eds Sickafus KE, Kotomin EA, and Uberuaga BP), pp. 1–23. Springer, Dordrecht.

Wall ME, Hlavacek WS and Savageau MA 2004 Design of gene circuits: lessons from bacteria. Nat. Rev. Genet. 5, 34–42.

Walsh CT, Garneau-Tsodikova S and Gatto Jr, GJ 2005 Protein posttranslational modifications: the chemistry of proteome diversifications. Angew. Chem. Int. Ed. Engl. 44, 7342–7372.

Xu AM and Huang PH 2010 Receptor tyrosine kinase coactivation networks in cancer. Cancer Res. 70, 3857–3860.

Yang J, Bruno WJ, Hlavacek WS and Pearson JE 2006 On imposing detailed balance in complex reaction mechanisms. Biophys. J. 91, 1136–1141.

Yang J, Monine MI, Faeder JR and Hlavacek WS 2008 Kinetic Monte Carlo method for rule-based modeling of biochemical networks. Phys. Rev. E 78, 031910.

Chapter 2

Introduction to Statistical Methods for Complex Systems

Tristan Mary-Huard and Stéphane Robin

Agro ParisTech and INRA, Paris, France

2.1 Introduction

The aim of the present chapter is to introduce and illustrate some concepts of statistical inference useful in systems biology. Here we limit ourselves to the classical, so-called ‘frequentist’ statistical inference where parameters are fixed quantities that need to be estimated. The Bayesian approach will be presented in Chapter 3.

Modelling and inference techniques are illustrated in three recurrent problems in systems biology:

Class comparison aims at assessing the effect of some treatment or experimental condition on some biological response. This requires proper statistical modelling to account for the experimental design, various covariates or stratifications or dependence between the measurements. As systems biology often deals with high-throughput technologies, it also raises multiple testing issues.

Class prediction refers to learning techniques that aim at building a rule to predict the status (e.g. well or ill) of an individual, based on a set of biological descriptors. An exhaustive list of classification algorithms is out of reach, but general techniques such as regularization or aggregation are of prime interest in systems biology where the number of variables often exceeds the number of observations by far. Evaluating the performances of a classifier also requires relevant tools.

Class discovery aims at uncovering some structure in a set of observations. These techniques include distance-based or model-based clustering methods and allow to determine distinct groups of individuals in the absence of a prior classification. However, the underlying structure may have more complex forms, each raising specific issues in terms of inference.

This chapter focuses on generic statistical concepts and methods, that can be applied no matter which technology is used for the data acquisition. In practice, applications to any biological problem will necessitate both a relevant strategy for the data collection, and a careful tuning of the methods to obtain meaningful results. These two steps of data collection (or experimental design conception) and adaptation of the generic methods require taking into account the nature of the data. Therefore, they are dependent on the data acquisition technology, and will be discussed in Part B of this Handbook.

In this chapter, the data are assumed to arise from a static process. The analysis of a dynamic biological system would require more sophisticated methods, such as partial differential equations or network modelling.

These topics are not discussed here as they will be reviewed in depth in Parts C and D.

Lastly, a basic knowledge in statistics is assumed, covering topics including point estimation (in particular maximum likelihood estimation), hypothesis testing, and a background in regression and linear models.

2.2 Class Comparison

We consider here the general problem of assessing the effect of some treatment, experimental condition or covariate on some response. We first address the problem of modelling the data resulting from the experiments, focusing on how to account for the dependency between the observations. We then turn to the problem of multiple testing, which is recurrent in high-throughput data analyses.

2.2.1 Models for Dependent Data

Many biological experiments aim at observing the effects of a given treatment (or combination of treatments) on a given response. ‘Treatment’ is used here in a very broad sense, including controlled experimental conditions, uncontrolled covariates, time, population structure, etc. In the following will stand for the total number of experiments.

Linear (Gaussian) models (Searle 1971; Dobson 1990) provide a general framework to describe the influence of a set of controlled conditions and/or uncontrolled covariates, summarized in a -dimensional matrix , on the observed response gathered in a -dimensional vector as

(2.1) equation

where is the -dimensional vector containing all parameters. In the most classical setting, the response is supposed to be Gaussian, and the dependency structure between the observations is then fully specified by the (co-)variance matrix which contains the variance of each observation on the diagonal, and the covariances between pairs of observations elsewhere. In the most simple setting, the responses are supposed to be independent with same variance , that is .

2.2.1.1 Writing the Right (Mixed) Model

In more complex experiments, the assumption that observations are independent does not hold and the structure of needs to be adapted. Because it contains parameters, the shape of has to be strongly constrained to allow good inference. We first present here some typical experimental settings, and the associated dependency structures.

Variance Components

Consider the study of the combined effects of the genotype (indexed by ) and of the cell type ( ) on some gene expression. Several individuals ( ) from each genotype are included and cells from each type are harvested in each of them. In such a setting the expected response is , which is often decomposed into a genotype effect, a cell type effect and an interaction as .

The most popular way to account for the dependency between measures obtained on the same individual is to add a random term associated with each individual. The complete model can then be written as

(2.2) equation

where all and are independent centred Gaussian variables with variance and , respectively. The variance of one observation is then , where is the ‘biological’ variance and is the ‘technical’ one (Kerr and Churchill 2001). The random effect induces a uniform correlation between observations from the same individual since:

(2.3)

equation

and 0 if . The matrix form of this model is a generalization of (2.1):

(2.4) equation

where describes the individual structure: each row corresponds to one measurement and each column to one individual and contains a 1 at the intersection if the measurement has been made on the individual, and a 0 otherwise. The denomination ‘mixed’ of ‘linear mixed models’ comes from the simultaneous presence of fixed and random effects. It corresponds to the simplest form of so-called ‘variance components’ models. The variance matrix corresponding to (2.3) is . Application of such a model to gene expression data can be found in Wolfinger et al.(2001) or Tempelman (2008).

Repeated Measurements

One considers a similar design where, in place of cell types, we compare successive harvesting times (indexed by ) within each individual. The uniform correlation within each individual given in (2.3) may then seem inappropriate, for it does not account for the delay between times of observation. A common dependency form is then the so-called ‘autoregressive’, which states that

and 0 otherwise. This is to assume that the correlation decreases (at an exponential rate) with the time delay. Such a variance structure cannot be put in a simple matrix form similar to (2.4). Note that Equation (2.1) is still valid, but with nondiagonal variance matrix .

Spatial Dependency

It is also desirable to account for spatial dependency when observations have some spatial localization. Suppose one wants to compare treatments (indexed by ), and that replicates ( ) have respective localizations . A typical variance structure (Cressie 1993) is

where accounts for the measurement error variability and controls the speed at which the dependency decreases with distance.

The dependency structures described above can of course be combined. Also note that this list is far from exhaustive. The limitations often come from the software at hand or the specific computing developments that can be made. A large catalogue of such structures can be found in software such as SAS (2002-03) or R (www.r-project.org).

2.2.1.2 Inference

Some problems related to the inference of mixed linear models are still unresolved. We only provide here an introduction to the most popular approaches and emphasize some practical issues that can be faced when using them.

Estimation

Mixed model inference requires to estimate both and . We start with the estimation of , which reduces to the estimation of a few variance parameters such as in the examples given above.

Moment estimates can be obtained (Searle 1971; Demindenko 2004), typically for variance component models. Such estimates are often based on sums of squares, that are squared distances between and its projection on various linear spaces, such as span , span or span( . The expectation of these sums of squares can often be related to the different variance parameters and the estimation then reduces to solving a set of linear equations.

The maximum likelihood (ML) estimator is defined as

and can be used for all models. Unfortunately, ML variance estimates are known to be biased in many (almost all) situations, because both and have to be estimated at the same time. The most popular way to circumvent this problem consists of changing to a model where is known (Verbeke and Molenberghs 2000). Defining some matrix such that , we may define the Gaussian vector which satisfies

The most natural choice for is the projector on the linear space orthogonal to span . The so-called ‘restricted’ maximum likelihood (ReML) of is then defined as

Note that ReML estimates are ML estimates and therefore inherit all their properties (consistency, asymptotic normality). Also note that ReML provides less biased variance estimates than ML. The ReML estimates can be shown to be unbiased only for some specific cases such as orthogonal designs where they are equivalent to moment estimates.

The estimates of (and the variance of this estimator) are the same for ML and generalized least squares (Searle 1971):

These estimates are often calculated with a simple plug-in of one of the estimates described above. However, as it relies on the shape of specified in the model, such an estimator of is not robust to a misspecification of the dependency structure. Alternative estimators such as the ‘sandwich’ estimator can be defined, that are both consistent and robust (Diggle et al. 2002).

Tests

As many experiments aim to assess the significance of a given effect, we are often interested in testing the hypothesis stating that the corresponding contrast between the elements of is null. The global procedure is similar to that of model (2.1): writing the contrast under study as a linear combination of the parameters , we get the usual test statistic:

(2.5) equation

A central practical issue is the determination of the (approximate) degree of freedom of the Student's distribution as it depends on both the experimental design and the contrast itself.

As an example, in model (2.2), we first consider the average difference between genotype and : that is estimated by (the notation ‘ ’ means that the variables are averaged over the index it replaces) (Searle 1971). We then consider the difference between two cell types and is and its estimate . If the design is completely balanced with genotypes, individuals per genotype and cell types within each individual, the respective variances of these estimates are

We see here that the biological variance does not affect the contrast on the cell types (that are harvested within each individual) whereas it is predominant for the genotype contrast, which refers to differences between the individuals. The distribution of the test statistics is Student in both cases, but with different degrees of freedom: for the genotype and for the cell types. The intuition is that the number of observations is the number of individuals for the genotype, whereas it is the total number of measures for the cell type. Hence, in such a design the power will be (much) greater for distinguishing cell types than genotypes.

For more complex dependency structures such as repeated measurements or spatial data, the distribution of (2.5) is only approximated and the degrees of freedom need to be approximated (Fai and Cornelius 1996; Kenward and Roger 1997; Khuri et al. 1998).

2.2.1.3 Generalized Linear Mixed Models

Observations cannot always be modelled with a Gaussian distribution. The linear model can however easily be generalized to most usual distributions such as binomial, Poisson, Gamma, etc., giving rise to the ‘generalized’ linear model (Dobson 1990; Demindenko 2004; Zuur et al. 2009). A key ingredient is the introduction of a ‘link function’ between the expectation of the responses and the linear model as .

For non Gaussian models, a canonical parametrization of the correlation between the observations does not always exist. A specific modelling of the dependency is then required. As an example, in the Poisson case, the variance components model (2.4) will typically be rewritten as

where is the log function and the coordinates of are independent Gamma distributed variables. However the Gaussian modelling can be re-used in all generalized linear models, stating that

where is a centred Gaussian vector with variance matrix similar to those seen above in the Gaussian context.

2.2.2 Multiple Testing

Models such as those described in the preceding section allow to assess or reject the existence of some effect on a given response. They typically allow to assess whether a given gene is differentially expressed between several conditions in microarray experiments, accounting for possible covariates and dependency between the replicates. Because of the large number of genes spotted on each microarray, we are then faced with a multiple testing since we get one test statistic similar to (2.5) for each of the genes. This example has become common place (Dudoit et al. 2003), but similar issues are encountered in SNP studies, QTL detection and mass spectrometry, that is in all statistical analyses dealing with high-throughput technologies.

2.2.2.1 The Basic Problem

One Test Setting

We first consider hypothesis that states that the expression level of gene is not affected by the treatment under study. To assess , a test statistic is calculated from the observations such that a large value of will lead to reject (gene is then said to be ‘differentially expressed’). More precisely, denoting the cumulative distribution function (c.d.f.) of if holds, we calculate a -value defined as

and reject if is below a pre-specified level . measures the significance of the test and is the level of the test, that is the risk (probability) to reject whereas it is true. When testing one hypothesis at a time, is generally set to 1% or 5%.

Multiple Testing

Now consider m null hypotheses . Suppose that among these hypotheses, actually hold. For a given level , denote the number of false positives, that is the number of tests for which actually holds, whereas falls below . The expected number of false positives is

This number can exceed several hundreds when reaches or , as in microarray or SNP analyses. The primary goal of a multiple testing procedure (MTP) is to keep this number of false positives small, by tuning the threshold .

Distribution of the -Value

Most MTPs apply to the set of -values and the control of relies essentially on their distribution. If is true, we have

which means that, when holds, is uniformly distributed over the interval [0; 1]. As we shall see, this property is one of the key ingredients of all MTPs. It is therefore highly recommended to make a graphical check of it by simply plotting the histogram of the s: it should show a peak close to 0 (corresponding to truly differentially expressed genes associated with small s) and a uniform repartition along the rest of the interval [0, 1] [see Figure 2.1a].

Figure 2.1 (a) Typical distribution of the p-values in a multiple testing setting. (b) Receiver operating characteristic (ROC) curves for four different classifiers A, B, C and D.

2.2.2.2 Global Risk

Because is random, an MTP aims at controlling some characteristic of its distribution. Such a characteristic can be viewed as a global risk since it considers all the tests at the same time. Our goal is to maintain it at some pre-specified value . We assume that is known; if not, it can be either estimated or replaced by a surrogate (see later).

Family-Wise Error Rate (FWER)

The most drastic approach is to keep close to 0, that is to make

small. The goal is then to find the right level to guarantee a targeted FWER .

When all tests are assumed to be independent, has a binomial distribution so

This MTP is known as the Sidak procedure. When independence is not assumed, FWER can be upper bounded via the Bonferroni inequality which leads to

For a small , we have

so the two MTPs lead to similar thresholds .

False Discovery Rate (FDR)

Because

Enjoying the preview?

Page 1 of 1

Handbook of Statistical Systems Biology

About this ebook

Michael Stumpf

Related authors

Related to Handbook of Statistical Systems Biology

Related ebooks

Biology For You

Related podcast episodes

Related articles

Related categories

Reviews for Handbook of Statistical Systems Biology

What did you think?

Book preview

Handbook of Statistical Systems Biology - Michael Stumpf

1.1 Introduction

1.2 Cell Signaling Systems

1.3 The Challenge of Many Moving Parts

1.4 The Challenge of Parts with Parts

1.5 Closing Remarks

References

2.1 Introduction

2.2 Class Comparison

2.2.1 Models for Dependent Data

2.2.1.1 Writing the Right (Mixed) Model

Variance Components

Repeated Measurements

Spatial Dependency

2.2.1.2 Inference

Estimation

Tests

2.2.1.3 Generalized Linear Mixed Models

2.2.2 Multiple Testing

2.2.2.1 The Basic Problem

One Test Setting

Multiple Testing

Distribution of the -Value

2.2.2.2 Global Risk

Family-Wise Error Rate (FWER)

False Discovery Rate (FDR)