Computational Systems Biology

A. Khalid and S. K. Jha, “ Calibration of Rule-Based Stochastic Biochemical Models using Statistical Model Checking ,” IEEE International Conference on Bioinformatics and Biomedicine. 2018.Abstract

Computational modeling of biochemical networks assists in understanding the functioning of a biochemical process as well as enables in silico experimentation. However, such models are often stochastic in nature and contain unknown parameters that must be estimated against expert behavioral specifications. In this paper, we present a new statistical model checking approach for calibrating rule-based stochastic biochemical models such that the calibrated model satisfies a given behavioral temporal logic specification. Our approach computes a distribution of aquantitative tightness metric describing how well a specification is satisfied by a parameterized rule-based stochastic biochemical model. Then, sequential selection of the larger of the means of two such distributions is used to guide a simulated annealing based search for the correctly calibrated model that satisfies the given behavioral temporal logic specification. Our experimental results demonstrate the successful calibration of a rule-based T- cell receptor model while achieving a significant speedup in time.

A. Khalid and S. K. Jha, “ Calibration of Rule-Based Stochastic Biochemical Models using Statistical Model Checking ,” International Conference on Bioinformatics and Biomedicine (BIBM). 2018.
A. Khalid and S. K. Jha, “ Parameter Estimation of Stochastic Biochemical Models using Multiple Hypothesis Testing ,” in 8th IEEE International Conference on Computational Advances in Bio and medical Sciences (ICCABS), 2018.
S. Raj, et al., “ A theorem proving approach for automatically synthesizing visualizations of flow cytometry data ,” BMC Bioinformatics, vol. 18, no. 8, pp. 245, 2017. Publisher's VersionAbstract

Polychromatic flow cytometry is a popular technique that has wide usage in the medical sciences, especially for studying phenotypic properties of cells. The high-dimensionality of data generated by flow cytometry usually makes it difficult to visualize. The naive solution of simply plotting two-dimensional graphs for every combination of observables becomes impractical as the number of dimensions increases. A natural solution is to project the data from the original high dimensional space to a lower dimensional space while approximately preserving the overall relationship between the data points. The expert can then easily visualize and analyze this low-dimensional embedding of the original dataset.

S. K. Jha and C. J. Langmead, “ Stochastic computational model parameter synthesis system ”, US Patent: US9558300 B2, 2017.Abstract


A stochastic computational model parameter synthesis system comprising at least one processor capable of executing processor executable code, and a non-transitory computer memory operably coupled with the at least one processor and storing processor executable code, which when executed by processor, causes processor to synthesize at least one parameter of a stochastic computational model to satisfy one or more behavioral specifications of properties observed in a modeled system. The processor generates and searches randomized projections of a first parameter space having n dimensions into one or more second abstract parameter space having d dimensions, where d is less than n, and outputs a signal indicative of a synthesized parameter value to the user.



US Patent App. 13/673,575

F. Hussain, C. J. Langmead, Q. Mi, J. Dutta-Moscato, Y. Vodovotz, and S. K. Jha, “ Automated parameter estimation for biological models using Bayesian statistical model checking ,” BMC bioinformatics, vol. 16, no. 17, pp. S8, 2015.Abstract


Background: Probabilistic models have gained widespread acceptance in the systems biology community as a useful way to represent complex biological systems. Such models are developed using existing knowledge of the structure and dynamics of the system, experimental observations, and inferences drawn from statistical analysis of empirical data. A key bottleneck in building such models is that some system variables cannot be measured experimentally. These variables are incorporated into the model as numerical parameters. Determining values of these parameters that justify existing experiments and provide reliable predictions when model simulations are performed is a key research problem. Domain experts usually estimate the values of these parameters by fitting the model to experimental data. Model fitting is usually expressed as an optimization problem that requires minimizing a cost-function which measures some notion of distance between the model and the data. This optimization problem is often solved by combining local and global search methods that tend to perform well for the specific application domain. When some prior information about parameters is available, methods such as Bayesian inference are commonly used for parameter learning. Choosing the appropriate parameter search technique requires detailed domain knowledge and insight into the underlying system.

Results: Using an agent-based model of the dynamics of acute inflammation, we demonstrate a novel parameter estimation algorithm by discovering the amount and schedule of doses of bacterial lipopolysaccharide that guarantee a set of observed clinical outcomes with high probability. We synthesized values of twenty-eight unknown parameters such that the parameterized model instantiated with these parameter values satisfies four specifications describing the dynamic behavior of the model.

Conclusions: We have developed a new algorithmic technique for discovering parameters in complex stochastic models of biological systems given behavioral specifications written in a formal mathematical logic. Our algorithm uses Bayesian model checking, sequential hypothesis testing, and stochastic optimization to automatically synthesize parameters of probabilistic biological models. 




F. Hussain, et al., “ SANJAY: Automatically synthesizing visualizations of flow cytometry data using decision procedures ,” in Computational Advances in Bio and Medical Sciences (ICCABS), 2015 IEEE 5th International Conference on, 2015, pp. 1–1.
F. Hussain, A. Ramanathan, L. L. Pullum, and S. K. Jha, “ EpiSpec: A formal specification language for parameterized agent-based models against epidemiological ground truth ,” in Computational Advances in Bio and Medical Sciences (ICCABS), 2014 IEEE 4th International Conference on, 2014, pp. 1–6.
F. Hussain, C. J. Langmead, Q. Mi, J. Dutta-Moscato, Y. Vodovotz, and S. K. Jha, “ Parameter discovery for stochastic computational models in systems biology using Bayesian model checking ,” in Computational Advances in Bio and Medical Sciences (ICCABS), 2014 IEEE 4th International Conference on, 2014, pp. 1–2.
F. Hussain, S. K. Jha, S. Jha, and C. J. Langmead, “ Parameter discovery in stochastic biological models using simulated annealing and statistical model checking ,” International Journal of Bioinformatics Research and Applications 2, vol. 10, no. 4-5, pp. 519–539, 2014.
F. Hussain, A. Velasquez, E. Sassano, and S. K. Jha, “ Putting humpty-dumpty together: Mining causal mechanistic biochemical models from big data ,” in Computational Advances in Bio and Medical Sciences (ICCABS), 2014 IEEE 4th International Conference on, 2014, pp. 1–6.
A. R. Kolli, et al., “ A computational metabolic model of the NG108-15 cell for high content drug screening with electrophysiological readout ,” in Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine, 2012, pp. 530–532.
S. K. Jha and C. J. Langmead, “ Exploring behaviors of stochastic differential equation models of biological systems using change of measures ,” BMC bioinformatics, vol. 13, no. 5, pp. S8, 2012.


S. K. Jha and A. Ramanathan, “ Quantifying uncertainty in epidemiological models ,” in BioMedical Computing (BioMedCom), 2012 ASE/IEEE International Conference on, 2012, pp. 80–85. quantifyinguncertainty.pdf
S. Jha, et al., “ Parameter estimation and synthesis for systems biology: New algorithms for nonlinear and stochastic models ,” Journal of critical care, vol. 26, no. 2, pp. e8, 2011.
S. K. Jha and C. J. Langmead, “ Synthesis and infeasibility analysis for stochastic models of biochemical systems using statistical model checking and abstraction refinement ,” Theoretical Computer Science, vol. 412, no. 21, pp. 2162–2187, 2011.
S. K. Jha, E. M. Clarke, C. J. Langmead, A. Legay, A. Platzer, and P. Zuliani, “ A bayesian approach to model checking biological systems ,” in International Conference on Computational Methods in Systems Biology, 2009, pp. 218–234. cmsb09_bayesianstatisticalmodelchecking_jha.pdf
C. J. Langmead and S. K. Jha, “ Symbolic approaches for finding control strategies in Boolean networks ,” Journal of Bioinformatics and Computational Biology, vol. 7, no. 02, pp. 323–338, 2009.Abstract


We present algorithms for finding control strategies in Boolean Networks (BN). Our approach uses symbolic techniques from the field of model checking. We show that despite recent hardness-results for finding control policies, a model checking-based approach is often capable of scaling to extremely large and complex models. We demonstrate the effectiveness of our approach by applying it to a BN model of embryogenesis in D. melanogaster with 15,360 Boolean variables. 


E. M. Clarke, J. R. Faeder, C. J. Langmead, L. A. Harris, S. K. Jha, and A. Legay, “ Statistical model checking in biolab: Applications to the automated analysis of t-cell receptor signaling pathway ,” in International Conference on Computational Methods in Systems Biology, 2008, pp. 231–250.Abstract


We present an algorithm, called BioLab, for verifying tem- poral properties of rule-based models of cellular signalling networks.

BioLab models are encoded in the BioNetGen language, and prop- erties are expressed as formulae in probabilistic bounded linear temporal logic. Temporal logic is a formalism for representing and reasoning about propositions qualified in terms of time. Properties are then verified using sequential hypothesis testing on executions generated using stochastic simulation. BioLab is optimal, in the sense that it generates the mini- mum number of executions necessary to verify the given property. Bio- Lab also provides guarantees on the probability of it generating Type-I (i.e., false-positive) and Type-II (i.e., false-negative) errors. Moreover, these error bounds are pre-specified by the user. We demonstrate Bio- Lab by verifying stochastic effects and bistability in the dynamics of the T-cell receptor signaling network. 


C. J. Langmead and S. K. Jha, “ Predicting protein folding kinetics via temporal logic model checking ,” in International Workshop on Algorithms in Bioinformatics, 2007, pp. 252–264.Abstract


We present a novel approach for predicting protein folding kinetics using techniques from the field of model checking. This represents the first time model checking has been applied to a problem in the field of structural biology. The protein’s energy landscape is encoded sym- bolically using Binary Decision Diagrams and related data structures. Questions regarding the kinetics of folding are encoded as formulas in the temporal logic CTL. Model checking algorithms are then used to make quantitative predictions about the kinetics of folding. We show that our approach scales to state spaces as large as 1023 when using exact algo- rithms for model checking. This is at least 14 orders of magnitude larger than the number of configurations considered by comparable techniques. Furthermore, our approach scales to state spaces at least as large as 1032 unique configurations when using approximation algorithms for model checking. We tested our method on 19 test proteins. The quantitative predictions regarding folding rates for these test proteins are in good agreement with experimentally measured values, achieving a correlation coefficient of 0.87. 


C. J. Langmead, S. K. Jha, and E. M. Clarke, “ Temporal-logics as query languages for dynamic Bayesian networks: application to D. melanogaster embryo development ,” 2006.Abstract


This paper introduces novel techniques for exact and approximate inference in Dynamic Bayesian Networks (DBNs) based on algorithms, data structures, and formalisms from the field of model checking. Model checking comprises a family of techniques from for formally verifying systems of concurrent reactive processes. We discuss: i) the use of temporal logics as a query language for inference over DBNs; ii) translation of DBNs into probabilistic reactive modules; and iii) the use of symbolic data structures and algorithms for deciding complex stochastic temporal logic formu- las. We demonstrate the effectiveness of these new algorithms by examining the behavior of an enhanced expression model of embryogenesis in D. melanogaster. In particular, we converted an existing deterministic developmental model over a one-dimensional arrays of cells into a stochas- tic model over a two dimensional array of cells. Our results confirm that the rules which govern the one-dimensional model also display wild-type expression patterns in the two-dimensional case within certain parameter bounds.