Application of (bio) chemical engineering principles and lumping analysis in modelling

Living cells are organized, self-replicating, self-adjustable, evolvable and responsive structures to environmental stimuli. Attempts to model metabolic cell reactions and processes are not new, an adequate dynamic model being the one engineering alternative to coherently, consistently, and systematically insilico represent the cell metabolism aiming at studying the cell response to various perturbations (reviews [5,6,8,17,18]). Thus, Synthetic Biology and System Biology become emergent sciences focus on the engineering-driven model-based building of complex biological entities, aiming at applying engineering principles of systems design to biology with the idea to produce predictable and robust biological systems with novel functions in a broad area of applications, such as therapy of diseases (gene therapy), design of new biotechnological processes, new devices based on cell-cell communicators, biosensors, etc. “System Biology can be defined as “the science of discovering, modelling, understanding and ultimately engineering at the molecular level the dynamic relationships between the biological molecules that define living organisms.” (Leroy Hood, President Institute for System Biology, Seattle, USA, cited by [17,18]).


Introduction
Living cells are organized, self-replicating, self-adjustable, evolvable and responsive structures to environmental stimuli. Attempts to model metabolic cell reactions and processes are not new, an adequate dynamic model being the one engineering alternative to coherently, consistently, and systematically insilico represent the cell metabolism aiming at studying the cell response to various perturbations (reviews [5,6,8,17,18]). Thus, Synthetic Biology and System Biology become emergent sciences focus on the engineering-driven model-based building of complex biological entities, aiming at applying engineering principles of systems design to biology with the idea to produce predictable and robust biological systems with novel functions in a broad area of applications, such as therapy of diseases (gene therapy), design of new biotechnological processes, new devices based on cell-cell communicators, biosensors, etc. "System Biology can be defined as "the science of discovering, modelling, understanding and ultimately engineering at the molecular level the dynamic relationships between the biological molecules that define living organisms." (Leroy Hood, President Institute for System Biology, Seattle, USA, cited by [17,18]).
Due to the highly complex and partly unknown aspects of the metabolic processes, the detailed mathematical modelling at a molecular level remains still an unsettled issue, even if remarkable progresses and developments of extended simulation platforms have been reported. The general modelling rules, based on physico-chemical-biological and chemical engineering principles, and a statistical data treatment are more difficult to be applied to living systems. That is because metabolic cell processes present a low observability vs. the very large number of species of the order O(10 4 ), reactions O(10 5 ), and transport parameters. Application of advanced lumping techniques can increase the model estimability by reducing the number of reactions and/or variables, and by keeping the most influential terms. Model quality tests, parameter and species sensitivity analysis, principal component and algorithms to find invariant subspaces are common rules to reduce extended model structures. The reduction cost is a loss of information on certain species and reactions, a loss in model generality, prediction capabilities, and physical meaning for some rate constants.
To overcome the structural low identifiability of living cell processes, the current trend is to use all types of information 'translated' from the 'language' of molecular biology to that of mechanistic chemistry, by preserving the cell structural hierarchy and species functions. Application of (bio) chemical engineering concepts and modelling methods, and of the nonlinear systems control theory allow improving the cell model quality, and may offer a detailed simulation of the cell metabolism adaptation to environmental changes, useful for designing modified genetic circuits and of modified micro-organisms.
Applications [5,[15][16][17][18][19]45,60,2,[8][9][10][11][12][13]31,39,43,50] refer to insilico design (that is by using mathematical models and tools) of mutant cells with desirable motifs, such as genetic switches acting as biosensors, or genetic circuits amplifying exogeneous stimuli, or involved in signal transduction, or in oscillatory cell processes such as glycolysis. Another case study [31] presents a multi-layer/multi-scale model that couples a structured representation of a metabolic genetic regulatory circuit at a molecular level with the macroscopic mass balances of the relevant state variables of a fluidized-bed bioreactor used for mercury uptake from wastewaters by using immobilized E. coil bacteria. The model was proved useful for process design and optimal control purposes allowing predicting the wild/cloned bacteria metabolism adaptation (i.e. the response of the italics letters operon expression) over several cell generations to dynamic operating conditions of the bioreactor.
The present paper is aiming at reviewing the general concepts of the VVWC modelling approach when developing modular kinetic representations of the homeostatic gene expression regulatory modules (GERM) that control the protein synthesis and homeostasis of metabolic processes. The paper is also reviewing some published contributions including past and current experience with GERM linking rules in order to point-out how optimized globally efficient kinetic models for the genetic regulatory circuits (GRC) can be obtained to reproduce experimental observations.

Applied Concepts When Modelling the GRC Dynamics
It is well-known that most of the concepts and numerical methods used in chemical engineering can be also used when modelling the dynamics of enzymatic [40][41][42][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36] and metabolic cell processes [1,2,[6][7][8][9][10][11][12][13][14][15][16]. The reviewed simple case studies of VVWC modular kinetic models of GERM-s [6-16] (see Figure 1 for such a reduced representation of a GERM) proved that the chemical and biochemical engineering principles, together with the control theory of the nonlinear systems are fully applicable to modelling complex metabolic cell processes, including the sophisticated GRC-s controlling the cell enzymes syntheses and metabolic fluxes [52]. The reaction scheme of a generic gene G expression. The regulatory module of G(P)1 type was used to exemplify the synthesis of a generic P protein in the E. coil cell by Maria [22]. To improve the system homeostasis stability, that is quasi-invariance of key species concentrations (enzymes, proteins, metabolites), despite of perturbations in nutrients Nut*, and metabolites Met*, or of internal cell changes, a very rapid buffering reaction G + P <===> GP(inactive) has been added. Horizontal arrows indicate reactions; vertical arrows indicate catalytic actions; G = gene encoding protein P; MetG, MetP = lumped DNA and protein precursor metabolites respectively.

These Principles and Tools Includes i)
Molecular species conservation law (stoichiometry analysis; species differential mass balance set); ii) Atomic species conservation law (atomic species mass balance); iii) Thermodynamic analysis of reactions (quantitative assignment of reaction directionality), [3]; set equilibrium reactions; Gibbs free energy balance analysis; set cyclic reactions; find species at quasi-steady-state; improved evaluation of steady-state flux distributions that provide important information for metabolic engineering and on cell metabolism [4]; iv) Application of species and/or reaction lumping rules to the Ordinary Differential Equation ODE model [1,38]; v) Analysis of cyclic reactions, and of the cell system steady-state (homeostasis) stability. Such kinetic analysis at a molecular level includes modelling the kinetics of complex GRC-s [1,2,6-16] controlling protein synthsis and cell metabolic fluxes distribution, and resource allocation in branched pathways through genetic switches according to environmental conditions [1,2,[6][7][8][9][10][11][12][13]39,43,50]. Such a cell model can eventually be used to design modified micro-organisms. Deterministic (model-based) simulation platforms allow italics letters design of modified cells with desirable gene circuits and 'motifs' of practical applications in the biosynthesis industry, environmental engineering, and medicine [13][14][15][16][17][18][19]5,45,60], or for modelling and design of new drug delivery systems with a controlled drug release [15,37,44,46,49,[55][56][57].
Some of these applications have been developed, for instance, for modelling the gene expression regulation [1,2,[6][7][8][9][10][11][12][13][14][15][16]. Due to near astronomic complexity of cell processes (see Figure 2 as an example of central carbon metabolism [51,54]), and a huge number of involved species of order O(10 4 ), and reactions O(10 5 ), advanced lumping procedures [38,56] and modularization techniques [1,6,8,10,12,13,31] have been applied to obtain reduced models by lumpig species and/or reactions but keeping the main cell functions, and the structural, functional and temporal hierarchy. When developing whole-cell models on a mechanistic (deterministic) basis by using continuous-variable models, all enzymatic processes are closely linked to ensure an optimum metabolism. So, the metabolic reactions must occur with maximum reaction rates, with using minimum of resources (substrates, energy), and producing minimum amount of reaction intermediates; besides, reactions and key-species homeostasis should be less influenced by the environmental perturbations by involving simple GRC-s with a preferable cascade control of the gene expression that minimize the transition or recovering times of the quasi-steady-states QSS [7,43]. It quickly appeared that lot of cell continuous processes can be modelled by using the concepts of the (bio) chemical kinetic modelling. For simplicity, the modelling problem was decomposed in modelling "functional modules" which, eventually will be linked to recreate the wholecell structure [15,16] that is "modules that can be elaborated …..by 'translating' from the 'language' of molecular biology to that of mechanistic chemistry, by preserving the cell structural hierarchy and component functions" [1]. Other attempts to model cell complex continuous processes to reproduce the three main properties of the cell metabolism: i) dynamics; ii) feedback, and iii)optimality, by using "Electronic circuits" like models [1] (Figure 3) failed because they cannot reproduce in detail molecular interactions with slow and continuous responses to perturbations.  As expected, because the deterministic model complexity sharply increases with the level of detail (Figure 4), a satisfactory model complexity that realizes the best trade-off between model simplicity and its predictive quality should be adopted [7]. When developing a structured dynamic model with continuous variables, there are strong reasons to use lumping techniques for reducing its complexity. And that is because: i) there are too complex cell mechanisms vs. available data; ii) there is a large number of species, reactions, transport parameters, and interactions difficult to be modelled in detail; iii) structured experimental data (standard kinetic data) are difficult to be obtained and usually present a low observability and reproducibility due to the metabolic process variability; iv) a reduced cell process model allows an easier interpretation of cell complexity, and quick simulations of cell behavior under various environmental conditions. Besides, the computational tractability of the reduced mathematical models allows application of well-known algorithmic rules of chemical engineering, nonlinear system theory, and numerical calculus.
Inherently, the reduced dynamic cell models suffer from a series of drawbacks, such as: i.
Multiple reduced structures of different characteristics are possible to exist for the same cell process, difficult to be discriminated; iii. Loss of information on certain species and reaction steps; iv. Loss in system flexibility, due to the reduced number of intermediates and species interactions included in the mathematical model; (stability, multiplicity, sensitivity, regulatory characteristics).
Starting from some case studies, elaborated procedures have been proposed to buildup a whole-cell simulator of modular construction [1,2,[7][8][9][10][11][12][13][14], useful for evaluation of the genetic regulatory circuits' (GRC) efficiency, and in designing genetically modified cells with desirable characteristics ("motifs"). Thus, there are a large number of important contributions in modelling regulatory cell processes. Among the pioneers, are to be mentioned: Heinrich and Schuster, [52]; Athel Cornish Bowden [53], Torres and Voit [58], or construction of modular GRC-s [19,38,39,45,47,53,60] to mention only few of them. It is also to notice that the number of papers in the area of System Biology increases with 3-orders of magnitude in the last 10-years, according to Scopus.
Basically, the main hypotheses of a VVWC model are the following [1]: (i) The cell system consists in a sum of hierarchically organized components, e.g. metabolites, genes DNA, proteins, RNA, intermediates, etc. (interrelated through transcription, translation, and DNA replication and other processes); the cell is separated from the environment (containing nutrients) by a membrane.
(ii) The membrane, of negligible volume, presents a negligible resistance to nutrient diffusion; the membrane dynamics being neglected in the cell model, is assumed to follow the cell growing dynamics.
(iii) The cell is an isothermal system with an uniform content (perfectly-mixed case); species behave ideally, and present uniform concentrations within cell. The cell system is not only homogeneous but also isotonic (constant osmotic pressure), with no inner gradients or species diffusion resistance.
(iv) The cell is an open system interacting with the environment through a semi-permeable membrane.
(v) To better reproduce the GERM properties interconnected with the rest of the cell, the other cell species are lumped together in the so-called "cell ballast". To fulfil the adopted Pfeiffers'law of diluted solutions: [where T = absolute temperature; R = universal gas constant, V= cell (cytosol) volume; p cyt = inner osmotic pressure; t= time; n j = species j number of moles], Maria [1,2,[6][7][8][9][10][11][12][13] proposed that lumped genome and proteome replication to also be considered in such cell models.
(vii) The inner osmotic pressure (p cyt ) is constant, and all time equal with the environmental pressure, thus ensuring the membrane integrity (p cyt = p env = constant). As a consequence, the isotonic osmolarity under isothermal conditions leads to the equality RT/p cyt = RT/ p env which, indicating that the sum of cell species concentrations must equal those of the environment, i.e.
. Otherwise, the osmosis will eventually lead to an equal osmotic pressure = p cyt = p env . Even if, in a real cell, such equality is approximately fulfilled due to perturbations and transport gradients, and in spite of migrating nutrients from environment into the cell, the overall environment concentration is considered to remain unchanged. On the other hand, species inside the cell transform the nutrients into metabolites and react to make more cell components. In turn, increased amounts of polymerases are then used to import increasing amounts of nutrients. The net result is an exponential increase of cellular components in time, which translates, through isotonic osmolarity assumption, into an exponential increase in volume with time [1]. The overall concentration of cellular components is time-invariant (homeostasis), because the rate at which cell-volume increases equals that at which overall number of moles increases, leading to a constant ratio.
(viii) The species concentrations Cj=nj/V at the cell level are usually expressed in nano-molar, being computed with the relationship [1]: When modelling metabolic cell processes with deterministic continuous-variable models by using a certain number of novel concepts related to the VVWC modelling approach, certain advantages have to be underlined, as followings [1,2,[6][7][8][9][10][11][12][13][14][15]60] The more realistic "whole-cell-variable-volume" (VVWC) approach which was proved to lead to a more effective representation of cell processes, being very useful when developing modular kinetic representations of the GERM-s that control the protein synthesis and homeostasis of metabolic processes.
The novel VVWC modelling framework to buildup kinetic cell models presents the advantage of explicitely including in the model equations the isotonicity constraint, and the variable cell volume link with the cell reactions. The holistic approach reveals the role played by the "cell ballast" in smoothing the effect of perturbations coming from the environment. The VVWC modelling framework was succesfully used to derive effective kinetic models describing various genetic regulatory circuits (GRC), and individual gene expression modules (GERM) [1,2,[6][7][8][9][10][11][12][13][14][15][16].
The VVWC model formulation was proved to be also suitable to accurately model the cell growth and its division [59]. Such a model formulation allows studying various regulatory properties of GERM-s, and the response of coupled GERM-s to dynamic / stationary continuous perturbations in the environment, and also the 'inertial' effect of the cell-'ballast' vs. continuous changes in cell and environment [1,2,[6][7][8][9][10][11][12][13]. As ca. 80% of the cycle period is the growing phase and, assuming a quasi-constant osmotic pressure and a constant volume growing logarithmic rate, the VVWC cell model can be considered satisfactory to study the GRC effectiveness. Trends Biomedical Eng & Biosci. 2017; 1(4): 555566.

006
The novel VVWC modelling framework allows a realistic analysis of rules to be used for GERM linking when buiding-up complex GRC dynamic models, thus offering the possibility i) To simulate the regulatory performances of a gene expression, of an aperon expression, or ii) Of a GRC (switch, amplifier, filter, etc. [39,43,50]), and also iii) To in-silico design of genetic modified or cloned micro-organisms with target plasmides to get desirable characteristics for industrial or medical applications. Some examples include: maximization of succinate production in E. coli [60]; efficient removal of mercury from wastewaters [10,12,29,31,34]; design of a genetic switch of desirable characteristics [2,9,13].
The mechanistic approach was very adequate when developing a structured complex kinetic model ( Figure 6) to simulate the efficiency of the GRC responsible for induced expression of the mer-operon in gram-negative bacteria (Pseudomonas putida, E. coli) responsible for regulation of the mercury ions uptake from wastewaters [10,12,29,31,34]. The model was tested in the new VVWC modelling framework vs. literature data and used to optimize operation of an industrial fluidized-bed bioreactor used for mercury removal from wastewaters [31,34].

Conclusion
As revealed by this very brief review, general chemical engineering modelling principles are proved to be valuable tools for representing the both stationary and dynamic characteristics of complex cell biochemical processes. Elaboration of reduced models of satisfactory quality is closely related to the ability of selecting the suitable lumping rules, key-parameters, and influential terms, and to apply multi-objective non-/conventional estimation criteria that realize the best trade-off between model simplicity and its predictive quality.
Note: This paper is the extended version of the invited plenary lecture: Maria, G., Applications of Chemical Engineering Principles and the Lumping Analysis in Modelling the Living Systems -A trade-off between simplicity and model quality, presented at University Babes-Bolyai, Cluj (Romania), Department of Chemistry, Nov. 8, 2013. http://www.chem.ubbcluj.ro/~chimie/ anunturi.html,