6 Conclusions

The contributions of this work were listed in Chapter 1, and particular conclusions can be found at the end of each main chapter.

Here some general conclusions are drawn and discussed together besides some proposed lines for future work.

  • (Chapter 3) Intraviral PPIN is highly connected.

The four topological parameters studies seem to depend on the protein degree.

Moreover, the cumulative distributions of these parameters and the degree increase in a quasi–linear way.

The results were very similar across the detection methods used.

Interactions with lower intensity can be as vital to virus development as the more intense ones.

  • (Chapter 3) Some viral proteins focus their effect in only one host hub, while other diversify their effects among several proteins through direct interactions.

    There are significant differences in the propagation speed across different viral proteins.

    Some proteins spread its effect in similar patterns, hinting a common functionality.

  • (Chapter 3) The PLS modelling applied to genomic, proteomic and phenotypic data sets allows for the integration of the mutations performed on viral proteins, their effects on the PPIN, and their influences on the organismal fitness experimentally quantified.

    Three biological functional modules affecting the PPIN and influencing the fitness positively have been detected.

    Two additional modules are identified affecting a single protein.

    Different mutations affecting the same protein induce different behaviours in the activity of the PPIN and the resulting fitness.

  • (Chapter 3) Data fusion allows unveiling two significant features: (i) the mutations are related to topological changes in the network and their subsequent influence on the fitness, and (ii) the mutations not affecting the network can also be related to the fitness.

  • (Chapter 4) The E. coli constraint-based model show an excelent agreement with the experimental dataset.

    Constraint-based model constitute a perfectly valid approach to model metabolism in steady state conditions.

    Taking into account purely the amount of scenarios that the approaches can tackle MFA and FVA achieve a 100% and MFAg only a 50% success rate.

  • (Chapter 4) The flux distributions obtained through each approach are very similar among them.

    All MFA solutions look alike.

    This also happens for MFAg and FVA solutions.

    Just by looking at the shape of the intervals it is evident that each approach finds solutions very similar across the complete dataset.

  • (Chapter 4) The size or width of the solutions changes significantly, though.

    MFA offer the widest solutions, being many of them biologically infeasible.

    MFAg returns narrower and more biologically sound solutions.

    Finally, FVA solutions return almost always a single value for each flux in the model.

    They make the most biological sense too.

    From the size of the solutions they provide FVA is the best, then MFAg and in the end MFA .

    This seems to validate the main assumption in which FBA and FVA are based: that cellular growth can be understood as a biomass production optimization process.

  • (Chapter 4) The Partial Least Squares regression indicates that there is no correlation between the variability of the original variables and the variability of the ratios that define the performance of the different approaches.

    The three approaches obtain similar results no matter the value of the original variables.

    Therefore, the conclusions about the performance of the different approaches are stable and robust.

  • (Chapter 4) The PCA analysis carried out was able to describe the system with two components.

    These components fit the main carbon metabolic route and the overflow metabolism.

    The behaviour of the cells seem to be defined by a combination of both, which agrees with previously reported metabolic regime description in E.

    coli.

  • (Chapter 5) The construction of two types of graphs were presented where nodes represent reactions and directed edges represent metabolites produced by one reaction and consumed by another.

    In the Probabilistic Flux Reaction Graph (PRG), the edge-weights describe the probability that a any two reactions produce-consume a molecule of any metabolite.

    To incorporate the effect of the environment, the Flux Balance Graph (FBG) is proposed, in which edge weights are the total flux of metabolites between reactions predicted by Flux Balance Analysis (FBA).

  • (Chapter 5) Computing FBA solutions for different exchange fluxes between the cell and its environment, one can systematically build metabolic graphs for different compositions of the growth media.

    When applied to the core E.coli metabolic model the topology of the FBG effectively captures known metabolic adaptations such as the glycolytic-gluconeogenic switch, overflow metabolism, and the effects of anoxia.

  • (Chapter 5) The proposed FBG draws a novel connection between modern network theory (which studies graphs) and constraint-based methods widely employed in metabolic modelling.

    The FBG exploits the physiological predictions from FBA to construct metabolic graphs that are more informative of cell physiology and are directly grounded on specific environmental conditions.

    The resulting graphs are smaller and less connected that those built from the complete metabolic blueprint, but they shed further light on the organisation of metabolic activity in realistic physiological conditions.

  • (Chapter 5) Metabolic network and community structure of E.coli under different environmental scenarios produces significantly different graphs.

    The most stable partition is formed by three communities found for the aerobic scenario.

    This robust partition has a concrete interpretation: community C1glc contains reactions in charge of processing carbon from D-glucose to pyruvate.

    Community C2glc harbours the bulk of the cell’s ATP production.

    Community C3glc contains the reactions in charge producing NADH and NADPH (cell’s reductive power).


This thesis addressed problems related to the analysis and modelling of molecular biological networks.

The objective was to develop and use a series of techniques and methodologies that shed light and increased our understanding of complex biological systems.

The main conclusion of all the work done and the line that joins all the dots together is the power and capability of network analysis in molecular biology.

This system approach has resulted in a series of valuable studies were the focus was always on the whole system and its general and emergent properties.

Ideally, this thesis will contribute to stablish systems biology or the systemic or integral approach to biology as an absolutely fundamental part of almost any biological research done in the future.


Future work

The study carried out in this thesis opens new research avenues.

From the analysis of Chapter 3, the VHPIN analysis can be further explored using more complex metrics, graph kernels or integrating more biological information available such as sub–cellular localization or biological function.

Specially relevant would be to integrate in this topology the regulatory network of the host.

The interactions between viral proteins and host transcription factors are very sensible points to study further.

Besides this, further work of interest includes testing the proposed methodology with a larger dataset containing more mutants, and extending the analysis to larger PPINs, in order to build multivariate models with a higher predictive power, exploiting the features of the projection to latent structure methods.

Regarding Chapter 4 additional work would require to extend even further the scope of the dataset to check if the constraint-model is still able to work.

On the other hand, larger models (ideally genome-scale) could be used to check if they improve the level of agreement with the experimental data.

Moreover, constraintbased model of higher organisms should be addressed as well.

In these organisms, it is much harder to define a objective function and therefore the methods that do not assume any could improve their performance.

Finally, the proposed graph description of metabolic networks proposed in Chapter 5 could be used in many different applications.

Community detection through Markov Stability is one particular example of many possible.

Using real fluxomics data is the most straight forward application.

Calculate the synergy and competition version of the PRG would out an even more accurate relationship between reactions.

The reverse matrix (using metabolites as nodes instead of reactions) would give us even more insight into the metabolic system.

Furthermore, it would be interesting as well to analyse the transition between to extreme states (such as aerobic and anaerobic conditions) an the study if the structure of the networks varies uniformly or if there are jumps between stable intermediate metabolic states.

In addition, the applications already discussed in Section 5.6 are also worth pursuing.

results matching ""

    No results matching ""