2.1.4 Experimental data integration
As it was commented before, high-throughput technologies gave birth to the systemic approach in the molecular biology field.
More often than not the shape and content of the experimental data is not directly comparable.
This is common problem when new experimental techniques appear.
In this particular scenario the problem becomes even more significant due to the massive amount of information this methods can produce.
Data integration is important for any biological research but it is particularly relevant in systems biology.
This discipline is based on the idea some properties of a biological system can only be studied when such system is considered as a whole.
Therefore it is of vital importance that the data coming from different parts of this system fit together as easily and seamlessly as possible.
There are a series of problems that difficult the data integration in systems biology (Hwang et al. 2005).
First, the types of data can be very different in shape and range from discrete to continuous.
Second, each method has a different degree of reliability and comes with particular uncertainties, errors and biases.
And third, very useful available information can be found outside the high-throughput techniques in small-scope experiments, computational predictions and high-quality curated databases.
Klipp et al. 2011 proposed a division of data integration in several levels of complexity.
The first level refers to common standars for information storage, representation and transfer.
The second level focuses on developing shared schemes for biological models and pathways.
A good example of this is the Systems Biology Markup Language (SBML) (Hucka et al. 2003).
The final level revolves around data correlation and fusion.
In other words, to develop analysis and modelling tools that allow researchers to integrate very different information to learn and explain biological processes.