ModelExplorer – software for visual inspection and inconsistency correction of genome-scale metabolic reconstructions

Bioinformatics

ModelExplorer allows the visualization of a metabolic reconstructed networks as bipartite graphs: Metabolites and reactions are represented by nodes, and links (shown as arrows) only connect metabolites to reactions and vice versa. The arrows may be unidirectional or bidirectional, depending on the encoded reaction reversibility in the metabolic reconstruction. Metabolites and reactions are automatically grouped by their compartment, as specified in the reconstruction. The compartment grouping is visualized and may be highlighted.

The network layout is calculated using a force-directed algorithm, as we found this to give the most aesthetically pleasing results. In the Network View (Fig. 1, mark (1)) the user may zoom, pan, select and hover the cursor over nodes in the network in order to explore or make changes. In ModelExplorer, the user is provided with a set of options for network visualization, exploration and editing. These options may be accessed through function menus in the Command Panel (Fig. 1, mark (2)).

Fig. 1

General layout of Model Explorer: The network view (1) shows a bipartite graph representation of the metabolic model, which is both zoomable and pannable. Reactions and species are represented with different shades of the same colour (reactions are bright and species are dark). If the reaction/species is active the base colour is green, and red if blocked. Endpoint species (those which are either not produced or not consumed) have a light blue outline, while the biomass (growth) reaction has a thick yellow outline. Connections between reactions and metabolites are represented with grey semitransparent arrows that have either one or two arrowheads depending on reversibility. Compartment contours are shown in orange, and all species that belong to a compartment must be localized within its contour. The command panel (2) contains a set of function menus and a Search tool, which can be used to find species and reactions by their name or id. When using neighbour view in the Command Panel, and hovering the cursor over a reaction/species or selecting it, the names of the reaction/species and a list of its properties, neighbours, ancestors or blocked module mates is shown in the text panel on the right (3)

Some of the tools in ModelExplorer can also output information to the Text Panel (Fig. 1 mark (3)).

Finding blocked reactions

The core of the ModelExplorer functionality is the identification of blocked reactions and metabolites that cannot be produced by a reconstructed metabolic network. This is called consistency checking. ModelExplorer provides the user with three different methods for doing this, named FBA, Bi-directional and Dynamic mode. The FBA and Bi-directional methods have previously been published in different implementations [16, 17].

In the FBA mode, a reaction is declared (and marked) blocked if it is unable to carry a (FBA) steady state flux. A metabolite is shown as blocked if all reactions that can generate it are blocked. In order to reduce the time it takes to perform consistency checking in FBA mode, we have developed a radically improved version of the FastCC algorithm [17], which we call ExtraFastCC. It uses 40-80 times fewer optimization rounds than its predecessor. Detailed speed and complexity comparisons of our algorithm against FastCC can be found in the “Comparison with other software” section.

The FBA mode is useful to identify which parts of a model may be removed without affecting the results of any FBA simulation. Restoring the consistency of these reactions may improve the model’s resilience against knock-outs. Using this mode we consistency-checked 13 models from the OpenCOBRA model repository used by Ebrahim et al. [18] (iMM1415, iAF1260, iCac802, iAN840m, iMM904, iBsu1103, iND750, iMO1056, iJN746, iJR904, iNJ661, iFF708 and iRsp1095) and found 28% of all reactions to be blocked on average, with a standard deviation of 11%. This highlights that blocked reactions as a significant problem for most metabolic reconstructions.

In the bi-directional mode, we initiate the analysis by setting all reactions to be reversible. This step is followed by running the same algorithm as used for the FBA mode. The main purpose of the bi-directional mode is not as an alternative to the FBA-mode, instead to provide the user with a quick way to check if the inactivity of a certain part of the model is caused by an over-constrained or misdirected reaction. In addition to help identifying obvious errors, comparing the two modes can address a deeper dilemma: It is not always trivial to establish the reversibility of a reaction, as it is influenced by the relative concentrations of the participating chemical species. Concentrations may change depending on the abundance of available nutrients, altering reaction reversibility.

Finally, in the dynamic mode, a species is declared (and marked) blocked if it will block the biomass (growth) reaction when added to the list of its reactants. A reaction is then determined to be blocked if any of its reactants are blocked. The dynamic mode is useful in the process of assessing the fidelity of a draft reconstruction, since it allows us to identify which existing metabolites may potentially be part of the biomass reaction without blocking it. It is the only mode that will show valid results when the user has not yet added a biomass function or any export reactions to the model, as the dynamic mode can solely rely on imports. This mode also adds a higher level of realism compared to the FBA mode, since it shows if the reconstructed network can support a constant concentration of a metabolite during exponential growth. Unfortunately, this topic is usually overlooked as exponential growth cannot be directly addressed using the steady state approximation of FBA. The Dynamic mode is the fastest to compute among the three modes, and it always needs only one round of optimization. The details of the algorithm will be published elsewhere [Martyushenko, Almaas. In preparation].

If the user wants to know whether a specific reactions is blocked, ModelExplorer makes it is possible to directly look up reactions and metabolites by their name through the search function and highlights them on the network view (Fig. 2 panel a).

Fig. 2

ModelExplorer graphical features: a The search tool in the command panel can be used to search for reactions and species by their id or name. If one selects the desired item from a drop-down list of matches, a purple circle is drawn around the target, and a line of similar colour is drawn from the lower left corner of the network view to the circle. b In node ancestry mode, one can view the shortest pathway (all the way to import reactions) necessary to produce a species or to make a reaction active. It gets highlighted in dark purple colour if non-cyclic. If cyclic, the cycle (strongly connected component) gets highlighted in black

Exploring the network

When blocked reactions and metabolites are identified, the user is presented with four tracking tools for determining the source of error (accessed through the “Neighbour view” menu).

The first option, called “None,” does not highlight anything except the node itself. However, in this mode one may edit existing species, reactions and compartments by clicking on them. By using the Text Panel, it is possible to change their properties.

The second option, called “Ego-centric,” highlights the selected node’s direct neighbours and can be used for brute force exploration of blocked nodes. For instance, it makes it easy to distinguish reactants from products, as well as to asses which reactions produce and consume a metabolite.

The third option, called “Node ancestry,” is more intricate. Here, ModelExplorer will highlight the smallest subset of the network necessary to synthesize a species or activate a reaction in question, given that a non-cyclic solution exists. One such path is highlighted by ModelExplorer in Fig. 2 panel b. If the path is cyclic, the “Node ancestry” mode will instead highlight the cycle, defined as the strongly connected component.

The fourth option is called “Blocked Module” and highlights unconnected modules, as described by Ponce-de-Leon et al. [10]. Each module is an unconnected (to other modules) group of blocked species and reactions, which can be addressed independently of other groups. This tool shows an output only when hovering over blocked items, highlighting the same module when hovering over any of its members. The “Blocked Module” tracking tool is special, because the user can choose to view the module separately from the rest of the network. This is done through the View menu. The layout algorithm is then run only on the blocked module, and the module is plotted on its own. This makes it much easier to visually identify the source of the inconsistency, as crowding in the visual display of the network is very much reduced. Model editing and tracking can be done on the module in the same way as on the whole model, with all changes being applied to the model itself.

Editing the network

ModelExplorer allows the user to interactively edit, add and delete any species, reaction or compartment in the model. It can even be used to build models from scratch by hand. Editing can be performed on any object with the “None” tracking tool option activated, by right-clicking the object and then altering its properties in the Text Panel. Adding and deleting objects is done through the “Add” and “Purge” menus. In addition to deleting objects one by one, ModelExplorer provides the user with several en masse node purging functions. These tools may be useful if, for instance, a reconstructed network has boundary (or extracellular) metabolites instead of import reactions. In that case, ModelExplorer can purge such species, allowing reactions consuming these metabolites to become import reactions.

We have observed many publicly available reconstructed networks to consist of multiple disconnected graphs, where all graphs, except the one containing the biomass, obviously are useless from a modelling perspective. If it would be of interest to remove these, ModelExplorer includes a function to purge disconnected clusters. This function can also be useful after a purge of boundary or extracellular metabolites that may leave behind rudimentary, disconnected reactions. The user also has the choice to only purge species and reactions which are unconnected to any other species or reaction, since we have observed some models to contain unused metabolites in the hundreds.

Comparison with other software

To our knowledge, there are at least five other packages that address the issue of visualization of metabolic networks: MetDraw [19], Escher [20], Gephi [21], Cytoscape [22] with the cy3sbml [23] plugin, and MetExploreViz [24]. None of these tools can perform or visualize consistency checking, edit the underlying model or track neighbours, ancestry or unconnected modules.

MetDraw is also based on Graphviz. However, it does not provide an interactive network view since it will only output still images. Escher and MetExploreViz, are interactive web-applications centered around pathway visualization. These tools draw networks disentangled into pathways, for which human input is necessary since the way one divides a network into pathways is strictly subjective. This approach means that side-metabolites appear plotted multiple times, which could complicate deciphering inconsistencies and tracking ancestry, if such options were to be implemented.

Cytoscape and Gephi on the other hand, are generalist network visualization tools. Cytoscape can use the cy3sbml plugin to import, layout and view SBML files, while Gephi accepts only standard graph formats such as “dot”, requiring a prior conversion from SBML into one of these formats. Both of the tools can make layouts similar to that of ModelExplorer, but lack any other functionality, as mentioned above.

This highlights a principal difference between comparable existing software and ModelExplorer: Escher, MetExploreViz, Cytoscape and Gephi are mainly designed for network visualization of finished models, whereas ModelExplorer is designed for consistency checking and correction of metabolic models at any stage of construction and refinement. This, however, does not mean that ModelExplorer is inferior at visualization. It is in fact the opposite in terms of speed. Visualization speed can be measured in terms of the frame rate, which is the number of times the image could change per second. A low frame rate slows down the navigation around the network and can be very annoying to the user. In order to conduct a reasonable a side-by-side comparison of these very different visualization tools (except MetDraw, which makes still images and thus does not have a frame rate), we tested the frame rates when visualizing the iTO977 model using a DELL laptop with an Intel Core i5-5300U CPU (see Table 1). The comparisons revealed that ModelExplorer is approximately 10.7 times faster than Cytoscape, 9.3 times faster than GePhi (given similar visualization settings), 8.4 times faster than MetExploreViz and 2.8 times faster than Escher. It is important to note that Escher was tested with a model that was about 7 times smaller than iTO977. The reason was that the Escher documentation did not recommend launching bigger models for reasons of speed, and we therefore used the largest model provided on the Escher website.

Table 1

Frame rate comparison of ModelExplorer with similar software, when visualizing the iTO977 model

ModelExplorer

16.0

Escher

5.7

GePhi

1.7

Cytoscape

1.5

MetExploreViz

1.9

Another field in which ModelExplorer has is a significant advance is the consistency checking algorithm we developed. Our algorithm called “ExtraFastCC” is based on the “FastCC” algorithm of Ponce-de-Leon et al. [10], but has a significantly improved ability to check the consistency of dead reversible reactions. The FastCC algorithm is used in consistency checking tools, such as PSAMM [25]. Tools, such as MC3 [26], have based their consistency checking on flux variability analysis (FVA) [16] of every reaction, which is a much slower approach. When tested against FastCC, our algorithm performs 36-80 times better in terms of the number of linear optimization problems that needs to solve, and 3-15 times better in terms of CPU time. The FastCC algorithm needs to test nearly all of the dead reversible reactions in both directions, which can be seen from the numbers in Table 2. Note that, ModelExplorer uses the relatively slow open-source optimizer Clp, while we have used FastCC from the COBRA toolbox together with state-of-the-art commercial optimizer Gurobi [27]. While our algorithm still has better computing times, the difference would be radically larger if Clp was not so slow. As this paper is focused on the visualization tool ModelExplorer, we leave the precise details of the algorithm to another paper.

Table 2

Run time and complexity comparisons of the ModelExplorer consistency checking algorithm “ExtraFastCC” against its predecessor “FastCC”

iTO977

1536

120

215

8.0

6

0.8

iJO1366

2583

241

489

27.2

6

9.0

Recon1

3719

395

794

117.6

21

7.9

Articles You May Like

‘Einstein’s Unfinished Revolution’ Looks At The Quantum-Physics-And-Reality Problem
The buzz about bumble bees isn’t good
Meal Kits Have Smaller Carbon Footprint Than Grocery Shopping, Study Says
Geomagnetic jerks finally reproduced and explained
BRB-seq: The quick and cheaper future of RNA sequencing

Leave a Reply

Your email address will not be published. Required fields are marked *