GeNeCK: a web server for gene network construction and visualization

Bioinformatics

GeNeCK allows users to construct network using 11 different methods (summarized in Additional file 3: Table S1). Readers can refer to Yu et al. [7] for a comprehensive review of the different network construction methods.

Network inference methods

Partial correlation-based methods calculate the inverse covariance matrix Ω (also known as the precision matrix) of gene expressions, in which ωj,h=0 indicates gene j and h given the expressions of all the other genes is conditional independent. GeneNet [11] employs Moore-Penrose pseudoinverse and bootstrap methods to obtain a shrink estimate of Ω. Meinshausen and Bühlmann [12] proposed the neighborhood selection (NS) method, which converts the precision matrix estimation problem to a regression problem by fitting a LASSO to each gene using others as predictors. Sparse partial correlation estimation (SPACE) is a joint spare regression model developed by Peng et al. [13], which resolves a symmetrically constrained and L1-regularizated regression problem under high-dimensional settings.

Likelihood-based approaches, such as graphical LASSO (GLASSO [14]) and GLASSO with a reweighted strategy for scale-free networks (GLASSO-SF [15]), optimize a penalized maximum likelihood function to estimate Ω. Bayesian graphical LASSO (BayesianGLASSO [16]) is a fully Bayesian treatment of GLASSO that uses a double exponential prior and employs a block Gibbs sampler for exploring the posterior distribution.

Mutual information (MI) is a measure in information theory of pairwise dependency between two variables. Zhang et al. [17] proposed a path consistency algorithm based on conditional mutual information (PCACMI) to infer graphical structure, and further conditional mutual inclusive information-based network inference (CMI2NI [18]) method that improves the PCACMI method.

Hub gene incorporation

Gene networks usually have scale-free characteristics. In other words, there are usually a few hub genes regulating many others. In practice, most of such hub genes in biological pathways have been well studied and validated through biological experiments. To properly incorporate this prior knowledge, Yu et al. [7] proposed extended sparse partial correlation estimation (ESPACE) and extended graphical LASSO (EGLASSO) methods. In these methods, during the covariance estimation of original SPACE and GLASSO methods, hub gene information can be incorporated to improve the network inferences.

Network integration

An ensemble-based network aggregation (ENA) method [6] combines networks reconstructed from different methods. The original ENA algorithm does not report the confidence level of estimated edges. To derive the p-value of an edge between a pair of genes, we adapted ENA by implementing an additional permutation step to generate the distribution of null hypothesis. We first permute the given gene expression dataset to obtain a resampled dataset D(m). Then we implement the ENA algorithm to get the ensemble rank matrix (tilde {R}^{(m)} ) for this dataset. This procedure is repeated M times. The empirical null distribution Fnull of all possible pairwise connection for p genes can be obtained based on all the harmonic means in the M permutations, i.e. (left {tilde {r}^{(m)}_{jh},,m=1,…,M,1leq j < h leq pright } ). Then the p-value of the estimated edge between gene j and h is approximated by the quantile of (tilde {r}_{jh} ) in the null distribution Fnull with Benjamini-Hochberg adjustment [19] to avoid multiple comparison problems.

$$Dxrightarrow{text{permutate}} left{ begin{array}{ccc} D^{(1)} & xrightarrow{text{ENA}} & tilde{R}^{(1)}\ vdots & & vdots \ D^{(M)} & xrightarrow{text{ENA}} & tilde{R}^{(M)} end{array} right} rightarrow F^{text{null}}, $$

$${begin{aligned} {p}-value(jh)=BHadjustleft(frac{#text{ of }tilde{r}_{jh}leq text{ permutated}~r text{ value in }F^{text{null}}} {text{Total }#text{ of }tilde{r}_{jh}leq text{ permutated}~r text{ value in }F^{text{null}}}right). end{aligned}} $$

In the simulation studies, we ensembled the networks constructed by NS, GLASSO, GLASSO-SF, PCACMI, SPACE, and BayesianGLASSO. GeneNet and CMI2NI were excluded because GeneNet performed the worst in all the scenarios (Additional file 4: Figure S1-S8) and CMI2NI produced the exact same results as PCACMI in default settings. We run all the processes in a single node of UT Southwestern BioHPC cluster (Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz, 32GB RAM).

Articles You May Like

Algal library lends insights into genes for photosynthesis
‘Inflamm-aging’ causes loss of bone healing ability in the elderly
Researchers reverse the flow of time on IBM’s quantum computer
Meningitis changes immune cell makeup in the mouse brain lining
Measuring differences in brain chemicals in people with mild memory problems

Leave a Reply

Your email address will not be published. Required fields are marked *