Drug repositioning for dengue haemorrhagic fever by integrating multiple omics analyses

Bioinformatics

Transcriptomic analysis

We identified 1,817 signature genes with significant differences in expression (p < 0.015 in t-test) between the 10 DHF patients and 8 normal controls from the GSE18090 dataset. A heat map of the expression data is shown in Supplementary Fig. S1. We also identified 1,809 and 1,018 signature genes with significant differences in expression between DHF patients and normal controls in Nicaraguan children from the GSE25226 and GSE38246 datasets, respectively. The 3,892 signature genes that we obtained by integrating the results of the three GEO datasets are listed in Supplementary Table S1 and are presented in a Venn diagram in Fig. 2, which shows that 88 signature genes were common among the three datasets. We also analysed the signature genes with expression fold changes >1.5 or <0.666 (Supplementary Fig. S1). Based on the fold changes, 57 signature genes were common among the three datasets. The GSEA of the 3,892 signature genes identified 419 pathways in the canonical pathways in MsigDB and PWO as significant for DHF. Table 1 shows the top 10 pathways; 7 are from the Reactome database, 2 are from KEGG, and 1 is from PWO.

Figure 2
Figure 2

Venn diagram of the signature genes detected in the transcriptomic analysis. The number of signature genes from three GEO gene expression data (GSE18090, GSE25226, GSE38246) are shown in the green, light blue, light green circle, respectively. The numbers of signature genes that overlap between these data are also shown.

Table 1 Top 10 pathways identified by GSEA of the transcriptomic expression data.

The probe list of the top 300 upregulated and bottom 300 downregulated signature genes from GSE18090 that were compatible with the HG-U133A platform of CMap, was used to query the CMap system. Using the CMap permutated results, 85 compounds with statistical significance (p < 0.1) were identified, and 40 of them were assigned with ATC codes and KEGG DGroups. We classified the selected compounds based on drug use, and analysed the distribution of both the ATC codes and KEGG DGroups for these compounds (Supplementary Fig. S3). At the first level of the anatomical main group in the hierarchy of ATC classification, 29 of the 40 compounds were assigned to the top six groups, namely “C” (cardiovascular system; 11 compounds), “S” (sensory organs; 6 compounds), “N” (Nervous system; 5 compounds), “D” (dermatologicals; 5 compounds), “J” (anti-infectives for systemic use; 5 compounds), and “A” (alimentary tract and metabolism; 5 compounds). In KEGG DGroups, 30 of the 40 compounds were assigned to the top five groups, namely “Antibacterial” (16 compounds), “Cyp substrate” (12 compounds), “Cardiovascular agent” (11 compounds), “Neuropsychiatric agent” (9 compounds), and “Other” (5 compounds).

Proteomic analysis

We used the 389 signature proteins reported by Chiu et al.27 in the proteomic analysis (Supplementary Table S2). The GSEA was applied to the significant proteins in the same manner as in the transcriptomic analysis. A total of 255 statistically significant (p < 0.05) pathways were identified, and the top 10 pathways are shown in Table 2. Three of the drug pathways identified in PWO, the etoposide, sorafenib, and irinotechan pathways, are for drugs that are known antineoplastic agents. The drug pathways were found to be statistically significant (p < 0.05) by a chi-squared test at the first level of the hierarchy in pathway classification of PWO. Previously, Chiu et al.27 analysed the statistical significance of these pathways based on Gene Ontology and found that the protein degradation pathway was statistically significant. This pathway was also statistically significant (p = 0.0028) in our analysis.

Table 2 Top 10 pathways identified by GSEA of the proteomics data.

By searching STITCH 5.0 for drug candidates based on the proteomic data, we found 548 drug candidates that interacted with the 389 significant proteins. Of these, 412 compounds were assigned ATC codes and KEGG DGroups (KEGGID). The top five ATC groups were “C” (cardiovascular system), “L” (antineoplastic and immunomodulating agents), “A” (alimentary tract and metabolism), “G” (genitourinary system and sex hormones), and “S” (sensory organs), and contained 93, 90, 85, 82, and 81 compounds, respectively. The top five KEGG DGroups were “Cyp substrate”, “Other”, “Antibacterial”, “Cardiovascular agent”, and “Neuropsychiatric agent”, and contained 115, 92, 90, 79, and 63 drug candidates, respectively. The distribution of the compounds in ATC and KEGG DGroups was different (correlation coefficients −0.173 and −0.297, respectively) between the drug candidates obtained from the transcriptome analysis and those obtained from the proteome analysis. This is because the drug candidates from the transcriptomic analysis were identified based on the effect of the drug–disease relationship on the gene expression profiles in patient data (DHF vs ND), whereas the candidates from the proteomic analysis were obtained based on the relationship between chemical compounds and proteins in the cellular response.

Interactomic analysis

We searched the literature for reported PPIs between human and dengue virus and found several relevant publications29,30,31,32,33. The proteins reported in these dengue virus–human PPI studies included all 10 dengue virus proteins (capsid protein C, membrane protein M, envelope protein E, nonstructural protein (NS)1, NS2a, NS2b, NS3, NS4a, NS4b, and NS5). Folly et al.29 reported 36 human–dengue virus protein interactions that included the dengue virus C, M, and E proteins. Khadka et al.30 reported 139 human–dengue virus protein interactions that included the dengue virus M, NS1, NS2a, NS2b, NS3, NS4a, NS4b, and NS5 proteins. Le Breton et al.31 reported 186 human–dengue virus protein interactions that included the dengue virus NS3 and NS5 proteins. Mairiang et al.32 reported 46 human–dengue virus protein interactions that included the dengue virus C, NS3, and NS5 proteins29,30,31,32. We integrated the PPI data from these studies and obtained a total of 268 PPIs (Supplementary Table S3), which involved all 10 dengue virus proteins and 221 human proteins. The human–viral PPI network incorporating these identified PPIs is shown in Fig. 3. The top five dengue virus proteins in terms of the number of interactions with human proteins were NS5, NS3, C, NS2A, and NS2B, which displayed 72, 68, 29, 28, and 20 interactions, respectively. Eight of the 10 virus proteins interacted with more than 10 human proteins. NS4B had the lowest number of interactions with only three. Human protein DDX5 interacted with four virus proteins, the highest number of interactions, followed by 5 and 32 human proteins that interacted with three and two virus proteins, respectively. By contrast, 183 human proteins interacted with only one virus protein each.

Figure 3
Figure 3

Human protein and dengue virus protein interaction network. Blue nodes represent the dengue viral proteins and are labelled with the corresponding gene names. Pink nodes represent the human proteins and are labelled with the corresponding UniProt ID. The black edges show the interactions between human proteins and dengue viral proteins as determined by our interactomic analysis of the experimental data.

In searching for drug candidates from the interactomic data, we found 415 drug candidates in STITCH that interacted with the 221 human proteins identified in the human–viral PPIs. Of these 415 compounds, 315 were assigned with ATC codes and identified with KEGG DGroups. We analysed the distribution of these ATC codes and KEGG DGroups in the same manner as in the proteome analysis described above. For the ATC codes, the top five anatomical main groups were “C” (cardiovascular system), “A” (alimentary tract and metabolism), “G” (genitourinary system and sex hormones), “L” (antineoplastic and immunomodulating agents), and “S” (sensory organs), and contained 77, 75, 69, 61, and 59 compounds, respectively. For the KEGG DGroups, the top five groups in the first level were “Cyp substrate”, “Cardiovascular agent”, “Other”, “Antibacterial”, and “Neuropsychiatric agent”, and contained 85, 72, 69, 46, and 39 compounds, respectively. The distributions of compounds in ATC and KEGG DGroups between the proteomics analysis and the interactomic analysis were similar (correlation coefficients 0.888 and 0.832, respectively). The high number of overlapped proteins between the proteome and interactome analyses clearly is responsible for the similarity in the distribution of the compounds obtained from STITCH.

Multiple omics analysis

We conducted a multiple-step comparison of the 3,892 significant genes identified by the transcriptomic analysis, 389 proteins identified by the proteomic analysis, and 221 human proteins identified by the human–dengue virus PPIs as shown in Fig. 4. The genes and proteins are listed in Supplementary Table S4. We found that 41 proteins overlapped between the signature gene products and the human proteins from the human–virus PPIs, and 11 proteins overlapped between the signature proteins and the human proteins from the human–virus PPI network, namely ACTG1, CALR, CLU, ERC1, HSPA5, KTN1, NUP50, PABPC1, PAIP1, RRP12, and SYNE2. Notably, six of these proteins (CALR, ERC1, HSPA5, KTN1, NUP50, and SYNE2) are likely to play important roles in the infectious mechanisms of DHF. HSPA5, also known as GRP78, encodes a 78-kDa glucose-regulated protein, which is the HSP70 molecular chaperone in the endoplasmic reticulum. Previous transcriptome and proteome analyses have shown that HSPA5 was upregulated and a 78-kDa glucose-regulated protein was enriched in dengue virus-infected cells47,48,49. It has been suggested that the 78-kDa glucose-regulated protein may be a component of the dengue virus receptor complex that supports dengue virus entry or facilitates viral protein production48,50. CALR and ERC1 are two of six significant proteins in replication of a dengue virus replicon. CALR encodes calreticulin, which colocalized with viral dsRNA and with the viral NS3 and NS5 proteins in dengue virus-infected cells, consistent with a direct role for calreticulin in dengue virus replication30. NUP50 encodes nuclear pore protein Nup50, which is a component of the nuclear pore complex. The nuclear pore complex is involved in transporting the dengue virus genome and has been suggested as a therapeutic target for many virus infections51. KTN1 encodes kinectin, which is an endoplasmic reticulum protein that extends the endoplasmic reticulum along microtubules. The endoplasmic reticulum is known to be a central organelle in dengue virus replication. Our transcriptomics, proteomics, and interactomics analyses identified these four proteins as significant proteins in dengue infection.

Figure 4
Figure 4

Venn diagram of signature genes, signature proteins, and human proteins that interact with dengue virus proteins. The number of signature genes obtained from the gene expression data with significant differences between DHF patients and normal controls is shown in the blue circle. The number of signature proteins obtained from the protein expression data is shown in the orange circle. The number of human proteins that interact with dengue virus proteins in human–virus PPIs is shown in the green circle. The numbers of gene products and proteins that overlap between the groups are also shown.

Figure 5 shows a comparison of the expression profiles for the pathways identified by GSEA of the transcriptomic and proteomic data, which revealed 115 pathways as common between the two analyses (Supplementary Table S5). A total of 559 detected pathways were categorized into Reactome, PWO, and KEGG pathways for hierarchical analysis (Fig. 6). We examined the distribution of assigned pathways in the first hierarchical level to study the coarse-grained functions for a large set of pathways. Based on the distributions shown in Fig. 6, we assigned the pathways that either existed only among the common pathways or that increased the ratio drastically in these pathways. Specifically, among the 115 common pathways, “Metabolism of proteins,” which is a parent of the “Unfolded protein response” in Reactome, “Regulatory pathway,” which is a parent of the “Protein degradation pathway” in PWO, and “Cellular Processes and Human Diseases” in KEGG, were obtained from the transcriptome and proteome analyses and may play central roles in the infectious mechanism of dengue virus. Furthermore, the top three common pathways in the combined transcriptomics and proteomics analysis (Fig. 6, top panel) are “Drug pathway”, “Regulatory pathway”, and “Metabolism”. The “Drug pathway” in PWO is a pharmacokinetics and pharmacodynamics pathway that is elicited by the administration of specific drugs.

Figure 5
Figure 5

Venn diagram of the significant pathways in the transcriptomic and proteomic analyses. The numbers of significant pathways identified by GSEAs of the transcriptome and proteome are shown in the blue and orange circles, respectively. The number of pathways that overlap between the two analyses is also shown.

Figure 6
Figure 6

Histograms showing the normalized distribution probability of pathways identified by transcriptomics alone, proteomics alone, and the combination of transcriptomics and proteomics. The bars show the proportion of each group among the total number of pathway types identified by Reactome (blue), PWO (orange), or KEGG (green). The top panel shows the distribution of the 115 common pathways identified by both transcriptomics and proteomics GSEAs (T&P). The middle panel shows the distribution of the 304 pathways identified only by the transcriptomics GSEA (T). The bottom panel shows the distribution of the 140 pathways identified only by the proteomics GSEA (P). The sum of all the groups in each section is 1.0 for all three panels.

Figure 7 shows the drug candidates selected using CMap with gene expression profiles and interactomic relationships between signature proteins and chemical compounds. In DHF vs ND, 85 drug candidates were found by CMap. To identify drug candidates in the proteomic analysis, we searched STITCH 5.0 and identified 548 drug candidates that interacted with the 389 significant proteins selected by the protein expression data. For the interactome, we applied the same method to the 221 human proteins found to interact with dengue virus proteins in the human–dengue virus PPIs and obtained 415 drug candidates. Finally, we detected 13 drug candidates that overlapped over the analyses of the three layers (Fig. 7).

Figure 7
Figure 7

Venn diagram for the drug candidates identified by the transcriptomic, proteomic, and interactomic analyses. The numbers of drug candidates identified by the transcriptomic, proteomic, and interactomic analyses are shown in the blue, orange, and green circles, respectively. The numbers of drug candidates that overlap between each analysis type are also shown.

A diagram of the process used to filter and narrow down the drug candidates based on the common proteins (union of proteins in Fig. 4), the common pathways (union of pathways in Fig. 5), and the common drugs (union of drugs in Fig. 7) of three layers is shown in Fig. 8. This process yielded 11 target proteins, 115 target pathways, and 13 drug candidates. Based on the interactions between the drug candidates and the target proteins, we found eight drug candidates and nine proteins. When we focused on the proteome and transcriptome analyses, we found that seven of the 11 target proteins could be mapped in 43 of the 115 target pathways. By combining the above findings, we narrowed down the number of drug candidates that targeted the human proteins in human–dengue virus PPIs and the signature proteins in the proteomic analysis mapped on 33 significant pathways to only eight. We found that five proteins (ACTG1, CALR, ERC1, HSPA5, SYNE2) out of the seven could be mapped to these 33 pathways. These five proteins interacted with one viral protein each. For example, ACTG1 interacted with NS3; CALR, ERC1, and HSPA5 interacted with NS5; and SYNE2 interacted with NS2A, as shown in Fig. 3. The likely relationships among drug candidates, proteins, and pathways are shown in Fig. 9. The eight identified drugs are likely to be effective for DHF and may be suitable for drug repositioning for this purpose. We described the list of the eight drug candidates in Table 3. Those structures and clustering analysis are shown in Supplementary Fig. S6(a,b).

Figure 8
Figure 8

Schematic diagram of the filtering method used to narrow down the drug candidates. 13 drugs were identified as the intersection of the drug candidates by transcriptomic, proteomic and interactomic analyses (Fig. 7). 11 proteins were identified as the intersection between signature proteins of proteomic analysis and human proteins in human-viral PPI (Fig. 4). 115 pathways were identified as the intersection between significant pathways determined by GSEA of transcriptomic analysis and that of proteomic analysis (Fig. 5). Firstly, nine proteins were selected as the intersection between proteins interacted with 13 drugs and 11 signature proteins. These nine proteins were interacted with eight drugs out of 13 drugs. Secondly, seven proteins were selected as the intersection between proteins participating in 115 pathways and 11 signature proteins. These seven proteins were mapped in 43 pathways out of 115 pathways. Finally, five proteins were selected as the intersection between nine proteins interacted with 13 drugs and seven proteins participating in 43 pathways. These five proteins were interacted with eight drugs, and were participating in 33 pathways out of 43 pathways. Then, eight drugs were identified as candidates for use in future drug repositioning.

Figure 9
Figure 9

Chord diagram of the likely relationships among drug candidates, proteins, and pathways identified by the multiple omics analyses. Based on the drug repositioning method, eight drugs, five proteins, and 33 pathways were selected as potential candidates for use in the development of treatments for DHF. The drug candidates, proteins, and pathways are shown in green, orange, and cyan, respectively. Connections show the interactions among the drug candidates, proteins, and pathways. The Reactome, PWO, and KEGG pathways are shown in blue, orange, and green, respectively.

Table 3 The eight drug candidates identified by multiple omics analysis in this study.

Five drug candidates, valparoic acid, sirolimus, resveratrol, vorinostat, and Y-27632, out of the eight identified in this study have already been reported as effective in inhibiting infections with other flaviviruses. Valparoic acid inhibited contact with the E protein of other flaviviruses, so it is expected to be similarly effective against dengue virus52. Sirolimus inhibited viral growth and viral protein expression in flaviviruses53,54. Resveratrol exerted a negative effect on dengue virus replication55. Vorinostat was shown to have a potential synergistic effect for the treatment of West Nile virus encephalitis56. Y-27632, a Rho-associated coiled-coil forming kinase (ROCK) inhibitor, was found to block dengue virus type 2 infection57. The likely relationships and connections among the identified drug candidates, proteins, and pathways based on their categorization by Reactome, PWO, and KEGG are shown in Fig. 9. The order of connections from the eight drugs to the five proteins is valparoic acid, resveratrol, estradiol, and diethylstilbestrol, respectively. The remaining drug candidates have an order of one or two in the connections. Valproic acid is the drug most expected to inhibit dengue infection, and it has been previously demonstrated to be an inhibitor of other flaviviruses. The seven drugs except vorinostat interact with ACTC1 which is the highest number of the connections to the drugs among those five. ACTC1 has a role of reorientation cytoskeleton and interacting with vimentin. It is known that rearrangement of the cytoskeleton in host cells is closely involved in the virus life cycle, starting from virus entry to egress. Vimentin is the major component of mesenchymal cells. The vimentin rearrangement induced by dengue infection can be blocked by the inhibitor drugs. Figure 9 is broken down in Supplementary Figs S4 and S5 to show the relationships based on Reactome and PWO separately. HSPA5 has the highest number of connections to the represented pathways. It is known that the unfolded protein response is a pro-survival cellular reaction induced in response to dengue virus-mediated endoplasmic reticulum stress58,59. We found that these pathways were related directly to the unfolded protein response. These pathways are included in the Reactome as “activation of chaperones by ATF6-alpha” and “activation of chaperone genes by ATF6-alpha” as well as in PWO as “pathway pertinent to protein folding, sorting, modification, translocation and degradation” and “protein degradation pathway”. HSPA5 and CALR are always involved in these pathways. They interact with chaperone proteins and regulate their functions in protein folding and protein degradation. The seven drugs except Y-27632 interact with the proteins annotated to the unfolded protein response. A potential anti-dengue virus agent that targets HSPA5 has been developed. HSPA5 localized to the cell surface and associated with dengue virus receptor complexes and was shown to block entry of the dengue virus by disrupting the association with the dengue virus receptor complex60. The drug candidates identified in our study are expected to induce a suppressed level of gene expression and disrupt the association of host proteins with dengue virus proteins.

Our computational approach using multiple omics data for drug repositioning successfully detected the five drugs out of eight that have already been used to treat infectious diseases caused by flaviviruses. The five reported drugs are all non-anticancer agents. Excluding the anti-cancer agents in Table 3, we found the two drugs, estradiol and simvastatin which have not been reported previously for the treatment of flaviviruses. The seven drug candidates are considered to be strong candidates of drug repositioning to prevent dengue virus replication or ameliorate symptoms.

Articles You May Like

Why language technology can’t handle Game of Thrones (yet)
Experiences of ‘ultimate reality’ or ‘God’ confer lasting benefits to mental health
High performance solid-state sodium-ion battery
Hurricane Michael Was A Category 5, NOAA Finds – The First Since Andrew In 1992
Data mining digs up hidden clues to major California earthquake triggers

Leave a Reply

Your email address will not be published. Required fields are marked *