1| Introduction

SubpathwayMiner system is an integrative subpathway identification platform for the transcriptomics and/or metabolomics data. The interesting genes and/or metabolites can be treated as input and mapped into KEGG pathways. Then the pathways will be splitted into subpathways. For each subpathway, the hypergeometric test is used to calculate the significance of subpathway enrichment and a cytoscape plugin is used to show the network constructed with subpathways.

2| Data Upload

Click on the "Data Upload" item on the left menu to send data to SubpathwayMiner. Both genes and compounds can be uploaded. By selecting different tags on the top of the window, genes and compounds can be uploaded individually or together. Gene list or compound list can be pasted to the text area or uploaded by a file, one row for each item. After uploading, users should choose the corresponding ID type in the drop-down list and click on the "save and next step" button for further analyzing.

A various of gene ID types are supported including ENTREZID, EMBLID, HGNCID, HGNCSIMBOL, PROTEINID, UNIGENEID, ENSEMBLID, UCSCID, UNIPROTGENENAME and UNIPROTSWISSPROTID. And there are also a lot of compound ID types can be used, like KEGGID, CHEBIID, CHEMBID, CID, HMDBID, SID and METABOLITE_NAME.

Considering the name of compounds may be ambiguous, we offer fuzzy matching for metabolite name. The matching result will be shown as below. The metabolite names in blue means fuzzy matching appears.

Click on the "VIEW" to the left of the frame can pop up a window for user to select the mapping relation they wanted.

3| Pathway Reconstruction

This part is for user to choose different type of pathways , like metabolic pathways or non-metabolic pathways, pathways represented in terms of KO or EC and directed or undirected pathways. After selecting species from the pull-down menu, users can select different combination of various types by clicking on the button before them. Furthermore, some further operations on pathways can be performed by clicking on the "detail" to the bottom of the page.

i. FilterNode: this option is used to filter nodes in the pathway, covering nodes with the type of gene, compound and map.

ii. simplifyGraph: this option can simplify pathways into a network with only one type of node, also covering the nodes with type of gene, compound and map. The needless type of nodes will be omitted and if it is between two required nodes an edge will be created.

iii. expandNode: In many pathway maps, some nodes may have multiple molecules. This option is used to expand one node into some other nodes representing only one molecule respectively. Users can also choose which type of node to be expanded, including gene compound and map.

iv. mergeNode: By selecting this function, different nodes with the same name will be merged into a single node while all the edges will be kept.

4| Pathway Mining

In this section, users can choose what type of pathway should be used for mining, entire pathway or subpathway. Subpathway mining is highly recommended as our unique feature and two different mining method, 'K-Cliques' and 'lenient' distance similarity method are provided. Different method can be chosen from the pull-down menu and several parameters must be set for each function. The default value of parameter k for 'K-Cliques' is 4, which can be modified by users according to their own consideration. Meanwhile, there are two parameters for lenient distance similarity method, n and s, whose default value both are 5.

5| Execute

After selecting a serious of parameters above, this page is to show the summary for user to confirm, including the mining method and the parameter user set. If there are no mistakes, click on the "Start Mining" button to start analyzing.

6| Result

A table will display the mining result after calculating within minutes. In the table, the ID and name of the pathway or subpathway will be showed. Meanwhile, a serious of statistics will also be displayed, including the annotated molecule ratio, the annotated background ratio, the hypergeometric pvalue and the false discovery rate.

7| Visualization

After subpathways were mined, SubpathwayMiner system provides two flexible ways to visualize subpathways, in KEGG or Cytoscape. By clicking on the name of pathways, the page will be redirected to the corresponding graph in KEGG with red nodes annotated in subpathways and yellow nodes unanotated in subpathways.

When clicking on the "view", a subpathway graph created by R or diplayed by Cytoscape plugin is displayed. It is a user friendly picture in which a node can be dragged to anywhere and the picture size can be adjusted by a menu which in the bottom right of the picture. The annotated nodes will also be colored in red.

When click on the nodes in the graph, the detail information of the corresponding node in KEGG will be showed.