Differences

This shows you the differences between two versions of the page.

--- prediction [2021/01/27 09:52] – [Design data panel] cloucera
+++ prediction [2021/01/30 14:22] (current) – [Model explanation] krian
@@ Line 7: / Line 7: @@
 {{ :prediction_form.png?nolink |}}
 ===== Prediction form =====
-The main page of a the tool is its filling in form. This form includes all the information and parameters that the tool needs to process a study. The form is divided in different panels:
+The main page of the tool is its filling form. This form includes all the information and parameters that the tool needs to process a study. The form is divided into different panels:
 ==== Type panel ====
 The type panel allows you to choose the kind of prediction analysis you want to perform. You can choose between two options:
-  * **Train new model**: A prediction model is generated with your selected data.
+  * **Build new predictor**: Train new prediction model with your selected data.
 {{ ::typepanel.png?nolink |}}
-  * **Test existing model**: You can choose an already trained model to predict the phenotype of new samples. Existing prediction models can be selected from your project folders stored in HiPathia's server.
+  * **Use existing predictor**: Test existing model choosing an already trained model to predict the phenotype of new samples. Existing prediction models can be selected from your project folders stored in HiPathia's server.
 {{ ::typetestpanel.png?nolink |}}
 ==== Input data panel ====
@@ Line 21: / Line 21: @@
 When we select a gene expression file, the number of samples of this matrix will appear under the "file browser" button as shown below.
 {{ ::diffnumbersamples.png?nolink |}}
-==== Design data panel ====
+==== Design data panel (For training) ====
+It will appear only if the //Build new predictor// is selected.
 The design data panel allows you to choose the kind of experiment you want to perform:
-  * **Two group comparison**: The comparison is performed between the two groups described in the experimental design file.\\ The experimental design file must include **two columns**: the first one with the names of the samples, the second one with the class to which each sample belongs.
+  * **Two-Class predictor**: The comparison is performed between the two classes described in the experimental design file.\\ The experimental design file must include **two columns**: the first one with the names of the samples, the second one with the class to which each sample belongs.
 {{ ::designdatacomp.png?nolink |}}
-If the experimental design file contains more than two classes then you have to select the appropriate class for each condition.\\ **Note:** the condition 2 will be taken as a reference condition.
+If the experimental design file contains more than two classes then you have to select the appropriate class for each condition.\\ **Note:** condition 2 will be taken as a reference condition.
 {{ ::morethantwoclass.png?nolink |}}
-==== Species ====
+==== Species (For training)====
 Here we must choose the species of our experiment.
 You can choose among:
@@ Line 36: / Line 38: @@
   * Rat (Rattus norvegicus).
 {{ ::species.png?nolink |}}
-==== Experimental design ====
+==== Experimental design (For training)====
 This panel includes further parameters necessary to run an analysis.\\
-**Filter circuits**: Check to obtain the circuits that best differentiate your phenotype. This option is only available from //Prediction// tool.
+**Rank and filter circuits**: Check to obtain the circuits that best differentiate your phenotype. This option is only available for the //Build new predictor// type.
 {{ ::filtercircuits.png?nolink |}}
 ==== Pathways ====
@@ Line 63: / Line 65: @@
 {{ ::mystudiespredict.png?nolink |}}
-===== Prediction report =====
+===== Training report =====
 The report page of the Prediction tool includes different output results. You can download any table or image showed in the results page by clicking on the name right before it. You can also download the pathway and function matrices by clicking on //Circuit values//.
@@ Line 86: / Line 89: @@
 {{ ::circuitvaluespredreport.png?nolink |}}
 This matrix file indicates for each "effector circuit" the level of activation calculated using Hipathia method for each sample.
-==== Model training ====
+==== Model evaluation ====
 Here you can visualize the results from the prediction analysis.
-  * **K-fold cross-validation**: The number of equal sized subsamples in which the original sample is randomly partitioned.
+=== K-fold cross-validation ===
+The number of equal-sized subsamples in which the original sample is randomly partitioned.
+Then a table for test model statistics is showed, each value represents the mean across the holdout folds for the corresponding metric or score:
 {{ ::k-foldcrossvalidation.png?nolink |}}
-  * **SVM-RBF hypermeter performance**
-{{ ::hyperparameterpredreport.png?nolink |}}
+=== Validation of typical split ===
-  * **Test model statistics**:
-{{ ::testmodelstatistics.png?nolink |}}
+Then you will find several plots for train-test split validation, where we randomly holdout 30% of the data for the test while the remaining samples are used for training the model. The plots represent the receiver operating characteristic and precision and recall curves for each split.
-  * **Split Train pr**:
+  * **Split Train precision and recall**:
 {{ ::splittrainpr.png?nolink |}}
-  * **Split Train roc**:
+  * **Split Train receiver operating characteristic**:
 {{ ::splittrainroc.png?nolink |}}
-  * **Split Test pr**:
+  * **Split Test precision and recall**:
 {{ ::splittestpr.png?nolink |}}
-  * **Split Test roc**:
+  * **Split Test receiver operating characteristic**:
 {{ ::splittestroc.png?nolink |}}
-==== Prediction model ====
+=== Probability distribution ===
+Here you can find a boxplot for the (predicted) probability distribution of the positive class over the test split with respect to the original labels:
+{{ ::testprobdistboxplot.png?nolink |}}
+Then we show a table with the statistics of the model over the test set, in the same format as the one presented for the k-fold experiment. A well suited model for the problem at hand should not present a huge gap between the performance during the training and testing phases:
+{{ ::testmodelstat.png?nolink |}}
+==== Model explanation ====
+Here you will find a table with the most relevant circuits along with their interaction sign.
+You can download the filtered circuits that best differentiate your phenotype. This section is only available when selecting //Rank and filter circuits// option.
 {{ ::predictionmodelreport.png?nolink |}}
-==== Model statistics ====
+===== Test report =====
-You can download the model statistics.
+When you select //Use existing predictor// you will have a different report for your test prediction study.
-  * **Selected features**: You can download the filtered paths that best differentiate your phenotype. This section is only available when selecting //filter paths// option.
+The test report is divided into four different panels:
+{{ ::testpresdictionreport.png?nolink |}}
+==== Study Information ====
+As explained before, here you can find the information about the current study.
+==== Input Parameters ====
+The parameters with which the test study was launched, such as the name of the used expression file and the Species.
+==== Circuit values ====
+This matrix file indicates for each “effector circuit” the level of activation calculated using Hipathia method for each sample.
+==== Prediction model ====
+This is the most important result, this table is the predicted design file for your selected expression matrix using a previously trained model.
 ===== Workflow =====
 The prediction tool is based on a machine learning module, this module of the Hipathia web tool can be summarized as follows:
@@ Line 140: / Line 172: @@
       * Note that all curve visualizations have been done using the specialized R package ''PRROC'' [3]
+/*
 === Breast Cancer Molecular Subtype Classification ===
@@ Line 258: / Line 290: @@
 {{ :test_probability_boxplot.png?400 | ROC curve for the test split. }}
+*/
 ===== Bibliography =====