User Tools

Site Tools


worked_example_prediction_-_train

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
worked_example_prediction_-_train [2021/01/30 16:07] – [Test: Use existing predictor] krianworked_example_prediction_-_train [2021/01/30 16:16] (current) – removed krian
Line 1: Line 1:
-====== Worked example prediction ===== 
-===== Training: Build new predictor ===== 
- 
-**1-** Log into HiPathia. For further information on this step visit [[logging_in|logging in]]. 
- 
-**2-** Collection of data. We will work with a Breast Cancer dataset from the repository The Cancer Genome Atlas (TCGA) [[https://portal.gdc.cancer.gov/projects/TCGA-BRCA | Link to dataset]].\\  
-More information on the proposed dataset is available here: 
-  * https://www.nature.com/articles/nature11412 
-  * https://pubmed.ncbi.nlm.nih.gov/23644459/ 
- 
-Before use in HiPathia, the dataset must be normalized. We recommend using the [[https://genomebiology.biomedcentral.com/articles/10.1186/gb-2010-11-3-r25 | logarithm of the trimmed mean of M values]] (log2TMM). 
- 
-We have selected samples of breast cancer tumors from the dataset, annotated as luminal A and luminal B (the molecular annotations come from [[https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3465532/ | this paper]]). You can learn more about breast cancer molecular subtypes [[https://link.springer.com/article/10.1007/s10549-009-0499-6 | here]]. The purpose of this study is to train a predictor so that it can learn to distinguish molecular subtypes from gene expression data using the Hipathia mechanistic models, and evaluate the model with a controlled set of samples. 
- 
-The expression matrix and the experimental design can be downloaded from these links: 
- 
-  * Expression matrix: [[http://hipathia.babelomics.org/data/brca_sub_class_exp_train.txt|brca_sub_class_exp_train.txt]] 
-  * Experimental design:[[http://hipathia.babelomics.org/data/brca_sub_class_des_train.txt|brca_sub_class_des_train.txt]] 
- 
-**3-** Click //Prediction// button.  
-{{ :hipathia_bar_pred.png?600 |}} 
- 
-**4-** Upload the normalized data to HiPathia by clicking //My data// in the data panel, or click the //Run training example// button. For further information on this step visit [[upload_your_data|Upload your data]]. 
-{{ ::runtrainingexmpl.png?nolink |}} 
-**5.** In the //Type// panel select //Train new predictor//. 
- 
-**6.** In the //Input data// panel, select //Expression matrix//. Click the //File browser// of the //Expression matrix// section and select the desired file. 
- 
-{{ :hipathia_Pred_work1.png?600 |}} 
- 
-**7.** In the //Design data// panel, select //Class predictor//. Click the //File browser// of the //Experimental design// section and select the desired file. Automatically //Condition 1// and //Condition 2// files are selected. Select "Tumor" for //Condition 1// and "Normal" for //Condition 2//. 
- 
-{{ :hipathia_Pred_work2.png?600 |}} 
- 
-**8.** Select Human (Homo sapiens) as species (default). 
- 
- 
-**9.** In the //Pathways// panel, select all the pathways (default). 
- 
-**10.** In the //Study information// panel, click the //File browser// button and select the desired output folder. In this case, we will use //analysis_BRCA//. Give a name to the study, for example "BRCA train model". 
- 
-{{ :hipathia_work4.png?600 |}} 
- 
-**11.** Click the //Run analysis// button. A Study will be created and listed in the studies panel. You can access this panel by clicking on the //My studies// button. 
- 
-===== Training report ===== 
-Once the launched study is finished, the report/results will be available in “My studies”.  
-The report page of the Prediction-training tool includes different output results. You can download any table or image shown on the results page by clicking on the name right before it. You can also download the pathway and function matrices by clicking on //Circuit values//, For more information about each result please read [[prediction#training_report|Prediction - Training report]] and [[prediction#workflow|Prediction - Workflow]]  sections. 
- 
-=== Breast Cancer Molecular Subtype Classification === 
- 
-The experiment consists in classifying a given sample as Luminal A or Luminal B (molecular subtype). We use TCGA data, no pathway filtering was done by hand. 
- 
-**Model Analysis** 
- 
-Hyperparameter search: 
- 
-{{ :svm.performance.heatmap.png?direct&400 | Hyperparameter search heatmap}} 
- 
-CV Performance: 
-{{ :model_stats.txt| CV stats }} 
- 
-The most relevant features along with their interaction sign: 
- 
-|**Selected circuits name**                                                          |**Coef sign** | 
-|ErbB signaling pathway: ELK1* |-     | 
-|Progesterone-mediated oocyte maturation: CDK1 |+     | 
-|Pathways in cancer: BCL2 |-     | 
-|Cell cycle: CDC45 MCM7 MCM6 MCM5 MCM4 MCM3 MCM2 |+     | 
-|Neurotrophin signaling pathway: NFKB1 |+     | 
-|Pathways in cancer: E2F1 |+     | 
-|p53 signaling pathway: SERPINB5 |-     | 
-|p53 signaling pathway: CDK1 CCNB3 |+     | 
-|Vascular smooth muscle contraction: ACTA2 |-     | 
-|Neurotrophin signaling pathway: JUN |-     | 
-|Apoptosis: TP53 |+     | 
-|PPAR signaling pathway: ACAA1 |-     | 
-|Hippo signaling pathway: BIRC5 |+     | 
-|PPAR signaling pathway: CPT1C |+     | 
-|ErbB signaling pathway: GSK3B |+     | 
-|cAMP signaling pathway: GRIN3A |+     | 
-|Oocyte meiosis: CPEB2 |+     | 
-|Pathways in cancer: CSF3R |+     | 
-|Choline metabolism in cancer: WAS |+     | 
-|NOD-like receptor signaling pathway: CASP5 |+     | 
-|Hippo signaling pathway: SMAD1 SMAD4 |-     | 
-|Pathways in cancer: BIRC5 |+     | 
-|ErbB signaling pathway: STAT5A* |-     | 
-|ErbB signaling pathway: STAT5A |-     | 
-|p53 signaling pathway: CDK2 CCNE1 |+     | 
-|Proteoglycans in cancer: MAPK1 |-     | 
-|Pathways in cancer: CTBP1 HDAC1 |-     | 
-|cGMP-PKG signaling pathway: CNGB1 |-     | 
-|Ras signaling pathway: RAP1A |-     | 
-|Oxytocin signaling pathway: EEF2 |-     | 
-|Gap junction: C00681 |-     | 
-|Insulin signaling pathway: G6PC |-     | 
-|Fc gamma R-mediated phagocytosis: ARF6* |+     | 
-|Platelet activation: PIK3R5* |-     | 
-|Progesterone-mediated oocyte maturation: PIK3R5 |-     | 
-|PI3K-Akt signaling pathway: CDKN1B |-     | 
-|PPAR signaling pathway: FADS2 |+     | 
-|Thyroid hormone signaling pathway: WNT4 |-     | 
-|Thyroid hormone signaling pathway: CTNNB1 |-     | 
- 
- 
-**Split Analysis** 
- 
-Split Performance: 
-{{ :test_model_stats.txt| Test stats }} 
- 
-PR curve over the test: 
- 
-{{ :split_test_pr.png?400 | Precision-recall (PR) curve for the test split. }} 
- 
-ROC curve over the test set: 
- 
-{{ :split_test_roc.png?400 | ROC curve for the test split. }} 
- 
-Probability for the test set: 
- 
-{{ :test_probability_boxplot.png?400 | ROC curve for the test split. }} 
- 
-===== Test: Use existing predictor ===== 
- 
  
worked_example_prediction_-_train.1612022869.txt.gz · Last modified: 2021/01/30 16:07 by krian