data_format
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
data_format [2020/01/28 19:21] – krian | data_format [2024/02/27 13:17] (current) – [Experimental design file format] krian | ||
---|---|---|---|
Line 3: | Line 3: | ||
Different types of data are used in Hipathia. Some of this data require a certain structure explained on the following links: | Different types of data are used in Hipathia. Some of this data require a certain structure explained on the following links: | ||
+ | **Note:** The recommended file extensions are ' | ||
- | [[Expression matrix file format |Expression matrix file format]] | + | ===== Expression matrix file format |
- | ====== Headline ====== | + | |
- | ===== Headline ===== | + | |
- | ====== Headline ====== | + | |
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | ====== | + | |
Expression matrix file is a Tab-separated values file. | Expression matrix file is a Tab-separated values file. | ||
Line 57: | Line 50: | ||
**Note**: If probe expression values are provided, these are recodified to gene expression values, obtained as the average value of all the probes mapping in the gene. | **Note**: If probe expression values are provided, these are recodified to gene expression values, obtained as the average value of all the probes mapping in the gene. | ||
+ | ===== Experimental design file format ===== | ||
+ | Experimental design is Tab-separated values file. This file has two columns, the first one corresponds to the sample name and the second one corresponds to the phenotype. | ||
+ | < | ||
+ | sample1 Group_1 | ||
+ | sample2 Group_1 | ||
+ | sample3 Group_2 | ||
+ | </ | ||
- | [[Experimental design file format | Experimental design file format ]] | + | **Note**: In case of **paired data** the Experimental design file must be **ordered**. |
- | [[Gene list file format | Gene list file format ]] | + | Here is an example of a file with 4 piared samples (sample1_Normal and sample1_Treated are the same sample before and after treatment): |
+ | < | ||
+ | sample1_Normal Group_1 | ||
+ | sample2_Normal Group_1 | ||
+ | sample1_Treated Group_2 | ||
+ | sample2_Treated Group_2 | ||
+ | </ | ||
+ | |||
+ | Here is an other file example see {{: | ||
+ | |||
+ | ===== Gene list file format ===== | ||
+ | |||
+ | Gene List is Tab-separated values file. This file has just one column, that is the Entrez ID of genes (1 Entrez ID per line). | ||
+ | |||
+ | |||
+ | Here is an example of a file with 4 genes to be evaluated: | ||
+ | |||
+ | < | ||
+ | Gene_1 | ||
+ | Gene_2 | ||
+ | Gene_3 | ||
+ | Gene_4 | ||
+ | </ | ||
+ | ====== Character encoding ====== | ||
+ | We recommend using the **[[https:// |
data_format.1580239286.txt.gz · Last modified: 2020/01/28 19:21 by krian