Benchmark Multi-Omics Datasets for Methods Comparison (Q6847)

Dataset published at Zenodo repository.

Language	Label	Description	Also known as
English	Benchmark Multi-Omics Datasets for Methods Comparison	Dataset published at Zenodo repository.

Statements

instance of

data set

0 references

description

Pathway Multi-Omics Simulated Data These are synthetic variations of the TCGA COADREAD data set (original data available athttp://linkedomics.org/data_download/TCGA-COADREAD/). This data set is used as a comprehensive benchmark data set to compare multi-omics tools in the manuscript pathwayMultiomics: An R package for efficient integrative analysis of multi-omics datasets with matched or un-matched samples. There are 100 sets (stored as 100 sub-folders, the first 50 in pt1 and the second 50 in pt2) of random modifications to centred and scaled copy number, gene expression, and proteomics data saved as compressed data files for the R programming language. These data sets are stored in subfolders labelled sim001, sim002, ..., sim100. Each folder contains the following contents: 1) indicatorMatricesXXX_ls.RDSis a list of simple triplet matrices showing which genes (in which pathways) and which samples received the synthetic treatment (where XXX is the simulation run label: 001, 002, ...), (2) CNV_partitionA_deltaB.RDS is the synthetically modified copy number variation data(where A represents the proportion of genes in each gene set to receive the synthetic treatment [partition 1 is 20%, 2 is 40%, 3 is 60% and 4 is 80%] and B is the signal strength in units of standard deviations), (3) RNAseq_partitionA_deltaB.RDS is the synthetically modified gene expression data (same parameter legend as CNV), and (4)Prot_partitionA_deltaB.RDS is the synthetically modified protein expression data (same parameter legend as CNV). Supplemental Files The file cluster_pathway_collection_20201117.gmt is the collection of gene sets used for the simulation study in Gene Matrix Transpose format.Scripts to create and analyze these data sets available at:https://github.com/TransBioInfoLab/pathwayMultiomics_manuscript_supplement

0 references

publication date

11 November 2021

0 references

0 references

0 references

Creative Commons Attribution 4.0 International

0 references