Variable selection in transcriptomics data using knockoffs in a classification framework
1 : Centre de Bioinformatique
Mines Paris - PSL (École nationale supérieure des mines de Paris)
2 : Institut Curie
PSL - University
3 : Oncologie Computationnelle (U1331)
Institut National de la Santé et de la Recherche Médicale - INSERM
4 : LOPF Califrais'Machine Learning Lab
* : Corresponding author
Califrais
The emergence of new sequencing technologies has facilitated the acquisition of large amounts of biological data, which has proven to be a useful tool for better understanding biological systems. One way to take advantage of the potential of sequencing data is to use them to identify the relationship between biological units (e.g. genes) and phenotypical characteristics (e.g. disease outcomes). This question, formulated as a variable selection problem, remains difficult because of the size of the data (n

PDF version