TY - JOUR AU1 - Xing, Eric P. AU2 - Karp, Richard M. AB - We present CLIFF, an algorithm for clustering biologicalsamples using gene expression microarray data. This clusteringproblem is difficult for several reasons, in particular the sparsityof the data, the high dimensionality of the feature (gene) space, andthe fact that many features are irrelevant or redundant. Ouralgorithm iterates between two computational processes, featurefiltering and clustering. Given a reference partition thatapproximates the correct clustering of the samples, our featurefiltering procedure ranks the features according to their intrinsicdiscriminability, relevance to the reference partition, andirredundancy to other relevant features, and uses this ranking toselect the features to be used in the following round of clustering.Our clustering algorithm, which is based on the concept of anormalized cut, clusters the samples into a new reference partitionon the basis of the selected features. On a well-studied probleminvolving 72 leukemia samples and 7130 genes, we demonstrate thatCLIFF outperforms standard clustering approaches that do not considerthe feature selection issue, and produces a result that is very closeto the original expert labeling of the sample set.Contact: epxing@cs.berkeley.edu TI - CLIFF: clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts JF - Bioinformatics DO - 10.1093/bioinformatics/17.suppl_1.s306 DA - 2001-06-01 UR - https://www.deepdyve.com/lp/oxford-university-press/cliff-clustering-of-high-dimensional-microarray-data-via-iterative-9EuNS1u3No SP - S306 EP - S315 VL - 17 IS - suppl_1 DP - DeepDyve ER -