Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 7-Day Trial for You or Your Team.

Learn More →

Self-taught learning: transfer learning from unlabeled data

Self-taught learning: transfer learning from unlabeled data Self-taught Learning: Transfer Learning from Unlabeled Data Rajat Raina Alexis Battle Honglak Lee Benjamin Packer Andrew Y. Ng Computer Science Department, Stanford University, CA 94305 USA [email protected] [email protected] [email protected] [email protected] [email protected] Abstract We present a new machine learning framework called œself-taught learning  for using unlabeled data in supervised classi cation tasks. We do not assume that the unlabeled data follows the same class labels or generative distribution as the labeled data. Thus, we would like to use a large number of unlabeled images (or audio samples, or text documents) randomly downloaded from the Internet to improve performance on a given image (or audio, or text) classi cation task. Such unlabeled data is signi cantly easier to obtain than in typical semi-supervised or transfer learning settings, making selftaught learning widely applicable to many practical learning problems. We describe an approach to self-taught learning that uses sparse coding to construct higher-level features using the unlabeled data. These features form a succinct input representation and signi cantly improve classi cation performance. When using an SVM for classi cation, we further show how a Fisher kernel can be learned for this representation. 1. Introduction Labeled data for machine learning is often http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png

Self-taught learning: transfer learning from unlabeled data

Association for Computing Machinery — Jun 20, 2007

Loading next page...
/lp/association-for-computing-machinery/self-taught-learning-transfer-learning-from-unlabeled-data-XlaGvDWeVh

References (25)

Datasource
Association for Computing Machinery
Copyright
Copyright © 2007 by ACM Inc.
ISBN
978-1-59593-793-3
doi
10.1145/1273496.1273592
Publisher site
See Article on Publisher Site

Abstract

Self-taught Learning: Transfer Learning from Unlabeled Data Rajat Raina Alexis Battle Honglak Lee Benjamin Packer Andrew Y. Ng Computer Science Department, Stanford University, CA 94305 USA [email protected] [email protected] [email protected] [email protected] [email protected] Abstract We present a new machine learning framework called œself-taught learning  for using unlabeled data in supervised classi cation tasks. We do not assume that the unlabeled data follows the same class labels or generative distribution as the labeled data. Thus, we would like to use a large number of unlabeled images (or audio samples, or text documents) randomly downloaded from the Internet to improve performance on a given image (or audio, or text) classi cation task. Such unlabeled data is signi cantly easier to obtain than in typical semi-supervised or transfer learning settings, making selftaught learning widely applicable to many practical learning problems. We describe an approach to self-taught learning that uses sparse coding to construct higher-level features using the unlabeled data. These features form a succinct input representation and signi cantly improve classi cation performance. When using an SVM for classi cation, we further show how a Fisher kernel can be learned for this representation. 1. Introduction Labeled data for machine learning is often

There are no references for this article.