CMU-CS-19-132 Computer Science Department School of Computer Science, Carnegie Mellon University
Uncertainty and Diversity in Deep Active Image Classification Hariank Muthakana M.S. Thesis December 2019
Deep neural networks have revolutionized computer vision, with state-of-the art performance across multiple tasks. An important part of training such networks is the availability of large, high-quality labeled datasets. This makes building new datasets a significant hurdle to approaching novel tasks or domains. In many cases, acquiring labels can be difficult, expensive, or time-consuming. Active learning seeks to improve label efficiency and lower overall labeling cost by allowing the learning system to intelligently pick samples to label. Active learning is well studied for classical machine learning models, but many of these approaches have been shown to be ineffective for deep models and modern image datasets. This raises the question of how to develop and use active strategies in these settings. In this work, we seek to build intuitions for deep active learning by conducting a comprehensive empirical analysis of existing approaches for image classification tasks. Critical to this analysis is the distinction between uncertainty and diversity-based strategies and how they perform in various settings. Our experiments show surprising results regarding the efficacy of existing approaches in commonly tested settings. We find that active learning is more useful in settings such as low data availability, class imbalance, and transfer learning. Finally, our results provide heuristics for the active learning practitioner to decide on a strategy to use, and more crucially whether to use active learning at all. 41 pages
Thesis Committee:
Srinivasan Seshan, Head, Computer Science Department
| |
Return to:
SCS Technical Report Collection This page maintained by [email protected] |