CMU-CS-16-119 Computer Science Department School of Computer Science, Carnegie Mellon University
Active Transfer Learning Xuezhi Wang June 2016 Ph.D. Thesis
Transfer learning algorithms are used when one has sufficient training data for one supervised learning task (the source task) but only very limited training data for a second task (the target task) that is similar but not identical to the first. These algorithms use varying assumptions about the similarity between the tasks to carry information from the source to the target task. Common assumptions are that only certain specific marginal or conditional distributions have changed while all else remains the same. Moreover, not much work on transfer learning has considered the case when a few labels in the test domain are available. Alternatively, if one has only the target task, but also has the ability to choose a limited amount of additional training data to collect, then active learning algorithms are used to make choices which will most improve performance on the target task. These algorithms may be combined into active transfer learning, but previous efforts have had to apply the two methods in sequence or use restrictive transfer assumptions.
This thesis focuses on active transfer learning under the model shift assumption.
We start by proposing two transfer learning algorithms that allow changes in all
marginal and conditional distributions but assume the changes are smooth in order
to achieve transfer between the tasks. We then propose an active learning algorithm
for the second method that yields a combined active transfer learning algorithm.
By analyzing the risk bounds for the proposed transfer learning algorithms, we show
that when the conditional distribution changes, we are able to obtain a generalization
error bound of O(i/λ On the other hand, multi-task learning attempts to simultaneously leverage data from multiple domains in order to estimate related functions on each domain. Similar to transfer learning, multi-task problems are also solved by imposing some kind of "smooth" relationship among/between tasks. We study how different smoothness assumptions on task relations affect the upper bounds of algorithms proposed for these problems under different settings. Finally, we propose methods to predict the entire distribution P(Y) and P(Y\X) by transfer, while allowing both marginal and conditional distributions to change. Moreover, we extend this framework to multi-source distribution transfer. We demonstrate the effectiveness of our methods on both synthetic examples and real-world applications, including yield estimation on the grape image dataset, predicting air-quality from Weibo posts for cities, predicting whether a robot successfully climbs over an obstacle, examination score prediction for schools, and location prediction for taxis.
117 pages
Thesis Committee:
Frank Pfenning, Head, Computer Science Department
| |
Return to:
SCS Technical Report Collection This page maintained by [email protected] |