CMU-ML-12-100
Machine Learning Department
School of Computer Science, Carnegie Mellon University



CMU-ML-12-100

Integrating Representation Learning and Skill Learning in a Human-Like Intelligent Agent

Nan Li, Noboru Matsuda,
William W. Cohen, Kenneth R. Koedinger

January 2012

CMU-ML-12-100.pdf


Keywords: Representation learning, deep feature learning, intelligent agent


Building an intelligent agent that simulates human learning of math and science could potentially benefit both education, by contributing to the understanding of human learning, and artificial intelligence, by advancing the goal of creating human-level intelligence. However, constructing such a learning agent currently requires manual encoding of prior domain knowledge; in addition to being a poor model of human acquisition of prior knowledge, manual knowledge-encoding is both time-consuming and error-prone. Previous work showed that one of the key factors that differentiates experts and novices is their different representations of knowledge. Experts view the world in terms of deep functional features, while novices view it in terms of shallow perceptual features. Moreover, since the performance of many existing learning algorithms is sensitive to representation, the deep features are also important in achieving effective learning. In this paper, we present an efficient algorithm that acquires representation knowledge in the form of "deep features" for specific domains, and demonstrate its effectiveness in the domain of algebra as well as synthetic domains. We integrate this algorithm into a machine-learning agent, SimStudent, which learns procedural knowledge by observing a tutor solve sample problems, and by getting feedback while actively solving problems on its own. We show that learning "deep features" reduces the requirements for knowledge engineering. Moreover, we propose an approach that automatically discovers student models using the extended SimStudent. By fitting the discovered model to real student learning curve data, we show that it is a better student model than human-generated models, and demonstrate how the discovered model may be used to improve a tutoring system's instructional strategy.

41 pages


SCS Technical Report Collection
School of Computer Science homepage

This page maintained by [email protected]