|
CMU-CS-05-145
Computer Science Department
School of Computer Science, Carnegie Mellon University
CMU-CS-05-145
Automated Modeling and Nonlinear Axis Scaling
Leejay Wu
May 2005
Ph.D. Thesis
CMU-CS-05-145.ps
CMU-CS-05-145.pdf
Keywords: Scaling, modeling, feature selection
This thesis examines nonlinear axis scaling and its impact on
the modeling of inter-attribute relationships. Through automated
methods, the described system identifies possible scaling methods;
decides which attributes serve as inputs or outputs; and builds
regression trees that quantify these relationships. While the
experiments focus on the accuracy and complexity of these models,
both of which one can attempt to quantitatively examine, the
results also consider applicability towards the inherently more
qualitative task of rule-based outlier or anomaly detection.
The results demonstrate that the use of nonlinear axis scaling,
even in an automated system, can provide signi cantly more
accurate models compared to the unscaled case without
proportionally higher complexity costs; and also can help reveal
unusual tuples in which what is unusual is not any individual
value, but the combination thereof.
179 pages
|