CMU-CS-03-154
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-03-154

A Fast Multi-Resolution Method for Detection of
Significant Spatial Overdensities

Daniel B. Neill, Andrew W. Moore

June 2003

CMU-CS-03-154.ps
CMU-CS-03-154.pdf


Keywords: Algorithms, biosurveillance, cluster detection, spatial scan statistic


Given an N x N grid of squares, where each square sij has a count cij and an underlying population pij, our goal is to find the square region S with the highest density, and to calculate the significance of this region by Monte Carlo testing. Any density measure D, which depends on the total count and total population of the region, can be used. For example, if each count cij represents the number of disease cases occurring in that square, we can use Kulldorff's spatial scan statistic DK to find the most significant spatial disease cluster. A naive approach to finding the region of maximum density would be to calculate the density measure for every square region: this requires O(RN3) calculations, where R is the number of Monte Carlo replications, and hence is generally computationally infeasible. We present a novel multi-resolution algorithm which partitions the grid into overlapping regions, bounds the maximum score of subregions contained in each region, and prunes regions which cannot contain the maximum density region. For sufficiently dense regions, this method finds the maximum density region in optimal O(RN2) time, and in practice it results in significant (10-200x) speedups as compared to the naive approach.

17 pages


Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by [email protected]