The problem of identifying subtle time-space clustering of disease, as may be occurring in leukemia, is described and reviewed. Published approaches, generally associated with studies of leukemia, not dependent on knowledge of the underlying population for their validity, are directed towards identifying clustering by establishing a relationship between the temporal and the spatial separations for the n(n - 1)/2 possible pairs which can be formed from the n observed cases of disease. Here it is proposed that statistical power can be improved by applying a reciprocal transform to these separations. While a permutational approach can give valid probability levels for any observed association, for reasons of practicability, it is suggested that the observed association be tested relative to its permutational variance. Formulas and computational procedures for doing so are given.
While the distance measures between points represent symmetric relationships subject to mathematical and geometric regularities, the variance formula developed is appropriate for arbitrary relationships. Simplified procedures are given for the case of symmetric and skew-symmetric relationships. The general procedure is indicated as being potentially useful in other situations as, for example, the study of interpersonal relationships. Viewing the procedure as a regression approach, the possibility for extending it to nonlinear and multivariate situations is suggested.
Other aspects of the problem and of the procedure developed are discussed.
Similarly, pure temporal clustering can be identified by a study of incidence rates in periods of widespread epidemics. In point of fact, many epidemics of communicable diseases are somewhat local in nature and so these do actually constitute temporal-spatial clusters. For leukemia and similar diseases in which cases seem to arise substantially at random rather than as clear-cut epidemics, it is necessary to devise sensitive and efficient procedures for detecting any nonrandom component of disease occurrence.
Various ingenious procedures which statisticians have developed for the detection of disease clustering are reviewed here. These procedures can be generalized so as to increase their statistical validity and efficiency. The technic to be given below for imparting statistical validity to the procedures already in vogue can be viewed as a generalized form of regression with possible useful application to problems arising in quite different contexts.