
| |  | |  | |
| The Spatial Association between Community Air Pollution and Mortality: A New Method of Analyzing Correlated Geographic Cohort Data Richard Burnett, Renjun Ma, Michael Jerrett, Mark S. Goldberg, Sabit Cakmak, C. Arden Pope III, and Daniel Krewski Abstract We present a new statistical model for linking spatial variation in ambient air pollution to mortality. The model incorporates risk factors measured at the individual level, such as smoking, and at the spatial level, such as air pollution. We demonstrate that the spatial autocorrelation in community mortality rates, an indication of not fully characterizing potentially confounding risk factors to the air pollution-mortality association, can be accounted for through the inclusion of location in the model assessing the effects of air pollution on mortality. Our methods are illustrated with an analysis of the American Cancer Society cohort to determine whether all cause mortality is associated with concentrations of sulfate particles. The relative risk associated with a 4.2 µg/m3 interquartile range of sulfate distribution for all causes of death was 1.051 (95% confidence interval 1.036-1.066) based on the Cox proportional hazards survival model, assuming subjects were statistically independent. Inclusion of community-based random effects yielded a relative risk of 1.055 (1.033, 1.077) , which represented a doubling in the residual variance compared to that estimated by the Cox model. Residuals from the random-effects model displayed strong evidence of spatial autocorrelation (p = 0.0052) . Further inclusion of a location surface reduced the sulfate relative risk and the evidence for autocorrelation as the complexity of the location surface increased, with a range in relative risks of 1.055-1.035. We conclude that these data display both extravariation and spatial autocorrelation, characteristics not captured by the Cox survival model. Failure to account for extravariation and spatial autocorrelation can lead to an understatement of the uncertainty of the air pollution association with mortality. Key words: air pollution, cohort, epidemiology, mortality, spatial regression, sulfate particles, survival. -- Environ Health Perspect 109(suppl 3) :375-380 (2001) . http://ehpnet1.niehs.nih.gov/docs/2001/suppl-3/375-380burnett/abstract.html |
|
|
 |

Search
109-S3 Table of Contents
EHPS Archives
Publications
Subscribe
|
Environmental Health Perspectives Volume 109, Supplement 3, June 2001
The Spatial Association between Community Air Pollution and Mortality: A New Method of Analyzing Correlated Geographic Cohort Data
Richard Burnett,1,2,7 Renjun Ma,2 Michael Jerrett,3 Mark S. Goldberg,4,5 Sabit Cakmak,1 C. Arden Pope III,6 and Daniel Krewski2,7
1Healthy Environments and Consumer Safety Branch, Health Canada, Tunney's Pasture, Ontario, Ottawa, Canada; 2Department of Epidemiology and Community Medicine, Faculty of Medicine, University of Ottawa, Ontario, Ottawa, Canada; 3School of Geography and Geology and Institute of Environment and Health, McMaster University, Hamilton, Ontario, Canada; 4Department of Medicine, McGill University, Montreal, Quebec, Canada; 5Joint Departments of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Quebec, Canada; 6Economics Department, Brigham Young University, Provo, Utah, USA; 7Institute of Population Health, University of Ottawa, Ottawa, Ontario, Canada
|
|
Abstract
We present a new statistical model for linking spatial variation in ambient air pollution to mortality. The model incorporates risk factors measured at the individual level, such as smoking, and at the spatial level, such as air pollution. We demonstrate that the spatial autocorrelation in community mortality rates, an indication of not fully characterizing potentially confounding risk factors to the air pollution-mortality association, can be accounted for through the inclusion of location in the model assessing the effects of air pollution on mortality. Our methods are illustrated with an analysis of the American Cancer Society cohort to determine whether all cause mortality is associated with concentrations of sulfate particles. The relative risk associated with a 4.2 µg/m3 interquartile range of sulfate distribution for all causes of death was 1.051 (95% confidence interval 1.036-1.066) based on the Cox proportional hazards survival model, assuming subjects were statistically independent. Inclusion of community-based random effects yielded a relative risk of 1.055 (1.033, 1.077), which represented a doubling in the residual variance compared to that estimated by the Cox model. Residuals from the random-effects model displayed strong evidence of spatial autocorrelation (p = 0.0052). Further inclusion of a location surface reduced the sulfate relative risk and the evidence for autocorrelation as the complexity of the location surface increased, with a range in relative risks of 1.055-1.035. We conclude that these data display both extravariation and spatial autocorrelation, characteristics not captured by the Cox survival model. Failure to account for extravariation and spatial autocorrelation can lead to an understatement of the uncertainty of the air pollution association with mortality. Key words: air pollution, cohort, epidemiology, mortality, spatial regression, sulfate particles, survival. -- Environ Health Perspect 109(suppl 3):375-380 (2001).
http://ehpnet1.niehs.nih.gov/docs/2001/suppl-3/375-380burnett/abstract.html
Address correspondence to R.T. Burnett, 200 Environmental Health Center, Tunney's Pasture, Ottawa, Ontario, Canada K1A 0L2, PA: 0800B1. Telephone: (613) 952-1364. Fax: (613) 941-3883. E-mail: rick_burnett@hc-sc.gc.ca
This research was motivated by a comprehensive reanalysis of the Harvard Six Cities Study and the American Cancer Society (ACS) Study of Particulate Air Pollution and Mortality sponsored by the Health Effects Institute (HEI) in which several of the authors (R. Burnett, R. Ma, M. Jerrett, M. Goldberg, and D. Krewski) participated. We are grateful to HEI for their support of the reanalysis and to the HEI Expert Panel and Review Committee that provided us with many helpful comments during the 2-year course of the reanalysis. The methods presented in this paper represent extensions of our initial attempts to address spatial patterns in the ACS data in the reanalysis.
Received 12 December 2000; accepted 14 March 2001.
|
In 1997, the U.S. Environmental Protection Agency (U.S. EPA) promulgated new regulations for fine particulate matter in ambient air. This decision was based, in part, on the evidence that Americans had an increased risk of cardiopulmonary mortality if they lived in areas with elevated ambient fine particles compared to individuals who resided in less-polluted areas. Two of the key studies considered by the U.S. EPA in this regard were those of Dockery and colleagues (1), who used data from the Harvard Six Cities Study, and Pope and colleagues (2) who used data obtained from the American Cancer Society (ACS) Cancer Prevention II Study (3). A number of criticisms of these two studies (4-6) have been largely addressed in an extensive reanalysis (7) conducted at the request of the Health Effects Institute in Cambridge, Massachusetts, USA.
In both of these cohort studies (1,2), subjects were enrolled from communities with various levels of outdoor air pollution. Subject-specific information on factors such as age, gender, race, tobacco use, alcohol consumption, occupational exposures, and education were collected by the use of an interview and questionnaire. Subjects were followed over time to assess changes in their vital status. Air pollution was measured by fixed-site monitors either prior to enrollment or during follow-up or both. The standard Cox proportional hazards regression survival model (8) was used to assess associations between mortality rates and community-based average ambient air pollution while controlling for individual risk factors such as age, gender, race, tobacco use, education, cigarette smoking, body mass index, and occupational exposures. In both of these studies (1,2), statistically significant associations were found between mortality rates and particulate air pollution, as measured by fine and sulfate particulate concentrations.
The standard Cox proportional hazard model used in these two studies to relate longevity to exposure assumed that event information (time of death or censoring due to end of study or loss to follow-up) was statistically independent among subjects after controlling for available information on subject-specific mortality risk factors. Such an approach results in at least two somewhat related concerns. First, health responses can cluster by location (9). Clustering will induce a positive correlation of the response of subjects in the same location and thus suggests that there are one or more unmeasured or inadequately modeled risk factors specific to the location itself. Failure to account for this clustering can lead to an understatement of the uncertainty in these estimates (10,11).
Second, responses of subjects living in communities close together may be more similar than responses of subjects living in cities farther apart after controlling for subject-specific risk factors. Failure to account for this type of spatial autocorrelation can also lead to an understatement of the uncertainty of the effect estimates (12,13). Furthermore, if this spatial autocorrelation is due to missing or systematically mismeasured risk factors also spatially autocorrelated, then the estimates of the effect of air pollution on mortality could be biased. The direction and size of the bias will depend upon the direction and degree of correlation between the missing risk factors, air pollution, and mortality.
In this article we present a new statistical approach to deal with these two related methodologic concerns. We present a spatial random-effects survival model that links spatial variation in concentrations of ambient air pollution to longevity of cohort subjects after controlling for temporal effects and individual risk factors for mortality. We used data from the original ACS study (2) to demonstrate the impact of modeling random location effects and spatial autocorrelation on the estimated air pollution-mortality association and estimates of uncertainty. These results are compared with those obtained using standard methods of survival analysis assuming statistical independence among subjects.
In this section we review the data used for our analysis, and we present the statistical model used to assess the association between air pollution and mortality.
The American Cancer Society Study
of Air Pollution and Mortality
Volunteers of the ACS enrolled over 1.2 million people in September 1982 throughout the United States. Information on history of disease, demographic characteristics, and mortality risk factors was obtained from respondents. Vital status was monitored through the end of 1989.
We obtained information on particulate sulfate levels from the Aerometric Information Retrieval System (AIRS) (http://www.epa.gov/air/data/info.html) and the Inhalable Particle Network (IPN) for 1980 and 1981 for 144 metropolitan statistical areas (MSAs) in which ACS subjects were enrolled. Sulfates are secondarily formed particulate aerosols originating from sulfur dioxide emissions and are a major component of fine particulate matter. The sulfate data from AIRS were collected using glass-fiber filters, which react in the presence of sulfur dioxide and artifactually inflate the sulfate concentration. The sulfate data obtained from the IPN used Teflon filters, which are not subject to this artifact problem. Both monitoring networks were operating in 41 MSAs. We calibrated the AIRS sulfate data to the IPN sulfate data using six linear regression models, with separate calibrations for three regions of the county and two periods (April-September and October-March) (7). We used six calibration equations because sulfur dioxide concentrations vary both regionally and seasonally in the United States. Estimates of exposure were obtained by averaging all available sulfate data from all monitors located in an MSA for the years 1980 and 1981, inclusive.
We examined the association between concentrations of sulfate particles and longevity in 144 MSAs for white members of the ACS cohort, totaling 509,292 subjects. The mean age at enrollment was 56.7 years. Five percent of the subjects were younger than 40 years of age and 5% were older than 75 years of age; 56.3% of the subjects were women. During the course of the 7 years of follow-up, 39,474 subjects (7.8%) died. The mean concentration of sulfate particles across all 144 cities, corrected for the sulfur dioxide artifact, was 6.4 µg/m3, with a minimum value of 1.4 µg/m3, an interquartile range of 4.2 µg/m3, and a maximum value of 15.6 µg/m3.
Statistical Model
The model is formulated in two stages. In stage one, survival data were modeled by covariates at the individual level and indicator functions for each community, using the Cox proportional hazards model (8). The community-specific indicator functions represent the logarithm of the relative risk of death in a specific community compared to an arbitrarily defined reference community. Sulfate pollution was not induced at this stage.
Indicator functions for community are defined with respect to a reference community (in our case Greenville, South Carolina, as it had a sulfate concentration near the mean value). A limitation of this procedure is that the uncertainty of the estimate of the reference community is not defined. Because these values are based on comparisons with the same reference community, they are correlated. This correlation increases the estimated uncertainty in the community-specific log-relative risks. The induced correlation can be removed by methods developed by Easton and colleagues (14). This procedure eliminates the included correlation between the estimates of the community-specific log-relative risks and estimates the uncertainty for the reference community. Outputs from stage one are estimates of the community-specific logarithm of the relative risks adjusted for mortality risk factors other than sulfate pollution, denoted by { (s), s =1,...,S}, where s denotes a point in Cartesian (x,y) space representing the location of one of the S communities under study. Additional output from this stage is the variance-covariance matrix of the (s), denoted by V, which describes the uncertainty in the adjusted estimates of the community-specific log-relative risks.
In stage two, estimates of adjusted community-specific log-relative risks were related to levels of sulfate pollution levels using a linear random-effects regression model (15). We also included a two-dimensional term to account for spatial trends, denoted by *, and assigned a random effect to each community. These spatial random effects are shared by all individuals within the community and reflect the difference between the observed and predicted values from our statistical model. We assume that the random effects have zero expectation, variance > 0, and correlation matrix . Residual variation at the spatial level suggests that there is some unexplained (unmeasured or incorrectly specified) risk factors for mortality.
Evidence of spatial autocorrelation in the residuals of the model may indicate the need to account for additional risk factors, which may potentially exert a confounding effect on the air pollution-mortality association. An alternate approach to accounting for autocorrelation by modeling additional risk factor information is to filter out spatial contiguous variation by including a term that represents spatial trends {*(s)}. The total impact of these potentially numerous unmeasured risk factors may vary in a relatively smooth manner over space, and thus spatial detrending can remove autocorrelation between geographic areas. In this approach, location and other covariates such as air pollution, which also vary in space, compete in the regression model to predict mortality. Thus, the regression coefficients give the effect of these variables adjusted for each other. This approach is analogous to that used in time-series studies of mortality and air pollution in which temporal trends in daily mortality rates are jointly modeled with air pollution levels (16).
Here, (s) has expectation
µ(s) = *(s) + ßTZ(s)
and variance-covariance matrix
 =  + V ,
where Z(s) is a vector of covariates defined at the community level. In our example we restricted the set of these spatial covariates to sulfate pollution. If the number of subjects and deaths in each community is large, as is assumed here, the { (s)} have approximately a multivariate normal distribution with mean vector | |
|