Introduction
Results of a case-control study on air pollution and lung cancer in Trieste, Italy, were reported by Barbone et al. (1). That study confirmed a moderate elevation in risk of lung cancer in polluted areas and showed a variation by histologic type and category of air pollution. Trieste, which had approximately 250,000 inhabitants in the mid-1980s, is a border city located in the northeast of Italy and is characterized by a major port and a high concentration of industries. Air pollution has been monitored since the early 1970s. Higher total particulate deposition levels (i.e., >0.3 g/m2/day) were documented in the center of the city and in the industrial area in the 1970s. Currently, higher levels of carbon monoxide (monthly average 3.6 mg/m3) and nitrogen oxides (218
g/m3) are found in the center of the city, and higher levels of ozone (32-39
g/m3) and sulfur dioxide (50-59
g/m3) are present near an incinerator and an iron foundry. The presence of suspended asbestos fibers was documented near a shipyard. Here we present analyses of the spatial pattern of risk of lung cancer with regard to four sources, shipyard, iron foundry, incinerator, and the city center, while adjusting for known risk factors.
Geographical investigations are hampered by the difficulties in properly accounting for confounders (2). However, methods based on the case-control design have been proposed in the statistical literature that allow the collection of data at individual level, avoiding the ecologic bias (3). The merit of the analysis presented here is in relaxing the a priori categorization of the subject residence in given areas and in using the distance from a source as a proxy for exposure. Second, the method we used allows for directional effects and estimates the risk gradient in order to properly describe the specific pattern of risk for each source.
Materials and Methods
The Cancer Registry and the Department of Pathology of the Province of Trieste identify 99% of cancer cases and conduct autopsies on approximately 73% of all the deaths of the region. From these institutions, 938 histologically confirmed cases of lung cancer were identified among males resident in the province of Trieste, who died from 1979 to 1981 or from 1985 to 1986. The two enrollment periods were chosen to cover an extended time span at a reasonable cost. The study had been originally designed to investigate environmental and occupational risk factors for lung cancer. This, together with statistical power considerations, was the reason we restricted the study to male cases only. We excluded 182 cases because we failed to trace the next of kin and 1 case because his residence was outside the Province of Trieste.
For each case, one male control resident in the Province of Trieste, who died within the same 6-month period, at the same age (
2 years), was randomly selected from the same archive at the Department of Pathology. The causes of death of the controls were not chronic lung diseases or cancer of the upper aerodigestive tract, urinary tract, pancreas, liver, or gastrointestinal system. The sampling probabilities for the control series are usually varied according to the proportion of cases by some relevant variable such as age or sex (4,5). The baseline spatial intensity would be therefore distorted, compared to a random sample of death controls. Use of death controls instead of living ones is widely discussed in the epidemiological literature (6). Our choice is justified by minimizing selection biases with special reference to residential history.
The present study was based on 755 case-control pairs, determined by age. Each subject's next of kin was interviewed within 1-3 years of the subject's death by means of a structured questionnaire to obtain information on demographic characteristics, smoking habits, occupational history, and last place of residence. Likelihood of exposure to occupational carcinogens was obtained from expert evaluation based on the type of job and also for people working in the iron foundry, shipyard, and incinerator. This summary variable was chosen to increase statistical power, since to include several variables for each job would have led to sparse data and results would have been affected by excess random variation.
Length of residence was not individually assessed; we only assessed if any subject moved from his place of residence in the last 10 years. A detailed description of data collection procedures and exposure coding has been published elsewhere (1).
Geographical. The boundaries of the Province of Trieste were coded using the geographical coordinates (Mercatore projection) as provided by the Italian Army Geographical Institute (Florence, Italy; map 1:10,000). The subject's last residence was identified in the same map, and the geographical coordinates were read directly. The location of the incinerator, the iron foundry, and the shipyard was identified similarly. The city center corresponded to the location of the central square of the town.
For the analysis, we calculated the distance and the angle from each subject location to each pollution source (north orientation). Maps with point locations were produced using ARC/Info 6.1 (7); contour plots of relative risk gradient were constructed using Gauss 2.2 (8).
Point-source analysis. The present analysis focuses on the spatial intensity
(x ) i.e., the frequency of events by unit area at location x. This is the spatial counterpart of the usual concept of rate, having substituted unit time with unit area. When we deal with heterogeneous population denominators, the spatial intensity is expressed in terms of intensity of the population (density of inhabitants) instead of person-years. The spatial intensity as function of the distance from a source is expressed as:
where
p(x ) indicates the population intensity at the location x and
(x-x0;
) is the risk as a function of the distance x-x0 from the location of the source (x0), modeled by the parameters
.
The case-control design is used to bypass the task of obtaining valid estimates of the population density at each location x. The spatial intensity for the control series (i.e., non-cases) is:
and for the case series:
where k and c are constants determined by study design (sampling fraction and case-control ratio, respectively). The spatial intensity of disease is therefore a function of the odds of disease (the odds being the probability of being ill over the probability of not being ill). To overcome the difficulty in estimating
CN(x), Diggle and Rowlingson (3) proposed conditioning the analysis on the observed case and control locations [further details are in Lagazio (9)]. We define a logistic regression model in which the odds of disease is:
assuming an additive scale for the relative risk [where w is a proportionality factor and f(
) is a function to be defined later]. This is plausible because, with a suitable choice of f(
), the risk is unchanged at infinite distance from the source. In the case of multiple sources the model becomes:
and individual risk factors can be modeled in the following way:
where s denotes the sth source and
j is the log odds ratio for the jth risk factor, zj. The adjusted excess risk gradient for each source has been modeled as follows:
where the parameter
s models the excess relative risk at the source location, ds is the distance (in meters) from the sth source, and the parameter ßs models the exponential decrease of the excess relative risk for longer distances. To allow for directional effects, we define the following model for a given source:
where d is the distance and
is the angle between the case or control location and the source location. This is of particular importance when considering a situation like that in Trieste, where the city is located between the coast (southwest) and hills (northeast). Although Trieste is famous for a strong northeast to southwest wind (bora) the moderate winds from the sea toward the hills are more relevant for the spread of air pollution.
The model-based spatial analysis was conducted to allow for the contribution of relevant risk factors. These terms were considered in the multiplicative scale in the model: age, smoking habit (nonsmoker, 1-19, 20-39, and >40 cigarettes/day), and exposure to occupational carcinogens (none, possible, likely). Moreover, we included the levels of air particulates as defined in a previous paper (1) (tertiles of distribution, 1972-1977: <0.175; 0.175-0.298; >0.298 g/m2/day). Each subject was assigned the average value measured by the nearest among the 28 stations that covered the city.
In the appendix, we report point estimates and likelihood ratio tests for the significance of the spatial terms in the model. The likelihood surface for those parameter estimates has an odd shape, and therefore their relative standard errors are poorly estimated. In this situation it is preferable to rely on likelihood ratios (10). These models are known as mixed additive-multiplicative models for excess relative risks and can be fitted using Epicure software (11).
Crude analysis. To describe the observed pattern of relative risk within the study area, we estimated the spatial intensity,
(x) nonparametrically, following the suggestions of Bithell (12) and Lawson and Williams (13). The spatial intensities for the case and control series are estimated separately as follows:
where the kernel function, G(
), has the Epanechnikov functional form (14). The terms hi are smoothing parameters that allow for local variation of the degree of smoothing. They are obtained as hi =
ih, where h is fixed in advance (500 m for our application), and
i is a previous estimate obtained using the simple nearest-neighbor technique (14).
The ratio of the kernel estimates for cases and non-cases is the odds of being a case, given the observed sample (this quantity differs from the odds of being ill because it also depends on the case-control ratio). To obtain easily interpretable contour plots, we back-transformed it to probability; i.e.,
where
(x) represents the odds of being a case. Because in our study the case-control ratio is 1, the areas with a probability >0.5 of being a case are characterized by higher risk of disease.
Results
Descriptive statistics and odds ratios for the relevant variables are shown in Table 1. Figures 1 and 2 show the locations of the case and control series. Figure 3 reports the location of the pollution sources and the contour plot of the probability of being a case obtained using adaptive kernel estimators with a 500-m bandwidth. There appears to be a wide risk area in the eastern part of the city with a spot near the city center and two peaks northeast and southeast from the incinerator.
 |
 |
|
Figure 1. Locations of cases. |
Figure 2. Locations of controls. |
Figure 3: Locations of pollution sources and contour plot of the probability of being a case.
The appendix reports the estimates of the spatial parameters
and ßs for each source. The highest excess relative risks is shown by the city center along with the most slowly declining gradient. All these sources appeared to be highly statistically significant.
The distances from the four sources are highly correlated. We chose to consider the city center, the most important source from a statistical point of view, as part of the model and assess the significance of the inclusion of each other source in turn.
The appendix reports the estimates of the spatial parameters of the other sources, adjusting for the effect of individual risk factors and for the effect of city center. The effect of shipyard is no longer statistically significant after adjustment. The iron foundry was of borderline significance (p = 0.09), with an excess relative risk of 5.9 at the source location. The incinerator was highly significant (p = 0.0098), with an excess relative risk of 6.7 and a very rapid decay moving away from the source. No other sources reached statistical significance when city center and incinerator had been included in the model.
Finally, we investigated if there were directional effects with regard to the effect of the incinerator. The appendix shows the results of fitting that model. Although not statistically significant, the point estimates for the directional effects suggested a wind effect from southwest to northeast.
Incidentally, we note the estimates for the levels of particulate: the odds ratios were 1.1 (95% CL, 0.8-1.5) for the second tertile and 1.4 (1.1-1.8) for the highest tertile. When we took into account the distance from the city center and the incinerator, the effect of particulate vanished: second tertile, OR = 1.2 (0.9-1.4); highest tertile, OR = 1.0 (0.7-1.4).
Discussion
The present analysis supports and validates the geographical areas defined in a previous study (1). Indeed, the use of the distance between residential location and sources of pollution as a continuous variable provided a more sensitive approach to spatial modeling of risk than the classification of the residences into four areas on the basis of their proximity to each source. Furthermore, the evidence of higher risk in the neighborhood of the incinerator has been confirmed. The excess relative risk estimated at the city center and at the location of the incinerator appears to be consistent as well as the shallow and steep descent, respectively.
The model adopted is simple, allowing an exponential decrease by distance from the source. Although several alternatives could be specified (15), we chose the model described here because it could be extended to include more than one source. The peculiar spatial location of the four sources complicate the analysis. The sources appear to be highly correlated, and the geography of the city is heavily affected by its proximity to the coast.
For these reasons we adopted a forward strategy to select the best-fitting model. The final model contains terms for spatial effects of the city center and of the incinerators. This could be due to the indistinguishable effects of the shipyard, the city center, and, to a lesser degree, the iron foundry, which lie on the same line along a north-south direction. The incinerator effects retained statistical significance even when adjusting for individual risk factors and spatial effects of the city center.
The previous analysis based on histological subtypes of lung cancer showed higher relative risks for small cell and large cell carcinoma among residents close to the city center, whereas the relative risk for squamous cell carcinoma and adenocarcinoma was elevated among those residents who lived close to the incinerator (1). The presence of a linear trend by level of particulate deposition was significant for small and large cell cancers. In the present study, for all lung cancers there was a significant increase in risk for those resident in areas in the highest tertile of particulate (>0.298 g/m2/day, OR = 1.4; 95% CL, 1.1-1.8). This effect appeared to be fully explained once distance from city center and incinerator had been included in the model.
This study was mainly a geographical investigation with characterization of environmental exposure by adjustment for total particulate deposition and residence location. Although the spatial pattern of the risk was adjusted for relevant confounders, residual confounding due to other unmeasured exposure cannot be excluded. Background radiation should not be a problem in this area because it is known that radiation follows a gradient, with a minimum at the city center and a maximum in the rural area at the boundary of the province. A selection bias due to the chosen frame of cases and controls cannot be excluded in principle; however, it should be noted that the subject list is derived from the Cancer Registry, which guarantees the coverage of the resident population and provides high-quality data, including 73% of all deaths autopsied. It was impossible to obtain a complete residential history for each subject enrolled. Therefore, misclassification bias due to change in residence cannot be excluded (we note that eventually this error would push the risk estimates toward the null value; nondifferential misclassification or selective migration of cases, e.g., of terminally ill people, outside the risk areas). The results shown here are coherent with the hypothesis of an independent effect of residing close to the incinerator and the city center. Further investigations should be undertaken to characterize the types and levels of pollutants from the incinerator and the center of the city.
Appendix
Excess risk of lung cancer as a function of distance from city center, shipyard, foundry, and incinerator considered separately
Null model:

Includes terms for age, smoking habits, occupational exposure and levels of air particulate.
Model 1:

ce = risk excess in the source (city center) = 2.209
ßce = risk decay moving away from city center = -0.0151
Likelihood ratio statistic model 1 vs. null model = 7.435, df = 2
p = 0.0243
Model 2:

sh = risk excess in the source (shipyard) = 2.033
ßsh = risk decay moving away from the shipyard = -0.01922
Likelihood ratio statistic model 2 vs. null model = 7.868, df = 2
p = 0.0196
Model 3:

if = risk excess in the source (iron foundry) = 1.702
ßif = risk decay moving away from the iron foundry = -0.01692
Likelihood ratio statistic model 3 vs. null model = 5.273, df = 2
p = 0.0716
Model 4:

in = risk excess in the source (incinerator) = 1.484
ßin = risk decay moving away from the incinerator = - 0.01505
Likelihood ratio statistic model 4 vs. null model = 4.736, df = 2
p = 0.0937
Excess risk of lung cancer as a function of distance from city center and from either the shipyard, foundry, or incinerator
Null model:

Includes terms for age, smoking habits, occupational exposure, levels of air particulate and excess risk as function of distance from the city center.
Model 1:

ce = risk excess in the source (city center) = 0.9091
ßce = risk decay moving away from city center = -0.01855
sh = risk excess in the source (shipyard) = 1.242
ßsh = risk decay moving away from the shipyard = -0.02208
Likelihood ratio statistic model 1 vs. null model = 1.089, df = 2
p = 0.5803
Model 2:

ce = risk excess in the source (city center) = 1.857
ßce = risk decay moving away from city center = -0.02439
if = risk excess in the source (iron foundry) = 5.858
ßif = risk decay moving away from the iron foundry = -0.1615
Likelihood ratio statistic model 2 vs. null model = 4.889, df = 2
p = 0.0868
Model 3:

ce = risk excess in the source (city center) = 1.959
ßce = risk decay moving away from city center = -0.03523
in = risk excess in the source (incinerator) = 6.740
ßin = risk decay moving away from the incinerator = -0.1762
Likelihood ratio statistic model 3 vs. null model = 9.241, df = 2
p = 0.0098
Excess risk of lung cancer as a function of distance from city center and incinerator, including an angular component associated with the incinerator
Null model:

Includes terms for age, smoking habits, occupational exposure, levels of air particulate and excess risk as function of distance from the city center and incinerator.
Model 1:

ce = risk excess in the source (city center) = 1.873
ßce = risk decay moving away from city center = -0.03885
in = risk excess in the source (incinerator) = 4.045
ßin = risk decay moving away from the incinerator = -0.1661
ß2 = - 0.6621
ß3 = - 0.1669
Likelihood ratio statistic model 1 vs. null model = 0.5005, df = 2
p = 0.7786