Please answer the discussion questions for both articles:Section 1 – Overview / summary of the reading – this may include:
- What are the key points?
- What was learned?
- What are the most important issues?
- Why is it important (or not)?
Section 2 – Your reaction and wider implications – this may include:
- What do you agree/disagree with and why?
- What critiques do you have?
- What additional things do you want to learn?
- What related examples have you found or observed in the real-world?
- Links to other relevant materials (websites, videos, etc.)
Int. J. Environ. Res. Public Health 2014, 11, 312-336; doi:10.3390/ijerph110100312
OPEN ACCESS
International Journal of
Environmental Research and
Public Health
ISSN 1660-4601
www.mdpi.com/journal/ijerph
Article
Spatio-Temporal Distribution and Hotspots of Hand, Foot and
Mouth Disease (HFMD) in Northern Thailand
Ratchaphon Samphutthanon 1,*, Nitin Kumar Tripathi 1, Sarawut Ninsawat 1 and
Raphael Duboz 2
1
2
Remote Sensing and Geographic Information Systems Field of Study, School of Engineering and
Technology, Asian Institute of Technology, P.O. Box 4, Klong Luang, Pathumthani 12120, Thailand;
E-Mails: nitinkt@ait.ac.th (N.K.T.); sarawutn@ait.ac.th (S.N.)
Computer Science and Information Management Field of Study, School of Engineering and
Technology, Asian Institute of Technology, P.O. Box 4, Klong Luang, Pathumthani 12120,
Thailand; E-Mail: raphael@ait.ac.th
* Author to whom correspondence should be addressed;
E-Mail: ratchaphon.samphutthanon@ait.ac.th; Tel.:+66-2524-5799; Fax: +66-2524-5597.
Received: 18 October 2013; in revised form: 15 December 2013 / Accepted: 17 December 2013 /
Published: 23 December 2013
Abstract: Hand, Foot and Mouth Disease (HFMD) is an emerging viral disease, and at
present, there are no antiviral drugs or vaccines available to control it. Outbreaks have
persisted for the past 10 years, particularly in northern Thailand. This study aimed to
elucidate the phenomenon of HFMD outbreaks from 2003 to 2012 using general statistics
and spatial-temporal analysis employing a GIS-based method. The spatial analysis
examined data at the village level to create a map representing the distribution pattern,
mean center, standard deviation ellipse and hotspots for each outbreak. A temporal analysis
was used to analyze the correlation between monthly case data and meteorological factors.
The results indicate that the disease can occur at any time of the year, but appears to peak
in the rainy and cold seasons. The distribution of outbreaks exhibited a clustered pattern.
Most mean centers and standard deviation ellipses occurred in similar areas. The linear
directional mean values of the outbreaks were oriented toward the south. When separated
by season, it was found that there was a significant correlation with the direction of the
southwest monsoon at the same time. An autocorrelation analysis revealed that hotspots
tended to increase even when patient cases subsided. In particular, a new hotspot was
found in the recent year in Mae Hong Son province.
Int. J. Environ. Res. Public Health 2014, 11
313
Keywords: Hand, Foot and Mouth Disease (HFMD); spatial statistics; spatial distribution
pattern; temporal distribution pattern; hotspot detection; Geographic Information
Systems (GIS); hybrid ring mapping
1. Introduction
Hand, Foot and Mouth Disease (HFMD) is an emerging illness infecting infants and children. It is
characterized by fever, painful sores in the mouth and a rash with blisters on the hands, feet and
buttocks. HFMD is most frequently caused by coxsackievirus A16 (CA16) and enterovirus 71
(EV-71) [1–3]; however, most patients with fatal complications are infected with EV-71 [4]. At
present, no effective chemoprophylaxis or vaccination approaches for dealing with HFMD are
available [5,6]. The transmission of HFMD occurs from person to person through direct contact with
nasal discharge, saliva or fluid from the blisters. Other infection paths include food or water
contaminated with fecal droplets, nasal discharge, fluid or saliva from an infectious person. Weather
variables may affect the transmission of HFMD either directly or indirectly [7]. Globally, HFMD
outbreaks have been documented for more than four decades. It has been reported that in the last
decade the western Pacific region, including countries such as Japan, Malaysia, Singapore, Thailand
and China, was the area most severely affected by HFMD [8–13]. Other countries, such as Taiwan,
Hong Kong, the Republic of Korea, Vietnam, Cambodia, Brunei and Mongolia, were also impacted.
HFMD has also progressed to become a leading cause of suffering and mortality in some developing
countries, with Thailand finding itself among these. The current epidemic situation in Thailand, as
reported by the Bureau of Epidemiology, Ministry of Public Health, includes HFMD outbreaks during
the past 10 years, with the more severe in the last 5 to 6 years. Four deaths were caused by HFMD in
2008, and another four were reported in 2009. In 2011 and 2012, six and two deaths were reported,
respectively. The northern region of Thailand has had the highest infection rate every year from 2003
to 2012 (Figure 1).
Statistically, the number of HFMD patients has trended upward for the past 10 years. The number
of HFMD cases per capita reported by the Epidemiology Center of Thailand reveals that the upper
north of Thailand has the highest outbreak rate, with very high infectivity observed in Lampang,
Phayao, Nan, Chiang Rai and Lamphun provinces. MThai News reported HFMD outbreaks in many
schools in July 2012. Prompted by the situation, the Ministry of Public Health cooperated with the
Ministry of Education to establish a war room and to launch a policy to control the outbreak by closing
more than 100 schools in several provinces with immediate cleaning and disinfection of the sites. If
more than five infected students were detected at a school, the administrator had to close the school for
at least 7 to 10 days to prevent further infections and to clean the school. During the same period, there
was an outbreak in Cambodia, where more than 60 people died due to HFMD within 3 months. This
outbreak was also a major concern for Thailand. The deputy director of the Office of Disease
Prevention and Control reported that new species of the virus were found that had not been seen in the
area before. It is assumed that the virus could have mutated.
Int. J. Environ. Res. Public Health 2014, 11
314
Figure 1. HFMD outbreaks in number of patients per million inhabitants at the regional
level, annually from 2003 to 2012 and the 10 year average.
In recent years, several studies have been conducted in different countries to understand the
diffusion and transmission patterns of HFMD. However, very few studies have been conducted in
Thailand. Moreover, attempts to understand the disease have focused solely on the study of medicine
and public health or on its demographic distribution. Thus, a full understanding in terms of the spatial
and temporal characteristics of the disease’s transmission pattern has not yet been established. Here,
the application of GIS technology may prove useful by allowing for a spatial analysis concerning
medical and public health. The assessment of spatio-temporal characteristics and disease associations
with weather can provide valuable information for the efficient allocation of public health resources
for disease prevention and treatment [14–16]. The spatio-temporal features of an infectious disease are
usually driven by certain determinants that can provide invaluable information for exploring the risk
factors of the disease and contribute to developing effective measures to control and prevent its
transmission [17,18]. Spatio-temporal analysis is increasingly used in epidemiological research based
on specific or routinely collected data from different sources [19–21]. Therefore, a better
understanding of the spatio-temporal distribution patterns of HFMD would help in identifying areas
and populations at high risk and then formulating and implementing appropriate regional public health
intervention strategies to prevent and control the outbreak [22].
The spatial dimensions of HFMD incidence have been explored in view of several explanatory
determinants, such as average temperature, average relative humidity and monthly precipitation, to
examine possible spatial variations in the assumed relationship between these factors and the incidence
of HFMD. Climatic factors have been taken into consideration because they may directly or indirectly
affect the transmission of HFMD. HFMD occurs during the summer in temperate regions but at any
Int. J. Environ. Res. Public Health 2014, 11
315
time in tropical countries [23]. However, there are limited studies that discuss the association between
weather and the dynamics of HFMD in Thailand.
The focus of the present study was to elucidate the spatial and temporal aspects of the incidence of
HFMD. The first research objective was to study the general distribution of cases in terms of patient
age and gender and the timing of infection. The second objective was to study the spatial distribution
of infections, and the third objective was to detect hotspots in the upper north of Thailand from 2003 to
2012. Finally, hybrid ring mapping was applied to better visualize and understand the spatial and
temporal domains of the results.
2. Materials and Methods
2.1. Study Area
Because the upper northern region of Thailand was the area with the highest HFMD infection rate
in Thailand over all 10 years from 2003 to 2012, this area was selected as the study area.
Geographically, the upper north of Thailand is one of six regions of Thailand and is located between
97°19′8′′E and 101°22′18′′E and 17°11′12′′N and 20°29′1′′N. The region is located approximately
600 kilometers north of Bangkok, covers an area of 93,690.85 km2 (9,369 million hectares), or 18.25
percent of Thailand, and has a population of 6,133,208. The area consists of nine provinces: Mae Hong
Son, Chiang Mai, Chiang Rai, Lamphun, Lampang, Phayao, Phrae, Nan and Uttraradit. It borders the
Republic of the Union of Myanmar to the west and north, the Lao PDR to the north and east and is
adjacent to the Tak, Sukhothai and Pitsanulok provinces of Thailand to the south (Figure 2).
Figure 2. Study area: Upper northern region of Thailand.
Int. J. Environ. Res. Public Health 2014, 11
316
2.2. Data Acquisition
2.2.1. Disease Data
Disease data were obtained from the Bureau of Epidemiology, National Trustworthy and
Competent Authority in Epidemiological Surveillance and Investigation, Ministry of Public Health of
Thailand. HDMF case data as reported at the village level from January 2003 to November 2012 were
used in this study. The databases comprised the monthly numbers of reported, both apparent and
confirmed, HFMD cases in terms of gender, age and the month of the symptoms.
2.2.2. Meteorological Data
Monthly average temperature (Celsius), monthly total rainfall (millimeters) and monthly relative
humidity (percent) for January 2003 to November 2012 were obtained from the nine weather stations
operated by the Thai Meteorological Department in each province in the study area. The climate of the
upper north of Thailand is subtropical and cooler than the other regions of Thailand. As such, it can be
clearly divided into three seasons as follows: summer: February through May; rainy: June through
September; and cold: October through January.
2.2.3. Village Data
The locations and population data for 7,700 villages in the study area were obtained from the
Information and Communication Technology Center, Ministry of the Interior, Thailand. To conduct a
GIS-based analysis of the spatial distribution of HFMD, points and polygons representing villages
from the administrative boundary layer, which were generated based on the administrative boundary
map, were obtained from the Geo-Informatics and Space Technology Development Agency of
Thailand. All HFMD cases were coded and matched to the village layers by administrative code using
ArcGIS 10.0 software. The accuracy of the village point locations was confirmed by overlaying them
on satellite images.
2.3. Methodology
2.3.1. Distribution Analysis
First, in the distribution analysis the incidence of HFMD was analyzed by population, area and time
using general statistics. The population distribution analysis comprised age group and gender. The
temporal pattern of HFMD infection was analyzed at yearly and monthly intervals separately for each
province. In addition, the correlation coefficients between HFMD cases and climatic factors such as air
temperature, relative humidity and total rainfall from 2003 to 2012 were analyzed for all 10 years by
the Pearson method as well as separately for each season. The distribution of HFMD by area was
analyzed separately for each province. The annual incidence rates from 2003 to 2012 are also given.
Int. J. Environ. Res. Public Health 2014, 11
317
2.3.2. Spatial Distribution Analysis
A global spatial analysis by spatial autocorrelation was applied to the HFMD incidence rates to
analyze their distribution over the study area and to identify spatial disease clusters of statistical
significance [19]. The Moran’s I statistical method [24,25] was used to evaluate the spatial
autocorrelation in the distribution of HFMD cases and to determine how villages were clustered
(random or dispersed) in the area with reference to the HFMD morbidity rate. Village locations and
morbidity rates were analyzed for each village. The indices were evaluated by simulation considering
the original location of the villages [26] with Moran’s I values ranging from −1 to +1, where a value
close to “0” indicates spatial randomness and a positive value denotes a positive spatial autocorrelation
and vice versa.
The Average Nearest Neighbor (NN) tool calculates the distance between each featured village
point and the nearest neighboring point. The nearest neighbor distances were then averaged. When the
average distance was less than one, the distribution of the features being analyzed was considered
clustered. An average distance between one and two indicated a random pattern. With an average
distance greater than two, the features were considered dispersed.
The Standard Deviation Ellipse (SDE) measures whether a distribution of features displays a
directional trend (whether features are farther from a specified point in one direction than in another).
The standard distance circle shows the spatial spread of a set of point locations.
The Mean Center (MC) is the average location of a set of points. Here, the MCs of the locations of
the villages having HFMD cases were measured. These indicate the area and extent of the incidence of
HFMD cases.
The Linear Directional Mean (LDM) is the line angle representing the mean orientation of all lines
in the dataset. The LDM can also measure the trend of their lengths and geographic centers. The output
of the LDM is a single line located at the calculated mean center with a length equal to the mean length
and an orientation equal to the mean orientation of all input vectors. The mean direction is calculated
for features that move from a starting point to an end point. In this case study, the LDM was calculated
using the monthly MC movement from January 2003 to 2012. For the future, the LDM trend was
calculated as the average LDM of these 10 years. Another LDM was calculated for each season by
separating monthly MC movements into seasons (summer, rainy and cold). These LDMs show the
LDM trend for each month during the 10 years of the analysis. Moreover, the LDM of each season was
explained in relation to the respective monsoon direction.
2.3.3. Hotspot Detection and Analysis
Moran’s I has well-established statistical properties to describe global spatial autocorrelations.
However, it is not effective in identifying clustered spatial patterns. These clusters are also described
as hotspots [27]. Local indicators of spatial association (LISA) permit the decomposition of global
indicators of Moran’s I into the contribution of each individual observation [28]. At the local level, the
mean LISA was calculated to identify the pattern as clustered, random or dispersed. This indicator
considered both village locations and their attributes. In this case, the attributes were the yearly HFMD
morbidity rates, which were also analyzed monthly for each village, and the hotspot locations [28].
The local Moran’s I value was checked at the local level of spatial autocorrelation to identify extreme
Int. J. Environ. Res. Public Health 2014, 11
318
and geographically homogeneous HFMD morbidity rates [28,29]. The local indicator allowed the
identification of HFMD hotspots where the value of the index was extremely pronounced across
localities, as well as those of spatial outliers. The simulation used a permutation of the values among
neighbors. The significance level was set at 0.05. A Moran scatter plot was created with a standardized
HFMD. In this study, the village location points were converted to Thiessen polygons. This follows a
definition of neighbors based on common boundaries between polygons. A spatial weights matrix was
calculated to define the local neighborhood surrounding each village polygon. This was created to
support the spatial autocorrelation measure.
3. Results
3.1. General Distribution
The trend in the number of patients varied over the analyzed decade, first increasing from 2003 to 2007,
then decreasing from 2007 to 2009, and finally reversing with a rapid increase from 2009 to 2012. The
highest HFMD infection rates in 10 years were reported in 2012, with 5,500 cases in the study area.
The disease distribution was analyzed based on the age groups of patients. The age groups with the
highest frequencies were found in 1- and 2-year-olds, representing 31.81 and 31.05 percent of patients,
respectively (Table 1). Most patients were less than 5 years old, accounting for an average of
95.37 percent of all cases. The number of patients rapidly decreased with age. However, it is notable
that the trend of patient age changed significantly over the past 10 years, with infections of one-year-olds
clearly decreasing while the number of patients aged 2, 4, 5 and 6 years clearly increased.
The ten years of data reveal that the number of male patients was slightly higher than that of
females. When separated by age, 1- and 2-year-old males were more likely to be infected than their
female peers. In contrast, in children aged 10 years and older, the opposite pattern occurred. Because
the majority of patients were aged 0 to 5 years (95.37 percent), the results of only this age group were
compared in this study. Male patients accounted for 56.81 percent and females for 43.19 percent of
cases. Thus, the ratio of male to female patients was 1.3153 compared with the overall sex ratio for
0- to 5-year-olds of 50.68 percent boys to 49.32 percent girls, or 1.0275 boys per girl. This ratio
indicates that boys are more likely to get infected by the virus than girls in the same age group. The
overall mean age of male children with HFMD was 2.6 years, which is slightly lower than the mean
age for females of 2.8 years. The highest mean age, 3.3 years, was recorded in Nan province. Lampang
and Lamphun provinces reported the lowest mean ages of approximately 2.3 years. Patients have been
reported from every province, ranging from infants to the elderly. Chiang Mai recorded the oldest
patient, aged 78. For the period from 2003 to 2012, Chiang Rai and Lampang provinces had the
highest average occurrences of HFMD, with infection rates of 22.82 and 20.30 cases per 100,000
inhabitants, respectively.
Looking at the average monthly infections over ten years, the month of highest HFMD infection
appears to be June, with 12.69 percent of all infections, followed by July with 11.92 percent, both of
occurring during the rainy season, whereas November, with 11.02 percent, is in the cold season. The
lowest incidence occurred in April, with 2.65 percent, followed by March, with 4.40 percent, both
belonging to the summer period.
Int. J. Environ. Res. Public Health 2014, 11
319
Table 1. Distribution characteristics of HFMD by population, area and time, 2003–2012.
2003
2004
2005
2006
Age
10
2007
10.50
35.71
23.53
15.13
3.78
1.26
1.26
0.42
0.84
0.42
0.00
7.14
8.26
45.45
19.83
12.40
4.96
1.65
0.83
0.83
0.83
0.83
1.65
2.48
6.65
34.61
31.51
16.67
5.37
1.91
0.64
0.36
0.55
0.46
0.46
0.82
7.50
36.29
29.47
12.28
6.68
2.32
0.68
1.09
0.14
0.82
0.27
2.46
4.95
31.68
33.81
17.09
6.52
2.26
1.30
0.57
0.27
0.23
0.40
0.93
Gender
male
female
total
140
98
238
70
51
121
629
469
1,098
414
319
733
1,712
1,296
3,008
2010
2011
2012
%
6.98
31.79
30.65
16.99
6.76
2.44
1.07
0.85
0.52
0.33
0.41
1.22
4.95
30.22
31.52
17.89
7.00
3.22
1.42
0.62
0.74
0.37
0.37
1.67
5.79
29.66
32.27
15.32
7.53
2.87
1.43
1.02
0.61
0.15
0.36
2.97
6.05
30.56
32.41
18.14
6.41
2.84
1.27
0.73
0.30
0.39
0.24
0.67
11.60
32.25
29.04
13.55
5.89
2.78
1.75
0.78
0.53
0.33
0.31
1.20
7.55
31.81
31.05
15.93
6.40
2.62
1.35
0.74
0.47
0.34
0.35
1.39
1,484
1,224
2,708
922
693
1,615
1,112
840
1,952
1,909
1,399
3,308
3,130
2,370
5,500
56.81
43.19
100
(Ratio per 100,000 Inhabitats)
0.75
0.49
17.56
10.02
5.04
0.62
2.19
0.62
2.08
0.12
0.99
8.86
0.74
1.23
1.05
4.10
0.21
1.28
6.12
10.37
62.96
11.61
32.33
24.89
16.02
9.33
2.34
3.50
13.22
26.75
10.11
9.01
23.66
12.13
10.46
4.49
17.78
36.90
115.49
105.14
17.66
76.67
46.86
29.41
36.11
Month
January
February
March
April
May
June
July
August
September
October
November
December
2009
(Number)
Province
Chiang Mai
Chiang Rai
Lampang
Lamphun
Mae Hong Son
Nan
Phayao
Phrae
Uttaradit
2008
(Percentage)
28.74
56.71
81.81
42.70
18.60
29.62
64.01
26.32
23.48
9.37
28.37
50.88
26.93
17.37
54.04
31.61
17.31
19.87
15.24
50.58
35.17
18.29
20.19
57.52
36.81
31.25
23.35
34.44
59.48
79.60
37.88
61.87
94.42
65.99
41.20
34.92
80.73
126.39
57.42
42.58
194.63
81.62
147.59
35.10
66.37
16.01
22.82
20.30
6.11
4.58
10.44
10.26
4.59
4.89
28
53
96
36
97
268
146
138
209
172
198
174
309
393
247
90
179
209
134
112
85
59
68
67
59
35
40
42
149
466
345
324
506
444
488
410
387
380
264
218
405
809
1187
620
514
443
273
0
9.29
6.58
4.40
2.65
6.42
12.69
11.92
8.28
8.98
8.71
11.02
9.05
(Number)
5
3
4
4
5
11
21
13
24
45
39
64
46
17
22
4
9
5
3
8
2
2
1
2
7
11
7
13
152
329
196
98
94
79
77
35
48
56
43
43
81
78
79
74
90
67
47
27
33
19
12
8
31
78
134
117
202
386
980
1008
963
367
157
80
195
320
173
176
96
69
63
49
Int. J. Environ. Res. Public Health 2014, 11
320
3.2. Temporal Distribution Pattern
The investigation of the temporal distribution of infections in each province from 2003 to 2012
revealed that most maxima in terms of the number of patients occurred in the latter part of the decade,
mostly in 2011 and 2012, with the exception of Lamphun and Lampang provinces, where the highest
patient case counts occurred in 2007. It is notable that some outbreaks in neighboring areas had similar
temporal patterns. In Phrae and Nan, outbreaks occurred during the same period in 2011, and likewise,
Chiang Rai and Phayao saw an outbreak peak in 2012 (Figure 3a).
Figure 3b shows the rates of HFMD cases per capita for each year. The annual disease incidence
rate was highest in Lampang province, located in the central part of the study area, from 2003 to 2008.
The area with the highest incidence rate then moved east to Nan from 2009 to 2011. In 2012, the
highest incidence rate was recorded in Mae Hong Son province, located on the western side of the
study area.
Figure 3. (a) Bar charts of the HFMD infection rate for each year by province, shown as
average case counts for 2003–2012 per 100,000 inhabitants. (b) Yearly Ranking of HFMD
cases per capita.
(a)
Int. J. Environ. Res. Public Health 2014, 11
321
Figure 3. Cont.
(b)
The monthly reported cases of HFMD are summarized in Figure 4. The highest peaks of infections
per year occurred in different months. However, most outbreaks occurred during the rainy and cold
seasons, except for 2010, when a peak occurred in February. Over the entire investigated period, April
was the month with the fewest detected outbreaks, particularly in 2006 and 2007.
Figure 4. Monthly HFMD cases from 2003 to 2012.
1400
Rainy
Summer
1200
Cold
Cases
1000
800
600
400
200
0
JAN
FEB
MAR
APR
MAY
JUN
JUL
AUG
SEP
OCT
NOV
DEC
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
Int. J. Environ. Res. Public Health 2014, 11
322
The HFMD outbreaks do not directly correlate to the annual seasonal cycle. The temporal outbreak
pattern can be described as five waves with a wave occurring every two years (Figure 5). However, the
analysis of the correlation between climatic factors and HFMD incidence in terms of Pearson’s
correlation coefficient revealed that the overall annual average temperature had a low negative
relationship with an HFMD incidence of 0.123, whereas relative humidity and total rainfall exhibited
low positive relationships with HFMD of 0.166 and 0.045, respectively.
Figure 5. HFMD case counts per month from 2003 to 2012 related to climatic factors.
400
1200
HFMD cases
350
1000
300
800
250
600
200
150
400
100
200
2003
2004
HFMD
meanTemp
2009
avgRHumid
Sep
July
Nov
Mar
Jan
2011
May
Sep
July
Nov
Jan
Mar
May
Sep
2010
Nov
July
May
Jan
Mar
Sep
July
Nov
Jan
Mar
2008
May
Sep
July
Nov
Jan
Mar
May
Sep
2007
Nov
July
May
Jan
Mar
Sep
July
2006
Nov
Jan
Mar
May
Sep
July
2005
Nov
Jan
Mar
May
Sep
July
Nov
May
Jan
Mar
Sep
July
Nov
Jan
Mar
50
May
0
temp, humid, rainfall
450
1400
0
2012
totalRAIN
When analyzed by season, the summer periods were consistent across the 10 years. Temperature
had a negative correlation with disease incidence in six of the 10 years, humidity had a positive
correlation in 9 of the 10 years and rainfall a highly positive correlation in seven of the 10 years. The
rainy season exhibited a significant difference in relationship, but no clear relationship was detected for
the cold season (Table 2). These results indicate that lower temperatures and higher humidity may
cause an increase in outbreaks in the summer season. In contrast, during the rainy season, increased
outbreaks appear to be associated with higher temperatures and lower humidity.
Table 2. Pearson’s correlations between climatic factors and HFMD infection rates by season.
Summer
Rainy Season
Cold Season
year Temperature Humidity Rainfall Temperature Humidity Rainfall Temperature Humidity Rainfall
2003
+0.808
+0.015 +0.992
+0.068
+0.601 +0.721
+0.167
−0.688 −0.171
2004
−0.561
−0.433 −0.575
+0.840
−0.532 −0.912
−0.326
−0.834 −0.518
2005
+0.602
+0.942 +0.762
+0.881
−0.705 −0.628
+0.899
+0.990 +0.696
2006
−0.554
+0.905 +0.693
−0.221
+0.069 −0.388
+0.939
+0.891 +0.808
2007
−0.305
+0.925 +0.734
−0.814
+0.740 +0.317
+0.017
+0.350 −0.174
2008
−0.980
+0.241 −0.383
+0.976
−0.940 −0.566
−0.319
−0.765 −0.316
2009
+0.005
+0.302 +0.431
−0.347
−0.084 +0.586
+0.663
+0.439 +0.269
2010
−0.941
+0.333 −0.544
+0.949
−0.940 −0.562
−0.267
−0.838 −0.277
2011
+0.623
+0.887 +0.857
+0.160
−0.003 −0.528
+0.589
+0.673 +0.321
2012
−0.481
+0.742 +0.460
−0.171
−0.286 −0.398
−0.113
−0.193 +0.280
Int. J. Environ. Res. Public Health 2014, 11
323
3.3. Spatial Distribution Pattern
The HFMD-affected villages were classified based on the Jenks natural breaks optimization
method. Figure 6 presents the annual incidence rates from 2003 to 2012. In 2003, the villages with
high incidence rates were mostly found in the center of the study area (Lamphun and Lampang
provinces). Later, however, the highest rates spread toward the east, particularly into the north-south
belt (Chiang Rai and Lampang provinces) and Nan province in 2007. After that, a general spread
across the study area can be observed. In recent years, outbreaks were observed in three main groups,
located in Chiang Rai-Phayao and the central parts of Chiang Mai and Mae Hong Son provinces.
The population density map displays the high densities of the larger municipal areas in every
province except Mae Hong Son, which has no clear cluster. The comparison of the annual HFMD
incidence maps and the population density map indicates a correlation. In general, high HFMD
incidence rates were found in areas with a high population density except for Mae Hong Son province,
where many patient cases occurred, particularly in 2012.
Figure 6. Annual HFMD incidence rates (2003–2012) and population density of the
provinces in the study area in Northern Thailand.
2003
2004
2005
2006
2007
2008
Int. J. Environ. Res. Public Health 2014, 11
324
Figure 6. Cont.
2009
2012
2010
2011
Population Density
3.4. Moran’s I
The HFMD distribution patterns at the village level were established by analyzing the raw infection
rate (patient cases per unit population) in terms of spatial autocorrelation (Moran’s I). The results of
the Moran’s I analysis presented values higher than zero for almost every year. Table 3 shows that
most years exhibited a clustered pattern except for 2007, which displayed a dispersed pattern. All years
showed different levels of clustering; 2004 had the highest values for both the Moran’s I and Z scores
(0.634 and 8.645, respectively), with most groups located in lower-central Lampang province (Khokha
and MaeTha districts). The second and third highest values were observed in 2005 and 2010, respectively.
Although most annual distribution values were low, when separated by season very prominent
values were detected for the rainy season, particularly in 2006, 2007 and 2009 through 2012. For the
cold season, the majority of cluster patterns were found in 2009, followed by 2012 and 2006. For the
summer season, which generally exhibited only low numbers of affected villages, the most clustered
pattern was found in some areas in 2004.
3.5. Mean Center (MC), Standard Deviation Ellipse (SDE) and Linear Directional Mean (LDM)
The analysis of the annual spatial MC revealed that most clusters occurred in the central portion of
the study area in the north of Lampang province. The MC for 2003 was in a different location and was
far less pronounced compared with the other years. The SDE of 2003 was the smallest, falling within
Int. J. Environ. Res. Public Health 2014, 11
325
the main area of Lampang and Lamphun provinces. Most SDEs appeared at the same location with the
same shape, size and northeast-southwest direction during 2004 to 2012. They appeared primarily in
7 of the 9 provinces, namely in Chiang Mai, Chiang Rai, Lamphun, Lampang, Phayao, Phrae and Nan
provinces (Figure 7a).
Before producing the yearly diffusion of the disease as shown in Figure 7b, the movement of the
monthly MC location was examined. It was found that most LDMs were directed toward the south,
except in 2006, 2007 and 2010. However, the LDM trend displayed a northeast-southwest direction
with an orientation of 220 degrees. This result indicates the global trend of HFMD spread in the study area.
Table 3. Annual and seasonal global spatial autocorrelation analysis of HFMD.
Year
Moran
Zscore
Pattern
2003
0.033
1.345
2004
0.634
8.645
2005
0.161
2006
Summer
Rainy Season
Moran
Zscore
Pattern
Moran
Zscore
C
−0.12
−0.19
D
1.15
C
8.76
16.06
C
−0.14
0.161
C
5.88
11.23
C
2.11
0.098
0.189
C
1.20
3.72
C
6.50
2007
−0.002
−0.006
D
5.26
7.99
C
5.27
31.25
2008
0.032
0.162
C
7.58
64.67
C
1.82
13.61
2009
0.082
0.335
C
0.18
0.75
C
3.82
31.69
2010
0.108
8.073
C
1.77
13.76
C
4.08
25.35
Cold Season
Pattern
Moran
Zscore
Pattern
3.03
C
2.66
5.00
C
−0.71
D
0.14
1.22
C
15.21
C
2.32
9.36
C
34.75
C
4.20
13.58
C
C
3.19
56.88
C
C
2.50
16.52
C
C
7.02
35.51
C
C
0.77
3.74
C
2011
0.040
0.202
C
0.33
1.05
C
3.14
41.27
C
0.55
6.94
C
2012
0.026
0.179
C
6.06
68.35
C
4.59
105.06
C
4.96
40.80
C
Note: C = Clustered pattern; R = Random pattern; D = Dispersed pattern.
Figure 7. (a) HFMD Mean Center and Standard Deviation Ellipse and (b) Linear
Directional Mean from 2003 to 2012.
(a)
(b)
3.6. Hotspot Detection and Analysis
Hotspot detection means the identification of high-HFMD-incidence villages that are surrounded by
other villages with a high HFMD incidence. The result of the Moran’s I scatter plot illustrates the
spatial autocorrelation of HFMD incidences at the village level with a significant Moran’s I value
Int. J. Environ. Res. Public Health 2014, 11
326
(p-value < 0.05). The number of hotspots increased from 2003 to 2012, spreading from their origin in
the lower central region of the study area in 2003 (Lamphun and Lampang provinces, Figure 8).
In 2004, the hotspot-containing area moved to northern Lampang province. Then, 2005 saw the area
move to the border area of Lampang and Chiang Rai provinces. In 2006, hotspots emerged in the
eastern and northeastern margins of Nan and Chiang Rai provinces. In 2007, hotspots appeared to have
scattered across the entire area except Mae Hong Son, Phayao and Uttaradit provinces. In 2008, the
hotspot-containing area was found to be smaller than during the previous year, limited to the central
and northern part of the study area. In 2009, hotspots spread to the margin areas again, and in 2010,
a similar development followed in Chiang Rai, Mae Hong Son and Nan provinces. In 2011, distinct
hotspots were observed clustered in the east and the north of the study area, and in 2012, most hotspots
were located in the north and west of the study area.
The highest number of hotspots was observed in the last year, 2012 (Figure 9), totaling 75 villages,
which can be divided into two groups: those in the northern part of Chiang Rai province and those at
the northern tip of Mae Hong Son province. With 29 hotspots, Chiang Rai had the highest count,
followed by Mae Hong Son with 25. Classifications by month illustrate the hotspots’ development
during this year. No hotspots were found in Uttaradit province. It can be noted that although many
villages were significantly affected in Chiang Mai, Lampang and Uttaradit provinces, the hotspots in those
areas disappeared. Numerous hotspots were only observed in Chiang Rai and Mae Hong Son province.
There was a significant local Moran’s I as classified by association. The red (high-high) and blue
points (low-low) are indicators of spatial clusters, opposite the cyan (low-high) and pink points (high-low),
which are indicators of spatial outliers. The meanings of red, blue, cyan and pink are as follows: high
infection rate surrounded by high rates, low surrounded by low, low surrounded by high and high
surrounded by low, respectively.
Figure 10 shows the changes in the number of affected villages, the number of cases and the
number of hotspots from 2003 to 2012. The number of affected villages followed a similar trend to that
of the number of cases over nearly the entire period. In contrast, the trend in the number of hotspots
differed, especially in the period from 2007 to 2012. After 2007, the number of affected villages and
HFMD cases decreased, whereas the number of hotspots increased continuously until 2012.
The line graph confirms that the trend in HFMD outbreaks is more severe. In particular, the new
outbreak in the northern part of Mae Hong Son province (Mueang, Pangmapha and Pai districts)
should be closely watched to prevent HFMD outbreaks in the future.
The Hybrid ring maps illustrate the combination of the results of the spatial and temporal analyses.
Ring mapping aids in better understanding HFMD outbreak patterns in both geographical and temporal
terms. In addition, the map of centers, also called a morbidity map, which is derived from the kernel
density method, can illustrate the number of patient cases. The map also displays hotspots derived by
the LISA method, which answers the question of “how to”. Moreover, the rings visually display the
result of the spatial distribution analysis of four values: patient cases per area, patient cases per capita,
NN and Moran’s I value. This was carried out for every year from 2003 to 2012, but for simplicity,
only the ring maps showing the highest (2012) and lowest (2004) HFMD incidences were
included in Figure 11.
The Hybrid Ring Maps for both years represent clear differences in spatial and temporal results.
Although the smallest outbreak was observed in 2004, interestingly, both hotspots and high morbidity
Int. J. Environ. Res. Public Health 2014, 11
327
can be observed in the same area, located in the northern part of Lampang province. The inner ring
showed more clustering (81%–100%) in this province than in the surrounding provinces. The monthly
variation in outbreaks (outer ring) was also higher than in the other provinces. However, as represented
by the outer ring of 2004, several other provinces, including Phayao, Uttaradit and Lampang, also saw
many outbreaks in January.
Figure 8. HFMD hotspot development from 2003 to 2012.
2003
2004
2005
2006
2007
2008
2009
2010
2011
Int. J. Environ. Res. Public Health 2014, 11
328
Figure 8. Cont.
2012
Figure 9. Moran’s I scatter plot matrix and LISA cluster map of HFMD for 2012.
High-High
Low-High
(Hotspot)
Low-Low
High-Low
Figure 10. Number of affected villages, number of cases and number of hotspots in the
upper north of Thailand during 2003–2012.
80
6000
70
60
4000
50
40
3000
30
2000
20
1000
10
0
0
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
villages
outbreak
(n)
No. of affected
villages
Cases
(n)
No. ofoutbreak
cases
Hotspot
(n)
No. of hotspots
Hotspots
villages, cases
5000
Int. J. Environ. Res. Public Health 2014, 11
329
Among the 10 years analyzed, 2012 included the highest number of outbreaks, as shown in Figure 11.
The highest morbidity was observed in the central regions of Chiang Mai (Mueang, Sansai and
Doisaket districts) and Chiang Rai (Mueang district) provinces. Hotspots were found clustered in the
northern part of Chiang Rai province and the north of Mae Hong Son province. However, the highest
spatial density was observed in Chiang Rai and Phayao provinces (showed in inner ring). Although no
high-morbidity area was observed in Mae Hong Son province, many hotspots (appearing in the
Mueang and Pangmapha districts) and high patient case numbers per capita (second inner ring) were
observed. The Moran’s I and NN values for this province were also high. In 2012, the 3rd and 4th
inner rings showed the highest clustering in Phrae province, although fewer hotspots and lower
morbidity were seen in this province. In this year, most provinces, including Mae Hong Son, Chiang
Rai and Phayao, exhibited a clear increase in outbreaks during June and August (rainy season).
Figure 11. Spatio-temporal ring map: The inner ring shows the spatial distribution of
hotspots and morbidity; the outer ring shows the temporal distribution.
4. Discussion
HFMD has recently become a health issue in Thailand. Most outbreaks occur in the northern region,
which was chosen as the study area, and are likely to intensify, as 2012 in particular saw a sharp and
unprecedented increase in reported cases. Knowledge and research on this disease has been limited
Int. J. Environ. Res. Public Health 2014, 11
330
regarding epidemiology and public health. The results reported in this article are different from those
limited to statistical distributions and spatial analysis. GIS was employed to enhance the understanding
of distribution patterns in the study area. Although GIS has been used in medical and health
applications in Thailand for more than 10 years, it appears not to have been applied to studies of
HFMD until now, while the number of patients has been increasing rapidly. This report introduced
Ring Mapping to improve the visualization of the results. This may be an advantage not only for
medical and public health organizations but also for the general public in the affected area to further
understanding of the disease and the necessity for its surveillance.
The results revealed that most patients were infants and children under the age of 5 years, which
corresponds to the studies conducted in other countries [4,5,8,9,11,22,30–38]. According to
seroepidemiological studies [39,40], more than 50% of children under the age of 5 years lack
neutralizing antibodies against EV71 and CA16. Additionally, in a previous study, the levels of
circulating antibodies had waned so rapidly only 1 month after birth that none of the infants tested had
maternal antibodies to EV71 [12]. A recent seroepidemiological study showed that the level of
maternal antibody titers declined markedly during the first 7 months after birth, then increased
significantly from month 12 to months 27–38. This explains why the majority of patients belonged to
the age group of 1- to 2-year-olds, with the incidence rate decreasing with increasing age. The results
confirmed that the peak incidence occurs in the 1 year age group [4,5]. However, the infection rate is
still likely to increase in the 2-, 4- and 6-year-old age groups and thus might shift from nurseries to
kindergartens and schools, where children are more exposed to infection risk by intensive contact and
activities with peers. These results reflect those of a study conducted in Hong Kong that found that the
number of older children (>5 years) infected increased from 25.4% in 2001 to 33.0%
in 2009 [37].
Outbreaks were found to affect males more than females, which is in line with previous studies [4,5,38]
that found that boys were more susceptible to enterovirus than girls. [38]. Boys may be more
physically active than girls, which might lead to more physical contact, promoting the spread of
HFMD [4]. The results presented here also indicate that the patient sex ratio (male:female) was higher
than the population sex ratio. Remarkably, the number of female patients aged 10 years or older was
higher than the number of males of the same age. The reasons for the higher infection and disease
outbreak risk in children might be that they are less likely to have developed protective antibodies than
adults [18]. Moreover, many children of this age are attending nursery school or kindergarten, where
they stay together with their peers of the same age group.
The temporal distribution over the 10-year study period revealed an outbreak pattern recurring
every two years, with the largest wave observed between 2011 and 2012, whereas in Sarawak, HFMD
outbreaks are reported to occur every three years [23]. Looking at the results by province, the peak
outbreak at the end of the decade, in 2011 and 2012, was observed in all provinces with the exception
of Lampang and Lamphun provinces, which saw a peak outbreak in 2007. The endemic source
analysis investigating the annual outbreak focuses found most of them in Lampang province. Both
results match significantly. It is possible that Lampang, after facing the largest and longest outbreak in
the past, is reaching the end of the epidemic cycle.
The distribution classified by season found that HFMD appeared primarily during the rainy season
and in winter, similar to its occurrence in Yunnan in the south of China, where outbreak peaks were
Int. J. Environ. Res. Public Health 2014, 11
331
observed in May as well as winter [41]. In contrast, April, during the Thai summer, showed HFMD
incidence rates that clearly decreased to the lowest level almost every year. A correlation analysis
between HFMD cases and temperature over a long-term period showed a negative result. This
contrasts with the results of a study conducted in China, Malaysia, Taiwan, Japan and England that
recorded a trend toward peaks in HFMD cases in April [3,4,8,22,23,42]. A study in Belgium found that
HFMD infections usually occur throughout the year [43]. In Hong Kong, HFMD occurrence is
reported to peak in winter, perhaps due to an increase in winter temperature [37]. Both results showed
positive relationships with humidity and rainfall. The reason for the increase in outbreaks in the rainy
season may be a weakened immune defense system, especially in children. On the other hand, cooler
weather may be associated with behavioral patterns that lead to increased contact among children, e.g.,
sharing toys and other items in child-care centers or kindergartens, thereby contributing to the spread
of EV71 infections [12]. In summer, the negative correlation result may be due to higher temperatures
that may destroy or hinder the growth of the virus. The long school holiday in summer may also
contribute to the low incidence rate. Most children stay at home, so there is less physical interaction
with their peers than at school. Although the exact mechanism is unclear, we can conclude that
weather factors have a significant influence on the HFMD infection rate [7,34]. HFMD does occur in
temperate countries during summer, but in the tropical countries, it occurs throughout the year [7,23,44,45].
In contrast, the results of a previous study conducted in Tokyo, Japan indicated that higher temperature
and humidity may have stimulated the increased HFMD incidence observed between 1999 and 2002 [13].
GIS and a global spatial analysis method with spatial autocorrelation were applied to study the
occurrence of the disease at the village level over the study period from 2003 to 2012. The results
showed that the distribution patterns were clustered throughout all years except 2007, during which the
cases exhibited a dispersed pattern. The most intense clustering occurred in 2004, with a Moran’s I of
0.634. Viewed by season, the most prominent outbreaks were found during the rainy and cold seasons.
The mean center site remained in the northern part of Lampang province over the whole period. The
Standard Deviation Ellipses of the villages affected by the disease covered nearly the entire area in the
middle north and the northeast with seven out of all nine provinces. The trend of the global diffusion
pattern was from northeast to southwest. The Linear Directional Mean classified by season revealed
the influence of the annual monsoon (Figure 12). Significant differences occurred between the
individual rainy seasons, but all LDMs followed a similar path in the northern direction. This
corresponds to the southwest monsoon during this time of the year, and it indicates that the movement
of the MC during this period is influenced by the southwest monsoon. This is consistent with studies
conducted in China reporting that from May to June, intense disease clusters began to move from the
south to the north [46]. The LDMs in the summer months showed different, irregular patterns. The
monthly LDMs of the cold season, even though not clearly associated with the northeast monsoon,
showed significantly similar patterns.
Hotspot detection is useful in finding a local cluster. The results indicated that the starting points lie
in high morbidity areas. However, in the last year of the study period, they were found even in areas of
low morbidity, especially in the west of the study area. Hotspots were found in two main groups in
Chiang Rai and Mae Hong Son provinces, with only sporadic hotspots in other provinces. The hotspots
were not associated with the number of patients because their number does not decrease along with the
number of patients. This means that the probability that a village surrounded by other villages infected
Int. J. Environ. Res. Public Health 2014, 11
332
with HFMD is a hotspot increases exponentially. The results indicate that the trend of severe HFMD
outbreaks has sharply increased from 2003 to 2012. Therefore, all agencies are urged to take measures
to prevent outbreaks of this disease.
Figure 12. Linear directional mean for each season and the annual monsoon direction in Thailand.
Sumer season
Rainy season
Cold season
Monsoon in Thailand
5. Conclusions
This study aimed at understanding the phenomenon of HFMD outbreaks and their spatio-temporal
patterns in the upper north of Thailand from 2003 to 2012 using GIS tools. The demographic analysis
of patients revealed some interesting facts. It was found that 95.37% of cases were observed in
children less than 5 years old. Female children in this age group were less affected than males. Studies
in other countries reported similar findings. The male to female ratio of patient cases was 1.32. Cases
in children less than one year of age occur much less frequently. Children in the 4–6 age groups are
most vulnerable. There are clear indications that children attending kindergarten are more susceptible
to HFMD; this may be vital information for public health officers seeking to control the spread of
this disease.
The trends of outbreaks and hotspots were also investigated using maps and graphs. Spatially
distributed clusters were observed in different provinces in every year except 2007. It was also
revealed that the number of hotspots is rising, though the number of cases is slightly declining. This
clearly indicates the severity of this disease in new locations, which is alarming if proper measures are
not taken to control outbreaks. As revealed by the LDM analysis of different seasons, the diffusion of
outbreaks during the rainy season was higher, moving from the southwest toward the northeast, which
coincides with the direction of the annual southwest monsoon.
The temporal analysis on a monthly basis revealed that outbreaks occurred approximately every two
years, mostly during the rainy and cold seasons. This implies a significant correlation of HFMD
incidence and climatic factors. In the future, this finding may be further consolidated by including
more climatic factors and variability. From this research, it can be concluded that GIS proved to be a
powerful tool in monitoring HFMD outbreak patterns both spatially and temporally. The results may
be utilized by public health officials and the general public to spread greater awareness and to take
effective measures to prevent the disease.
Int. J. Environ. Res. Public Health 2014, 11
333
Acknowledgments
We would like to thank the Bureau of Epidemiology, Ministry of Public Health, the Meteorological
Department, Ministry of Interior, the Meteorological Department and Geo-Informatics and the Space
Technology Development Agency of Thailand for providing invaluable information.
Conflicts of Interest
The authors declare no conflict of interest.
References
1.
Melnick, J.L. Enterovirus: Polioviruses, Coxsackieviruses, Echo-Viruses, And Newer En-Teroviruses.
In Field’s Virology. Fields, B.N., Knipe, D.M., Howley, P.M., Chanock, R.M., Melnick, J.L.,
Monath, T.P., Eds.; Lippincott-Raven Publishers: Philadelphia, PA, USA, 1996; pp. 655–712.
2. Tunnessen, W.W., Jr. Erythema Infectiosum, Roseola, and Entero-viral Exanthems. In Infectious
Diseases. Gorbach, S.L., Bartlett, J.G., Blacklow, N.R., Eds.; WB Saunders Co: Philadelphia, PA,
USA, 1992; pp. 1120–1125.
3. Chen, K.T.; Chang, H.L.; Wang, S.T.; Cheng, Y.T.; Yang, J.Y. Epidemiologic features of
hand-foot-mouth disease and herpangina caused by enterovirus 71 in Taiwan, 1998–2005.
Pediatrics 2007, 120, e244–e252.
4. Zhu, Q.; Hao, Y.; Ma, J.; Yu, S.; Wang, Y. Surveillance of hand, foot, and mouth disease in
mainland China (2008–2009). Biomed. Environ. Sci. 2011, 24, 349–356.
5. Deng, T.; Huang, Y.; Yu, S.; Gu, J.; Huang, C.; Xiao, G.; Hao, Y. Spatial-temporal clusters and
risk factors of hand, foot, and mouth disease at the district level in Guangdong province, China,
2013. PLoS ONE, 2013, e56943, doi: 10.1371/journal.pone.0056943.
6. Hu, M.; Li, Z.; Wang, J.; Jia, L.; Liao, Y.; Lai, S.; Guo, Y.; Zhao, D.; Yang, W. Determinants of
the incidence of hand, foot and mouth disease in China using geographically weighted regression
models. PLoS ONE. 2012, 7, e38978, doi:10.1371/journal.pone.0038978.
7. Leong, P.F.; Labadin, J.; Rahman, S.B.A.; Juan, S.F.S. Quantifying the relationship between the
climate and hand-foot-mouth disease (HFMD) incidences. In Proceedings of Modeling,
Simulation and Applied Optimization (ICMSAO), 2011 4th International Conference, Universiti
Malaysia Sarawak, Sarawak, Malaysia, April 2011; pp. 19–21.
8. Ho, M.; Chen, E.R.; Hsu, K.H.; Twu, S.J.; Chen, K.T.; Tsai, S.F.; Wang, J.R.; Shih, S.R.
An epidemic of enterovirus 71 infection in Taiwan. N Engl. J. Med. 1999, 341, 929–935.
9. Ang, L.W.; Koh, B.K.; Chan, K.P.; Chua, L.T.; James, L.; Goh, K.T. Epidemiology and control of
hand, foot and mouth disease in Singapore, 2001–2007. Ann. Acad. Med. Singapore. 2009, 38,
106–112.
10. Chatproedprai, S.; Theanboonlers, A.; Korkong, S.; Thongmee, C.; Wananukul, S.; Poovorawan, Y.
Clinical and molecular characterization of hand-foot-and-mouth disease in Thailand, 2008–2009.
Jpn. J. Infect. Dis. 2010, 63, 229–233.
Int. J. Environ. Res. Public Health 2014, 11
334
11. Liu, M.Y.; Liu, W.; Luo, J.; Liu, Y.; Zhu, Y.; Berman, H.; Wu, J. Characterization of an outbreak
of hand, foot, and mouth disease in Nanchang, China in 2010. PLoS ONE. 2011, 6, e25287, doi:
10.1371/journal.pone.0025287.
12. Ooi, E.E.; Phoon, M.C.; Ishak, B.; Chan, S.H. Seroepidemiology of human enterovirus 71, Singapore.
Emerg. Infect. Dis. 2002, 8, 995–7.
13. Urashima, M.; Shindo, N.; Okabe, N. Seasonal models of herpangina and hand-foot-mouth
disease to simulate annual fluctuations in urban warming in Tokyo. Jpn. J. Infect. Dis. 2003,
56, 48–53
14. Chen, C.C.; Wu, K.Y.; Chang, M.J.W. A statistical assessment on the stochastic relationship
between biomarker concentrations and environmental exposures. Stoch. Environ. Res. Risk Assess.
2004, 18, 377–385.
15. Hay, S.I.; Snow, R.W. The malaria atlas project: Developing global maps of malaria risk. PLoS Med.
2006, 3, e473, doi:10.1371/journal.pmed.0030473.
16. Tamerius, J.D.; Wise, E.K.; Uejio, C.K.; McCoy, A.L.; Comrie, A.C. Climate and human health:
Synthesizing environmental complexity and uncertainty. Stoch. Environ. Res. Risk Assess. 2007,
21, 601–613.
17. Interdisciplinary Public Health Reasoning and Epidemic Modelling: The Case of Black Death.
Christakos, G., Olea, R.A., Serre, M.L., Yu, H.-L., Wang, L.-L. Eds.; Springer Verlag: New York,
NY, USA, 2005.
18. Wang, J.F.; Guo, Y.S.; Christakos, G.; Yang, W.Z.; Liao, Y.L.; Li, Z.J. Hand, foot and mouth
disease: Spatiotemporal trans-mission and climate. Int. J. Health Geogr. 2011, 10, 25,
doi:10.1186/1476-072X-10-25.
19. Kulldorff, M.; Feuer, E.J.; Freedman, L.S. Breast cancer clusters in the Northeast United States:
A geographic analysis. Am. J. Epidemiol. 1997,146, 161–170.
20. Spatial Data Analysis: Theory and Practice. Haining, R. Ed.; Cambridge University Press: Cambridge,
London, UK, 2003.
21. Riley, S. A prospective study of spatial clusters gives valuable insights into dengue transmission.
PLoS Med. 2008, 5, 1540–1541.
22. Liu, Y.; Wang, X.; Liu, Y.; Sun, D.; Ding, S.; Zhang, B.; DU, Z.; Xue, F. Detecting
Spatial-temporal clusters of HFMD from 2007 to 2011 in Shandong province, China. PLoS ONE.
2013, 8, e63447, doi:10.1371/journal.pone.0063447.
23. Podin, Y.; Gias, E.L.; Ong, F.; Leong, Y.W.; Yee, S.F.; Yusof, M.A.; Perera, D.; Teo, B.; Wee, T.Y.;
et al. Sentinel surveillance for human enterovirus 71 in Sarawak, Malaysia: Lessons from the first
7 years. BMC Public Health 2006, 6, 180, doi:10.1186/1471-2458-6-180
24. Si, Y.L.; Debba, P.; Skidmore, A.K.; Toxopeus, A.G.; Li, L. Spatial and temporal patterns of
global H5N1 outbreaks. Spatial Inform. Sci. 2008, 1117, 69–74.
25. Point Pattern Analysis Newbury Park. Boots, B.N., Getis, A., Eds.; Sage Publications: Newbury
Park, California, CA, USA, 1998.
26. Fang, L.; Yan, L.; Liang, S.; Vlas, S.J.D.; Feng, D.; Han, X.; Zhao, W.; Xu, B.; Bian, L.; Yang, H.;
et al. Spatial analysis of hemorrhagic fever with renal syndrome in China. BMC Infect. Dis. 2006,
6, 77, doi:10.1186/1471-2334-6-77.
Int. J. Environ. Res. Public Health 2014, 11
335
27. Osei, F.B.; Duker, A.A. Spatial and demographic patterns of Cholera in Ashanti region—Ghana.
Inter. J. Health Geogr. 2008, 7, 44, doi:10.1186/1476-072X-7-44.
28. Anselin, L. Local Indicators of Spatial Association—LISA. Geogr. Anal. 1995,
27, 93–116.
29. Jepsen, M.R.; Simonsen, J.; Ethelberg, J.S. Spatio-temporal cluster analysis of the incidence of
Campylobacter cases and patients with general diarrhea in a Danish county, 1995–2004. Inter. J.
Health Geogr. 2009, 8, 11, doi:10.1186/1476-072X-8-11.
30. China CDC, 2009. Available online: http://www.cdcp.org.cn/editor/uploadfile/200904291746597
88.ppt (accessed on 6 January 2013).
31. Lin, T.Y.; Chang, L.Y.; Hsia, S.H.; Huang, Y.C.; Chiu, C.H.; Hsueh, C.; Shih, S.R.; Liu, C.C.;
Wu, M.H. The 1998 enterovirus 71 outbreak in Taiwan: Pathogenesis and management.
Clin. Infect. Dis. 2002; 34, S52– S57.
32. Gan, Z.H.; Zhuo, J.T. Advances of research on hand, foot and mouth disease. China Trop. Med.
2009, 9, 373–375.
33. Cohen, J.I. Enterovirus and reovirus. In Harrison’s Principles of Internal Medicine. Fauci, A.S.,
Braunwald, E., Isselbacher, K.J., Wilson, J.D., Martin, J.B., Kasper, D.L., Eds. 14th ed. McGraw-Hill
Internal Medicion: New York, NY, USA, 1998; pp. 1118–1123.
34. Onozuka, D.; Hashizume, M. The influence of temperature and humidity on the incidence of hand,
foot, and mouth disease in Japan. Sci. Total Environ. 2011, 410–411, 119–125.
35. Fujimoto, T.; Iizuka, S.; Enomoto, M.; Abe, K.; Yamashita, K.; Hanaoka, N.; Okabe, N.; Yoshida, H.;
Yasui, Y. Kobayashi, M.; et al. Hand, foot, and mouth disease caused by coxsackievirus A6,
Japan, 2011. Emerg. Infect. Dis. 2012, 18, 337–339.
36. Mao, L.X.; Wu, B.; Bao, W.X.; Han, F.A.; Xu, L.; Ge, Q.J.; Yang, J.; Yuan, Z.H.; Miao, C.H.;
Huang, X.X.; et al. Epidemiology of hand, foot, and mouth disease and genotype characterization
of enterovirus 71 in Jiangsu, China. J. Clin. Virol. 2010, 49, 100–104.
37. Ma, E.; Lam, T.; Chan, K.C.; Wong, C.; Chuang, S.K. Changing epidemiology of hand, foot, and
mouth disease in Hong Kong, 2001–2009. Jpn. J. Infect. Dis. 2010, 63, 422–426.
38. Momoki, S.T. Surveillance of enterovirus infections in Yokohama city from 2004 to 2008. Jpn. J.
Infect. Dis. 2009, 62, 471–473.
39. Zhu, Z.; Zhu, S.; Guo, X.; Wang, J.; Wang, D.; Yan, D.; Tan, X.; Tang, L.; Zhu, H.; Yang, X.; et al.
Retrospective seroepidemiology indicated that human enterovirus 71 and coxsackievirus A16
circulated wildly in central and southern China before large-scale outbreaks from 2008. Virol. J.
2010, 7, 300–305.
40. Ji, Z.; Wang, X.; Zhang, C.; Miura, T.; Sano, D. Occurrence of hand-foot-and-mouth disease
pathogens in domestic sewage and secondary effluent in Xi’an, China. Microbes Environ. 2012,
27, 288–292.
41. Xu, W.; Jiang, L.; Thammawijaya, P.; Thamthitiwat, S. Hand, foot and mouth disease in Yunnan
province, China, 2008–2010. Asia Pac. J. Public Health 2011, 23, doi: 10.1177/10105395114
30523.
42. Bending, J.W.; Fleming, D.M. Epidemiological, virological, and clinical features of an epidemic
of hand, foot, and mouth disease in England and Wales. Commun. Dis. Rep. CDR Rev. 1996, 6,
R81–R86.
Int. J. Environ. Res. Public Health 2014, 11
336
43. Druyts-Voets, E. Epidemiological features of entero non-poliovirus isolations in Belgium 1980–94.
Epidemiol Infect. 1997, 119, 71–77.
44. Malaysia, H.M. Hand Food and Mouth Disease (HFMD) Guidelines. Disease Control Division,
Ministry of Health. Available online: http://jknsarawak.moh.gov.my/en/uploads/hfmd_
guidelines.pdf (accessed on 1 January 2007).
45. Fong, T.T.; Lipp, E.K. Enteric viruses of humans and animals in aquatic environments: Health
risks, detection, and potential water quality assessment tools. Microbiol. Mol. Biol. Rev. 2005,
69, 357–371.
46. Bie, Q.Q; Qiu, D.S.; Hu, H.; Ju, B. Spatial and temporal distribution characteristics of
hand-foot-mouth disease in China. J. Geoinf. Sci. 2010, 12, 380–384.
© 2013 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article
distributed under the terms and conditions of the Creative Commons Attribution license
(http://creativecommons.org/licenses/by/3.0/).
A.1
Spatial Statistics in ArcGIS
Lauren M. Scott and Mark V. Janikas
A.1.1
Introduction
With over a million software users worldwide, and installations at over 5,000 universities, Environmental Systems Research Institute, Inc. (ESRI), established in
1969, is a world leader for the design and development of Geographic Information
Systems (GIS) software. GIS technology allows the organization, manipulation,
analysis, and visualization of spatial data, often uncovering relationships, patterns,
and trends. It is an important tool for urban planning (Maantay and Ziegler 2006),
public health (Cromley and McLafferty 2002), law enforcement (Chainey and
Ratcliffe 2005), ecology (Johnston 1998), transportation (Thill 2000), demographics (Peters and MacDonald 2004), resource management (Pettit et al. 2008), and
many other industries (see http://www.esri.com/industries.html). Traditional GIS
analysis techniques include spatial queries, map overlay, buffer analysis, interpolation, and proximity calculations (Mitchell 1999). Along with basic cartographic
and data management tools, these analytical techniques have long been a foundation for geographic information software. Tools to perform spatial analysis have
been extended over the years to include geostatistical techniques (Smith et al.
2006), raster analysis (Tomlin 1990), analytical methods for business (Pick 2008),
3D analysis (Abdul-Rahman et al. 2006), network analytics (Okabe et al. 2006),
space-time dynamics (Peuquet 2002), and techniques specific to a variety of industries (e.g., Miller and Shaw 2001). In 2004, a new set of spatial statistics tools
designed to describe feature patterns was added to ArcGIS 9. This chapter focuses
on the methods and models found in the Spatial Statistics toolbox.
Spatial statistics comprises a set of techniques for describing and modeling
spatial data. In many ways they extend what the mind and eyes do, intuitively, to
assess spatial patterns, distributions, trends, processes and relationships. Unlike
traditional (non-spatial) statistical techniques, spatial statistical techniques actually use space – area, length, proximity, orientation, or spatial relationships – directly in their mathematics (Scott and Getis 2008).
M.M. Fischer and A. Getis (eds.), Handbook of Applied Spatial Analysis:
Software Tools, Methods and Applications, DOI 10.1007/978-3-642-03647-7_2,
© Springer-Verlag Berlin Heidelberg 2010
27
28
Lauren M. Scott and Mark V. Janikas
Fig. A.1.1. Right click on a script tool and select Edit to see the Python source code
By 2008 the Spatial Statistics toolbox in ArcGIS contained 25 tools. The majority
of these were written using the Python scripting language. Consequently, ArcGIS users have access not only to the analytical methods for these tools, but also
to their source code (see Fig. A.1.1).
The Spatial Statistics toolbox includes both statistical functions and generalpurpose utilities. With the most recent release of ArcGIS 9.3, statistical functions
are grouped into four toolsets: Measuring Geographic Distributions, Analyzing
Patterns, Mapping Clusters, and Modeling Spatial Relationships.
A.1.2
Measuring geographic distributions
The tools in the Measuring Geographic Distributions toolset (Table A.1.1) are descriptive in nature; they help summarize the salient characteristics of a spatial distribution. They are useful for answering questions like:
• Which site is most accessible?
• Is there a directional trend to the spatial distribution of the disease outbreak?
• What is the primary wind direction for this region in the winter?
• Where is the population center?
• Which species has the broadest territory?
A.1
Spatial statistics in ArcGIS
29
Table A.1.1. Tools in the measuring geographic distributions toolset
Tool
Description
Central feature
Identifies the most centrally located feature in a point, line, or polygon feature class
Directional distribution
(standard deviational
ellipse)
Measures how concentrated features are around the geographic mean, and
whether or not they exhibit a directional trend
Linear directional mean
Identifies the general (mean) direction and mean length for a set of vectors
Mean center
Identifies the geographic center for a set of features
Standard distance
Measures the degree to which features are concentrated or dispersed around
the geographic mean center
Even the simplest tool in the Spatial Statistics toolbox can be a powerful communicator of spatial pattern when used with animation. The mean center tool is a
measure of central tendency; it computes the geometric center – the average X and
average Y coordinate – for a set of geographic features. In Fig. A.1.2, the
weighted mean center of population for the counties of California is computed
every decade from 1910 to 2000. The center of population is initially located in
the northern half of the state near San Francisco. Animation reveals steady
movement of the mean center south, every decade, as population growth in Southern California outpaces population growth in the state’s northern counties.
Fig. A.1.2. Weighted mean center of population, by county, 1910 through 2000
30
Lauren M. Scott and Mark V. Janikas
Fig. A.1.3. Core areas for five gangs based on graffiti tagging
The Standard Deviational Ellipse and Standard Distance tools measure the spatial
distribution of geographic features around their geometric center, and provide information about feature dispersion and orientation. Gangs often mark their territory with graffiti. In Fig. A.1.3, a standard deviational ellipse is computed, by
gang affiliation, for graffiti incidents in a city. The ellipses provide an estimate of
the core areas associated with each gang’s turf. The potential for increased gangrelated conflict and violence is highest in areas where the ellipses overlap. By increasing the presence of uniformed police officers in these overlapping areas and
around nearby schools, the community may be able to curtail gang violence. Mitchell (2005), Scott and Warmerdam (2005), Wong (1999) and Levine (1996) provide additional examples of applications for descriptive statistics like mean center,
standard distance, and standard deviational ellipse.
A.1.3
Analyzing patterns
The Analyzing Patterns toolset (Table A.1.2) contains methods that are most appropriate for understanding broad spatial patterns and trends (Mitchell 2005).
With these tools you can answer questions like:
A.1
Spatial statistics in ArcGIS
31
• Which plant species is most concentrated?
• Does the spatial pattern of the disease mirror the spatial pattern of the population at risk?
• Is there an unexpected spike in pharmaceutical purchases?
• Are new AIDs cases remaining geographically fixed?
Consider the difficulty of trying to measure changes in urban manufacturing patterns for the United States over the past few decades. Certainly broad changes
have occurred with globalization and the move from vertical integration to a more
flexible and dispersed pattern of production. One approach might be to map manufacturing employment by census tract for a series of years, and then try to visually discern whether or not spatial patterns are becoming more concentrated or
more dispersed. Most likely a range of scenarios would emerge. The Global
Moran’s I tool computes a single summary value, a z-score, describing the degree
of spatial concentration or dispersion for the measured variable (in this case manufacturing employment). Comparing this summary value, year by year, indicates
whether or not manufacturing is becoming, overall, more dispersed or more concentrated.
Similarly, viewing thematic maps of per capita incomes (PCR)1 in New York
for a series of years (see Fig. A.1.4), it is difficult to determine whether rich and
poor counties are becoming more or less spatially segregated. Plotting the resultant z-scores from the Spatial Autocorrelation (Global Moran’s I) tool, however,
reveals decreasing values indicating that spatial clustering of rich and poor has
dissipated between 1969 and 2002.
Table A.1.2. A summary of the tools in the analyzing patterns toolset
Tool
Description
Average nearest neighbor
Calculates the average distance from every feature to its nearest
neighbor based on feature centroids
High/low clustering (Getis-Ord
general G)
Measures concentrations of high or low values for a study area
Spatial autocorrelation (global
Moran’s I)
Measures spatial autocorrelation (clustering or dispersion) based
on feature locations and attribute values
Multi-distance spatial cluster
analysis (Ripley’s K function)
Assesses spatial clustering/dispersion for a set of geographic features over a range of distances
The K function is a unique tool in that it looks at the spatial clustering or dispersion of points/features at a series of distances or spatial scales. The output from
the K function is a line graph (see Fig. A.1.5). The dark diagonal line represents
the expected pattern, if the features were randomly distributed within the study
area. The X axis reflects increasing distances. The solid curved line represents the
1
PCR is per capita income relative to the national average.
32
Lauren M. Scott and Mark V. Janikas
observed spatial pattern for the features being analyzed. When the curved line
goes above the diagonal line, the pattern is more clustered at that distance than we
would expect with a random pattern; when the curved line goes below the diagonal line, the pattern is more dispersed than expected. Based on a user-specified
number of randomly generated permutations of the input features, the tool also
computes a confidence envelope around the expected line. When the curved line
is outside the confidence envelope, the clustering or dispersion is statistically significant.
Fig. A.1.4. Relative per capita income for New York, 1969 to 2002
Fig. A.1.5. Components of the K function graphical output
A.1
Spatial statistics in ArcGIS
33
The K function is useful for comparing different sets of features within the same
study area, such as two strains of a disease or disease cases in relation to population at risk. Similar observed spatial patterns suggest similar factors (similar spatial processes) are at work. A researcher might compare the spatial pattern for a
disease outbreak, for example, to the spatial pattern of the population at risk to
help determine if factors other than the spatial distribution of population are promoting disease incidents. Wheeler (2007), Levine (1996), Getis and Ord (1992),
and Illian et al. (2008) provide examples of additional applications for the tools in
the Analyzing Patterns toolset.
A.1.4
Mapping clusters
The tools discussed above in the Analyzing Patterns toolset are global statistics
that answer the question: Is there statistically significant spatial clustering or dispersion? Tools in the Mapping Clusters toolset (Table A.1.3), on the other hand,
identify where spatial clustering occurs, and where spatial outliers are located:
• Where are their sharp boundaries between affluence and poverty in Ecuador?
• Where do we find anomalous spending patterns in Los Angeles?
• Where do we see unexpectedly high rates of diabetes?
In Fig. A.1.6, the Local Moran’s I tool is used to analyze poverty in Ecuador. A
string of outliers separate clusters of high poverty in the north from clusters of low
poverty in the south, indicating a sharp divide in economic status.
Table A.1.3. A summary of the tools in the mapping clusters toolset
Tool
Description
Cluster and outlier analysis
(Anselin’s local Moran’s I)
Given a set of weighted features, identifies clusters of high or
low values as well as spatial outliers
∗
Hot spot analysis (Getis-Ord Gi )
Given a set of weighted features, identifies clusters of features
with high values (hot spots) and clusters of features with low
values (cold spots)
The Hot Spot Analysis (Getis-Ord Gi∗ ) tool is applied to vandalism data for Lincoln, Nebraska in Fig. A.1.7. In the first map (left), raw vandalism counts for
each census block are analyzed. The picture that emerges would not surprise local
police officers. Most vandalism is found where most people and most overall
crime are found: downtown and in surrounding high crime areas. Fewer cases of
vandalism are associated with the lower density suburbs. In the second map
(right), however, vandalism is normalized by overall crime incidents prior to analysis. Running the Hot Spot Analysis tool on this normalized data shows that
34
Lauren M. Scott and Mark V. Janikas
Local Moran’s I
Statistically Significant Results
Fig. A.1.6. An analysis of poverty in Ecuador using local Moran’s I
Fig. A.1.7. An analysis of vandalism hot spots in Lincoln, Nebraska using Gi*
A.1
Spatial statistics in ArcGIS
35
while Lincoln may have more incidents of vandalism in downtown areas, vandalism represents a larger proportion of total crime in suburban areas. Zhang et al.
(2008), Jacquez and Greiling (2003), Getis and Ord (1992), Ord and Getis (1995),
and Anselin (1995) provide additional applications for the tools in the Mapping
Clusters toolset.
A.1.5
Modeling spatial relationships
The tools in the Modeling Spatial Relationships toolset (Table A.1.4) fall into two
categories. The first category includes tools designed to help the user define a
conceptual model of spatial relationships. The conceptual model is an integral
component of spatial modeling and should be selected so that it best represents the
structure of spatial dependence among the features being analyzed (Getis and Aldstadt 2004).
The options available for modeling spatial relationships include inverse distance, fixed distance, polygon contiguity (Rook’s and Queen’s case), k nearest
neighbors, Delaunay triangulation, travel time and travel distance. Figure A.1.8 illustrates how spatial relationships change when they are based on a real road network, rather than on straight line distances.
Fig. A.1.8. Traffic conditions or a barrier in the physical landscape can dramatically change
actual travel distances, impacting results of spatial analysis
Table A.1.4. A summary of the tools in the modeling spatial relationships toolset
Tool
Description
Generate network
spatial weights
Builds a spatial weights matrix file specifying spatial relationships among
features in a feature class based on a Network dataset
Generate spatial
weights matrix
Builds a spatial weights matrix file specifying spatial relationships among
features in a feature class
Geographically
weighted regression
A local form of linear regression used to model spatially varying relationships among a set of data variables
Ordinary least squares
regression
Performs global linear regression to model the relationships among a set
of data variables
36
Lauren M. Scott and Mark V. Janikas
Constructing spatial relationships prior to analysis generally results in improved
performance, particularly within the context of larger datasets or when applied to
multiple attribute fields. The spatial weights matrix files (*.swm) are sharable, reusable and can be directly edited within ArcGIS. Furthermore, options are available to facilitate both importing and exporting spatial weights matrix files from/to
other formats (*.gal, *.gwt, or a simple *.dbf table).2
The second category of tools in the Modeling Spatial Relationships toolset includes ordinary least squares (OLS) (Woolridge 2003), and geographically
weighted regression (GWR) (Fortheringham et al. 2002 and Chapter C.5). These
tools can help answer the following types of questions:
• What is the relationship between educational attainment and income?
• Is there a relationship between income and public transportation usage? Is that
relationship consistent across the study area?
• What are the key factors contributing to excessive residential water usage?
Regression analysis may be used to model, examine, and explore spatial relationships, in order to better understand the factors behind observed spatial patterns or
to predict spatial outcomes. There are a large number of applications for these
techniques (Table A.1.5).
Fig. A.1.9. GWR optionally creates a coefficient surface for each model explanatory variable reflecting variation in modeled relationship
OLS is a global model. It creates a single equation to represent the relationship
between what you are trying to model and each of your explanatory variables.
Global models, like OLS, are based on the assumption that relationships are static
and consistent across the entire study area. When they are not – when the relationships behave differently in separate parts of the study area – the global model
becomes less effective. You might find, for example, that people’s desire to live
and work close, but not too close, to a metro line encourages population growth:
the relationship for being fairly close to a metro line is positive while the relation2
See http://resources.esri.com/geoprocessing/ for a description and examples of exporting
/importing *.swm files to *.gal and *.gwt formats.
A.1
Spatial statistics in ArcGIS
37
ship for being right up next to a metro line is negative. A global model will compute a single coefficient to represent both of these divergent relationships. The result, an average, may not represent either situation very well.
Local models, like GWR, create an equation for every feature in the dataset,
calibrating each one using the target feature and its neighbors. Nearby features
have a higher weight in the calibration than features that are farther away. What
this means is that the relationships you are trying to model are allowed to change
over the study area; this variation is reflected in the coefficient surfaces optionally
created by the GWR tool (see Fig. A.1.9). If you are trying to predict foreclosures, for example, you might find that an income variable is very important in the
northern part of your study area, but very weak or not important at all in the
southern part of your study area. GWR accommodates this kind of regional variation in the regression model.
Table A.1.5. A variety of potential applications for regression analysis
Application Area
Analysis Example
Public health
Why are diabetes rates exceptionally high in particular regions of the United
States?
Public safety
What environmental factors are associated with an increase in search and rescue event severity?
Transportation
What demographic characteristics contribute to high rates of public transportation usage?
Education
Why are literacy rates so low in particular regions?
Market analysis
What is the predicted annual sales for a proposed store?
Economics
Why do some communities have so many home foreclosures?
Natural resource
management
What are the key variables promoting high forest fire frequency?
Ecology
Which environments should be protected to encourage reintroduction of an
endangered species?
The default output for both regression tools is a residual map showing the model
over- and underpredictions (see Fig. A.1.10). The OLS tool automatically checks
for muliticollinearity (redundancy among model explanatory variables), and
computes coefficient probabilities, standard errors, and overall model significance
indices that are robust to heteroscedasticity. The online help documentation for
these tools provides a beginner’s guide to regression analysis, suggested step by
step instructions for the model building process, a table outlining and carefully
explaining the challenges and potential pitfalls associated with using regression
analysis with spatial data, and recommendations for how to overcome those potential problems.3
3
See http://webhelp.esri.com/arcgisdesktop/9.3/index.cfm?TopicName=Regression_analysis_basics
38
Lauren M. Scott and Mark V. Janikas
Fig. A.1.10. Default output from the regression tools is a map of model over- and underpredictions
A.1.6
Custom tool development
The tools in the Spatial Statistics toolbox were developed using the same methods
and techniques that an ArcGIS user might adopt to create his/her own custom
tools. They illustrate the extendibility of ArcGIS, and ESRI’s commitment to
providing a framework for custom tool development.
The simplest way to create a new tool in the geoprocessing framework is to
use Model Builder to string existing tools together. The resultant model tool can
then be exported to Python and further extended with custom code. In addition,
any third party software package that can be launched from the DOS command
line is an excellent custom tool candidate. Simply point to the executable for that
software and define the needed tool parameters.
For software developers, the geoprocessing framework offers sophisticated
options for custom tool development. Python script tools can be run ‘in process’,
resulting in a cohesive interface that improves both performance and usability.
Numerical Python (NumPy) provides an avenue to perform complex mathematical
operations (Oliphant 2006), and is currently part of the ArcGIS software installation. Other Python libraries can be added as well. Perhaps the most logical extension is Scientific Python (SciPy),4 which provides a host of powerful statistical
techniques and works directly with NumPy. PySAL (a Python Library for Spatial
Analytical Functions, see Chapter A.10), developed in conjunction with GeoDa
(see Chapter A.4) and STARS (see Chapter A.5), is a crossplatform library of spatial analysis functions that may also provide opportunities for extending Arc GIS
functionality.5
4
http://www.scipy.org/
5
See http://www.sal.uiuc.edu/tools/tools-sum/pysal and http://www.sal.uiuc.edu/tools/toolssum/pysal
A.1
Spatial statistics in ArcGIS
39
Python works nicely with other programming languages, and this has resulted in
several hybrid libraries including Rpy and PyMat, giving users access to the methods in R (see Chapter A.3 for spatial econometric functions in R) and in MatLab, respectively.6 There are also a number of spatial data analysis add-on packages for R (Bivand and Gebhardt 2000) and a spatial econometrics toolbox for
MatLab (LeSage 1999). Sample scripts demonstrating integration of ArcGIS 9.3
with R are available for download from the Geoprocessing Resource Center7 (see
Fig. A.1.11).
Fig. A.1.11. Geoprocessing Resource Center Web page
A.1.7
Concluding remarks
The Spatial Statistics toolbox provides feature pattern analysis and regression
analysis capabilities inside ArcGIS where users can leverage, directly, all of its
powerful database management and cartographic functionalities. The source code
for these tools is provided inside a geoprocessing framework that encourages development and sharing of custom tools and methods. People and organizations
developing custom Python tools can take advantage of existing libraries, documentation, sample scripts, and support from a worldwide community of Python
software developers. The Geoprocessing Resouce Center (see Fig. A.1.11),
6
See http://rpy.sourceforge.net/, http://www.r-project.org/, http://www.mathworks.com/,
and http://claymore.engineer.gvsu.edu/~steriana/Python/pymat.html.
7
http://resources.esri.com/geoprocessing/
40
Lauren M. Scott and Mark V. Janikas
launched in August of 2008, offers a platform for asking questions and getting answers, for sharing ideas, tools, and methodologies, and for participating in an ongoing conversation about spatial data analysis. The sincere hope is that this conversation will extend beyond the realm of academics, theoreticians, and software
developers – that it will embrace the hundreds of thousands of GIS users grappling
with real world data and problems – and that, as a consequence, this might foster
new tools, new questions, perhaps even new approaches altogether.
References
Abdul-Rahman A, Zlantanaova S, Coors V (2006) Innovations in 3D geo information systems. Springer, Berlin, Heidelberg and New York
Anselin L (1995) Local indicators of spatial association: LISA. Geogr Anal 27(2):93-115
Anselin L, Syabri I, Kho Y (2009) GeoDa: an introduction to spatial data analysis. In
Fischer MM, Getis A (eds) Handbook of applied spatial analysis. Springer, Berlin,
Heidelberg and New York, pp.73-89
Bivand RS (2009) Spatial econometrics functions in R. In Fischer MM, Getis A (eds)
Handbook of applied spatial analysis. Springer, Berlin, Heidelberg and New York,
pp.53-71
Bivand RS, Gebhardt A (2000) Implementing functions for spatial statistical analysis using
the R language. J Geogr Syst 2(3):307-317
Chainey SP, Ratcliffe JH (2005) GIS and crime mapping. Wiley, London
Cromley EK, McLafferty SL (2002) GIS and public health. Guilford, New York
Fotheringham SA, Brunsdon C, Charlton M (2002) Geographically weighted regression:
the analysis of spatially varying relationships. Wiley, New York, Chichester, Toronto
and Brisbane
Getis A, Aldstadt J (2004) Constructing the spatial weights matrix using a local statistic.
Geogr Anal 36(2):90-104
Getis A, Ord JK (1992) The analysis of spatial association by use of distance statistics.
Geogr Anal 24(3):189-206
Illian J, Penttinen A, Stoyan H, Stoyan D (2008) Statistical analysis and modeling of spatial
point patterns. Wiley, London
Jacquez GM, Greiling DA (2003) Local clustering in breast, lung and colorectal cancer in
Long Island, New York. Int J Health Geographics 2:3
Johnston CA (1998) Geographic information systems in ecology. Blackwell Science, Malden [MA]
LeSage JP (1999) Spatial econometrics using MATLAB. www.spatial-econometrics.com
Levine N (1996) Spatial statistics and GIS: software tools to quantify spatial patterns. J
Am Plann Assoc 62(3):381-391
Maantay J, Ziegler J (2006) GIS for the urban environment. ESRI, Redlands [CA]
Miller HJ, Shaw S-L (2001) Geographic information systems for transportation: principles
and applications. Oxford University Press, Oxford and New York
Mitchell A (1999) The ESRI guide to GIS analysis, volume 1: geographic patterns and relationships. ESRI, Redlands [CA]
Mitchell A (2005) The ESRI guide to GIS analysis, volume 2: spatial measurements and
statistics. ESRI, Redlands [CA]
Okabe A, Okunuki K, Shiode S (2006) SANET: a toolbox for spatial analysis on a network.
Geogr Anal 38(1):57-66
A.1
Spatial statistics in ArcGIS
41
Oliphant T (2006) Guide to NumPy, Trelgol [USA]
Ord JK, Getis A (1995) Local spatial autocorrelation statistics: distributional issues and an
application. Geogr Anal 27(4):287-306
Peters A, MacDonald H (2004) Unlocking the census with GIS. ESRI, Redlands [CA]
Pettit C, Cartwright W, Bishop I, Lowell K, Pullar D, Duncan D (eds) (2008) Landscape
analysis and visualization: spatial models for natural resource management and planning. Springer, Berlin, Heidelberg and New York
Pick JB (2008) Geo-Business: GIS in the digital organization. Wiley, New York
Peuquet DJ (2002) Representations of space and time. Guilford, New York
Rey SJ, Anselin L (2009) PySAL: a Python library of spatial analytical methods. In Fischer
MM, Getis A (eds) Handbook of applied spatial analysis. Springer, Berlin, Heidelberg
and New York, pp.175-193
Rey SJ, Janikas MV (2009) STARS: Space-time analysis of regional systems. In Fischer
MM, Getis A (eds) Handbook of applied spatial analysis. Springer, Berlin, Heidelberg
and New York, pp.91-112
Scott L, Getis A (2008) Spatial statistics. In Kemp K (ed) Encyclopedia of geographic informations. Sage, Thousand Oaks, CA
Scott L, Warmerdam N (2005) Extend crime analysis with ArcGIS spatial statistics tools.
ArcUser Magazine, April-June [USA]
Smith MJ, Goodchild MF, Longley PA (2006) Geospatial analysis. Troubador, Leicester
Thill J-C (2000) Geographic information systems in transportation research. Elsevier Science, Oxford
Tomlin DC (1990) Geographic information systems and cartographic modeling. PrenticeHall, New Jersey
Wheeler D (2007) A comparison of spatial clustering and cluster detection techniques for
childhood leukemia incidence in Ohio, 1996-2003. Int J Health Geographics 6(1):13
Wheeler D, Paéz A (2009) Geographically Weighted Regression. In Fischer MM, Getis A
(eds) Handbook of applied spatial analysis. Springer, Berlin, Heidelberg and New
York, pp.461-486
Wong DWS (1999) Geostatistics as measures of spatial segregation. Urban Geogr
20(7):635-647
Woolridge JM (2003) Introductory econometrics: a modern approach. South-Western, Mason [OH]
Zhang C, Luo L, Xu W, Ledwith V (2008) Use of local Moran’s I and GIS to identify
pollution hotspots of Pb in urban soils of Galway, Ireland. Sci Total Environ 398
(1-3):212-221
http://www.springer.com/978-3-642-03646-0