task 2 only
2500 words
GY7707 Geospatial Analytics
CW Assignment 1
Note: The CW1 assignment below is a CHOICE. For those who have access to R on a
computer then Task 1 involves the analysis of a real data set. For those students who do
not have access to a sufficiently fast computer and/or R, then Task 2 is an essay question.
SELECT ONE of these ONLY.
Task 1: Pipeline Accidents in the US
This part of the assignment makes use of real data supplied by the US Department of
Transportation Pipeline and Hazardous Materials Safety Administration on pipeline accidents
involving gas or oil for the period 1986 to 2017 and gathered together and edited by (and
supplied courtesy of) Dr. Richard Stover. This consists of 8890 incidents! This is available as
the file pipeline.csv on Blackboard (along with this file). You may like to view the web page at
http://www.biologicaldiversity.org/campaigns/americas_dangerous_pipelines and the
associated video at https://www.youtube.com/watch?v=3rxqUXqPzog&feature=youtu.be.
Select a single US state (note: NOT North Dakota, and ideally with sufficient accident data) so
that each student has a different state (check the selection of a state with me), and use R to
produce:
(a) A map of pipeline incidents (20%)
The incident data set includes a variable (coded YES or NO) corresponding to whether or not
the data use recorded latitude/longitude coordinates. As we want exact coordinates for the
spatial data analysis you should subset the data to exclude those data which do not (i.e., NO).
It is entirely up to you how to present the data on a map: e.g., you could generate a choropleth
map at county level, but I will be looking for some novelty and effective visualisations
alongside otherwise complete maps (scale/legend/orientation etc).
(b) A separate spatial analysis of the data set enabling an evaluation of the potential human
and environmental exposure to contamination within the state (80%)
1. It is *YOUR* choice as to what spatial data and analysis steps you include in addition to the
pipeline incident data and state boundary. How you deal with spatial analysis with temporal
data is also an issue. Spatial analysis should include the kinds of spatial analysis operations
discussed in some of the lectures/practicals, e.g., buffers, overlays, kernel densities etc. The
course text Lovelace et al. (2019) might well be a source of inspiration, or also the Brunsdon
and Comber (2015) book.
2. You should include population data and environmental data such as land use and rivers (as
well as associated water bodies). Spatial data is available from various sources via (for
example) the R OSMAR package. Population data can be obtained from a variety of sources
(for example http://zevross.com/blog/2015/10/14/manipulating-and-mapping-us-census-
data -in-r-using-the-acs-tigris-and-leaflet-packages/ and the USCensus2010 R library (see
lecture 7) but there are many others). Other geospatial data are available via
http://www.biologicaldiversity.org/campaigns/americas_dangerous_pipelines
http://zevross.com/blog/2015/10/14/manipulating-and-mapping-us-census-data%20-in-r-using-the-acs-tigris-and-leaflet-packages/
http://zevross.com/blog/2015/10/14/manipulating-and-mapping-us-census-data%20-in-r-using-the-acs-tigris-and-leaflet-packages/
https://ckan.geoplatform.gov/. Remember such data may well be supplied in different
projections! So you will need to change the CRS before any spatial analysis. You should be
specific about the sources of any data you have included and include URLs to any data you
have identified and downloaded yourself. You should supply sufficient information so I can
download any data (or obtain within an appropriate R library) that you have used. Remember
to include any source/copyright/credit lines for such data in any maps/visual outputs.
3. A write-up of the spatial analysis, including what you did, and what the analysis revealed
about the risk to human and physical environments including your commented R code as an
an appendix, or in line with the text and images if you are creating a document using
RMarkdown. Your write-up should ideally be no more than 2500 words (not including code
and references) and include appropriate results in terms of tables and figures.”
4. You may like to refer to Obida et al. (2018) and Park et al. (2016) for some ideas about
similar data and some approaches to spatial analysis using the incident data in North Dakota.
References
Brunsdon, C. and Comber, L. 2015. An introduction to R for spatial analysis and mapping.
Sage: London
Lovelace, R., Nowosad, J. and Muenchow, J. 2019. Geocomputation with R. CRC Press. This is
also available on-line: https://geocompr.robinlovelace.net/
Obida, C.B., Blackburn, G.A., Whyatt, J.D. and Semple, K.T. 2018. Quantifying the exposure of
humans and the environment to oil pollution in the Niger Delta using advanced geostatistical
techniques. Environment International, 111, 2018: 32-42.
Park, Y.S., Al-Qublan, H., Lee, E. and Egilmez, G. 2016 Interactive spatiotemporal analysis of
oil spills using Comap in North Dakota. Informatics, 3(2), 4; doi:10.3390/informatics3020004.
Task 2: Essay
“What is spatial autocorrelation and how it is measured? Assess its importance to spatial
analysis as implemented in GIScience”
Answer the essay above with reference to a single application context and include examples
and illustrations. Essays should be academic with references and figures and ideally no more
than 2500 words (not including references).
This CW is due to be submitted (electronically via Turnitin) by May 4 2020.
Nick Tate
March 2020
https://ckan.geoplatform.gov/