::p_load(sf, maptools, raster, spatstat, tmap, kableExtra, tidyverse, funModeling, sfdep) pacman
Take Home Exercise 1
1.0 Overview
1.1 Background
This analysis aims to apply appropriate spatial point patterns analysis methods to discover the geographical distribution of functional and non-function water points and their co-locations if any in Osun State, Nigeria.
1.2 Task
Exploratory Spatial Data Analysis (ESDA)
Second-order Spatial Point Pattern Analysis
Spatial Correlation Analysis
2.0 Setup
2.1 Import Packages
sf - Used for handling geospatial data
tidyVerse - Used for data transformation and presentation
tmap, maptools, kableExtra - Used for visualizing dataframes and plots
spatstat - Used for point-pattern analysis
sfdep - Used for functions not in spdep
raster - Used to handle gridded spatial data
3.0 Data Wrangling
3.1 Datasets Used
Code
# initialise a dataframe of our geospatial and aspatial dataset details
<- data.frame(
datasets Type=c("Geospatial",
"Geospatial",
"Geospatial",
"Geospatial",
"Geospatial",
"Geospatial",
"Geospatial",
"Geospatial",
"Geospatial",
"Geospatial",
"Geospatial",
"Geospatial",
"Geospatial",
"Geospatial",
"Aspatial"),
Name=c("geoBoundaries-NGA-ADM2",
"geoBoundaries-NGA-ADM2",
"geoBoundaries-NGA-ADM2",
"geoBoundaries-NGA-ADM2",
"geoBoundaries-NGA-ADM2",
"geoBoundaries-NGA-ADM2",
"nga_admbnda_adm2_osgof_20190417",
"nga_admbnda_adm2_osgof_20190417",
"nga_admbnda_adm2_osgof_20190417",
"nga_admbnda_adm2_osgof_20190417",
"nga_admbnda_adm2_osgof_20190417",
"nga_admbnda_adm2_osgof_20190417",
"nga_admbnda_adm2_osgof_20190417",
"nga_admbnda_adm2_osgof_20190417",
"WPdx"),
Format=c(".dbf",
".geojson",
".prj",
".shp",
".shx",
".topojson",
".CPG",
".dbf",
".prj",
".sbn",
".sbx",
".shp",
".shp",
".shx",
".csv"),
Source=c("[geoBoundaries](https://www.geoboundaries.org/index.html#getdata)",
"[geoBoundaries](https://www.geoboundaries.org/index.html#getdata)",
"[geoBoundaries](https://www.geoboundaries.org/index.html#getdata)",
"[geoBoundaries](https://www.geoboundaries.org/index.html#getdata)",
"[geoBoundaries](https://www.geoboundaries.org/index.html#getdata)",
"[geoBoundaries](https://www.geoboundaries.org/index.html#getdata)",
"[Humanitarian Data Exchange](https://data.humdata.org/dataset/cod-ab-nga)",
"[Humanitarian Data Exchange](https://data.humdata.org/dataset/cod-ab-nga)",
"[Humanitarian Data Exchange](https://data.humdata.org/dataset/cod-ab-nga)",
"[Humanitarian Data Exchange](https://data.humdata.org/dataset/cod-ab-nga)",
"[Humanitarian Data Exchange](https://data.humdata.org/dataset/cod-ab-nga)",
"[Humanitarian Data Exchange](https://data.humdata.org/dataset/cod-ab-nga)",
"[Humanitarian Data Exchange](https://data.humdata.org/dataset/cod-ab-nga)",
"[Humanitarian Data Exchange](https://data.humdata.org/dataset/cod-ab-nga)",
"[ WPdx Global Data Repositories](https://www.waterpointdata.org/access-data/)")
)
# with reference to this guide on kableExtra:
# https://cran.r-project.org/web/packages/kableExtra/vignettes/awesome_table_in_html.html
# kable_material is the name of the kable theme
# 'hover' for to highlight row when hovering, 'scale_down' to adjust table to fit page width
library(knitr)
library(kableExtra)
kable(datasets, caption="Datasets Used") %>%
kable_material("hover", latex_options="scale_down")
Type | Name | Format | Source |
---|---|---|---|
Geospatial | geoBoundaries-NGA-ADM2 | .dbf | [geoBoundaries](https://www.geoboundaries.org/index.html#getdata) |
Geospatial | geoBoundaries-NGA-ADM2 | .geojson | [geoBoundaries](https://www.geoboundaries.org/index.html#getdata) |
Geospatial | geoBoundaries-NGA-ADM2 | .prj | [geoBoundaries](https://www.geoboundaries.org/index.html#getdata) |
Geospatial | geoBoundaries-NGA-ADM2 | .shp | [geoBoundaries](https://www.geoboundaries.org/index.html#getdata) |
Geospatial | geoBoundaries-NGA-ADM2 | .shx | [geoBoundaries](https://www.geoboundaries.org/index.html#getdata) |
Geospatial | geoBoundaries-NGA-ADM2 | .topojson | [geoBoundaries](https://www.geoboundaries.org/index.html#getdata) |
Geospatial | nga_admbnda_adm2_osgof_20190417 | .CPG | [Humanitarian Data Exchange](https://data.humdata.org/dataset/cod-ab-nga) |
Geospatial | nga_admbnda_adm2_osgof_20190417 | .dbf | [Humanitarian Data Exchange](https://data.humdata.org/dataset/cod-ab-nga) |
Geospatial | nga_admbnda_adm2_osgof_20190417 | .prj | [Humanitarian Data Exchange](https://data.humdata.org/dataset/cod-ab-nga) |
Geospatial | nga_admbnda_adm2_osgof_20190417 | .sbn | [Humanitarian Data Exchange](https://data.humdata.org/dataset/cod-ab-nga) |
Geospatial | nga_admbnda_adm2_osgof_20190417 | .sbx | [Humanitarian Data Exchange](https://data.humdata.org/dataset/cod-ab-nga) |
Geospatial | nga_admbnda_adm2_osgof_20190417 | .shp | [Humanitarian Data Exchange](https://data.humdata.org/dataset/cod-ab-nga) |
Geospatial | nga_admbnda_adm2_osgof_20190417 | .shp | [Humanitarian Data Exchange](https://data.humdata.org/dataset/cod-ab-nga) |
Geospatial | nga_admbnda_adm2_osgof_20190417 | .shx | [Humanitarian Data Exchange](https://data.humdata.org/dataset/cod-ab-nga) |
Aspatial | WPdx | .csv | [ WPdx Global Data Repositories](https://www.waterpointdata.org/access-data/) |
3.2 Geospatial Data
3.2.1 Load Data
Import Geospatial data and filter out Osun State. Transform WGS 48 coordinate system to projected coordinate system with Nigeria’s ESPG - 26293
<- st_read(dsn = "data/geospatial/",
osun layer = "nga_admbnda_adm2_osgof_20190417") %>%
filter(ADM1_EN == "Osun") %>%
st_transform(crs = 26392)
Reading layer `nga_admbnda_adm2_osgof_20190417' from data source
`/Users/shambhavigoenka/Desktop/School/Geo/IS415-GAA/take_home_ex/take_home_ex01/data/geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 774 features and 16 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 2.668534 ymin: 4.273007 xmax: 14.67882 ymax: 13.89442
Geodetic CRS: WGS 84
glimpse(osun)
Rows: 30
Columns: 17
$ Shape_Leng <dbl> 1.7951405, 0.7101503, 0.9199564, 0.8502782, 0.5212768, 0.60…
$ Shape_Area <dbl> 0.062436080, 0.024818478, 0.038002894, 0.030445804, 0.01221…
$ ADM2_EN <chr> "Aiyedade", "Aiyedire", "Atakumosa East", "Atakumosa West",…
$ ADM2_PCODE <chr> "NG030001", "NG030002", "NG030003", "NG030004", "NG030005",…
$ ADM2_REF <chr> "Aiyedade", "Aiyedire", "Atakumosa East", "Atakumosa West",…
$ ADM2ALT1EN <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADM2ALT2EN <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADM1_EN <chr> "Osun", "Osun", "Osun", "Osun", "Osun", "Osun", "Osun", "Os…
$ ADM1_PCODE <chr> "NG030", "NG030", "NG030", "NG030", "NG030", "NG030", "NG03…
$ ADM0_EN <chr> "Nigeria", "Nigeria", "Nigeria", "Nigeria", "Nigeria", "Nig…
$ ADM0_PCODE <chr> "NG", "NG", "NG", "NG", "NG", "NG", "NG", "NG", "NG", "NG",…
$ date <date> 2016-11-29, 2016-11-29, 2016-11-29, 2016-11-29, 2016-11-29…
$ validOn <date> 2019-04-17, 2019-04-17, 2019-04-17, 2019-04-17, 2019-04-17…
$ validTo <date> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ SD_EN <chr> "Osun West", "Osun West", "Osun East", "Osun East", "Osun C…
$ SD_PCODE <chr> "NG03003", "NG03003", "NG03002", "NG03002", "NG03001", "NG0…
$ geometry <MULTIPOLYGON [m]> MULTIPOLYGON (((213526.6 34..., MULTIPOLYGON (…
3.2.2 Data Preprocessing
3.2.2.1 Exclude redundant fields
<- osun %>%
osun select(c(3:4, 8:9))
3.2.2.2 Invalid Geometries
length(which(st_is_valid(osun) == FALSE))
[1] 0
Everything is valid
3.2.2.3 Checking for Duplicate Names
$ADM2_EN[duplicated(osun$ADM2_EN)==TRUE] osun
character(0)
No duplicate Local Government Areas (LGAs)
3.2.2.4 Remove Missing Values
rowSums(is.na(osun))!=0,] osun[
Simple feature collection with 0 features and 4 fields
Bounding box: xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: Minna / Nigeria Mid Belt
[1] ADM2_EN ADM2_PCODE ADM1_EN ADM1_PCODE geometry
<0 rows> (or 0-length row.names)
3.2.3 Initial Visualisation
plot(st_geometry(osun), main="Osun State")
3.3 Aspatial Data
3.3.1 Load Data
Import Aspatial data and filter out Osun State.
<- read_csv("data/aspatial/WPdx.csv") %>%
wp filter(`#clean_country_name` == "Nigeria" & `#clean_adm1` == "Osun")
glimpse(wp)
Rows: 5,557
Columns: 70
$ row_id <dbl> 429123, 70566, 70578, 66401, 422190, 422…
$ `#source` <chr> "GRID3", "Federal Ministry of Water Reso…
$ `#lat_deg` <dbl> 8.020000, 7.317741, 7.759448, 8.031187, …
$ `#lon_deg` <dbl> 5.060000, 4.785507, 4.563998, 4.637400, …
$ `#report_date` <chr> "08/29/2018 12:00:00 AM", "05/11/2015 12…
$ `#status_id` <chr> "Unknown", "No", "No", "No", "Unknown", …
$ `#water_source_clean` <chr> NA, "Protected Shallow Well", "Borehole"…
$ `#water_source_category` <chr> NA, "Well", "Well", "Well", NA, NA, "Wel…
$ `#water_tech_clean` <chr> "Tapstand", "Mechanized Pump", "Mechaniz…
$ `#water_tech_category` <chr> "Tapstand", "Mechanized Pump", "Mechaniz…
$ `#facility_type` <chr> "Improved", "Improved", "Improved", "Imp…
$ `#clean_country_name` <chr> "Nigeria", "Nigeria", "Nigeria", "Nigeri…
$ `#clean_adm1` <chr> "Osun", "Osun", "Osun", "Osun", "Osun", …
$ `#clean_adm2` <chr> "Ifedayo", "Atakumosa East", "Osogbo", "…
$ `#clean_adm3` <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ `#clean_adm4` <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ `#install_year` <dbl> NA, NA, NA, 2004, NA, NA, 2006, 2014, 20…
$ `#installer` <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ `#rehab_year` <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ `#rehabilitator` <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ `#management_clean` <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, "Com…
$ `#status_clean` <chr> NA, "Abandoned/Decommissioned", "Abandon…
$ `#pay` <chr> NA, "No", "No", "No", NA, NA, "No", "No"…
$ `#fecal_coliform_presence` <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ `#fecal_coliform_value` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ `#subjective_quality` <chr> NA, "Acceptable quality", "Acceptable qu…
$ `#activity_id` <chr> "6054e946-2573-45bf-ab7c-0ddaa68a61b4", …
$ `#scheme_id` <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ `#wpdx_id` <chr> "6FW723C6+222", "6FV68Q9P+36R", "6FV6QH5…
$ `#notes` <chr> "Rt Hon Kola Oluwawole Water Tap", "Temi…
$ `#orig_lnk` <chr> "https://nigeria.africageoportal.com/dat…
$ `#photo_lnk` <chr> NA, "https://akvoflow-55.s3.amazonaws.co…
$ `#country_id` <chr> "NG", "NG", "NG", "NG", "NG", "NG", "NG"…
$ `#data_lnk` <chr> "https://catalog.waterpointdata.org/data…
$ `#distance_to_primary_road` <dbl> 4474.22234, 10130.42742, 167.82235, 4133…
$ `#distance_to_secondary_road` <dbl> 1883.98780, 17466.35720, 838.91849, 1162…
$ `#distance_to_tertiary_road` <dbl> 1885.874322, 2376.832183, 1181.107236, 9…
$ `#distance_to_city` <dbl> 40100.268, 31549.222, 2449.293, 16704.19…
$ `#distance_to_town` <dbl> 8120.871, 24652.907, 9463.295, 5176.899,…
$ water_point_history <chr> "{\"2018-08-29\": {\"source\": \"GRID3\"…
$ rehab_priority <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ water_point_population <dbl> NA, NA, NA, NA, NA, NA, NA, NA, 0, 508, …
$ local_population_1km <dbl> NA, NA, NA, NA, NA, NA, NA, NA, 70, 647,…
$ crucialness_score <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, 0.78…
$ pressure_score <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, 1.69…
$ usage_capacity <dbl> 250, 1000, 1000, 1000, 250, 250, 1000, 3…
$ is_urban <lgl> FALSE, FALSE, TRUE, FALSE, FALSE, FALSE,…
$ days_since_report <dbl> 1483, 2689, 2689, 2700, 1483, 1483, 2688…
$ staleness_score <dbl> 62.65911, 42.84384, 42.84384, 42.69554, …
$ latest_record <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE…
$ location_id <dbl> 358783, 239555, 239556, 230405, 358062, …
$ cluster_size <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ `#clean_country_id` <chr> "NGA", "NGA", "NGA", "NGA", "NGA", "NGA"…
$ `#country_name` <chr> "Nigeria", "Nigeria", "Nigeria", "Nigeri…
$ `#water_source` <chr> "Tap", "Improved Protected dug well", "I…
$ `#water_tech` <chr> NA, "Motorised", "Motorised", "Motorised…
$ `#status` <chr> NA, "Non-functional Abandoned", "Non-fun…
$ `#adm2` <chr> NA, "Atakumosa East", "Osogbo", "Odo-Oti…
$ `#adm3` <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ `#management` <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, "Com…
$ `#adm1` <chr> NA, "Osun", "Osun", "Osun", NA, NA, "Osu…
$ `New Georeferenced Column` <chr> "POINT (5.06 8.02)", "POINT (4.7855068 7…
$ lat_deg_original <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ lat_lon_deg <chr> "(8.02°, 5.06°)", "(7.3177411°, 4.785506…
$ lon_deg_original <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ public_data_source <chr> "https://catalog.waterpointdata.org/data…
$ converted <chr> NA, "#status_id, #water_source, #pay, #s…
$ count <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ created_timestamp <chr> "12/06/2021 09:12:57 PM", "06/30/2020 12…
$ updated_timestamp <chr> "12/06/2021 09:12:57 PM", "06/30/2020 12…
3.3.2 Data Preprocessing
3.3.2.1 Create sfc object column
Convert well known text (wkt) field into simple feature column (sfc) field
$Geometry = st_as_sfc(wp$`New Georeferenced Column`)
wp$Geometry wp
Geometry set for 5557 features
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 4.032004 ymin: 7.060309 xmax: 5.06 ymax: 8.061898
CRS: NA
First 5 geometries:
3.3.2.2 Create Simple Feature DataFrame
Store sfc as single feature (sf) object
<- st_sf(wp, crs=4326) wp_sf
3.3.2.3 Transform Projection System
<- wp_sf %>%
wp_sf st_transform(crs = 26392)
3.3.2.4 Handle Missing Data
freq(data = wp_sf,
input = '#status_clean')
#status_clean frequency percentage cumulative_perc
1 Functional 2319 41.73 41.73
2 Non-Functional 2008 36.13 77.86
3 <NA> 748 13.46 91.32
4 Functional but needs repair 248 4.46 95.78
5 Non-Functional due to dry season 151 2.72 98.50
6 Functional but not in use 63 1.13 99.63
7 Abandoned 15 0.27 99.90
8 Abandoned/Decommissioned 5 0.09 100.00
There are eight classes in the #status_clean fields.
To change NA with “unknown”
<- wp_sf %>%
wp_sf rename(status_clean = '#status_clean') %>%
select(status_clean) %>%
mutate(status_clean = replace_na(
"unknown")) status_clean,
3.3.3 Extract Water Points
Functional Points
<- wp_sf %>%
wp_f filter(status_clean %in%
c("Functional",
"Functional but needs repair",
"Functional but not in use"))
Non-Functional Points
<- wp_sf %>%
wp_nf filter(status_clean %in%
c("Non-Functional",
"Non-Functional due to dry season",
"Abandoned",
"Abandoned/Decommissioned"))
Unknown Status Points
<- wp_sf %>%
wp_u filter(status_clean == "unknown")
freq(data = wp_f,
input = 'status_clean')
status_clean frequency percentage cumulative_perc
1 Functional 2319 88.17 88.17
2 Functional but needs repair 248 9.43 97.60
3 Functional but not in use 63 2.40 100.00
freq(data = wp_nf,
input = 'status_clean')
status_clean frequency percentage cumulative_perc
1 Non-Functional 2008 92.15 92.15
2 Non-Functional due to dry season 151 6.93 99.08
3 Abandoned 15 0.69 99.77
4 Abandoned/Decommissioned 5 0.23 100.00
freq(data = wp_u,
input = 'status_clean')
status_clean frequency percentage cumulative_perc
1 unknown 748 100 100
3.3.4 Point-in-Polygon Count
Extract number of total, functional, nonfunctional and unknown water points in each local government area (LGA)
<- osun %>%
wp mutate(`total_wp` = lengths(st_intersects(osun, wp_sf))) %>%
mutate(`wp_f` = lengths(st_intersects(osun, wp_f))) %>%
mutate(`wp_nf` = lengths(st_intersects(osun, wp_nf))) %>%
mutate(`wp_u` = lengths(st_intersects(osun, wp_u)))
To save as rds format
#write_rds(wp, "Data/rds/wp.rds")
ggplot(data = wp,
aes(x = total_wp)) +
geom_histogram(bins=20,
color="black",
fill="light blue") +
geom_vline(aes(xintercept=mean(
na.rm=T)),
total_wp, color="red",
linetype="dashed",
size=0.8) +
ggtitle("Distribution of total water points by LGA") +
xlab("No. of water points") +
ylab("No. of LGAs")
3.4 Combined Data Wrangling
3.4.1 Convert sf dataframes to sp Spatial* class
= as_Spatial(wp_f)
wp_f_spat = as_Spatial(wp_nf)
wp_nf_spat = as_Spatial(osun) osun_spat
Peek
wp_f_spat
class : SpatialPointsDataFrame
features : 2630
extent : 177285.9, 290751, 343128.1, 450859.7 (xmin, xmax, ymin, ymax)
crs : +proj=tmerc +lat_0=4 +lon_0=8.5 +k=0.99975 +x_0=670553.98 +y_0=0 +a=6378249.145 +rf=293.465 +towgs84=-92,-93,122,0,0,0,0 +units=m +no_defs
variables : 1
names : status_clean
min values : Functional
max values : Functional but not in use
wp_nf_spat
class : SpatialPointsDataFrame
features : 2179
extent : 180539, 290616, 340054.1, 450780.1 (xmin, xmax, ymin, ymax)
crs : +proj=tmerc +lat_0=4 +lon_0=8.5 +k=0.99975 +x_0=670553.98 +y_0=0 +a=6378249.145 +rf=293.465 +towgs84=-92,-93,122,0,0,0,0 +units=m +no_defs
variables : 1
names : status_clean
min values : Abandoned
max values : Non-Functional due to dry season
osun_spat
class : SpatialPolygonsDataFrame
features : 30
extent : 176503.2, 291043.8, 331434.7, 454520.1 (xmin, xmax, ymin, ymax)
crs : +proj=tmerc +lat_0=4 +lon_0=8.5 +k=0.99975 +x_0=670553.98 +y_0=0 +a=6378249.145 +rf=293.465 +towgs84=-92,-93,122,0,0,0,0 +units=m +no_defs
variables : 4
names : ADM2_EN, ADM2_PCODE, ADM1_EN, ADM1_PCODE
min values : Aiyedade, NG030001, Osun, NG030
max values : Osogbo, NG030030, Osun, NG030
3.4.2 Convert sp Spatial* class to generic sp format
<- as(wp_f_spat, "SpatialPoints")
wp_f_sp <- as(wp_nf_spat, "SpatialPoints")
wp_nf_sp <-as(osun_spat, "SpatialPolygons") osun_sp
Peek
wp_f_sp
class : SpatialPoints
features : 2630
extent : 177285.9, 290751, 343128.1, 450859.7 (xmin, xmax, ymin, ymax)
crs : +proj=tmerc +lat_0=4 +lon_0=8.5 +k=0.99975 +x_0=670553.98 +y_0=0 +a=6378249.145 +rf=293.465 +towgs84=-92,-93,122,0,0,0,0 +units=m +no_defs
wp_nf_sp
class : SpatialPoints
features : 2179
extent : 180539, 290616, 340054.1, 450780.1 (xmin, xmax, ymin, ymax)
crs : +proj=tmerc +lat_0=4 +lon_0=8.5 +k=0.99975 +x_0=670553.98 +y_0=0 +a=6378249.145 +rf=293.465 +towgs84=-92,-93,122,0,0,0,0 +units=m +no_defs
osun_sp
class : SpatialPolygons
features : 30
extent : 176503.2, 291043.8, 331434.7, 454520.1 (xmin, xmax, ymin, ymax)
crs : +proj=tmerc +lat_0=4 +lon_0=8.5 +k=0.99975 +x_0=670553.98 +y_0=0 +a=6378249.145 +rf=293.465 +towgs84=-92,-93,122,0,0,0,0 +units=m +no_defs
3.4.3 Convert generic sp to spatstat ppp format
# from sp object, convert into ppp format
<- as(wp_f_sp, "ppp")
wp_f_ppp <- as(wp_nf_sp, "ppp") wp_nf_ppp
View
par(mfrow=c(1,2))
plot(wp_f_ppp, main="Functional Water Points")
plot(wp_nf_ppp, main="Non-functional Water Points")
3.4.3.1 Check for Duplication
any(duplicated(wp_f_ppp)); any(duplicated(wp_nf_ppp))
[1] FALSE
[1] FALSE
No duplicates 🤝 no jittering
3.4.4 Create Owin object
<- as(osun_sp, "owin")
osun_owin plot(osun_owin, main="Owin object of Osun State")
3.4.4 Combine point event object with Owin object
= wp_f_ppp[osun_owin]
wp_f_ppp = wp_nf_ppp[osun_owin] wp_nf_ppp
View
Code
par(mfrow=c(1,2))
plot(wp_f_ppp, main="Functional Water Points")
plot(wp_nf_ppp, main="Non-functional Water Points")
4.0 Exploratory Spatial Data Analysis (ESDA)
4.1 Initial Visualization
Code
<- tm_shape(wp) +
wp_f_1 tm_fill("wp_f",
n = 10,
style = "equal",
palette = "Blues") +
tm_borders(lwd = 0.1,
alpha = 1) +
tm_layout(title = "Functional Water Points by LGA",
legend.outside = TRUE)
<- tm_shape(wp) +
wp_f_2 tm_fill("wp_nf",
n = 10,
style = "equal",
palette = "Blues") +
tm_borders(lwd = 0.1,
alpha = 1) +
tm_layout(title = "Non-functional Water Points by LGA",
legend.outside = TRUE)
tmap_arrange(wp_f_1, wp_f_2, nrow = 1)
Code
tmap_mode("view") +
tm_basemap("OpenStreetMap") +
tm_shape(wp_f) +
tm_dots(col = "status_clean",
pal = "blue",
border.col = "black",
title = "Functional") +
tm_shape(wp_nf) +
tm_dots(col = "status_clean",
pal = "purple",
border.col = "black",
title = "Non-Functional") +
tm_view(set.bounds = c(4,7,5,8),
set.zoom.limits = c(8,15))
4.1 Kernel Density Estimation (KDE)
Sigma = bw.ppl()
is used as there are multiple clusters and would be more prominent than bw.diggle()
<- density(wp_f_ppp,
kde_wp_f sigma=bw.ppl,
edge=TRUE,
kernel = "gaussian")
<- density(wp_nf_ppp,
kde_wp_nf sigma=bw.ppl,
edge=TRUE,
kernel = "gaussian")
par(mfrow=c(1,2))
plot(kde_wp_f,
main = "Functional Water Points (in m)",
ribside=c("bottom"))
plot(kde_wp_nf,
main = "Non-Functional Water Points (in m)",
ribside=c("bottom"))
<- adaptive.density(wp_f_ppp,
kde_wp_f_ad method = "kernel")
<- adaptive.density(wp_nf_ppp,
kde_wp_nf_ad method = "kernel")
par(mfrow=c(1,2))
plot(kde_wp_f_ad,
main = "Functional Water Points (in m)",
ribside=c("bottom"))
plot(kde_wp_nf_ad,
main = "Non-Functional Water Points (in m)",
ribside=c("bottom"))
4.2.1 Re-Scale KDE values
From m to km
<- rescale(wp_f_ppp, 1000, "km")
wp_f_ppp.km <- rescale(wp_nf_ppp, 1000, "km") wp_nf_ppp.km
<- density(wp_f_ppp.km,
kde_wp_f_scale sigma = bw.ppl,
edge = TRUE)
<- density(wp_f_ppp.km,
kde_wp_nf_scale sigma = bw.ppl,
edge = TRUE)
par(mfrow=c(1,2))
plot(kde_wp_f_scale,
main = "Functional Water Points (in km)",
ribside=c("bottom"))
plot(kde_wp_nf_scale,
main = "Non-Functional Water Points (in km)",
ribside=c("bottom"))
<- adaptive.density(wp_f_ppp.km,
kde_wp_f_scale_ad method = "kernel")
<- adaptive.density(wp_nf_ppp.km,
kde_wp_nf_scale_ad method = "kernel")
par(mfrow=c(1,2))
plot(kde_wp_f_scale_ad,
main = "Functional Water Points (in km)",
ribside=c("bottom"))
plot(kde_wp_nf_scale_ad,
main = "Non-Functional Water Points (in km)",
ribside=c("bottom"))
4.2.2 Convert KDE to Grid Object
<- as.SpatialGridDataFrame.im(kde_wp_f_scale)
grid_wp_f <- as.SpatialGridDataFrame.im(kde_wp_nf_scale)
grid_wp_nf
spplot(grid_wp_f)
spplot(grid_wp_nf)
<- as.SpatialGridDataFrame.im(kde_wp_f_scale_ad)
grid_wp_f_ad <- as.SpatialGridDataFrame.im(kde_wp_nf_scale_ad)
grid_wp_nf_ad
spplot(grid_wp_f_ad)
spplot(grid_wp_nf_ad)
4.2.3 Convert grid object to Raster
<- raster(grid_wp_f)
kde_wp_f_raster kde_wp_f_raster
class : RasterLayer
dimensions : 128, 128, 16384 (nrow, ncol, ncell)
resolution : 0.8948485, 0.9616045 (x, y)
extent : 176.5032, 291.0438, 331.4347, 454.5201 (xmin, xmax, ymin, ymax)
crs : NA
source : memory
names : v
values : -3.56348e-16, 10.55944 (min, max)
<- raster(grid_wp_nf)
kde_wp_nf_raster kde_wp_nf_raster
class : RasterLayer
dimensions : 128, 128, 16384 (nrow, ncol, ncell)
resolution : 0.8948485, 0.9616045 (x, y)
extent : 176.5032, 291.0438, 331.4347, 454.5201 (xmin, xmax, ymin, ymax)
crs : NA
source : memory
names : v
values : -3.56348e-16, 10.55944 (min, max)
4.2.4 Assign Projection System
projection(kde_wp_f_raster) <- CRS("+init=EPSG:26391 +datum:WGS84 +units=km")
kde_wp_f_raster
class : RasterLayer
dimensions : 128, 128, 16384 (nrow, ncol, ncell)
resolution : 0.8948485, 0.9616045 (x, y)
extent : 176.5032, 291.0438, 331.4347, 454.5201 (xmin, xmax, ymin, ymax)
crs : +init=EPSG:26391 +datum:WGS84 +units=km
source : memory
names : v
values : -3.56348e-16, 10.55944 (min, max)
projection(kde_wp_nf_raster) <- CRS("+init=EPSG:26391 +datum:WGS84 +units=km")
kde_wp_nf_raster
class : RasterLayer
dimensions : 128, 128, 16384 (nrow, ncol, ncell)
resolution : 0.8948485, 0.9616045 (x, y)
extent : 176.5032, 291.0438, 331.4347, 454.5201 (xmin, xmax, ymin, ymax)
crs : +init=EPSG:26391 +datum:WGS84 +units=km
source : memory
names : v
values : -3.56348e-16, 10.55944 (min, max)
4.2 Visualize KDE
Code
#tmap generation function
<- function(raster_obj, map_title) {
kde tmap_mode("view")
#tm_basemap("OpenStreetMap") +
tm_shape(raster_obj) +
tm_raster("v", alpha=0.9, palette = "YlGnBu") +
tm_view(set.bounds = c(4,7,5,8),
set.zoom.limits = c(8, 13))
}
kde(kde_wp_f_raster)
kde(kde_wp_nf_raster)
4.3 Nearest Neighbor Index
Clarke-Evans Test to check if distribution is random, clustered or dispersed with a 95% confidence interval
4.3.1 Test - Functional Water Points
The test hypotheses for Functional Water Point is :
H0 : The distribution is random
H1 : The distribution is not random
clarkevans.test(wp_f_ppp,
correction="none",
clipregion=NULL,
alternative=c("clustered"),
nsim=99)
Clark-Evans test
No edge correction
Monte Carlo test based on 99 simulations of CSR with fixed n
data: wp_f_ppp
R = 0.44265, p-value = 0.01
alternative hypothesis: clustered (R < 1)
4.3.2 Test - Non-Functional Water Points
The test hypotheses for Non-Functional Water Point is :
H0 : The distribution is random
H1 : The distribution is not random (clustered)
clarkevans.test(wp_nf_ppp,
correction="none",
clipregion=NULL,
alternative=c("clustered"),
nsim=99)
Clark-Evans test
No edge correction
Monte Carlo test based on 99 simulations of CSR with fixed n
data: wp_nf_ppp
R = 0.43223, p-value = 0.01
alternative hypothesis: clustered (R < 1)
4.3.3 Conclusion
As the p-value is less than 0.05 is statistically significant and R<1 for both functional and non-functional water points, we reject the null hypothesis that the distribution is not random and conclude that the water points are likely clustered.
5.0 2nd Order Spatial Point Pattern Analysis
Using G-Function for analysis with 95% confidence
5.1 Test 1 - Functional Water Points
The test hypotheses for Functional Water Point is :
H0 : The distribution is random
H1 : The distribution is not random
= Gest(wp_f_ppp, correction = "border")
G_wp_f plot(G_wp_f)
Complete Spatial Randomness Test
<- envelope(wp_f_ppp, Gest, nsim = 500) G_wp_f.csr
Generating 500 simulations of CSR ...
1, 2, 3, .5....10....15....20....25....30....35....40....45....50....55...
.60....65....70....75....80....85....90....95....100....105....110....115.
...120....125....130....135....140....145....150....155....160....165....170....
175....180....185....190....195....200....205....210....215....220....225....230..
..235....240....245....250....255....260....265....270....275....280....285....290
....295....300....305....310....315....320....325....330....335....340....345...
.350....355....360....365....370....375....380....385....390....395....400....405.
...410....415....420....425....430....435....440....445....450....455....460....
465....470....475....480....485....490....495.... 500.
Done.
plot(G_wp_f.csr, main="G test - Functional")
5.1.1 Conclusion
We reject the null hypothesis that functional water points are randomly distributed at 99% confidence interval. As G^obs(r) rises quickly and is greater than Gtheo(r) and the envelope it means the functional water points are clustered.
5.2 Test 2 - Non-functional Water Points
The test hypotheses for Non-Functional Water Point is :
H0 : The distribution is random
H1 : The distribution is not random
= Gest(wp_nf_ppp, correction = "border")
G_wp_nf plot(G_wp_nf)
Complete Spatial Randomness Test
<- envelope(wp_nf_ppp, Gest, nsim = 500) G_wp_nf.csr
Generating 500 simulations of CSR ...
1, 2, 3, .5....10....15....20....25....30....35....40....45....50....55...
.60....65....70....75....80....85....90....95....100....105....110....115.
...120....125....130....135....140....145....150....155....160....165....170....
175....180....185....190....195....200....205....210....215....220....225....230..
..235....240....245....250....255....260....265....270....275....280....285....290
....295....300....305....310....315....320....325....330....335....340....345...
.350....355....360....365....370....375....380....385....390....395....400....405.
...410....415....420....425....430....435....440....445....450....455....460....
465....470....475....480....485....490....495.... 500.
Done.
plot(G_wp_nf.csr, main = "G test - Non-functional")
5.2.1 Conclusion
We reject the null hypothesis that non-functional water points are randomly distributed at 99% confidence interval. As G^obs(r) rises quickly and is greater than Gtheo(r) and the envelope it means the functional water points are clustered.
5.3 Extra Tests
We see the same observation as mentioned above after zooming into a few neighborhoods such as Ife Central for functional water ponits and Ede North for non-functional water points.
5.3.1 Data Processing
#Functional water points
<- osun[osun$ADM2_EN == "Ife Central",] %>%
IfeCentral as('Spatial') %>%
as('SpatialPolygons') %>%
as('owin')
#non functional points
<- osun[osun$ADM2_EN == "Ede North",] %>%
EdeNorth as('Spatial') %>%
as('SpatialPolygons') %>%
as('owin')
<- wp_f_ppp[IfeCentral]
IfeCentral_ppp <- wp_nf_ppp[EdeNorth] EdeNorth_ppp
View
Code
par(mfrow=c(1,2))
plot(IfeCentral_ppp,
main = "Ife Central - Functional Water Points",
ribside=c("bottom"))
plot(EdeNorth_ppp,
main = "Ede North - Non-Functional Water Points",
ribside=c("bottom"))
5.3.2 Ife Central - Functional Water Points
= Gest(IfeCentral_ppp, correction = "border")
Ifecentral_G.csr plot(Ifecentral_G.csr)
Complete Spatial Randomness Test
<- envelope(IfeCentral_ppp, Gest, nsim=100) Ifecentral_G.csr
Generating 100 simulations of CSR ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100.
Done.
plot(Ifecentral_G.csr, main="Ife Central G-Function Test")
5.3.3 Ede North - Non-Functional Water Points
= Gest(EdeNorth_ppp, correction = "border")
EdeNorth_G.csr plot(EdeNorth_G.csr)
Complete Spatial Randomness Test
<- envelope(EdeNorth_ppp, Gest, nsim=100) EdeNorth_G.csr
Generating 100 simulations of CSR ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100.
Done.
plot(EdeNorth_G.csr, main="Ede North G-Function Test")
5.3.4 Conclusion
We observe the same that G^obs(r) rises quickly and is greater than Gtheo(r) and the envelope it means the functional and non-functional water points are clustered, thereby rejecting the null hypothesis that the water points are randomly distributed at 99% confidence interval.
6.0 Spatial Correlation Analysis
6.1 Density Analysis
Segregate data for plotting. Then convert to spatstat ppp format
<- as_Spatial(wp_sf)
wp_spat @data$status_clean <-as.factor(wp_spat@data$status_clean)
wp_spat<- as(wp_spat, "ppp")
wp_spat_ppp
plot(wp_spat_ppp, which.marks = "status_clean")
6.3 Test Correlation
Use LCross function to test for spatial independence
The test hypotheses for the spatial distribution of functional and non-functional water point are:
H0 : The distribution is spatially independent
H1 : The distribution is not spatially independent
Simulation
<- envelope(wp_spat_ppp,
wp_spat_ppp_Lcross.csr
Lcross, i="Functional",
j="Non-Functional",
correction="border",
nsim=100)
Generating 100 simulations of CSR ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100.
Done.
Plot
plot(wp_spat_ppp_Lcross.csr, xlab="in km")
Conclusion
We reject the null hypothesis that water points are statistically independent at 99% confidence interval. As L^obs(r) is greater than Ltheo(r) and the envelope it means the water points are statistically not independent.
6.4 Local Colocation Analysis
Remove status ‘unknown’ water points
<- wp_sf %>% filter(!status_clean=='unknown') wp_sf_clean
#6 nearbest neighbors
= include_self(st_knn(st_geometry(wp_sf_clean), 6))
nb
#weight matrix
= st_kernel_weights(nb, wp_sf_clean, "gaussian", adaptive = TRUE)
wt
= wp_sf_clean %>%
f filter(status_clean == "Functional")
= f$status_clean
A
= wp_sf_clean %>%
nf filter(status_clean == "Non-Functional")
= nf$status_clean
B
= local_colocation(A, B, nb, wt, 42)
LCLQ = cbind(wp_sf_clean, LCLQ) LCLQ_wp
<- LCLQ_wp %>%
LCLQ_wp mutate(
`p_sim_Non.Functional` = replace(`p_sim_Non.Functional`, `p_sim_Non.Functional` > 0.05, NA),
`Non.Functional` = ifelse(`p_sim_Non.Functional` > 0.05, NA, `Non.Functional`))
<- LCLQ_wp %>% mutate(`size` = ifelse(is.na(`Non.Functional`), 1, 5)) LCLQ_wp
View
Code
tmap_mode('view')
tm_view(set.bounds = c(4,7,5,8),
set.zoom.limits=c(8, 13),
bbox = st_bbox(filter(LCLQ_wp, !is.na(`Non.Functional`)))) +
tm_shape(osun) +
tm_borders() +
tm_shape(LCLQ_wp) +
tm_dots(col = c("Non.Functional"),
size = "size",
scale=0.15,
border.col = "black",
border.lwd = 0.5,
alpha=0.5,
title="LCLQ"
)
Code
tmap_mode('plot')
LCLQ is just less than 1 which is much greater than the p-value of 0.05. Hence it means there are few Non-Functional Water Points in the same neighborhood as Functional Water Points.
7.0 Acknowledgements
I’d like to thank Professor Kam for his insights and resource materials provided under IS415 Geospatial Analytics and Applications.