After obtaining (step 2) a single dataframe with spatial data and data-of-interest, we can manipulate this dataframe before plotting on a map and during explorative data-visualisation. Two types of data-manipulation:

spatial data-manipulations: spatial cropping, union, aggregation, etc.
‘regular’ data-manipulations: combine categories, calculate percentage of population, etc.

… but crucially, using R these manipulations use the same (tidyverse) syntax and are done within one dataframe, encouraging data exploration of spatial dimensions together with other data-dimensions (trends, quantities, time-dimensions, etc.).

FYI, instead of doing the ‘regular’ data-manipulations in R, you can also do this in SAS, Stata, Excel, etc. if you are (currently) more familiar with those, before reading and joining data (previous step).

library(here)
library(sf)
library(tmap)
library(dplyr)
library(readxl)
library(mapview)

# (1) load spatial data
raillines <- st_read(here('data/source/census_1851_raillines/1851EngWalesScotRail_Lines.shp'))

districts_spatial <- st_read(here('data/source/census_1851_districts/1851EngWalesRegistrationDistrict.shp')) %>%
  mutate(CEN1 = as.numeric(as.character(CEN1))) # make sure identifiers are the same type

# (2) load and add data-of-interest
districts_data <- read_excel(here('data/census1851_districts_count.xlsx'))
districts <- left_join(districts_spatial, districts_data, by = c('CEN1' = 'district_id'))

Data-manipulation on joint dataset

# datamanipulations such as calculating percentages still work on the single, joint dataset
#  e.g. percentage of those employed in professions
districts <- districts %>%
  mutate(pct_prof = tertiary_services_professions / total)

# Nearly 50% employed in professions in central districts of London
mapview(districts, zcol = 'pct_prof')

Spatial manipulation on joint dataset

Data-exploration: what is R_DIV variable?

# get al tally of the R_DIV variable
divisions <- districts %>%
  group_by(R_DIV) %>%
  tally()

# 11 divisions in the spatial data
divisions

## Simple feature collection with 11 features and 2 fields
## geometry type:  GEOMETRY
## dimension:      XY
## bbox:           xmin: 87019.07 ymin: 7067.26 xmax: 655747.5 ymax: 657473.5
## epsg (SRID):    NA
## proj4string:    +proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +datum=OSGB36 +units=m +no_defs
## # A tibble: 11 x 3
##    R_DIV           n                                               geometry
##  * <fct>       <int>                                         <GEOMETRY [m]>
##  1 EASTERN       153 MULTIPOLYGON (((576079.7 182724.1, 575881.1 182595.6,…
##  2 LONDON         50 MULTIPOLYGON (((545304.4 182108.8, 545328.8 182054.2,…
##  3 NORTH MIDL…    81 POLYGON ((538173.2 334349.2, 538357.7 334297.6, 53888…
##  4 NORTH WEST…    49 MULTIPOLYGON (((318709.4 387773.5, 318669.2 387723.4,…
##  5 NORTHERN       64 MULTIPOLYGON (((342348.8 478405.1, 342385.8 478359.1,…
##  6 SOUTH EAST…   196 MULTIPOLYGON (((448535.6 96647.07, 448662.4 96607.89,…
##  7 SOUTH MIDL…   111 MULTIPOLYGON (((516697 173206.1, 516698.8 173165.7, 5…
##  8 SOUTH WEST…   146 MULTIPOLYGON (((87825.76 8836.771, 87870.98 8816.206,…
##  9 WELSH          91 MULTIPOLYGON (((322121.3 165180.4, 322145.6 165022.8,…
## 10 WEST MIDLA…   158 MULTIPOLYGON (((359975 172330.1, 359944 172223.4, 359…
## 11 YORKSHIRE      99 MULTIPOLYGON (((446011.8 382146.7, 446017.4 382109, 4…

Notice: the group-and-tally step has not aggregated data by counting districts per divisions, it has also merged those districts into new geometry-objects at the division-level. These are new spatial boundaries, not present in the original data, demonstrating the interchangability of spatial and ‘regular’ data-operations.

# view the newly created/aggregated spatial division boundaries
mapview(divisions)

Spatial / data-manipulation: select districts in Manchester-region

Mousing-over on the interactive map above, we can see that the division around the Manchester-area is called “North Western”. This information allows us to filter the districts down to those that are situated in that division.

nwestern <- districts %>%
  filter(R_DIV == 'NORTH WESTERN')

Notice that a ‘regular’ data-operation such as filter() works as well on subsetting spatial data, as subsetting ‘regular’ data.

# static plot of Manchester districts
qtm(nwestern, fill = 'pct_secondary')

# interactive plot of Manchester districts
mapview(nwestern, zcol = 'pct_secondary') + raillines

Export interactive map for sharing

map_manchester <- mapview(nwestern, zcol = 'pct_secondary') + raillines
mapshot(map_manchester, url = here('output/map_manchester.html'))

Generated HTML-file ‘map_manchester.html’ + supporting files (folder ‘map_manchester_files’) is in folder ‘output’.

Summary: load, manipulate, interactively visualise, and share in 7 lines

The exported HTML-file of the interactive mapview()-generated map, can be shared with collaborators, put on a project-site, used during a presentation, etc. to furter explore or demonstrated the spatial data in context.

# (1) load spatial data
raillines <- st_read(here('data/source/census_1851_raillines/1851EngWalesScotRail_Lines.shp'))
districts_spatial <- st_read(here('data/source/census_1851_districts/1851EngWalesRegistrationDistrict.shp')) %>%
  mutate(CEN1 = as.numeric(as.character(CEN1)))

# (2) load and add data-of-interest
districts_data <- read_excel(here('data/census1851_districts_count.xlsx'))
districts <- left_join(districts_spatial, districts_data, by = c('CEN1' = 'district_id'))

# (3) manipulate data (select Manchester area)
nwestern <- districts %>% filter(R_DIV == 'NORTH WESTERN')

# export & share with collaborators, etc.
map_manchester <- mapview(nwestern, zcol = 'pct_secondary') + raillines
mapshot(map_manchester, url = here('output/map_manchester.html'))

Manipulating (spatial) data

Data-manipulation on joint dataset

Spatial manipulation on joint dataset

Spatial / data-manipulation: select districts in Manchester-region

Export interactive map for sharing

Summary: load, manipulate, interactively visualise, and share in 7 lines