After obtaining (step 2) a single dataframe with spatial data and data-of-interest, we can manipulate this dataframe before plotting on a map and during explorative data-visualisation. Two types of data-manipulation:

  1. spatial data-manipulations: spatial cropping, union, aggregation, etc.
  2. ‘regular’ data-manipulations: combine categories, calculate percentage of population, etc.

… but crucially, using R these manipulations use the same (tidyverse) syntax and are done within one dataframe, encouraging data exploration of spatial dimensions together with other data-dimensions (trends, quantities, time-dimensions, etc.).

FYI, instead of doing the ‘regular’ data-manipulations in R, you can also do this in SAS, Stata, Excel, etc. if you are (currently) more familiar with those, before reading and joining data (previous step).

# (1) load spatial data
raillines <- st_read(here('data/source/census_1851_raillines/1851EngWalesScotRail_Lines.shp'))

districts_spatial <- st_read(here('data/source/census_1851_districts/1851EngWalesRegistrationDistrict.shp')) %>%
  mutate(CEN1 = as.numeric(as.character(CEN1))) # make sure identifiers are the same type

# (2) load and add data-of-interest
districts_data <- read_excel(here('data/census1851_districts_count.xlsx'))
districts <- left_join(districts_spatial, districts_data, by = c('CEN1' = 'district_id'))

Data-manipulation on joint dataset

# datamanipulations such as calculating percentages still work on the single, joint dataset
#  e.g. percentage of those employed in professions
districts <- districts %>%
  mutate(pct_prof = tertiary_services_professions / total)
# Nearly 50% employed in professions in central districts of London
mapview(districts, zcol = 'pct_prof')

Spatial manipulation on joint dataset

Data-exploration: what is R_DIV variable?

# get al tally of the R_DIV variable
divisions <- districts %>%
  group_by(R_DIV) %>%
# 11 divisions in the spatial data
## Simple feature collection with 11 features and 2 fields
## geometry type:  GEOMETRY
## dimension:      XY
## bbox:           xmin: 87019.07 ymin: 7067.26 xmax: 655747.5 ymax: 657473.5
## epsg (SRID):    NA
## proj4string:    +proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +datum=OSGB36 +units=m +no_defs
## # A tibble: 11 x 3
##    R_DIV           n                                               geometry
##  * <fct>       <int>                                         <GEOMETRY [m]>
##  1 EASTERN       153 MULTIPOLYGON (((576079.7 182724.1, 575881.1 182595.6,…
##  2 LONDON         50 MULTIPOLYGON (((545304.4 182108.8, 545328.8 182054.2,…
##  3 NORTH MIDL…    81 POLYGON ((538173.2 334349.2, 538357.7 334297.6, 53888…
##  4 NORTH WEST…    49 MULTIPOLYGON (((318709.4 387773.5, 318669.2 387723.4,…
##  5 NORTHERN       64 MULTIPOLYGON (((342348.8 478405.1, 342385.8 478359.1,…
##  6 SOUTH EAST…   196 MULTIPOLYGON (((448535.6 96647.07, 448662.4 96607.89,…
##  7 SOUTH MIDL…   111 MULTIPOLYGON (((516697 173206.1, 516698.8 173165.7, 5…
##  8 SOUTH WEST…   146 MULTIPOLYGON (((87825.76 8836.771, 87870.98 8816.206,…
##  9 WELSH          91 MULTIPOLYGON (((322121.3 165180.4, 322145.6 165022.8,…
## 10 WEST MIDLA…   158 MULTIPOLYGON (((359975 172330.1, 359944 172223.4, 359…
## 11 YORKSHIRE      99 MULTIPOLYGON (((446011.8 382146.7, 446017.4 382109, 4…

Notice: the group-and-tally step has not aggregated data by counting districts per divisions, it has also merged those districts into new geometry-objects at the division-level. These are new spatial boundaries, not present in the original data, demonstrating the interchangability of spatial and ‘regular’ data-operations.

# view the newly created/aggregated spatial division boundaries

Spatial / data-manipulation: select districts in Manchester-region

Mousing-over on the interactive map above, we can see that the division around the Manchester-area is called “North Western”. This information allows us to filter the districts down to those that are situated in that division.

nwestern <- districts %>%
  filter(R_DIV == 'NORTH WESTERN')

Notice that a ‘regular’ data-operation such as filter() works as well on subsetting spatial data, as subsetting ‘regular’ data.

# static plot of Manchester districts
qtm(nwestern, fill = 'pct_secondary')

# interactive plot of Manchester districts
mapview(nwestern, zcol = 'pct_secondary') + raillines

Export interactive map for sharing

map_manchester <- mapview(nwestern, zcol = 'pct_secondary') + raillines
mapshot(map_manchester, url = here('output/map_manchester.html'))

Generated HTML-file ‘map_manchester.html’ + supporting files (folder ‘map_manchester_files’) is in folder ‘output’.

Summary: load, manipulate, interactively visualise, and share in 7 lines

The exported HTML-file of the interactive mapview()-generated map, can be shared with collaborators, put on a project-site, used during a presentation, etc. to furter explore or demonstrated the spatial data in context.

# (1) load spatial data
raillines <- st_read(here('data/source/census_1851_raillines/1851EngWalesScotRail_Lines.shp'))
districts_spatial <- st_read(here('data/source/census_1851_districts/1851EngWalesRegistrationDistrict.shp')) %>%
  mutate(CEN1 = as.numeric(as.character(CEN1)))

# (2) load and add data-of-interest
districts_data <- read_excel(here('data/census1851_districts_count.xlsx'))
districts <- left_join(districts_spatial, districts_data, by = c('CEN1' = 'district_id'))

# (3) manipulate data (select Manchester area)
nwestern <- districts %>% filter(R_DIV == 'NORTH WESTERN')

# export & share with collaborators, etc.
map_manchester <- mapview(nwestern, zcol = 'pct_secondary') + raillines
mapshot(map_manchester, url = here('output/map_manchester.html'))