After obtaining (step 2) a single dataframe with spatial data and data-of-interest, we can manipulate this dataframe before plotting on a map and during explorative data-visualisation. Two types of data-manipulation:
… but crucially, using R these manipulations use the same (tidyverse) syntax and are done within one dataframe, encouraging data exploration of spatial dimensions together with other data-dimensions (trends, quantities, time-dimensions, etc.).
FYI, instead of doing the ‘regular’ data-manipulations in R, you can also do this in SAS, Stata, Excel, etc. if you are (currently) more familiar with those, before reading and joining data (previous step).
library(here)
library(sf)
library(tmap)
library(dplyr)
library(readxl)
library(mapview)
# (1) load spatial data
raillines <- st_read(here('data/source/census_1851_raillines/1851EngWalesScotRail_Lines.shp'))
districts_spatial <- st_read(here('data/source/census_1851_districts/1851EngWalesRegistrationDistrict.shp')) %>%
mutate(CEN1 = as.numeric(as.character(CEN1))) # make sure identifiers are the same type
# (2) load and add data-of-interest
districts_data <- read_excel(here('data/census1851_districts_count.xlsx'))
districts <- left_join(districts_spatial, districts_data, by = c('CEN1' = 'district_id'))
# datamanipulations such as calculating percentages still work on the single, joint dataset
# e.g. percentage of those employed in professions
districts <- districts %>%
mutate(pct_prof = tertiary_services_professions / total)
# Nearly 50% employed in professions in central districts of London
mapview(districts, zcol = 'pct_prof')