After obtaining (step 2) a single dataframe with spatial data and data-of-interest, we can manipulate this dataframe before plotting on a map and during explorative data-visualisation. Two types of data-manipulation:
… but crucially, using R these manipulations use the same (tidyverse) syntax and are done within one dataframe, encouraging data exploration of spatial dimensions together with other data-dimensions (trends, quantities, time-dimensions, etc.).
FYI, instead of doing the ‘regular’ data-manipulations in R, you can also do this in SAS, Stata, Excel, etc. if you are (currently) more familiar with those, before reading and joining data (previous step).
# (1) load spatial data
raillines <- st_read(here('data/source/census_1851_raillines/1851EngWalesScotRail_Lines.shp'))
districts_spatial <- st_read(here('data/source/census_1851_districts/1851EngWalesRegistrationDistrict.shp')) %>%
mutate(CEN1 = as.numeric(as.character(CEN1))) # make sure identifiers are the same type
# (2) load and add data-of-interest
districts_data <- read_excel(here('data/census1851_districts_count.xlsx'))
districts <- left_join(districts_spatial, districts_data, by = c('CEN1' = 'district_id'))
# datamanipulations such as calculating percentages still work on the single, joint dataset
# e.g. percentage of those employed in professions
districts <- districts %>%
mutate(pct_prof = tertiary_services_professions / total)
# Nearly 50% employed in professions in central districts of London
mapview(districts, zcol = 'pct_prof')