PAF 515: Data Science III Project Management

Lab 03: Infographics & Mapping Tools for Data Presentation - Middle Atlantic Division

Lecture Video



Introduction

As project managers, data experts, and program evaluators, our ability to convey information to stakeholders in a succinct and easy-to-understand format is critical. While formal written reports are sometimes needed to provide in-depth details about data and projects, concise communication tools are typically the most useful. In these instances, information experts must devise a quick and effective way to share their findings with decision makers and support the success of a project or policy (Otten, Cheng, & Drewnowski, 2015).

This often difficult process of selecting methodologies to convey technical knowledge to a non-technical audience is not a new dilemma; humans have been strategizing the best way to share complex data points for centuries, starting with cave drawings. For nearly 400 years, this study of data visualization has become embedded in academia and industry, spanning numerous fields and professions (Mackinlay & Winslow, n.d.).

In 1644, an early version of a line graph was devised by Michael Florent van Langren, an astronomer and cartographer from the Netherlands who was employed by the Spanish courts. This graphic is now considered to be the first statistical visualization (Friendly, 2006).

In 1786, William Playfair, a Scottish engineer, published one of the first books on data visualization, Commercial and Political Atlas, to present trade-related data in England. In the book and his following work, Playfair introduced numerous ways to display data. Many of his strategies, including the Pie Chart, remain prevalent today and Playfair is often considered to be the “father of statistical presentation” (Mackinlay & Winslow, n.d.).

Subsequently in the 1800s, statistical visualizations increased in popularity and usage ranging from the epidemiological visualizations by British physician Dr. Jon Snow in 1855 and British statistician and founder of modern nursing, Florence Nightingale in 1856, the historical representation of Napoleon’s loss of 1812 by French civil engineer Charles Minard in 1869, and the demographic work by American sociologist W.E.B. Du Bois in the late 1800s-early 1900s (Mackinlay & Winslow, n.d., Andrews, 2022, Mansky, 2018).

From the 1900s to the present, researchers across disciplines have continued examining the best ways to present specialized knowledge to general audiences. Experts such as Edward Tufte, Stephanie Evergreen, Joel Best, Cole Nussbaumer Knaflic, David McCandless, Nathan Yau, Talithia Williams, members of the Data Visualization Society and others have dedicated their careers to data visualization and literacy and provide great insights on how to utilize modern technologies to make complex topics easier to understand with numbers and visuals.

One type of data visualization that has particularly grown in popularity over the past few years are infographics. Albers (2014) defines the term infographic as “a web-based image that takes a large amount of information in text or numerical form and condenses it into a combination of images and text with a goal of making the information presentable and digestible to an audience”. Further, “[w]hile not appropriate or useful for all types of content, infographics provide context by using visuals to show relationships in data, anatomy, hierarchy, chronology, and geography. Communicating relationships are at the heart of communicating complex information and infographics excel at communicating that aspect.” (Albers, 2015). The primary types of infographics are data graphics, maps, and diagrams (Otten, Cheng, & Drewnowski, 2015).

To see infographics in action, check out the following video on the 100 People Project as a great example of how data communicators can transform really large amounts of data into simple visuals:



In addition to pictorial infographics, well-designed maps are an excellent visualization tool to present geographic project data. Maps are particularly critical to helping stakeholders identify what areas of their target region are most in need.

The video below provides an introduction to the usage of choropleth maps for this purpose:



This week we will be combining the powers of the R package ggplot2 with patchwork and ggiraph to create our infographics and interactive map.

If you would like an overview of these packages, you can view the following videos:



https://www.infoworld.com/video/110220/how-to-create-interactive-ggplot-graphics-with-the-ggiraph-r-package

Also, to read further about the history of data visualization as well as the specific data visualization types mentioned in the videos above check out the links here:

  • Infographics And Public Policy: Using Data Visualization To Convey Complex Information by Jennifer J. Otten, Karen Cheng, and Adam Drewnowski (2015) download
  • Designing Great Visualizations Whitepaper by Jock D. Mackinlay with Kevin Winslow (n.d.) download
  • A Brief History of Data Visualization by Michael Friendly (2006): download
  • How Florence Nightingale Changed Data Visualization Forever by RJ Andrews (2022): download
  • W.E.B. Du Bois’ Visionary Infographics Come Together for the First Time in Full Color by Jackie Mansky (2018): download
  • Infographics: Horrid Chartjunk or Quality Communication by Michael J. Albers (2014): download
  • Infographics and Communicating Complex Information by Michael J. Albers (2015): download
  • Visualizations That Really Work by Scott Berinato (2016): download
  • Pictorial Fraction Charts: https://datavizproject.com/data-type/fraction-of-pictograms/
  • Waffle Charts: https://datavizproject.com/data-type/percentage-grid/
  • Choropleth Maps: https://datavizproject.com/data-type/choropleth-map-2/
  • Mapping 2020 US Census Data in R by Kyle Walker (creator of tigris & tidycensus!) (2022) : https://walker-data.com/umich-workshop-2022/mapping-census-data/#1

  • Now that we’ve explored pictorial waffle chart infographics and choropleth maps as tools for project data presentation, in this lab we will practice creating our own visuals to analyze our CDC Social Vulnerability Index for our divisions of interest.

    Library

    To begin, we will need to install a few new packages and load some packages that we’ve previously installed:

    Code
    # Install new packages
    utils::install.packages("ggiraph")
    utils::install.packages("showtext")
    utils::install.packages("tigris")
    utils::install.packages("scales")
    utils::install.packages("htmlwidgets")
    utils::install.packages("htmltools")
    utils::install.packages("widgetframe")
    utils::install.packages("patchwork")
    utils::install.packages("magick")
    Code
    # Load packages
    library(here)        # relative filepaths for reproducibility
    library(ggplot2)     # data visualization
    library(patchwork)   # data visualization layout
    library(magick)      # image editing in R
    library(ggiraph)     # interactive data visualization
    library(htmlwidgets) # create widget from interactive visualizations
    library(htmltools)   # encode HTML code in map tooltips
    library(widgetframe) # save widget of interactive visualizations
    library(tidyverse)   # data wrangling
    library(kableExtra)  # table formatting
    library(showtext)    # utilize special fonts in graphs
    library(scales)      # format numbers and colors for graphs
    library(tigris)      # pull US Census shapefiles

    Repo Structure Update

    For this lab we will be storing the output of our analysis in a sub-folder called imgs. If no one on your team has created the labs/wk03/imgs folder, go ahead and add it now.

    Functions and Constant Variables

    Recall in Lab 02 that we created functions to process our data throughout the duration of the project. You should have these four functions in your project_data_steps.R:

    • fips_region_assignment()
    • rank_variables()
    • svi_theme_variables()
    • svi_theme_flags()

    Next, you will want to add the following four (4) functions to your project_data_steps.R file to complete this lab:

    The first function combines our previous functions into one singular function that makes it easier to process our data in one step and the second allows us to merge our 2010 and 2020 dataframes into one:

    Code
    # Process data ----
    
    load_svi_data <- function(df, rank_by="national", location=NULL, state_abbr=NULL, percentile=.90) {
      
      df <- fips_region_assignment(df)
      df <- rank_variables(df, rank_by, location, state_abbr)
      df <- svi_theme_variables(df)
      df <- svi_theme_flags(df, percentile)
      return(df)
    }
    
    
    # Merge SVI data periods ----
    
    merge_svi_data <- function(df10, df20) {
      
      joint_tracts <- dplyr::intersect(df10$GEOID_2010_trt, df20$GEOID_2010_trt) %>% tibble()
      colnames(joint_tracts) <- "GEOID_2010_trt"
      
      svi_merged <- left_join(joint_tracts, df10, join_by("GEOID_2010_trt" == "GEOID_2010_trt" ))
      
      df20 <- df20 %>% select(!c(colnames(df20)[2:11]))
      
      svi_merged <- left_join(svi_merged, df20, join_by("GEOID_2010_trt" == "GEOID_2010_trt" ))
      
      colnames(svi_merged) <- str_replace(colnames(svi_merged), "\\.x", "_10")
      
      colnames(svi_merged) <- str_replace(colnames(svi_merged), "\\.y", "_20") 
      
      return(svi_merged)
    }

    The following two functions will sum up percentages of our SVI variables for our locations of interest:

    pct age 18-64 = (( round(sum(E_TOTPOP_20)) - sum( (round(E_AGE17_20/sum(E_TOTPOP_20)100)), (round(E_AGE65_20/sum(E_TOTPOP_20)100)) )))

    Code
    # Calculate SVI variable percentages ----
    
    svi_percentages10 <- function(df, division_name) {
      df_out <- 
        df %>% 
        filter(!is.na(ET_INSURSTATUS_12)) %>%
        summarise(
          division = division_name,
          year = 2010,
          `pct in poverty` = round(sum(E_POV150_10)/sum(ET_POVSTATUS_10)*100),
          `pct not in poverty` = round(((sum(ET_POVSTATUS_10) - sum(E_POV150_10))/sum(ET_POVSTATUS_10))*100),
          `pct unemployed` = round(sum(E_UNEMP_10)/sum(ET_EMPSTATUS_10)*100),
          `pct employed` = round(((sum(ET_EMPSTATUS_10) - sum(E_UNEMP_10))/sum(ET_EMPSTATUS_10))*100),
          `pct housing cost-burdened` = round(sum(E_HBURD_10)/sum(ET_HOUSINGCOST_10)*100),
          `pct not housing cost-burdened` = round(((sum(ET_HOUSINGCOST_10) - sum(E_HBURD_10))/sum(ET_HOUSINGCOST_10))*100),
          `pct adults without high school diploma` = round(sum(E_NOHSDP_10)/sum(ET_EDSTATUS_10)*100),
          `pct adults with high school diploma` = round(((sum(ET_EDSTATUS_10) - sum(E_NOHSDP_10))/sum(ET_EDSTATUS_10))*100),
          `pct age 17 & under` = round(sum(E_AGE17_10)/sum(E_TOTPOP_10)*100),
          `pct age 18-64` = round( (sum(E_TOTPOP_10)/sum(E_TOTPOP_10)*100) - (sum(round(sum(E_AGE17_10)/sum(E_TOTPOP_10)*100), round(sum(E_AGE65_10)/sum(E_TOTPOP_10)*100)))),
          `pct age 65+` = round(sum(E_AGE65_10)/sum(E_TOTPOP_10)*100),
          `pct single parent families` = round(sum(E_SNGPNT_10)/sum(ET_FAMILIES_10)*100),
          `pct other families` = round(((sum(ET_FAMILIES_10) - sum(E_SNGPNT_10))/sum(ET_FAMILIES_10))*100),
          `pct limited English speakers` = round(sum(E_LIMENG_10)/sum(ET_POPAGE5UP_10)*100),
          `pct proficient English speakers` = round(((sum(ET_POPAGE5UP_10) - sum(E_LIMENG_10))/sum(ET_POPAGE5UP_10))*100),
          `pct Minority race/ethnicity` = round(sum(E_MINRTY_10)/sum(ET_POPETHRACE_10)*100),
          `pct Non-Hispanic White race/ethnicity` = round(((sum(ET_POPETHRACE_10) - sum(E_MINRTY_10))/sum(ET_POPETHRACE_10))*100),
          `pct in multi-unit housing` = round(sum(E_MUNIT_10)/sum(E_STRHU_10)*100),
          `pct in mobile housing` = round(sum(E_MOBILE_10)/sum(E_STRHU_10)*100),
          `pct in other housing` = (100 - sum(round(sum(E_MUNIT_10)/sum(E_STRHU_10)*100), round(sum(E_MOBILE_10)/sum(E_STRHU_10)*100))),
          `pct in crowded living spaces` = round(sum(E_CROWD_10)/sum(ET_OCCUPANTS_10)*100),
          `pct in non-crowded living spaces` = round(((sum(ET_OCCUPANTS_10) - sum(E_CROWD_10))/sum(ET_OCCUPANTS_10))*100),
          `pct with no vehicle access` = round(sum(E_NOVEH_10)/sum(ET_KNOWNVEH_10)*100),
          `pct with vehicle access` = round(((sum(ET_KNOWNVEH_10) - sum(E_NOVEH_10))/sum(ET_KNOWNVEH_10))*100),
          `pct in group living quarters` = round(sum(E_GROUPQ_10)/sum(ET_HHTYPE_10)*100),
          `pct not in group living quarters` = round(((sum(ET_HHTYPE_10) - sum(E_GROUPQ_10))/sum(ET_HHTYPE_10))*100),
          `pct without health insurance` = round(sum(E_UNINSUR_12)/sum(ET_INSURSTATUS_12)*100),
          `pct with health insurance` = round(((sum(ET_INSURSTATUS_12) - sum(E_UNINSUR_12))/sum(ET_INSURSTATUS_12))*100),
          `pct disabled civilians` = round(sum(E_DISABL_12)/sum(ET_DISABLSTATUS_12)*100),
          `pct not disabled civilians` = round(((sum(ET_DISABLSTATUS_12) - sum(E_DISABL_12))/sum(ET_DISABLSTATUS_12))*100)
        )
      return(df_out)
    }
    
    
    svi_percentages20 <- function(df, division_name) {
      df_out <-  df %>% 
        summarise(
          division = division_name,
          year = 2020,
          `pct in poverty` = round(sum(E_POV150_20)/sum(ET_POVSTATUS_20)*100),
          `pct not in poverty` = round(((sum(ET_POVSTATUS_20) - sum(E_POV150_20))/sum(ET_POVSTATUS_20))*100),
          `pct unemployed` = round(sum(E_UNEMP_20)/sum(ET_EMPSTATUS_20)*100),
          `pct employed` = round(((sum(ET_EMPSTATUS_20) - sum(E_UNEMP_20))/sum(ET_EMPSTATUS_20))*100),
          `pct housing cost-burdened` = round(sum(E_HBURD_20)/sum(ET_HOUSINGCOST_20)*100),
          `pct not housing cost-burdened` = round(((sum(ET_HOUSINGCOST_20) - sum(E_HBURD_20))/sum(ET_HOUSINGCOST_20))*100),
          `pct adults without high school diploma` = round(sum(E_NOHSDP_20)/sum(ET_EDSTATUS_20)*100),
          `pct adults with high school diploma` = round(((sum(ET_EDSTATUS_20) - sum(E_NOHSDP_20))/sum(ET_EDSTATUS_20))*100),
          `pct age 17 & under` = round(sum(E_AGE17_20)/sum(E_TOTPOP_20)*100),
          `pct age 18-64` = round( (sum(E_TOTPOP_20)/sum(E_TOTPOP_20)*100) - (sum(round(sum(E_AGE17_20)/sum(E_TOTPOP_20)*100), round(sum(E_AGE65_20)/sum(E_TOTPOP_20)*100)))),
          `pct age 65+` = round(sum(E_AGE65_20)/sum(E_TOTPOP_20)*100),
          `pct single parent families` = round(sum(E_SNGPNT_20)/sum(ET_FAMILIES_20)*100),
          `pct other families` = round(((sum(ET_FAMILIES_20) - sum(E_SNGPNT_20))/sum(ET_FAMILIES_20))*100),
          `pct limited English speakers` = round(sum(E_LIMENG_20)/sum(ET_POPAGE5UP_20)*100),
          `pct proficient English speakers` = round(((sum(ET_POPAGE5UP_20) - sum(E_LIMENG_20))/sum(ET_POPAGE5UP_20))*100),
          `pct Minority race/ethnicity` = round(sum(E_MINRTY_20)/sum(ET_POPETHRACE_20)*100),
          `pct Non-Hispanic White race/ethnicity` = round(((sum(ET_POPETHRACE_20) - sum(E_MINRTY_20))/sum(ET_POPETHRACE_20))*100),
          `pct in multi-unit housing` = round(sum(E_MUNIT_20)/sum(E_STRHU_20)*100),
          `pct in mobile housing` = round(sum(E_MOBILE_20)/sum(E_STRHU_20)*100),
          `pct in other housing` = (100 - sum(round(sum(E_MUNIT_20)/sum(E_STRHU_20)*100), round(sum(E_MOBILE_20)/sum(E_STRHU_20)*100))),
          `pct in crowded living spaces` = round(sum(E_CROWD_20)/sum(ET_OCCUPANTS_20)*100),
          `pct in non-crowded living spaces` = round(((sum(ET_OCCUPANTS_20) - sum(E_CROWD_20))/sum(ET_OCCUPANTS_20))*100),
          `pct with no vehicle access` = round(sum(E_NOVEH_20)/sum(ET_KNOWNVEH_20)*100),
          `pct with vehicle access` = round(((sum(ET_KNOWNVEH_20) - sum(E_NOVEH_20))/sum(ET_KNOWNVEH_20))*100),
          `pct in group living quarters` = round(sum(E_GROUPQ_20)/sum(ET_HHTYPE_20)*100),
          `pct not in group living quarters` = round(((sum(ET_HHTYPE_20) - sum(E_GROUPQ_20))/sum(ET_HHTYPE_20))*100),
          `pct without health insurance` = round(sum(E_UNINSUR_20)/sum(ET_INSURSTATUS_20)*100),
          `pct with health insurance` = round(((sum(ET_INSURSTATUS_20) - sum(E_UNINSUR_20))/sum(ET_INSURSTATUS_20))*100),
          `pct disabled civilians` = round(sum(E_DISABL_20)/sum(ET_DISABLSTATUS_20)*100),
          `pct not disabled civilians` = round(((sum(ET_DISABLSTATUS_20) - sum(E_DISABL_20))/sum(ET_DISABLSTATUS_20))*100)
        )
      return(df_out)
    }

    Next we need to import our functions and constants from our project_data_steps.R file. Remember, your file should have your initials/name at the end to avoid confusion with your teammates. For example, in a shared repo my file name would be project_data_steps_CS.R or project_data_steps_courtney.R

    Recall that we should use the here::here() function for relative file paths. We also can use double colons :: to indicate both the specific library and function name we want to use to avoid any overriding.

    Code
    import::here( "fips_census_regions",
                  "load_svi_data",
                  "merge_svi_data",
                  "census_division",
                  "svi_percentages10",
                  "svi_percentages20",
                 # notice the use of here::here() that points to the .R file
                 # where all these R objects are created
                 .from = here::here("analysis/project_data_steps.R"),
                 .character_only = TRUE)

    Check that your census division variable loaded properly and reflects your division of interest:

    Code
    census_division
    [1] "Middle Atlantic Division"

    Data

    Finally, we can load up our data sets and process them with our functions to create data sets on a national and divisional level for 2010 and 2020.

    Note

    Note that we will want to flag our SVI indicators at or above the 75th percentile. For the divisional data we will also want to utilize the rank_by and location parameters to limit to our division of interest.

    Code
    # Load data from raw folder
    svi_2010 <- readRDS(here::here("data/raw/Census_Data_SVI/svi_2010_trt10.rds"))
    svi_2020 <- readRDS(here::here("data/raw/Census_Data_SVI/svi_2020_trt10.rds"))

    Load 2010 Data

    Code
    # National 2010 Data
    svi_2010_national <- load_svi_data(svi_2010, percentile=.75)
    svi_2010_national %>% head() %>% kbl() %>% kable_styling() %>% scroll_box(width = "100%")
    GEOID_2010_trt FIPS_st FIPS_county FIPS_tract state state_name county region_number region division_number division E_TOTPOP_10 E_HU_10 E_HH_10 E_POV150_10 ET_POVSTATUS_10 EP_POV150_10 EPL_POV150_10 F_POV150_10 E_UNEMP_10 ET_EMPSTATUS_10 EP_UNEMP_10 EPL_UNEMP_10 F_UNEMP_10 E_HBURD_OWN_10 ET_HOUSINGCOST_OWN_10 EP_HBURD_OWN_10 EPL_HBURD_OWN_10 F_HBURD_OWN_10 E_HBURD_RENT_10 ET_HOUSINGCOST_RENT_10 EP_HBURD_RENT_10 EPL_HBURD_RENT_10 F_HBURD_RENT_10 E_HBURD_10 ET_HOUSINGCOST_10 EP_HBURD_10 EPL_HBURD_10 F_HBURD_10 E_NOHSDP_10 ET_EDSTATUS_10 EP_NOHSDP_10 EPL_NOHSDP_10 F_NOHSDP_10 E_UNINSUR_12 ET_INSURSTATUS_12 EP_UNINSUR_12 EPL_UNINSUR_12 F_UNINSUR_12 E_AGE65_10 EP_AGE65_10 EPL_AGE65_10 F_AGE65_10 E_AGE17_10 EP_AGE17_10 EPL_AGE17_10 F_AGE17_10 E_DISABL_12 ET_DISABLSTATUS_12 EP_DISABL_12 EPL_DISABL_12 F_DISABL_12 E_SNGPNT_10 ET_FAMILIES_10 EP_SNGPNT_10 EPL_SNGPNT_10 F_SNGPNT_10 E_LIMENG_10 ET_POPAGE5UP_10 EP_LIMENG_10 EPL_LIMENG_10 F_LIMENG_10 E_MINRTY_10 ET_POPETHRACE_10 EP_MINRTY_10 EPL_MINRTY_10 F_MINRTY_10 E_STRHU_10 E_MUNIT_10 EP_MUNIT_10 EPL_MUNIT_10 F_MUNIT_10 E_MOBILE_10 EP_MOBILE_10 EPL_MOBILE_10 F_MOBILE_10 E_CROWD_10 ET_OCCUPANTS_10 EP_CROWD_10 EPL_CROWD_10 F_CROWD_10 E_NOVEH_10 ET_KNOWNVEH_10 EP_NOVEH_10 EPL_NOVEH_10 F_NOVEH_10 E_GROUPQ_10 ET_HHTYPE_10 EP_GROUPQ_10 EPL_GROUPQ_10 F_GROUPQ_10 SPL_THEME1 RPL_THEME1 F_THEME1 SPL_THEME2 RPL_THEME2 F_THEME2 SPL_THEME3 RPL_THEME3 F_THEME3 SPL_THEME4 RPL_THEME4 F_THEME4 SPL_THEMES RPL_THEMES F_TOTAL
    01001020100 01 001 020100 AL Alabama Autauga County 3 South Region 6 East South Central Division 1809 771 696 297 1809 16.41791 0.3871 0 36 889 4.049494 0.1790 0 127 598 21.23746 0.20770 0 47 98 47.95918 0.5767 0 174 696 25.00000 0.18790 0 196 1242 15.780998 0.6093 0 186 1759 10.574190 0.3790 0 222 12.271973 0.4876 0 445 24.59923 0.5473 0 298 1335 22.32210 0.8454 1 27 545 4.954128 0.09275 0 36 1705 2.1114370 0.59040 0 385 1809 21.282477 0.4524 0 771 0 0.0000000 0.1224 0 92 11.9325551 0.8005 1 0 696 0.0000000 0.1238 0 50 696 7.183908 0.6134 0 0 1809 0 0.364 0 1.74230 0.28200 0 2.56345 0.5296 1 0.4524 0.4482 0 2.0241 0.2519 1 6.78225 0.3278 2
    01001020200 01 001 020200 AL Alabama Autauga County 3 South Region 6 East South Central Division 2020 816 730 495 1992 24.84940 0.5954 0 68 834 8.153477 0.5754 0 49 439 11.16173 0.02067 0 105 291 36.08247 0.3019 0 154 730 21.09589 0.09312 0 339 1265 26.798419 0.8392 1 313 2012 15.556660 0.6000 0 204 10.099010 0.3419 0 597 29.55446 0.8192 1 359 1515 23.69637 0.8791 1 132 456 28.947368 0.83510 1 15 1890 0.7936508 0.40130 0 1243 2020 61.534653 0.7781 1 816 0 0.0000000 0.1224 0 34 4.1666667 0.6664 0 13 730 1.7808219 0.5406 0 115 730 15.753425 0.8382 1 0 2020 0 0.364 0 2.70312 0.56650 1 3.27660 0.8614 3 0.7781 0.7709 1 2.5316 0.5047 1 9.28942 0.6832 6
    01001020300 01 001 020300 AL Alabama Autauga County 3 South Region 6 East South Central Division 3543 1403 1287 656 3533 18.56779 0.4443 0 93 1552 5.992268 0.3724 0 273 957 28.52665 0.45780 0 178 330 53.93939 0.7152 0 451 1287 35.04274 0.49930 0 346 2260 15.309734 0.5950 0 252 3102 8.123791 0.2596 0 487 13.745413 0.5868 0 998 28.16822 0.7606 1 371 2224 16.68165 0.6266 0 126 913 13.800657 0.46350 0 0 3365 0.0000000 0.09298 0 637 3543 17.979114 0.4049 0 1403 10 0.7127584 0.3015 0 2 0.1425517 0.4407 0 0 1287 0.0000000 0.1238 0 101 1287 7.847708 0.6443 0 0 3543 0 0.364 0 2.17060 0.41010 0 2.53048 0.5116 1 0.4049 0.4011 0 1.8743 0.1942 0 6.98028 0.3576 1
    01001020400 01 001 020400 AL Alabama Autauga County 3 South Region 6 East South Central Division 4840 1957 1839 501 4840 10.35124 0.2177 0 101 2129 4.744011 0.2447 0 310 1549 20.01291 0.17080 0 89 290 30.68966 0.2044 0 399 1839 21.69657 0.10540 0 274 3280 8.353658 0.3205 0 399 4293 9.294200 0.3171 0 955 19.731405 0.8643 1 1195 24.69008 0.5530 0 625 3328 18.78005 0.7233 0 152 1374 11.062591 0.34710 0 10 4537 0.2204100 0.22560 0 297 4840 6.136364 0.1647 0 1957 33 1.6862545 0.3843 0 25 1.2774655 0.5516 0 14 1839 0.7612833 0.3564 0 19 1839 1.033170 0.1127 0 0 4840 0 0.364 0 1.20540 0.13470 0 2.71330 0.6129 1 0.1647 0.1632 0 1.7690 0.1591 0 5.85240 0.1954 1
    01001020500 01 001 020500 AL Alabama Autauga County 3 South Region 6 East South Central Division 9938 3969 3741 1096 9938 11.02838 0.2364 0 188 4937 3.807981 0.1577 0 426 2406 17.70574 0.11050 0 528 1335 39.55056 0.3753 0 954 3741 25.50120 0.20140 0 293 5983 4.897209 0.1655 0 740 10110 7.319486 0.2211 0 837 8.422218 0.2408 0 3012 30.30791 0.8455 1 759 7155 10.60797 0.2668 0 476 2529 18.821669 0.63540 0 78 9297 0.8389803 0.41110 0 1970 9938 19.822902 0.4330 0 3969 306 7.7097506 0.6153 0 0 0.0000000 0.2198 0 7 3741 0.1871157 0.2535 0 223 3741 5.960973 0.5483 0 0 9938 0 0.364 0 0.98210 0.08468 0 2.39960 0.4381 1 0.4330 0.4290 0 2.0009 0.2430 0 5.81560 0.1905 1
    01001020600 01 001 020600 AL Alabama Autauga County 3 South Region 6 East South Central Division 3402 1456 1308 735 3402 21.60494 0.5199 0 134 1720 7.790698 0.5436 0 242 1032 23.44961 0.28010 0 62 276 22.46377 0.1035 0 304 1308 23.24159 0.14070 0 301 2151 13.993491 0.5510 0 355 3445 10.304790 0.3656 0 386 11.346267 0.4232 0 931 27.36626 0.7200 0 440 2439 18.04018 0.6912 0 143 924 15.476190 0.52900 0 4 3254 0.1229256 0.19840 0 723 3402 21.252205 0.4519 0 1456 18 1.2362637 0.3507 0 433 29.7390110 0.9468 1 16 1308 1.2232416 0.4493 0 28 1308 2.140673 0.2298 0 0 3402 0 0.364 0 2.12080 0.39510 0 2.56180 0.5288 0 0.4519 0.4477 0 2.3406 0.4048 1 7.47510 0.4314 1
    Code
    # Divisional 2010 Data
    svi_2010_divisional <- load_svi_data(svi_2010, rank_by = "divisional", location = census_division, percentile=.75)
    svi_2010_divisional %>% head() %>% kbl() %>% kable_styling() %>% scroll_box(width = "100%")
    GEOID_2010_trt FIPS_st FIPS_county FIPS_tract state state_name county region_number region division_number division E_TOTPOP_10 E_HU_10 E_HH_10 E_POV150_10 ET_POVSTATUS_10 EP_POV150_10 EPL_POV150_10 F_POV150_10 E_UNEMP_10 ET_EMPSTATUS_10 EP_UNEMP_10 EPL_UNEMP_10 F_UNEMP_10 E_HBURD_OWN_10 ET_HOUSINGCOST_OWN_10 EP_HBURD_OWN_10 EPL_HBURD_OWN_10 F_HBURD_OWN_10 E_HBURD_RENT_10 ET_HOUSINGCOST_RENT_10 EP_HBURD_RENT_10 EPL_HBURD_RENT_10 F_HBURD_RENT_10 E_HBURD_10 ET_HOUSINGCOST_10 EP_HBURD_10 EPL_HBURD_10 F_HBURD_10 E_NOHSDP_10 ET_EDSTATUS_10 EP_NOHSDP_10 EPL_NOHSDP_10 F_NOHSDP_10 E_UNINSUR_12 ET_INSURSTATUS_12 EP_UNINSUR_12 EPL_UNINSUR_12 F_UNINSUR_12 E_AGE65_10 EP_AGE65_10 EPL_AGE65_10 F_AGE65_10 E_AGE17_10 EP_AGE17_10 EPL_AGE17_10 F_AGE17_10 E_DISABL_12 ET_DISABLSTATUS_12 EP_DISABL_12 EPL_DISABL_12 F_DISABL_12 E_SNGPNT_10 ET_FAMILIES_10 EP_SNGPNT_10 EPL_SNGPNT_10 F_SNGPNT_10 E_LIMENG_10 ET_POPAGE5UP_10 EP_LIMENG_10 EPL_LIMENG_10 F_LIMENG_10 E_MINRTY_10 ET_POPETHRACE_10 EP_MINRTY_10 EPL_MINRTY_10 F_MINRTY_10 E_STRHU_10 E_MUNIT_10 EP_MUNIT_10 EPL_MUNIT_10 F_MUNIT_10 E_MOBILE_10 EP_MOBILE_10 EPL_MOBILE_10 F_MOBILE_10 E_CROWD_10 ET_OCCUPANTS_10 EP_CROWD_10 EPL_CROWD_10 F_CROWD_10 E_NOVEH_10 ET_KNOWNVEH_10 EP_NOVEH_10 EPL_NOVEH_10 F_NOVEH_10 E_GROUPQ_10 ET_HHTYPE_10 EP_GROUPQ_10 EPL_GROUPQ_10 F_GROUPQ_10 SPL_THEME1 RPL_THEME1 F_THEME1 SPL_THEME2 RPL_THEME2 F_THEME2 SPL_THEME3 RPL_THEME3 F_THEME3 SPL_THEME4 RPL_THEME4 F_THEME4 SPL_THEMES RPL_THEMES F_TOTAL
    34001000100 34 001 000100 NJ New Jersey Atlantic County 1 Northeast Region 2 Middle Atlantic Division 2907 1088 983 1127 2907 38.76849 0.8482 1 144 1433 10.048849 0.7544 1 280 435 64.36782 0.9529 1 204 548 37.22628 0.2998 0 484 983 49.23703 0.7813 1 468 1759 26.60603 0.8634 1 532 2543 20.92017 0.8978 1 250 8.599931 0.1777 0 944 32.47334 0.94170 1 186 1851 10.04862 0.2706 0 266 678 39.233038 0.8981 1 177 2611 6.779012 0.7778 1 1928 2907 66.32267 0.7743 1 1088 113 10.386029 0.6229 0 9 0.8272059 0.7223 0 80 983 8.138352 0.8657 1 265 983 26.95829 0.7354 0 0 2907 0.000000 0.3512 0 4.1451 0.8935 5 3.06590 0.7944 3 0.7743 0.7667 1 3.2975 0.8414 1 11.28280 0.8862 10
    34001000200 34 001 000200 NJ New Jersey Atlantic County 1 Northeast Region 2 Middle Atlantic Division 3189 2217 1473 519 3189 16.27469 0.4806 0 109 1558 6.996149 0.5179 0 573 955 60.00000 0.9323 1 199 518 38.41699 0.3261 0 772 1473 52.41005 0.8418 1 405 2579 15.70376 0.6491 0 484 3547 13.64533 0.7154 0 847 26.560050 0.9629 1 436 13.67200 0.08181 0 608 3005 20.23295 0.8466 1 42 857 4.900817 0.1204 0 422 3072 13.736979 0.8799 1 1792 3189 56.19316 0.7390 0 2217 901 40.640505 0.8693 1 0 0.0000000 0.3251 0 48 1473 3.258656 0.7064 0 250 1473 16.97217 0.6444 0 0 3189 0.000000 0.3512 0 3.2048 0.6963 1 2.89161 0.7231 3 0.7390 0.7317 0 2.8964 0.6887 1 9.73181 0.7340 5
    34001000300 34 001 000300 NJ New Jersey Atlantic County 1 Northeast Region 2 Middle Atlantic Division 3997 1823 1357 1401 3968 35.30746 0.8164 1 382 2238 17.068811 0.9376 1 176 329 53.49544 0.8855 1 604 1028 58.75486 0.7947 1 780 1357 57.47973 0.9165 1 920 2677 34.36683 0.9346 1 1351 4149 32.56206 0.9811 1 314 7.855892 0.1437 0 937 23.44258 0.55900 0 319 3054 10.44532 0.3000 0 187 782 23.913044 0.7498 0 1080 3671 29.419777 0.9742 1 3357 3997 83.98799 0.8419 1 1823 363 19.912233 0.7535 1 0 0.0000000 0.3251 0 150 1357 11.053795 0.9136 1 651 1357 47.97347 0.8585 1 0 3997 0.000000 0.3512 0 4.5862 0.9691 5 2.72670 0.6360 1 0.8419 0.8336 1 3.2019 0.8054 3 11.35670 0.8920 10
    34001000400 34 001 000400 NJ New Jersey Atlantic County 1 Northeast Region 2 Middle Atlantic Division 2902 2683 1401 1172 2902 40.38594 0.8615 1 190 1389 13.678906 0.8811 1 364 707 51.48515 0.8627 1 507 694 73.05476 0.9503 1 871 1401 62.16988 0.9572 1 481 1981 24.28067 0.8339 1 674 3204 21.03620 0.8998 1 434 14.955203 0.6083 0 596 20.53756 0.33980 0 426 2607 16.34062 0.6886 0 111 652 17.024540 0.6204 0 215 2736 7.858187 0.8008 1 1792 2902 61.75052 0.7584 1 2683 2049 76.369735 0.9401 1 0 0.0000000 0.3251 0 69 1401 4.925053 0.7847 1 511 1401 36.47395 0.7992 1 72 2902 2.481048 0.8114 1 4.4335 0.9468 5 3.05790 0.7908 1 0.7584 0.7510 1 3.6605 0.9391 4 11.91030 0.9339 11
    34001000500 34 001 000500 NJ New Jersey Atlantic County 1 Northeast Region 2 Middle Atlantic Division 3483 1241 1027 1938 3483 55.64169 0.9533 1 124 1630 7.607362 0.5830 0 227 446 50.89686 0.8549 1 478 581 82.27194 0.9799 1 705 1027 68.64654 0.9863 1 733 2077 35.29129 0.9396 1 727 3258 22.31430 0.9149 1 377 10.824002 0.3081 0 1055 30.28998 0.90140 1 268 2401 11.16202 0.3549 0 209 763 27.391874 0.7940 1 911 3077 29.606760 0.9746 1 3036 3483 87.16624 0.8550 1 1241 52 4.190169 0.4505 0 4 0.3223207 0.6567 0 113 1027 11.002921 0.9128 1 422 1027 41.09056 0.8250 1 0 3483 0.000000 0.3512 0 4.3771 0.9379 4 3.33300 0.8766 3 0.8550 0.8467 1 3.1962 0.8026 2 11.76130 0.9229 10
    34001001100 34 001 001100 NJ New Jersey Atlantic County 1 Northeast Region 2 Middle Atlantic Division 2204 1204 1204 1185 2204 53.76588 0.9457 1 219 927 23.624596 0.9830 1 97 172 56.39535 0.9094 1 462 1032 44.76744 0.4746 0 559 1204 46.42857 0.7197 0 346 1440 24.02778 0.8306 1 469 1942 24.15036 0.9360 1 363 16.470054 0.7020 0 578 26.22505 0.74410 0 442 1558 28.36970 0.9675 1 247 396 62.373737 0.9898 1 104 2051 5.070697 0.7260 0 2118 2204 96.09800 0.9204 1 1204 570 47.342193 0.8858 1 0 0.0000000 0.3251 0 14 1204 1.162791 0.4877 0 817 1204 67.85714 0.9413 1 0 2204 0.000000 0.3512 0 4.4150 0.9451 4 4.12940 0.9805 2 0.9204 0.9114 1 2.9911 0.7243 2 12.45590 0.9597 9

    Load 2020 Data

    Code
    # National 2020 Data
    svi_2020_national <- load_svi_data(svi_2020, percentile=.75)
    svi_2020_national %>% head() %>% kbl() %>% kable_styling() %>% scroll_box(width = "100%")
    GEOID_2010_trt FIPS_st FIPS_county FIPS_tract state state_name county region_number region division_number division E_TOTPOP_20 E_HU_20 E_HH_20 E_POV150_20 ET_POVSTATUS_20 EP_POV150_20 EPL_POV150_20 F_POV150_20 E_UNEMP_20 ET_EMPSTATUS_20 EP_UNEMP_20 EPL_UNEMP_20 F_UNEMP_20 E_HBURD_OWN_20 ET_HOUSINGCOST_OWN_20 EP_HBURD_OWN_20 EPL_HBURD_OWN_20 F_HBURD_OWN_20 E_HBURD_RENT_20 ET_HOUSINGCOST_RENT_20 EP_HBURD_RENT_20 EPL_HBURD_RENT_20 F_HBURD_RENT_20 E_HBURD_20 ET_HOUSINGCOST_20 EP_HBURD_20 EPL_HBURD_20 F_HBURD_20 E_NOHSDP_20 ET_EDSTATUS_20 EP_NOHSDP_20 EPL_NOHSDP_20 F_NOHSDP_20 E_UNINSUR_20 ET_INSURSTATUS_20 EP_UNINSUR_20 EPL_UNINSUR_20 F_UNINSUR_20 E_AGE65_20 EP_AGE65_20 EPL_AGE65_20 F_AGE65_20 E_AGE17_20 EP_AGE17_20 EPL_AGE17_20 F_AGE17_20 E_DISABL_20 ET_DISABLSTATUS_20 EP_DISABL_20 EPL_DISABL_20 F_DISABL_20 E_SNGPNT_20 ET_FAMILIES_20 EP_SNGPNT_20 EPL_SNGPNT_20 F_SNGPNT_20 E_LIMENG_20 ET_POPAGE5UP_20 EP_LIMENG_20 EPL_LIMENG_20 F_LIMENG_20 E_MINRTY_20 ET_POPETHRACE_20 EP_MINRTY_20 EPL_MINRTY_20 F_MINRTY_20 E_STRHU_20 E_MUNIT_20 EP_MUNIT_20 EPL_MUNIT_20 F_MUNIT_20 E_MOBILE_20 EP_MOBILE_20 EPL_MOBILE_20 F_MOBILE_20 E_CROWD_20 ET_OCCUPANTS_20 EP_CROWD_20 EPL_CROWD_20 F_CROWD_20 E_NOVEH_20 ET_KNOWNVEH_20 EP_NOVEH_20 EPL_NOVEH_20 F_NOVEH_20 E_GROUPQ_20 ET_HHTYPE_20 EP_GROUPQ_20 EPL_GROUPQ_20 F_GROUPQ_20 SPL_THEME1 RPL_THEME1 F_THEME1 SPL_THEME2 RPL_THEME2 F_THEME2 SPL_THEME3 RPL_THEME3 F_THEME3 SPL_THEME4 RPL_THEME4 F_THEME4 SPL_THEMES RPL_THEMES F_TOTAL
    01001020100 01 001 020100 AL Alabama Autauga County 3 South Region 6 East South Central Division 1941 710 693 352 1941 18.13498 0.4630 0 18 852 2.112676 0.15070 0 81 507 15.976331 0.26320 0 63 186 33.87097 0.2913 0 144 693 20.77922 0.2230 0 187 1309 14.285714 0.6928 0 187 1941 9.634209 0.6617 0 295 15.19835 0.4601 0 415 21.38073 0.4681 0 391 1526 25.62254 0.9011 1 58 555 10.45045 0.3451 0 0 1843 0.0000000 0.09479 0 437 1941 22.51417 0.3902 0 710 0 0.0000000 0.1079 0 88 12.3943662 0.8263 1 0 693 0.0000000 0.09796 0 10 693 1.443001 0.1643 0 0 1941 0.000000 0.1831 0 2.19120 0.4084 0 2.26919 0.3503 1 0.3902 0.3869 0 1.37956 0.07216 1 6.23015 0.2314 2
    01001020200 01 001 020200 AL Alabama Autauga County 3 South Region 6 East South Central Division 1757 720 573 384 1511 25.41363 0.6427 0 29 717 4.044630 0.41320 0 33 392 8.418367 0.03542 0 116 181 64.08840 0.9086 1 149 573 26.00349 0.4041 0 139 1313 10.586443 0.5601 0 91 1533 5.936073 0.4343 0 284 16.16392 0.5169 0 325 18.49744 0.2851 0 164 1208 13.57616 0.4127 0 42 359 11.69916 0.3998 0 0 1651 0.0000000 0.09479 0 1116 1757 63.51736 0.7591 1 720 3 0.4166667 0.2470 0 5 0.6944444 0.5106 0 9 573 1.5706806 0.46880 0 57 573 9.947644 0.7317 0 212 1757 12.066022 0.9549 1 2.45440 0.4888 0 1.70929 0.1025 0 0.7591 0.7527 1 2.91300 0.68620 1 7.83579 0.4802 2
    01001020300 01 001 020300 AL Alabama Autauga County 3 South Region 6 East South Central Division 3694 1464 1351 842 3694 22.79372 0.5833 0 53 1994 2.657974 0.22050 0 117 967 12.099276 0.11370 0 147 384 38.28125 0.3856 0 264 1351 19.54108 0.1827 0 317 2477 12.797739 0.6460 0 127 3673 3.457664 0.2308 0 464 12.56091 0.3088 0 929 25.14889 0.7080 0 473 2744 17.23761 0.6211 0 263 975 26.97436 0.8234 1 128 3586 3.5694367 0.70770 0 1331 3694 36.03140 0.5515 0 1464 26 1.7759563 0.3675 0 14 0.9562842 0.5389 0 35 1351 2.5906736 0.60550 0 42 1351 3.108808 0.3415 0 0 3694 0.000000 0.1831 0 1.86330 0.3063 0 3.16900 0.8380 1 0.5515 0.5468 0 2.03650 0.26830 0 7.62030 0.4460 1
    01001020400 01 001 020400 AL Alabama Autauga County 3 South Region 6 East South Central Division 3539 1741 1636 503 3539 14.21305 0.3472 0 39 1658 2.352232 0.17990 0 219 1290 16.976744 0.30880 0 74 346 21.38728 0.1037 0 293 1636 17.90954 0.1333 0 173 2775 6.234234 0.3351 0 169 3529 4.788892 0.3448 0 969 27.38062 0.9225 1 510 14.41085 0.1208 0 670 3019 22.19278 0.8194 1 148 1137 13.01671 0.4541 0 89 3409 2.6107363 0.64690 0 454 3539 12.82848 0.2364 0 1741 143 8.2136703 0.6028 0 0 0.0000000 0.2186 0 10 1636 0.6112469 0.28340 0 72 1636 4.400978 0.4538 0 0 3539 0.000000 0.1831 0 1.34030 0.1575 0 2.96370 0.7496 2 0.2364 0.2344 0 1.74170 0.16270 0 6.28210 0.2389 2
    01001020500 01 001 020500 AL Alabama Autauga County 3 South Region 6 East South Central Division 10674 4504 4424 1626 10509 15.47245 0.3851 0 81 5048 1.604596 0.09431 0 321 2299 13.962592 0.17970 0 711 2125 33.45882 0.2836 0 1032 4424 23.32731 0.3109 0 531 6816 7.790493 0.4251 0 301 10046 2.996217 0.1894 0 1613 15.11149 0.4553 0 2765 25.90407 0.7494 0 1124 7281 15.43744 0.5253 0 342 2912 11.74451 0.4019 0 52 9920 0.5241935 0.35230 0 2603 10674 24.38636 0.4160 0 4504 703 15.6083481 0.7378 0 29 0.6438721 0.5037 0 37 4424 0.8363472 0.33420 0 207 4424 4.679023 0.4754 0 176 10674 1.648866 0.7598 1 1.40481 0.1743 0 2.48420 0.4802 0 0.4160 0.4125 0 2.81090 0.63730 1 7.11591 0.3654 1
    01001020600 01 001 020600 AL Alabama Autauga County 3 South Region 6 East South Central Division 3536 1464 1330 1279 3523 36.30429 0.8215 1 34 1223 2.780049 0.23780 0 321 1111 28.892889 0.75870 1 67 219 30.59361 0.2305 0 388 1330 29.17293 0.5075 0 306 2380 12.857143 0.6480 0 415 3496 11.870709 0.7535 1 547 15.46946 0.4760 0 982 27.77149 0.8327 1 729 2514 28.99761 0.9488 1 95 880 10.79545 0.3601 0 0 3394 0.0000000 0.09479 0 985 3536 27.85633 0.4608 0 1464 0 0.0000000 0.1079 0 364 24.8633880 0.9300 1 0 1330 0.0000000 0.09796 0 17 1330 1.278196 0.1463 0 0 3536 0.000000 0.1831 0 2.96830 0.6434 2 2.71239 0.6156 2 0.4608 0.4569 0 1.46526 0.08976 1 7.60675 0.4440 5
    Code
    # Divisional 2020 Data
    svi_2020_divisional <- load_svi_data(svi_2020, rank_by = "divisional", location =  census_division, percentile=.75)
    svi_2020_divisional %>% head() %>% kbl() %>% kable_styling() %>% scroll_box(width = "100%")
    GEOID_2010_trt FIPS_st FIPS_county FIPS_tract state state_name county region_number region division_number division E_TOTPOP_20 E_HU_20 E_HH_20 E_POV150_20 ET_POVSTATUS_20 EP_POV150_20 EPL_POV150_20 F_POV150_20 E_UNEMP_20 ET_EMPSTATUS_20 EP_UNEMP_20 EPL_UNEMP_20 F_UNEMP_20 E_HBURD_OWN_20 ET_HOUSINGCOST_OWN_20 EP_HBURD_OWN_20 EPL_HBURD_OWN_20 F_HBURD_OWN_20 E_HBURD_RENT_20 ET_HOUSINGCOST_RENT_20 EP_HBURD_RENT_20 EPL_HBURD_RENT_20 F_HBURD_RENT_20 E_HBURD_20 ET_HOUSINGCOST_20 EP_HBURD_20 EPL_HBURD_20 F_HBURD_20 E_NOHSDP_20 ET_EDSTATUS_20 EP_NOHSDP_20 EPL_NOHSDP_20 F_NOHSDP_20 E_UNINSUR_20 ET_INSURSTATUS_20 EP_UNINSUR_20 EPL_UNINSUR_20 F_UNINSUR_20 E_AGE65_20 EP_AGE65_20 EPL_AGE65_20 F_AGE65_20 E_AGE17_20 EP_AGE17_20 EPL_AGE17_20 F_AGE17_20 E_DISABL_20 ET_DISABLSTATUS_20 EP_DISABL_20 EPL_DISABL_20 F_DISABL_20 E_SNGPNT_20 ET_FAMILIES_20 EP_SNGPNT_20 EPL_SNGPNT_20 F_SNGPNT_20 E_LIMENG_20 ET_POPAGE5UP_20 EP_LIMENG_20 EPL_LIMENG_20 F_LIMENG_20 E_MINRTY_20 ET_POPETHRACE_20 EP_MINRTY_20 EPL_MINRTY_20 F_MINRTY_20 E_STRHU_20 E_MUNIT_20 EP_MUNIT_20 EPL_MUNIT_20 F_MUNIT_20 E_MOBILE_20 EP_MOBILE_20 EPL_MOBILE_20 F_MOBILE_20 E_CROWD_20 ET_OCCUPANTS_20 EP_CROWD_20 EPL_CROWD_20 F_CROWD_20 E_NOVEH_20 ET_KNOWNVEH_20 EP_NOVEH_20 EPL_NOVEH_20 F_NOVEH_20 E_GROUPQ_20 ET_HHTYPE_20 EP_GROUPQ_20 EPL_GROUPQ_20 F_GROUPQ_20 SPL_THEME1 RPL_THEME1 F_THEME1 SPL_THEME2 RPL_THEME2 F_THEME2 SPL_THEME3 RPL_THEME3 F_THEME3 SPL_THEME4 RPL_THEME4 F_THEME4 SPL_THEMES RPL_THEMES F_TOTAL
    34001000100 34 001 000100 NJ New Jersey Atlantic County 1 Northeast Region 2 Middle Atlantic Division 2157 941 784 1182 2157 54.79833 0.9571 1 242 1058 22.873346 0.9922 1 215 342 62.86550 0.9780 1 316 442 71.49321 0.9481 1 531 784 67.72959 0.9893 1 396 1274 31.08320 0.9497 1 266 2157 12.331943 0.9041 1 185 8.576727 0.09430 0 552 25.59110 0.8128 1 297 1605 18.504673 0.74880 0 83 510 16.27451 0.6090 0 251 2020 12.425743 0.8710 1 1852 2157 85.85999 0.8476 1 941 118 12.5398512 0.6385 0 0 0.0000000 0.3216 0 67 784 8.545918 0.8657 1 212 784 27.04082 0.7502 1 0 2157 0.0000000 0.1517 0 4.7924 0.9850 5 3.13590 0.8217 2 0.8476 0.8400 1 2.7277 0.6085 2 11.50360 0.9104 10
    34001000200 34 001 000200 NJ New Jersey Atlantic County 1 Northeast Region 2 Middle Atlantic Division 3510 2046 1353 1021 3510 29.08832 0.7682 1 121 1852 6.533477 0.6717 0 343 696 49.28161 0.9273 1 416 657 63.31811 0.8696 1 759 1353 56.09756 0.9321 1 553 2338 23.65269 0.8871 1 354 3510 10.085470 0.8530 1 643 18.319088 0.60310 0 1002 28.54701 0.9055 1 450 2508 17.942584 0.72330 0 237 786 30.15267 0.8539 1 534 3375 15.822222 0.9062 1 2534 3510 72.19373 0.7818 1 2046 906 44.2815249 0.8690 1 0 0.0000000 0.3216 0 119 1353 8.795270 0.8711 1 324 1353 23.94678 0.7255 0 0 3510 0.0000000 0.1517 0 4.1121 0.9003 4 3.99200 0.9781 3 0.7818 0.7747 1 2.9389 0.7011 2 11.82480 0.9310 10
    34001000300 34 001 000300 NJ New Jersey Atlantic County 1 Northeast Region 2 Middle Atlantic Division 3801 1640 1226 1857 3801 48.85556 0.9333 1 226 1800 12.555556 0.9267 1 111 280 39.64286 0.8339 1 608 946 64.27061 0.8842 1 719 1226 58.64600 0.9528 1 650 2275 28.57143 0.9337 1 1027 3801 27.019206 0.9914 1 380 9.997369 0.14040 0 1223 32.17574 0.9607 1 219 2578 8.494957 0.15680 0 268 909 29.48295 0.8456 1 940 3400 27.647059 0.9728 1 3318 3801 87.29282 0.8579 1 1640 262 15.9756098 0.6917 0 0 0.0000000 0.3216 0 124 1226 10.114192 0.8955 1 477 1226 38.90701 0.8258 1 0 3801 0.0000000 0.1517 0 4.7379 0.9829 5 3.07630 0.8013 3 0.8579 0.8501 1 2.8863 0.6781 2 11.55840 0.9150 11
    34001000400 34 001 000400 NJ New Jersey Atlantic County 1 Northeast Region 2 Middle Atlantic Division 3178 2264 1390 1508 3176 47.48111 0.9246 1 172 1804 9.534368 0.8460 1 205 468 43.80342 0.8858 1 622 922 67.46204 0.9192 1 827 1390 59.49640 0.9587 1 364 2076 17.53372 0.8013 1 476 3178 14.977974 0.9390 1 483 15.198238 0.41220 0 539 16.96035 0.2484 0 319 2639 12.087912 0.38790 0 101 565 17.87611 0.6539 0 583 3022 19.291860 0.9349 1 2186 3178 68.78540 0.7658 1 2264 1609 71.0689046 0.9266 1 15 0.6625442 0.7078 0 226 1390 16.258993 0.9567 1 599 1390 43.09353 0.8474 1 20 3178 0.6293266 0.6292 0 4.4696 0.9558 5 2.63730 0.5864 1 0.7658 0.7588 1 4.0677 0.9762 3 11.94040 0.9387 10
    34001000500 34 001 000500 NJ New Jersey Atlantic County 1 Northeast Region 2 Middle Atlantic Division 3385 1185 945 1682 3364 50.00000 0.9391 1 72 1577 4.565631 0.4586 0 185 468 39.52991 0.8332 1 362 477 75.89099 0.9703 1 547 945 57.88360 0.9477 1 592 1983 29.85376 0.9422 1 738 3385 21.802068 0.9817 1 240 7.090103 0.05988 0 1129 33.35303 0.9689 1 135 2256 5.984043 0.04817 0 110 717 15.34170 0.5822 0 721 3076 23.439532 0.9569 1 3029 3385 89.48301 0.8727 1 1185 9 0.7594937 0.2382 0 0 0.0000000 0.3216 0 103 945 10.899471 0.9072 1 263 945 27.83069 0.7560 1 0 3385 0.0000000 0.1517 0 4.2693 0.9283 4 2.61605 0.5709 2 0.8727 0.8648 1 2.3747 0.4357 2 10.13275 0.7921 9
    34001001100 34 001 001100 NJ New Jersey Atlantic County 1 Northeast Region 2 Middle Atlantic Division 1950 1267 1096 1131 1950 58.00000 0.9678 1 66 706 9.348442 0.8395 1 42 101 41.58416 0.8612 1 309 995 31.05528 0.1959 0 351 1096 32.02555 0.4782 0 510 1379 36.98332 0.9763 1 155 1950 7.948718 0.7660 1 392 20.102564 0.69880 0 447 22.92308 0.6712 0 570 1503 37.924152 0.99200 1 143 374 38.23529 0.9167 1 109 1841 5.920695 0.7464 0 1909 1950 97.89744 0.9529 1 1267 479 37.8058406 0.8464 1 0 0.0000000 0.3216 0 33 1096 3.010949 0.6446 0 743 1096 67.79197 0.9414 1 0 1950 0.0000000 0.1517 0 4.0278 0.8848 4 4.02510 0.9798 2 0.9529 0.9442 1 2.9057 0.6869 2 11.91150 0.9365 9

    Check Data

    Next, we want to check that our functions are behaving as expected and have filtered our data sets to include all divisions and states on the national level and only our division/states of interest on the divisional level:

    2010

    Code
    svi_2010_national$division %>% unique() 
    [1] "East South Central Division" "Pacific Division"           
    [3] "Mountain Division"           "West South Central Division"
    [5] "New England Division"        "South Atlantic Division"    
    [7] "East North Central Division" "West North Central Division"
    [9] "Middle Atlantic Division"   
    Code
    svi_2010_national$state %>% unique() 
     [1] "AL" "AK" "AZ" "AR" "CA" "CO" "CT" "DE" "DC" "FL" "GA" "HI" "ID" "IL" "IN"
    [16] "IA" "KS" "KY" "LA" "ME" "MD" "MA" "MI" "MN" "MS" "MO" "MT" "NE" "NV" "NH"
    [31] "NJ" "NM" "NY" "NC" "ND" "OH" "OK" "OR" "PA" "RI" "SC" "SD" "TN" "TX" "UT"
    [46] "VT" "VA" "WA" "WV" "WI" "WY"
    Code
    svi_2010_divisional$division %>% unique() 
    [1] "Middle Atlantic Division"
    Code
    svi_2010_divisional$state %>% unique() 
    [1] "NJ" "NY" "PA"

    2020

    Code
    svi_2020_national$division %>% unique()
    [1] "East South Central Division" "Pacific Division"           
    [3] "Mountain Division"           "West South Central Division"
    [5] "New England Division"        "South Atlantic Division"    
    [7] "East North Central Division" "West North Central Division"
    [9] "Middle Atlantic Division"   
    Code
    svi_2020_national$state %>% unique()
     [1] "AL" "AK" "AZ" "AR" "CA" "CO" "CT" "DE" "DC" "FL" "GA" "HI" "ID" "IL" "IN"
    [16] "IA" "KS" "KY" "LA" "ME" "MD" "MA" "MI" "MN" "MS" "MO" "MT" "NE" "NV" "NH"
    [31] "NJ" "NM" "NY" "NC" "ND" "OH" "OK" "OR" "PA" "RI" "SC" "SD" "TN" "TX" "UT"
    [46] "VT" "VA" "WA" "WV" "WI" "WY"
    Code
    svi_2020_divisional$division %>% unique()
    [1] "Middle Atlantic Division"
    Code
    svi_2020_divisional$state %>% unique()
    [1] "NJ" "NY" "PA"

    Find SVI percentages

    Now that we have our data loaded properly, we need to process it further to find the percentages of our Social Vulnerability Indices nationally and divisionally for 2010 and 2020:

    2010

    Code
    usa_pcts10 <- svi_percentages10(svi_2010_national, "United States")
    usa_pcts10 %>% kbl() %>% kable_styling() %>% scroll_box(width = "100%")
    division year pct in poverty pct not in poverty pct unemployed pct employed pct housing cost-burdened pct not housing cost-burdened pct adults without high school diploma pct adults with high school diploma pct age 17 & under pct age 18-64 pct age 65+ pct single parent families pct other families pct limited English speakers pct proficient English speakers pct Minority race/ethnicity pct Non-Hispanic White race/ethnicity pct in multi-unit housing pct in mobile housing pct in other housing pct in crowded living spaces pct in non-crowded living spaces pct with no vehicle access pct with vehicle access pct in group living quarters pct not in group living quarters pct without health insurance pct with health insurance pct disabled civilians pct not disabled civilians
    United States 2010 23 77 8 92 36 64 15 85 24 63 13 17 83 5 95 35 65 13 7 80 3 97 9 91 3 97 15 85 15 85
    Code
    div_pcts10 <-svi_percentages10(svi_2010_divisional, census_division)
    div_pcts10 %>% kbl() %>% kable_styling() %>% scroll_box(width = "100%")
    division year pct in poverty pct not in poverty pct unemployed pct employed pct housing cost-burdened pct not housing cost-burdened pct adults without high school diploma pct adults with high school diploma pct age 17 & under pct age 18-64 pct age 65+ pct single parent families pct other families pct limited English speakers pct proficient English speakers pct Minority race/ethnicity pct Non-Hispanic White race/ethnicity pct in multi-unit housing pct in mobile housing pct in other housing pct in crowded living spaces pct in non-crowded living spaces pct with no vehicle access pct with vehicle access pct in group living quarters pct not in group living quarters pct without health insurance pct with health insurance pct disabled civilians pct not disabled civilians
    Middle Atlantic Division 2010 20 80 8 92 39 61 14 86 23 63 14 16 84 5 95 34 66 18 3 79 3 97 19 81 3 97 11 89 14 86

    2020

    Code
    usa_pcts20 <- svi_percentages20(svi_2020_national, "United States")
    usa_pcts20 %>% kbl() %>% kable_styling() %>% scroll_box(width = "100%")
    division year pct in poverty pct not in poverty pct unemployed pct employed pct housing cost-burdened pct not housing cost-burdened pct adults without high school diploma pct adults with high school diploma pct age 17 & under pct age 18-64 pct age 65+ pct single parent families pct other families pct limited English speakers pct proficient English speakers pct Minority race/ethnicity pct Non-Hispanic White race/ethnicity pct in multi-unit housing pct in mobile housing pct in other housing pct in crowded living spaces pct in non-crowded living spaces pct with no vehicle access pct with vehicle access pct in group living quarters pct not in group living quarters pct without health insurance pct with health insurance pct disabled civilians pct not disabled civilians
    United States 2020 21 79 5 95 30 70 11 89 22 62 16 16 84 4 96 40 60 14 6 80 3 97 8 92 2 98 9 91 15 85
    Code
    div_pcts20 <- svi_percentages20(svi_2020_divisional, census_division)
    div_pcts20 %>% kbl() %>% kable_styling() %>% scroll_box(width = "100%")
    division year pct in poverty pct not in poverty pct unemployed pct employed pct housing cost-burdened pct not housing cost-burdened pct adults without high school diploma pct adults with high school diploma pct age 17 & under pct age 18-64 pct age 65+ pct single parent families pct other families pct limited English speakers pct proficient English speakers pct Minority race/ethnicity pct Non-Hispanic White race/ethnicity pct in multi-unit housing pct in mobile housing pct in other housing pct in crowded living spaces pct in non-crowded living spaces pct with no vehicle access pct with vehicle access pct in group living quarters pct not in group living quarters pct without health insurance pct with health insurance pct disabled civilians pct not disabled civilians
    Middle Atlantic Division 2020 19 81 6 94 34 66 11 89 21 62 17 15 85 5 95 39 61 20 2 78 4 96 19 81 3 97 6 94 14 86

    Join 2010 and 2020 data sets

    Code
    svi_pcts <- union(union(usa_pcts10, div_pcts10), union(usa_pcts20, div_pcts20))
    svi_pcts %>% kbl() %>% kable_styling() %>% scroll_box(width = "100%")
    division year pct in poverty pct not in poverty pct unemployed pct employed pct housing cost-burdened pct not housing cost-burdened pct adults without high school diploma pct adults with high school diploma pct age 17 & under pct age 18-64 pct age 65+ pct single parent families pct other families pct limited English speakers pct proficient English speakers pct Minority race/ethnicity pct Non-Hispanic White race/ethnicity pct in multi-unit housing pct in mobile housing pct in other housing pct in crowded living spaces pct in non-crowded living spaces pct with no vehicle access pct with vehicle access pct in group living quarters pct not in group living quarters pct without health insurance pct with health insurance pct disabled civilians pct not disabled civilians
    United States 2010 23 77 8 92 36 64 15 85 24 63 13 17 83 5 95 35 65 13 7 80 3 97 9 91 3 97 15 85 15 85
    Middle Atlantic Division 2010 20 80 8 92 39 61 14 86 23 63 14 16 84 5 95 34 66 18 3 79 3 97 19 81 3 97 11 89 14 86
    United States 2020 21 79 5 95 30 70 11 89 22 62 16 16 84 4 96 40 60 14 6 80 3 97 8 92 2 98 9 91 15 85
    Middle Atlantic Division 2020 19 81 6 94 34 66 11 89 21 62 17 15 85 5 95 39 61 20 2 78 4 96 19 81 3 97 6 94 14 86

    View all SVI percentage columns

    Code
    colnames(svi_pcts)
     [1] "division"                              
     [2] "year"                                  
     [3] "pct in poverty"                        
     [4] "pct not in poverty"                    
     [5] "pct unemployed"                        
     [6] "pct employed"                          
     [7] "pct housing cost-burdened"             
     [8] "pct not housing cost-burdened"         
     [9] "pct adults without high school diploma"
    [10] "pct adults with high school diploma"   
    [11] "pct age 17 & under"                    
    [12] "pct age 18-64"                         
    [13] "pct age 65+"                           
    [14] "pct single parent families"            
    [15] "pct other families"                    
    [16] "pct limited English speakers"          
    [17] "pct proficient English speakers"       
    [18] "pct Minority race/ethnicity"           
    [19] "pct Non-Hispanic White race/ethnicity" 
    [20] "pct in multi-unit housing"             
    [21] "pct in mobile housing"                 
    [22] "pct in other housing"                  
    [23] "pct in crowded living spaces"          
    [24] "pct in non-crowded living spaces"      
    [25] "pct with no vehicle access"            
    [26] "pct with vehicle access"               
    [27] "pct in group living quarters"          
    [28] "pct not in group living quarters"      
    [29] "pct without health insurance"          
    [30] "pct with health insurance"             
    [31] "pct disabled civilians"                
    [32] "pct not disabled civilians"            

    Note about Group Quarters definition:

    The Census Bureau classifies all people not living in housing units as living in group quarters. A group quarters is a place where people live or stay, in a group living arrangement, that is owned or managed by an entity or organization providing housing and/or services for the residents.

    This is not a typical household-type living arrangement. These services may include custodial or medical care as well as other types of assistance, and residency is commonly restricted to those receiving these services. People living in group quarters are usually not related to each other.

    Group quarters include such places as college residence halls, residential treatment centers, skilled nursing facilities, group homes, military barracks, correctional facilities, and workers’ dormitories.

    Source: Census Glossary

    Infographics

    Now that we have our data loaded, we can begin creating our visualizations. Note that while all steps are presented here for the tutorial, in your actual report files, you will just want to display the infographics and NOT all the steps to create them. An example final file will be linked at the end of the report.

    Waffle Charts

    First, we will create our waffle charts. While waffle charts are not a visualization that comes pre-loaded with ggplot2, several R programmers have worked to devise methodologies to create these charts with icons from FontAwesome.

    We will be modifying code from the following resources to create our charts:

    Create Color Palettes

    Before we can plot our graphs, we first need to create color palettes for our two, three, and four category variables. The following colors were selected from Coolors: https://coolors.co/colors. Feel free to select your own colors of interest, but note that you will want to have one color in each group as a shade of gray/tan to contrast with your categories of interest.

    Code
    # Palette for SVI variables with only two categories
    one_color_palette <- c("#960018", "#91A3B0")
    scales::show_col(one_color_palette)

    Code
    # Palette for SVI variables with three categories
    two_color_palette <- c("#7BA05B", "#E23D28", "#91A3B0")
    scales::show_col(two_color_palette)

    Code
    # Palette for SVI variables with four categories
    three_color_palette <- c("#003262", "#91A3B0" , "#960018", "#ED9121")
    scales::show_col(three_color_palette)

    Select Font Awesome icons for pictorial charts

    Now that we have our color palettes created, we need to select the Font Awesome icons we want to use.

    In order to do this we need to follow three steps:

    1. Search the FontAwesome website (https://fontawesome.com/icons?d=gallery) for the icons that we would like to select.

    2. Identify the unicode value for the icons. We can find this on the top right corner of the icon page on the FontAwesome website. See screenshot below:

    Alt-text: Screenshot of GitHub icon on FontAwesome page with unicode enclosed in red box. Screenshot source
    1. Add \u to the font of the unicode value and save it to a variable for R in your project_data_steps.R file.

    2. If you do not already have the FontAwesome icons downloaded, you will need to download the Free for Web icons from the download section of the FontAwesome website https://fontawesome.com/download. Note: for our class, you already have the fonts downloaded in your team repo in the resources folder so you do not need to download anything.

    Selected icons:

    • Person icon f183:

    • House icon f015:

    • Car icon f1b9:

    • Parent icon e53a:

    • Health icon f80d :

    • Group Quarter icon f1ad:

    Code
    # Don't forget to save FontAwesome variables to project_data_steps.R, remember to add \u to indicate this is unicode for ggplot2.
    person_icon = "\uf183"
    house_icon = "\uf015"
    car_icon = "\uf1b9"
    parent_icon = "\ue53a"
    health_icon = "\uf80d"
    groupqtr_icon = "\uf1ad"

    Create Waffle Charts graphing function

    Next, we will create a function to actually plot our waffle charts. The function will create four (4) visuals: a 2010 divisional waffle chart, a 2010 national waffle chart, a 2020 divisional waffle chart, and a 2020 national waffle chart.

    For our function we’ll have the following inputs:

    • svi_pcts: data frame that contains SVI pcts nationally and divisionally for 2010 and 2020
    • var_search: keyword to search data frame columns for all columns relevant to the topic of interest
    • fa_icon: the selected icon to represent the waffle chart for our variable of interest
    • filter_year1: the first year of the two time periods, for our study it’s 2010
    • title_label_div_year1: the title for year1 (2010) division infographic
    • title_label_usa_year1: the title for year1 (2010) national infographic
    • filter_year2: the second year of the two time periods, for our study it’s 2020
    • title_label_div_year2: the title for year2 (2020) division infographic
    • title_label_usa_year2: the title for year2 (2020) national infographic
    • census_division: the name of the census division we want to filter our data to select

    Now we can construct our function (REMEMBER: We will want to place this custom function with our others in the project_data_steps.R file, not directly in our .RMD file):

    Code
    waffle_charts <- function(svi_pcts, var_search, fa_icon, filter_year1, title_label_div_year1, title_label_usa_year1, filter_year2, title_label_div_year2, title_label_usa_year2, census_division) {
    
    # Filter data  to find columns by var_search keyword, select all columns with keyword, count number of columns
    cols <- tolower(colnames(svi_pcts)) %>% str_detect(tolower(var_search))
    cols_select <- colnames(svi_pcts)[cols]
    n_cols_select <- length(cols_select)
    
    # Set color palette for visualizations
    one_color_palette <- c("#960018", "#91A3B0")
    two_color_palette <- c("#7BA05B", "#E23D28", "#91A3B0")
    three_color_palette <- c("#003262", "#91A3B0" , "#960018", "#ED9121")
    
    
    if (n_cols_select == 2) {
      color_palette <- one_color_palette
    } else if (n_cols_select == 3 & var_search != "age") {
      color_palette <- two_color_palette
    } else {
      color_palette <- three_color_palette
    }
    
    # Division Year1
    svidf <- svi_pcts %>% filter(division == census_division & year == filter_year1) %>% select(all_of(cols_select))
    
    # Create 10x10 grid and then display variables of interests' categories the number of times they appear in data to total to 100
    # For example, for group quarters, the code repeats group quarters 2 times and non-group quarters 98 times
    # for 2020 national data for plotting
    df <- expand.grid(y = 1:10, x = 1:10)
    df$category<-factor(rep(names(svidf),svidf), levels=names(svidf))
    
    # Pull in downloaded fontawesome font to display font awesome icons in graphic
    font_add(family = "FontAwesome", regular = here::here("resources/fontawesome-free-6.5.1-web/webfonts/fa-solid-900.ttf"))
    showtext_auto(T)
    p1 <- ggplot(df, aes(x = y, y = x, color=category)) +
        geom_text(label = fa_icon,
                  family = 'FontAwesome',
                  size = 6) +
      scale_color_manual(values = color_palette) +
      coord_fixed(ratio = 1) + 
      scale_x_continuous(expand = c(0.1, 0.1)) +
      scale_y_continuous(expand = c(0.1, 0.1),trans = 'reverse') +
      theme(
        panel.background = element_blank(),
        plot.title = element_text(size = 22, hjust = 0),
        axis.text = element_blank(),
        axis.title = element_blank(),
        axis.ticks = element_blank(),
        legend.title = element_blank(),
        legend.position = "right",
        legend.key = element_rect(colour = NA, fill = NA),
        legend.key.size = unit(1.5, "cm"),
        legend.text = element_text(size = rel(1.2))) +
      guides(fill = guide_legend(byrow = TRUE)) +
      labs(title=title_label_div_year1) 
    
    
    # Division Year2
    svidf <- svi_pcts %>% filter(division == census_division & year == filter_year2) %>% select(all_of(cols_select))
    
    # Create 10x10 grid and then display variables of interests' categories the number of times they appear in data to total to 100
    # For example, for group quarters, the code repeats group quarters 2 times and non-group quarters 98 times
    # for 2020 national data for plotting
    df <- expand.grid(y = 1:10, x = 1:10)
    df$category<-factor(rep(names(svidf),svidf), levels=names(svidf))
    
    # Pull in downloaded fontawesome font to display font awesome icons in graphic
    font_add(family = "FontAwesome", regular = here::here("resources/fontawesome-free-6.5.1-web/webfonts/fa-solid-900.ttf"))
    showtext_auto(T)
    p2 <- ggplot(df, aes(x = y, y = x, color=category)) +
        geom_text(label = fa_icon,
                  family = 'FontAwesome',
                  size = 6) +
      scale_color_manual(values = color_palette) +
      coord_fixed(ratio = 1) + 
      scale_x_continuous(expand = c(0.1, 0.1)) +
      scale_y_continuous(expand = c(0.1, 0.1),trans = 'reverse') +
      theme(
        panel.background = element_blank(),
        plot.title = element_text(size = 22, hjust = 0),
        axis.text = element_blank(),
        axis.title = element_blank(),
        axis.ticks = element_blank(),
        legend.title = element_blank(),
        legend.position = "right",
        legend.key = element_rect(colour = NA, fill = NA),
        legend.key.size = unit(1.5, "cm"),
        legend.text = element_text(size = rel(1.2)) ) +
      guides(fill = guide_legend(byrow = TRUE)) +
      labs(title=title_label_div_year2) 
    
    
    # National Year1
    svidf <- svi_pcts %>% filter(division != census_division & year == filter_year1) %>% select(all_of(cols_select))
    
    # Create 10x10 grid and then display variables of interests' categories the number of times they appear in data to total to 100
    # For example, for group quarters, the code repeats group quarters 2 times and non-group quarters 98 times
    # for 2020 national data for plotting
    df <- expand.grid(y = 1:10, x = 1:10)
    df$category<-factor(rep(names(svidf),svidf), levels=names(svidf))
    
    # Pull in downloaded fontawesome font to display font awesome icons in graphic
    font_add(family = "FontAwesome", regular = here::here("resources/fontawesome-free-6.5.1-web/webfonts/fa-solid-900.ttf"))
    showtext_auto(T)
    p3 <- ggplot(df, aes(x = y, y = x, color=category)) +
        geom_text(label = fa_icon,
                  family = 'FontAwesome',
                  size = 6) +
      scale_color_manual(values = color_palette) +
      coord_fixed(ratio = 1) + 
      scale_x_continuous(expand = c(0.1, 0.1)) +
      scale_y_continuous(expand = c(0.1, 0.1),trans = 'reverse') +
      theme(
        panel.background = element_blank(),
        plot.title = element_text(size = 22, hjust = 0),
        axis.text = element_blank(),
        axis.title = element_blank(),
        axis.ticks = element_blank(),
        legend.title = element_blank(),
        legend.position = "right",
        legend.key = element_rect(colour = NA, fill = NA),
        legend.key.size = unit(1.5, "cm"),
        legend.text = element_text(size = rel(1.2)) ) +
      guides(fill = guide_legend(byrow = TRUE)) +
      labs(title=title_label_usa_year1) 
    
    
    # National Year2
    svidf <- svi_pcts %>% filter(division != census_division & year == filter_year2) %>% select(all_of(cols_select))
    
    # Create 10x10 grid and then display variables of interests' categories the number of times they appear in data to total to 100
    # For example, for group quarters, the code repeats group quarters 2 times and non-group quarters 98 times
    # for 2020 national data for plotting
    df <- expand.grid(y = 1:10, x = 1:10)
    df$category<-factor(rep(names(svidf),svidf), levels=names(svidf))
    
    # Pull in downloaded fontawesome font to display font awesome icons in graphic
    font_add(family = "FontAwesome", regular = here::here("resources/fontawesome-free-6.5.1-web/webfonts/fa-solid-900.ttf"))
    showtext_auto(T)
    p4 <- ggplot(df, aes(x = y, y = x, color=category)) +
        geom_text(label = fa_icon,
                  family = 'FontAwesome',
                  size = 6) +
      scale_color_manual(values = color_palette) +
      coord_fixed(ratio = 1) + 
      scale_x_continuous(expand = c(0.1, 0.1)) +
      scale_y_continuous(expand = c(0.1, 0.1),trans = 'reverse') +
      theme(
        panel.background = element_blank(),
        plot.title = element_text(size = 22, hjust = 0),
        axis.text = element_blank(),
        axis.title = element_blank(),
        axis.ticks = element_blank(),
        legend.title = element_blank(),
        legend.position = "right",
        legend.key = element_rect(colour = NA, fill = NA),
        legend.key.size = unit(1.5, "cm"),
        legend.text = element_text(size = rel(1.2)) ) +
      guides(fill = guide_legend(byrow = TRUE)) +
      labs(title=title_label_usa_year2) 
    
    # Create list of all plots
    plotlist = list(p1, p2, p3, p4)
    
    return(plotlist)
    }

    To get a visual of the grid the code is creating, let’s take a peek:

    Code
    # For our example we'll search for our health variables
    var_search <- "health"
    cols <- tolower(colnames(svi_pcts)) %>% str_detect(tolower(var_search))
    cols_select <- colnames(svi_pcts)[cols]
    n_cols_select <- length(cols_select)
    Code
    # View selected columns for health:
    cols_select
    [1] "pct without health insurance" "pct with health insurance"   
    Code
    # View number of selected columns for health:
    n_cols_select
    [1] 2
    Code
    # Filter data by division and year, select columns of interest:
    svidf <- svi_pcts %>% filter(division == census_division & year == 2010) %>% select(all_of(cols_select))
    
    # View data
    svidf
    # A tibble: 1 × 2
      `pct without health insurance` `pct with health insurance`
                               <dbl>                       <dbl>
    1                             11                          89
    Code
    # Create 10x10 grid where Y represents columns and X represents rows
    df <- expand.grid(y = 1:10, x = 1:10)
    df %>% head(20) %>% kbl() %>% kable_styling() %>% scroll_box(width = "100%")
    y x
    1 1
    2 1
    3 1
    4 1
    5 1
    6 1
    7 1
    8 1
    9 1
    10 1
    1 2
    2 2
    3 2
    4 2
    5 2
    6 2
    7 2
    8 2
    9 2
    10 2
    Code
    # View names of df that will serve as labels
    names(svidf)
    [1] "pct without health insurance" "pct with health insurance"   
    Code
    # Repeat the names by the number if times they're in the df
    rep(names(svidf),svidf)
      [1] "pct without health insurance" "pct without health insurance"
      [3] "pct without health insurance" "pct without health insurance"
      [5] "pct without health insurance" "pct without health insurance"
      [7] "pct without health insurance" "pct without health insurance"
      [9] "pct without health insurance" "pct without health insurance"
     [11] "pct without health insurance" "pct with health insurance"   
     [13] "pct with health insurance"    "pct with health insurance"   
     [15] "pct with health insurance"    "pct with health insurance"   
     [17] "pct with health insurance"    "pct with health insurance"   
     [19] "pct with health insurance"    "pct with health insurance"   
     [21] "pct with health insurance"    "pct with health insurance"   
     [23] "pct with health insurance"    "pct with health insurance"   
     [25] "pct with health insurance"    "pct with health insurance"   
     [27] "pct with health insurance"    "pct with health insurance"   
     [29] "pct with health insurance"    "pct with health insurance"   
     [31] "pct with health insurance"    "pct with health insurance"   
     [33] "pct with health insurance"    "pct with health insurance"   
     [35] "pct with health insurance"    "pct with health insurance"   
     [37] "pct with health insurance"    "pct with health insurance"   
     [39] "pct with health insurance"    "pct with health insurance"   
     [41] "pct with health insurance"    "pct with health insurance"   
     [43] "pct with health insurance"    "pct with health insurance"   
     [45] "pct with health insurance"    "pct with health insurance"   
     [47] "pct with health insurance"    "pct with health insurance"   
     [49] "pct with health insurance"    "pct with health insurance"   
     [51] "pct with health insurance"    "pct with health insurance"   
     [53] "pct with health insurance"    "pct with health insurance"   
     [55] "pct with health insurance"    "pct with health insurance"   
     [57] "pct with health insurance"    "pct with health insurance"   
     [59] "pct with health insurance"    "pct with health insurance"   
     [61] "pct with health insurance"    "pct with health insurance"   
     [63] "pct with health insurance"    "pct with health insurance"   
     [65] "pct with health insurance"    "pct with health insurance"   
     [67] "pct with health insurance"    "pct with health insurance"   
     [69] "pct with health insurance"    "pct with health insurance"   
     [71] "pct with health insurance"    "pct with health insurance"   
     [73] "pct with health insurance"    "pct with health insurance"   
     [75] "pct with health insurance"    "pct with health insurance"   
     [77] "pct with health insurance"    "pct with health insurance"   
     [79] "pct with health insurance"    "pct with health insurance"   
     [81] "pct with health insurance"    "pct with health insurance"   
     [83] "pct with health insurance"    "pct with health insurance"   
     [85] "pct with health insurance"    "pct with health insurance"   
     [87] "pct with health insurance"    "pct with health insurance"   
     [89] "pct with health insurance"    "pct with health insurance"   
     [91] "pct with health insurance"    "pct with health insurance"   
     [93] "pct with health insurance"    "pct with health insurance"   
     [95] "pct with health insurance"    "pct with health insurance"   
     [97] "pct with health insurance"    "pct with health insurance"   
     [99] "pct with health insurance"    "pct with health insurance"   
    Code
    # Add category to df
    df$category<-factor(rep(names(svidf),svidf), levels=names(svidf))

    As we can see, there are 11 columns listed as ‘pct without health insurance’ across 2 rows (one full row of 10, and 1 column in the next row). This is as we expect looking at our svi_pcts df where 11% of the population in the Middle Atlantic Division in 2010 were uninsured:

    Code
    df %>% head(20) %>% kbl() %>% kable_styling() %>% scroll_box(width = "100%")
    y x category
    1 1 pct without health insurance
    2 1 pct without health insurance
    3 1 pct without health insurance
    4 1 pct without health insurance
    5 1 pct without health insurance
    6 1 pct without health insurance
    7 1 pct without health insurance
    8 1 pct without health insurance
    9 1 pct without health insurance
    10 1 pct without health insurance
    1 2 pct without health insurance
    2 2 pct with health insurance
    3 2 pct with health insurance
    4 2 pct with health insurance
    5 2 pct with health insurance
    6 2 pct with health insurance
    7 2 pct with health insurance
    8 2 pct with health insurance
    9 2 pct with health insurance
    10 2 pct with health insurance
    Code
    df %>% 
      pivot_wider(names_from = y, values_from = category) %>% head(10) %>% kbl() %>% kable_styling() %>% scroll_box(width = "100%")
    x 1 2 3 4 5 6 7 8 9 10
    1 pct without health insurance pct without health insurance pct without health insurance pct without health insurance pct without health insurance pct without health insurance pct without health insurance pct without health insurance pct without health insurance pct without health insurance
    2 pct without health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance
    3 pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance
    4 pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance
    5 pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance
    6 pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance
    7 pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance
    8 pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance
    9 pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance
    10 pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance pct with health insurance

    Now we can pause here to go save our newly created waffle_charts() function and FontAwesome icon variables to our project_data_steps.R file.

    Plot Waffle Charts

    Once our function is all set to go in our project_data_steps.R file, we can begin using the function to create our infographics:

    var_search Keywords & icons

    First, we can make a list of the var_search keywords and FA icons we need to plot our waffle charts sets:

    1. poverty, person_icon
    2. employ, person_icon
    3. cost-burdened, house_icon
    4. adults, person_icon
    5. age, person_icon
    6. families, parent_icon
    7. English, person_icon
    8. race, person_icon
    9. housing$, house_icon
    10. crowded, house_icon
    11. vehicle, car_icon
    12. group, groupqtr_icon
    13. health, health_icon
    14. disabled, person_icon

    Note that for housing we use the grepl endswith indicator $ to avoid pulling the housing cost-burdened variables in with our housing type percentages.

    Create infographics

    Now that we have our 14 keyword and icon combinations, we can utilize our new function to create all of our infographics.

    We will group them into our four SVI categories:

    • Socioeconomic Status
    • Household Characteristics
    • Racial and Ethnic Minority Status
    • Housing Type and Transportation

    Socieconomic Status Infographics

    Poverty
    Code
    pov <- waffle_charts(svi_pcts = svi_pcts, 
                  var_search = "poverty", 
                  fa_icon = person_icon, 
                  filter_year1 = 2010, 
                  title_label_div_year1 = paste0("Population living below 150% poverty in ", census_division, ", 2010"),
                  title_label_usa_year1 = "Population living below 150% poverty in United States, 2010", 
                  filter_year2 = 2020, 
                  title_label_div_year2 = paste0("Population living below 150% poverty in ", census_division, ", 2020"), 
                  title_label_usa_year2 = "Population living below 150% poverty in United States, 2020", 
                  census_division = census_division)
    Unemployed
    Code
    unemploy <- waffle_charts(svi_pcts = svi_pcts, 
                  var_search = "employ", 
                  fa_icon = person_icon, 
                  filter_year1 = 2010, 
                  title_label_div_year1 = paste0("Population unemployed in ", census_division, ", 2010"),
                  title_label_usa_year1 = "Population unemployed in United States, 2010", 
                  filter_year2 = 2020, 
                  title_label_div_year2 = paste0("Population unemployed in ", census_division, ", 2020"), 
                  title_label_usa_year2 = "Population unemployed in United States, 2020", 
                  census_division = census_division)
    Housing cost-burdened
    Code
    house_cost <- waffle_charts(svi_pcts = svi_pcts, 
                  var_search = "cost-burdened", 
                  fa_icon = house_icon, 
                  filter_year1 = 2010, 
                  title_label_div_year1 = paste0("Population housing cost-burdened in ", census_division, ", 2010"),
                  title_label_usa_year1 = "Population housing cost-burdened in United States, 2010", 
                  filter_year2 = 2020, 
                  title_label_div_year2 = paste0("Population housing cost-burdened in ", census_division, ", 2020"), 
                  title_label_usa_year2 = "Population housing cost-burdened in United States, 2020", 
                  census_division = census_division)
    High School Education
    Code
    hsed <- waffle_charts(svi_pcts = svi_pcts, 
                  var_search = "adults", 
                  fa_icon = person_icon, 
                  filter_year1 = 2010, 
                  title_label_div_year1 = paste0("Adult education status in ", census_division, ", 2010"),
                  title_label_usa_year1 = "Adult education status in United States, 2010", 
                  filter_year2 = 2020, 
                  title_label_div_year2 = paste0("Adult education status in ", census_division, ", 2020"), 
                  title_label_usa_year2 = "Adult education status in United States, 2020", 
                  census_division = census_division)
    Health Insurance
    Code
    healthins <- waffle_charts(svi_pcts = svi_pcts, 
                  var_search = "health", 
                  fa_icon = health_icon, 
                  filter_year1 = 2010, 
                  title_label_div_year1 = paste0("Population health insurance status in ", census_division, ", 2010"), 
                  title_label_usa_year1 = "Population health insurance status in United States, 2010", 
                  filter_year2 = 2020, 
                  title_label_div_year2 = paste0("Population health insurance status in ", census_division, ", 2020"), 
                  title_label_usa_year2 = "Population health insurance status in United States, 2020", 
                  census_division = census_division)
    Create SES infographic, edit image
    Code
    # Join SES plots with patchwork package syntax
    pov[[1]] / pov[[2]] / pov[[3]] / pov[[4]] / unemploy[[1]] / unemploy[[2]] / unemploy[[3]] / unemploy[[4]] / house_cost[[1]] / house_cost[[2]] / house_cost[[3]] / house_cost[[4]] / hsed[[1]] / hsed[[2]] / hsed[[3]] / hsed[[4]] / healthins[[1]] / healthins[[2]] / healthins[[3]] / healthins[[4]]
    Code
    # Set output location
    out <- here::here(paste0("labs/wk03/imgs/infographic", "_SES", "_", str_replace_all(census_division, " ", "_"), ".png"))
    
    # Save plot as image
    ggsave(out, device = "png", width = 1050, height=520*20, units = "px")
    
    # Load image to magick package
    magick_image = magick::image_read(out)
    
    # Trim to remove excess margins
    cropped_out = magick::image_trim(magick_image)
    
    # Save final cropped infographic
    magick::image_write(cropped_out, path = out, format = "png")

    Household Characteristics Infographics

    Age

    Code
    age <- waffle_charts(svi_pcts = svi_pcts, 
                  var_search = "age", 
                  fa_icon = person_icon, 
                  filter_year1 = 2010, 
                  title_label_div_year1 = paste0("Age distribution in ", census_division, ", 2010"), 
                  title_label_usa_year1 = "Age distribution in United States, 2010", 
                  filter_year2 = 2020, 
                  title_label_div_year2 = paste0("Age distribution in ", census_division, ", 2020"),
                  title_label_usa_year2 = "Age distribution in United States, 2020", 
                  census_division = census_division)
    age

    Disability status

    Code
    disabled <- waffle_charts(svi_pcts = svi_pcts, 
                  var_search = "disabled", 
                  fa_icon = person_icon, 
                  filter_year1 = 2010, 
                  title_label_div_year1 = paste0("Civilian population with disability in ", census_division, ", 2010"), 
                  title_label_usa_year1 = "Civilian population with disability in United States, 2010", 
                  filter_year2 = 2020, 
                  title_label_div_year2 = paste0("Civilian population with disability in ", census_division, ", 2020"),
                  title_label_usa_year2 = "Civilian population with disability in United States, 2020", 
                  census_division = census_division)
    Family Types
    Code
    family <- waffle_charts(svi_pcts = svi_pcts, 
                  var_search = "families", 
                  fa_icon = parent_icon, 
                  filter_year1 = 2010, 
                  title_label_div_year1 = paste0("Family types in ", census_division, ", 2010"),  
                  title_label_usa_year1 = "Family types in United States, 2010", 
                  filter_year2 = 2020, 
                  title_label_div_year2 = paste0("Family types in ", census_division, ", 2020"), 
                  title_label_usa_year2 = "Family types in United States, 2020", 
                  census_division = census_division)
    Language
    Code
    lang <- waffle_charts(svi_pcts = svi_pcts, 
                  var_search = "English", 
                  fa_icon = person_icon, 
                  filter_year1 = 2010, 
                  title_label_div_year1 = paste0("English language proficiency in ", census_division, ", 2010"),  
                  title_label_usa_year1 = "English language proficiency in United States, 2010", 
                  filter_year2 = 2020, 
                  title_label_div_year2 = paste0("English language proficiency in ", census_division, ", 2020"), 
                  title_label_usa_year2 = "English language proficiency in United States, 2020", 
                  census_division = census_division)
    Create Household Characteristics infographic, edit image
    Code
    # Join household characteristics plots with patchwork package syntax
    age[[1]] / age[[2]] / age[[3]] / age[[4]] / disabled[[1]] / disabled[[2]] / disabled[[3]] / disabled[[4]] / family[[1]] / family[[2]] / family[[3]] / family[[4]] / lang[[1]] / lang[[2]] / lang[[3]] / lang[[4]]
    Code
    # Set output location
    out <- here::here(paste0("labs/wk03/imgs/infographic", "_HHChar", "_", str_replace_all(census_division, " ", "_"), ".png"))
    
    # Save plot as image
    ggsave(out, device = "png", width = 1050, height=520*16, units = "px")
    
    # Load image to magick package
    magick_image = magick::image_read(out)
    
    # Trim to remove excess margins
    cropped_out = magick::image_trim(magick_image)
    
    # Save final cropped infographic
    magick::image_write(cropped_out, path = out, format = "png")

    Racial and Ethnic Minority Status Infographics

    Code
    race <- waffle_charts(svi_pcts = svi_pcts, 
                  var_search = "race", 
                  fa_icon = person_icon, 
                  filter_year1 = 2010, 
                  title_label_div_year1 = paste0("Racial and Ethnic Diversity in ", census_division, ", 2010"),
                  title_label_usa_year1 = "Racial and Ethnic Diversity in United States, 2010", 
                  filter_year2 = 2020, 
                  title_label_div_year2 = paste0("Racial and Ethnic Diversity in ", census_division, ", 2020"), 
                  title_label_usa_year2 = "Racial and Ethnic Diversity in United States, 2020", 
                  census_division = census_division)
    Create Racial and Ethnic Minority infographic, edit image
    Code
    # Join racial and ethnic minority plots with patchwork package syntax
    race[[1]] / race[[2]] / race[[3]] / race[[4]]
    Code
    # Set output location
    out <- here::here(paste0("labs/wk03/imgs/infographic", "_REM", "_", str_replace_all(census_division, " ", "_"), ".png"))
    
    # Save plot as image
    ggsave(out, device = "png", width = 1050, height=520*4, units = "px")
    
    # Load image to magick package
    magick_image = magick::image_read(out)
    
    # Trim to remove excess margins
    cropped_out = magick::image_trim(magick_image)
    
    # Save final cropped infographic
    magick::image_write(cropped_out, path = out, format = "png")

    Housing Type and Transportation Infographics

    Housing Types
    Code
    house_type <- waffle_charts(svi_pcts = svi_pcts, 
                  var_search = "housing$", 
                  fa_icon = house_icon, 
                  filter_year1 = 2010, 
                  title_label_div_year1 =  paste0("Population housing types in ", census_division, ", 2010"),
                  title_label_usa_year1 = "Population housing types in United States, 2010", 
                  filter_year2 = 2020, 
                  title_label_div_year2 = paste0("Population housing types in ", census_division, ", 2020"),
                  title_label_usa_year2 = "Population housing types in United States, 2020", 
                  census_division = census_division)
    Crowded living
    Code
    crowded <- waffle_charts(svi_pcts = svi_pcts, 
                  var_search = "crowded", 
                  fa_icon = house_icon, 
                  filter_year1 = 2010, 
                  title_label_div_year1 =  paste0("Population living in crowded housing in ", census_division, ", 2010"),
                  title_label_usa_year1 = "Population living in crowded housing in United States, 2010", 
                  filter_year2 = 2020, 
                  title_label_div_year2 = paste0("Population living in crowded housing in ", census_division, ", 2020"),
                  title_label_usa_year2 = "Population living in crowded housing in United States, 2020", 
                  census_division = census_division)
    Transportation access
    Code
    car <- waffle_charts(svi_pcts = svi_pcts, 
                  var_search = "vehicle", 
                  fa_icon = car_icon, 
                  filter_year1 = 2010, 
                  title_label_div_year1 =  paste0("Population vehicle access in ", census_division, ", 2010"),
                  title_label_usa_year1 = "Population vehicle access in United States, 2010", 
                  filter_year2 = 2020, 
                  title_label_div_year2 = paste0("Population vehicle access in ", census_division, ", 2020"),
                  title_label_usa_year2 = "Population vehicle access in United States, 2020", 
                  census_division = census_division)
    Group Quarters
    Code
    group <- waffle_charts(svi_pcts = svi_pcts, 
                  var_search = "group", 
                  fa_icon = groupqtr_icon, 
                  filter_year1 = 2010, 
                  title_label_div_year1 =  paste0("Population living in group quarters in ", census_division, ", 2010"),
                  title_label_usa_year1 = "Population living in group quarters in United States, 2010", 
                  filter_year2 = 2020, 
                  title_label_div_year2 = paste0("Population living in group quarters in ", census_division, ", 2020"),
                  title_label_usa_year2 = "Population living in group quarters in United States, 2020", 
                  census_division = census_division)
    Create Housing and Transportation infographic, edit image
    Code
    # Join racial and ethnic minority plots with patchwork package syntax
    house_type[[1]] / house_type[[2]] / house_type[[3]] / house_type[[4]] / crowded[[1]] / crowded[[2]] / crowded[[3]] / crowded[[4]] / car[[1]] / car[[2]] / car[[3]] / car[[4]] / group[[1]] / group[[2]] / group[[3]] / group[[4]]
    Code
    # Set output location
    out <- here::here(paste0("labs/wk03/imgs/infographic", "_HAT", "_", str_replace_all(census_division, " ", "_"), ".png"))
    
    # Save plot as image
    ggsave(out, device = "png", width = 1050, height=520*16, units = "px")
    
    # Load image to magick package
    magick_image = magick::image_read(out)
    
    # Trim to remove excess margins
    cropped_out = magick::image_trim(magick_image)
    
    # Save final cropped infographic
    magick::image_write(cropped_out, path = out, format = "png")

    Check all infographics saved

    We can use the following code to check all of our files saved correctly to the imgs folder (there should be 4 for your division)

    Code
    images <- sprintf("imgs/%s", list.files(here::here("labs/wk03/imgs")))
    images
    [1] "imgs/infographic_HAT_Middle_Atlantic_Division.png"   
    [2] "imgs/infographic_HHChar_Middle_Atlantic_Division.png"
    [3] "imgs/infographic_REM_Middle_Atlantic_Division.png"   
    [4] "imgs/infographic_SES_Middle_Atlantic_Division.png"   

    View example infographic

    Alt-text: Infographic of Racial and Ethnic Diversity

    Choropleth Maps

    Now that we have learned how to create Waffle Charts, let’s explore another type of infographic: choropleth maps.

    As explained in the video at the beginning of this lab, choropleth maps are useful to explore variations in a topic of interest across geographical regions.

    For this lab we will examine how counties across our division of interest vary in social vulnerability as measured by the total number of SVI flags per 1000 people in the population.

    Download Shapefiles

    In order to create our map, we will need to download shapefiles from the R library tigris. However, before let’s first define shapefiles:

    What is a Shapefile?

    A shapefile is a vector data file format commonly used for geospatial analysis. Shapefiles store the location, geometry, and attribution of point, line, and polygon features.

    Why is a Shapefile Important?

    Shapefiles are one of the most common file formats for geospatial data. They store data as points, lines, or polygons. These three feature types form the basis of geospatial vector data analysis. Points can be used to represent addresses, points of interest, and parcel or ZIP Code centroids. Lines are often used to depict road networks or waterways. Polygonal data can represent anything with a boundary, such as a neighborhood, census block, or geofence.

    Source: Precisely glossary

    As explained in the Precisely glossary, shapefiles provide us with the data we need to represent geographic locations and features on a map.

    The US Census Bureau provides TIGER/Line shapefiles of the United States at various geographic levels. The tigris library was created to allow for R users to easily download these files for analysis.

    To begin, let’s identify the unique states in our division:

    Code
    # Find states in division
    states <- svi_2020_divisional %>% select(state) %>% unique()
    states <- states$state
    states
    [1] "NJ" "NY" "PA"
    Code
    # Find state FIPS codes
    state_fips <- fips_census_regions %>% filter(division == census_division) %>% select(state_code) %>% unique()
    state_fips <- state_fips$state_code
    state_fips
    [1] "34" "36" "42"

    Next, we can pull the state-level shapefiles from the tigris package:

    Code
    # Recall that we are working with 2010 census tracts, thus we need to pull the 2010 shapefiles
    st_sf <- tigris::states(year = 2010, cb = TRUE) %>% filter(STATE %in% state_fips)
    
    # Shift geometric locations of AK and HI if Pacific division
    if (census_division == "Pacific Division"){
        st_sf <- shift_geometry(
                      st_sf,
                      geoid_column = NULL,
                      preserve_area = FALSE,
                      position = c("below", "outside")
                    )
    } else {
      st_sf <- st_sf
    }
    Code
    # View data
    st_sf
    Simple feature collection with 3 features and 5 fields
    Geometry type: MULTIPOLYGON
    Dimension:     XY
    Bounding box:  xmin: -80.51989 ymin: 38.92852 xmax: -71.85621 ymax: 45.01585
    Geodetic CRS:  NAD83
           GEO_ID STATE         NAME LSAD CENSUSAREA                       geometry
    1 0400000US34    34   New Jersey <NA>    7354.22 MULTIPOLYGON (((-75.52684 3...
    2 0400000US36    36     New York <NA>   47126.40 MULTIPOLYGON (((-71.94356 4...
    3 0400000US42    42 Pennsylvania <NA>   44742.70 MULTIPOLYGON (((-75.41504 3...

    Finally we need to pull county-level shapefiles from the tigris package:

    Code
    # Recall that we are working with 2010 census tracts, thus we need to pull the 2010 shapefiles
    county_sf = tigris::counties(year = 2010, cb = TRUE) %>% filter(STATE %in% state_fips)
    
    # Shift geometric locations of AK and HI if Pacific division
    if (census_division == "Pacific Division"){
        county_sf <- shift_geometry(
                      county_sf,
                      geoid_column = NULL,
                      preserve_area = FALSE,
                      position = c("below", "outside")
                    )
    } else {
      county_sf <- county_sf
    }
    Code
    # View data
    county_sf
    Simple feature collection with 150 features and 8 fields
    Geometry type: MULTIPOLYGON
    Dimension:     XY
    Bounding box:  xmin: -80.51989 ymin: 38.92852 xmax: -71.85621 ymax: 45.01585
    Geodetic CRS:  NAD83
    First 10 features:
               GEO_ID STATE COUNTY       NAME   LSAD CENSUSAREA
    1  0500000US34001    34    001   Atlantic County    555.704
    2  0500000US34003    34    003     Bergen County    233.009
    3  0500000US34005    34    005 Burlington County    798.576
    4  0500000US34007    34    007     Camden County    221.263
    5  0500000US34009    34    009   Cape May County    251.425
    6  0500000US34011    34    011 Cumberland County    483.703
    7  0500000US34013    34    013      Essex County    126.212
    8  0500000US34015    34    015 Gloucester County    322.005
    9  0500000US34017    34    017     Hudson County     46.191
    10 0500000US34019    34    019  Hunterdon County    427.819
                             geometry COUNTYFP STATEFP
    1  MULTIPOLYGON (((-74.42314 3...      001      34
    2  MULTIPOLYGON (((-73.92676 4...      003      34
    3  MULTIPOLYGON (((-74.99056 4...      005      34
    4  MULTIPOLYGON (((-75.14001 3...      007      34
    5  MULTIPOLYGON (((-74.94545 3...      009      34
    6  MULTIPOLYGON (((-75.1145 39...      011      34
    7  MULTIPOLYGON (((-74.13892 4...      013      34
    8  MULTIPOLYGON (((-75.20008 3...      015      34
    9  MULTIPOLYGON (((-74.02039 4...      017      34
    10 MULTIPOLYGON (((-75.19491 4...      019      34

    View shapefiles

    Before we add our data, let’s take a look at our downloaded shapefiles as a map. We can use the ggplot2 package and its interactive extension ggiraph for mapping:

    State-level shapefile

    Code
    ggplot() +
      theme_void() +
      geom_sf(data=st_sf, aes(geometry=geometry))

    County-level shapefile

    Code
    ggplot() +
      theme_void() +
      geom_sf(data=county_sf, aes(geometry=geometry))

    As we can see, the state-level shapefile provide the outer boundary lines for our states while the county-level shapefile breaks down the boundaries to individual states. The tigris package has a variety of boundary levels that you can explore by visiting the package’s Github repo here: https://github.com/walkerke/tigris

    Prepare map data

    Now we need to summarize our Social Vulnerability Index (SVI) flag data by county within our states for 2010 and 2020.

    We can create a function to do this (and place it in our project_data_steps.R file):

    Code
    flag_summarize <- function (df, pop_var) {
     df_flags <- df %>% 
       filter(!is.na(F_TOTAL)) %>% 
       group_by(FIPS_st, FIPS_county) %>% 
       mutate(flag_count = sum(F_TOTAL),
            pop = sum(!!as.name(pop_var)),
            flag_by_pop = flag_count/pop) %>% 
       select(FIPS_st, FIPS_county, state, state_name, county, region_number, region, division_number, division, flag_count, pop, flag_by_pop) %>% 
       unique()
     
     # Assign counts to quantiles
     df_flags <- df_flags %>% 
       mutate(
         flag_count_quantile = cut(flag_count,quantile(df_flags$flag_count, probs = seq(0, 1, .2)),include.lowest=TRUE,labels=FALSE),
         flag_pop_quantile = cut(flag_by_pop,quantile(df_flags$flag_by_pop, probs = seq(0, 1, .2)),include.lowest=TRUE,labels=FALSE) ) %>%
       # Convert quantiles to ratios from .20 - 1
       mutate(flag_count_quantile = case_when(
                    flag_count_quantile == 1 ~ .20,
                    flag_count_quantile == 2 ~ .40,
                    flag_count_quantile == 3 ~ .60,
                    flag_count_quantile == 4 ~ .80,
                    flag_count_quantile == 5 ~ 1
                 )
       ) %>%
       # Convert quantiles to ratios from .20 - 1
       mutate(flag_pop_quantile = case_when(
                    flag_pop_quantile == 1 ~ .20,
                    flag_pop_quantile == 2 ~ .40,
                    flag_pop_quantile == 3 ~ .60,
                    flag_pop_quantile == 4 ~ .80,
                    flag_pop_quantile == 5 ~ 1
                 )
       )
    
     return(df_flags)
     }

    2010 SVI Flag data

    Code
    # Summarize total count of flags by state and county
    
    county_svi_flags10 <- flag_summarize(svi_2010_divisional, "E_TOTPOP_10")
    
    # View data
    county_svi_flags10 %>% head() %>% kbl() %>% kable_styling() %>% scroll_box(width = "100%")
    FIPS_st FIPS_county state state_name county region_number region division_number division flag_count pop flag_by_pop flag_count_quantile flag_pop_quantile
    34 001 NJ New Jersey Atlantic County 1 Northeast Region 2 Middle Atlantic Division 317 264511 0.0011984 1.0 1.0
    34 003 NJ New Jersey Bergen County 1 Northeast Region 2 Middle Atlantic Division 436 896482 0.0004863 1.0 0.2
    34 005 NJ New Jersey Burlington County 1 Northeast Region 2 Middle Atlantic Division 241 447861 0.0005381 0.8 0.2
    34 007 NJ New Jersey Camden County 1 Northeast Region 2 Middle Atlantic Division 463 513574 0.0009015 1.0 0.8
    34 009 NJ New Jersey Cape May County 1 Northeast Region 2 Middle Atlantic Division 99 97684 0.0010135 0.6 1.0
    34 011 NJ New Jersey Cumberland County 1 Northeast Region 2 Middle Atlantic Division 179 148233 0.0012076 0.8 1.0
    Code
    # View flag count quartiles
    county_svi_flags10$flag_count_quantile %>% unique() %>% sort()
    [1] 0.2 0.4 0.6 0.8 1.0

    2020 SVI Flag data

    Code
    # Summarize total count of flags by state and county
    
    county_svi_flags20 <- flag_summarize(svi_2020_divisional, "E_TOTPOP_20")
    
    # View data
    county_svi_flags20 %>% head() %>% kbl() %>% kable_styling() %>% scroll_box(width = "100%")
    FIPS_st FIPS_county state state_name county region_number region division_number division flag_count pop flag_by_pop flag_count_quantile flag_pop_quantile
    34 001 NJ New Jersey Atlantic County 1 Northeast Region 2 Middle Atlantic Division 335 264649 0.0012658 1.0 1.0
    34 003 NJ New Jersey Bergen County 1 Northeast Region 2 Middle Atlantic Division 454 931264 0.0004875 1.0 0.2
    34 005 NJ New Jersey Burlington County 1 Northeast Region 2 Middle Atlantic Division 237 446295 0.0005310 0.8 0.2
    34 007 NJ New Jersey Camden County 1 Northeast Region 2 Middle Atlantic Division 463 506715 0.0009137 1.0 0.8
    34 009 NJ New Jersey Cape May County 1 Northeast Region 2 Middle Atlantic Division 84 92701 0.0009061 0.6 0.8
    34 011 NJ New Jersey Cumberland County 1 Northeast Region 2 Middle Atlantic Division 181 140953 0.0012841 0.8 1.0

    Create Interactive Tooltips

    As mentioned above, we can use the ggiraph package to create an interactive HTML map. This means that when we generate our map, we will be able to hover over our counties and have a popup appear with specific information about each segment of the map. This popup is called a tooltip. Thus, we need to add a column for the tooltip to our data frame.

    We can create a new function for this (and add it it our project_data_steps.R file):

    Code
    # Note that "\n" indicates a line break in HTML and will place the elements on different lines of our tooltip
    # htmltools::htmlEscape(df$county, TRUE) encodes any special characters county names for HTML
    map_tooltip <- function(df) {
      df$flag_pop_quantile_pct <- scales::percent(df$flag_pop_quantile, accuracy = 1)
        df$tooltip <- paste0(htmltools::htmlEscape(df$county, TRUE), ", ", df$state, "\n", 
                           "SVI Flags: ", prettyNum(df$flag_count, big.mark = ",", scientific = FALSE), "\n", 
                           "County Population: ", prettyNum(df$pop, big.mark = ",", scientific = FALSE), "\n", 
                           "SVI-Flag-to-Population Ratio (per 1,000): ", (round(df$flag_by_pop, 6)*1000), "\n",
                           "Quintile: ", df$flag_pop_quantile_pct)
      
    return(df)
    }
    2010 Tooltip
    Code
    county_svi_flags10 <- map_tooltip(county_svi_flags10)
    county_svi_flags10$tooltip[1]
    [1] "Atlantic County, NJ\nSVI Flags: 317\nCounty Population: 264,511\nSVI-Flag-to-Population Ratio (per 1,000): 1.198\nQuintile: 100%"
    2020 Tooltip
    Code
    county_svi_flags20 <- map_tooltip(county_svi_flags20)
    county_svi_flags20$tooltip[1]
    [1] "Atlantic County, NJ\nSVI Flags: 335\nCounty Population: 264,649\nSVI-Flag-to-Population Ratio (per 1,000): 1.266\nQuintile: 100%"

    Merge County Data and Shapefile

    Now that we have our county SVI flags summarized, we can merge this data with our county shapefile for mapping:

    2010 Merge

    Code
    county_svi_flags10 <- left_join(county_sf, county_svi_flags10, join_by(STATEFP == FIPS_st, COUNTYFP == FIPS_county)) 
    
    county_svi_flags10
    Simple feature collection with 150 features and 22 fields
    Geometry type: MULTIPOLYGON
    Dimension:     XY
    Bounding box:  xmin: -80.51989 ymin: 38.92852 xmax: -71.85621 ymax: 45.01585
    Geodetic CRS:  NAD83
    First 10 features:
               GEO_ID STATE COUNTY       NAME   LSAD CENSUSAREA COUNTYFP STATEFP
    1  0500000US34001    34    001   Atlantic County    555.704      001      34
    2  0500000US34003    34    003     Bergen County    233.009      003      34
    3  0500000US34005    34    005 Burlington County    798.576      005      34
    4  0500000US34007    34    007     Camden County    221.263      007      34
    5  0500000US34009    34    009   Cape May County    251.425      009      34
    6  0500000US34011    34    011 Cumberland County    483.703      011      34
    7  0500000US34013    34    013      Essex County    126.212      013      34
    8  0500000US34015    34    015 Gloucester County    322.005      015      34
    9  0500000US34017    34    017     Hudson County     46.191      017      34
    10 0500000US34019    34    019  Hunterdon County    427.819      019      34
       state state_name            county region_number           region
    1     NJ New Jersey   Atlantic County             1 Northeast Region
    2     NJ New Jersey     Bergen County             1 Northeast Region
    3     NJ New Jersey Burlington County             1 Northeast Region
    4     NJ New Jersey     Camden County             1 Northeast Region
    5     NJ New Jersey   Cape May County             1 Northeast Region
    6     NJ New Jersey Cumberland County             1 Northeast Region
    7     NJ New Jersey      Essex County             1 Northeast Region
    8     NJ New Jersey Gloucester County             1 Northeast Region
    9     NJ New Jersey     Hudson County             1 Northeast Region
    10    NJ New Jersey  Hunterdon County             1 Northeast Region
       division_number                 division flag_count    pop  flag_by_pop
    1                2 Middle Atlantic Division        317 264511 0.0011984379
    2                2 Middle Atlantic Division        436 896482 0.0004863455
    3                2 Middle Atlantic Division        241 447861 0.0005381134
    4                2 Middle Atlantic Division        463 513574 0.0009015254
    5                2 Middle Atlantic Division         99  97684 0.0010134720
    6                2 Middle Atlantic Division        179 148233 0.0012075584
    7                2 Middle Atlantic Division       1371 778923 0.0017601226
    8                2 Middle Atlantic Division        122 285223 0.0004277355
    9                2 Middle Atlantic Division       1090 622123 0.0017520651
    10               2 Middle Atlantic Division         24 128458 0.0001868315
       flag_count_quantile flag_pop_quantile flag_pop_quantile_pct
    1                  1.0               1.0                  100%
    2                  1.0               0.2                   20%
    3                  0.8               0.2                   20%
    4                  1.0               0.8                   80%
    5                  0.6               1.0                  100%
    6                  0.8               1.0                  100%
    7                  1.0               1.0                  100%
    8                  0.8               0.2                   20%
    9                  1.0               1.0                  100%
    10                 0.2               0.2                   20%
                                                                                                                                  tooltip
    1    Atlantic County, NJ\nSVI Flags: 317\nCounty Population: 264,511\nSVI-Flag-to-Population Ratio (per 1,000): 1.198\nQuintile: 100%
    2       Bergen County, NJ\nSVI Flags: 436\nCounty Population: 896,482\nSVI-Flag-to-Population Ratio (per 1,000): 0.486\nQuintile: 20%
    3   Burlington County, NJ\nSVI Flags: 241\nCounty Population: 447,861\nSVI-Flag-to-Population Ratio (per 1,000): 0.538\nQuintile: 20%
    4       Camden County, NJ\nSVI Flags: 463\nCounty Population: 513,574\nSVI-Flag-to-Population Ratio (per 1,000): 0.902\nQuintile: 80%
    5      Cape May County, NJ\nSVI Flags: 99\nCounty Population: 97,684\nSVI-Flag-to-Population Ratio (per 1,000): 1.013\nQuintile: 100%
    6  Cumberland County, NJ\nSVI Flags: 179\nCounty Population: 148,233\nSVI-Flag-to-Population Ratio (per 1,000): 1.208\nQuintile: 100%
    7      Essex County, NJ\nSVI Flags: 1,371\nCounty Population: 778,923\nSVI-Flag-to-Population Ratio (per 1,000): 1.76\nQuintile: 100%
    8   Gloucester County, NJ\nSVI Flags: 122\nCounty Population: 285,223\nSVI-Flag-to-Population Ratio (per 1,000): 0.428\nQuintile: 20%
    9    Hudson County, NJ\nSVI Flags: 1,090\nCounty Population: 622,123\nSVI-Flag-to-Population Ratio (per 1,000): 1.752\nQuintile: 100%
    10    Hunterdon County, NJ\nSVI Flags: 24\nCounty Population: 128,458\nSVI-Flag-to-Population Ratio (per 1,000): 0.187\nQuintile: 20%
                             geometry
    1  MULTIPOLYGON (((-74.42314 3...
    2  MULTIPOLYGON (((-73.92676 4...
    3  MULTIPOLYGON (((-74.99056 4...
    4  MULTIPOLYGON (((-75.14001 3...
    5  MULTIPOLYGON (((-74.94545 3...
    6  MULTIPOLYGON (((-75.1145 39...
    7  MULTIPOLYGON (((-74.13892 4...
    8  MULTIPOLYGON (((-75.20008 3...
    9  MULTIPOLYGON (((-74.02039 4...
    10 MULTIPOLYGON (((-75.19491 4...

    2020 Merge

    Code
    county_svi_flags20 <- left_join(county_sf, county_svi_flags20, join_by(STATEFP == FIPS_st, COUNTYFP == FIPS_county)) 
    
    county_svi_flags20
    Simple feature collection with 150 features and 22 fields
    Geometry type: MULTIPOLYGON
    Dimension:     XY
    Bounding box:  xmin: -80.51989 ymin: 38.92852 xmax: -71.85621 ymax: 45.01585
    Geodetic CRS:  NAD83
    First 10 features:
               GEO_ID STATE COUNTY       NAME   LSAD CENSUSAREA COUNTYFP STATEFP
    1  0500000US34001    34    001   Atlantic County    555.704      001      34
    2  0500000US34003    34    003     Bergen County    233.009      003      34
    3  0500000US34005    34    005 Burlington County    798.576      005      34
    4  0500000US34007    34    007     Camden County    221.263      007      34
    5  0500000US34009    34    009   Cape May County    251.425      009      34
    6  0500000US34011    34    011 Cumberland County    483.703      011      34
    7  0500000US34013    34    013      Essex County    126.212      013      34
    8  0500000US34015    34    015 Gloucester County    322.005      015      34
    9  0500000US34017    34    017     Hudson County     46.191      017      34
    10 0500000US34019    34    019  Hunterdon County    427.819      019      34
       state state_name            county region_number           region
    1     NJ New Jersey   Atlantic County             1 Northeast Region
    2     NJ New Jersey     Bergen County             1 Northeast Region
    3     NJ New Jersey Burlington County             1 Northeast Region
    4     NJ New Jersey     Camden County             1 Northeast Region
    5     NJ New Jersey   Cape May County             1 Northeast Region
    6     NJ New Jersey Cumberland County             1 Northeast Region
    7     NJ New Jersey      Essex County             1 Northeast Region
    8     NJ New Jersey Gloucester County             1 Northeast Region
    9     NJ New Jersey     Hudson County             1 Northeast Region
    10    NJ New Jersey  Hunterdon County             1 Northeast Region
       division_number                 division flag_count    pop  flag_by_pop
    1                2 Middle Atlantic Division        335 264649 0.0012658276
    2                2 Middle Atlantic Division        454 931264 0.0004875094
    3                2 Middle Atlantic Division        237 446295 0.0005310389
    4                2 Middle Atlantic Division        463 506715 0.0009137286
    5                2 Middle Atlantic Division         84  92701 0.0009061391
    6                2 Middle Atlantic Division        181 140953 0.0012841160
    7                2 Middle Atlantic Division       1392 796608 0.0017474090
    8                2 Middle Atlantic Division        110 291749 0.0003770364
    9                2 Middle Atlantic Division       1044 671928 0.0015537379
    10               2 Middle Atlantic Division         32 125063 0.0002558710
       flag_count_quantile flag_pop_quantile flag_pop_quantile_pct
    1                  1.0               1.0                  100%
    2                  1.0               0.2                   20%
    3                  0.8               0.2                   20%
    4                  1.0               0.8                   80%
    5                  0.6               0.8                   80%
    6                  0.8               1.0                  100%
    7                  1.0               1.0                  100%
    8                  0.6               0.2                   20%
    9                  1.0               1.0                  100%
    10                 0.2               0.2                   20%
                                                                                                                                  tooltip
    1    Atlantic County, NJ\nSVI Flags: 335\nCounty Population: 264,649\nSVI-Flag-to-Population Ratio (per 1,000): 1.266\nQuintile: 100%
    2       Bergen County, NJ\nSVI Flags: 454\nCounty Population: 931,264\nSVI-Flag-to-Population Ratio (per 1,000): 0.488\nQuintile: 20%
    3   Burlington County, NJ\nSVI Flags: 237\nCounty Population: 446,295\nSVI-Flag-to-Population Ratio (per 1,000): 0.531\nQuintile: 20%
    4       Camden County, NJ\nSVI Flags: 463\nCounty Population: 506,715\nSVI-Flag-to-Population Ratio (per 1,000): 0.914\nQuintile: 80%
    5       Cape May County, NJ\nSVI Flags: 84\nCounty Population: 92,701\nSVI-Flag-to-Population Ratio (per 1,000): 0.906\nQuintile: 80%
    6  Cumberland County, NJ\nSVI Flags: 181\nCounty Population: 140,953\nSVI-Flag-to-Population Ratio (per 1,000): 1.284\nQuintile: 100%
    7     Essex County, NJ\nSVI Flags: 1,392\nCounty Population: 796,608\nSVI-Flag-to-Population Ratio (per 1,000): 1.747\nQuintile: 100%
    8   Gloucester County, NJ\nSVI Flags: 110\nCounty Population: 291,749\nSVI-Flag-to-Population Ratio (per 1,000): 0.377\nQuintile: 20%
    9    Hudson County, NJ\nSVI Flags: 1,044\nCounty Population: 671,928\nSVI-Flag-to-Population Ratio (per 1,000): 1.554\nQuintile: 100%
    10    Hunterdon County, NJ\nSVI Flags: 32\nCounty Population: 125,063\nSVI-Flag-to-Population Ratio (per 1,000): 0.256\nQuintile: 20%
                             geometry
    1  MULTIPOLYGON (((-74.42314 3...
    2  MULTIPOLYGON (((-73.92676 4...
    3  MULTIPOLYGON (((-74.99056 4...
    4  MULTIPOLYGON (((-75.14001 3...
    5  MULTIPOLYGON (((-74.94545 3...
    6  MULTIPOLYGON (((-75.1145 39...
    7  MULTIPOLYGON (((-74.13892 4...
    8  MULTIPOLYGON (((-75.20008 3...
    9  MULTIPOLYGON (((-74.02039 4...
    10 MULTIPOLYGON (((-75.19491 4...

    Plot Choropleth Maps

    Finally, we can create a function to plot our maps (REMEMBER you will want this function in your project_data_steps.R file):

    Code
    choropleth_map <- function(df, fill_var, year_var) {
    gg_int <- ggplot() +
      # Make theme empty for customization
      theme_void() +
      # Set color palette
      scale_fill_gradient2(low = "#B9D9EB", high="#003262", labels = scales::label_percent()) +
      # Load interactive shapefile layer with county data
      geom_sf_interactive(data=df , aes(geometry=geometry, fill=!!as.name(fill_var), tooltip = tooltip, data_id = tooltip), size = 0.1) +
        # Load overall state outlines from state shapefiles, can update linewidth thickness to highlight outlines depending on size/spacing of states, .5 - 1.5 is usually a good range
      geom_sf(data=st_sf, color="white", fill=NA, linewidth=.5, aes(geometry=geometry)) +
      labs(title= paste0(year_var, " SVI Flag to Population Ratio", "\U2014", census_division),
           fill="Quintile")
    
    return(gg_int)
    }

    2010 SVI Flag to Population Ratio Map

    Code
    gg_int10 <- choropleth_map(county_svi_flags10, "flag_pop_quantile", "2010")
    Code
    # View map
    girafe(ggobj = gg_int10)
    Code
    # Save map to HTML file for embedding in final project 
    save_int <- girafe(ggobj = gg_int10)
    setwd(here::here("labs/wk03/imgs"))
    htmlwidgets::saveWidget(widgetframe::frameableWidget(save_int), here::here(paste0("labs/wk03/imgs/", "flag_pop_quantile", "2010", "_", str_replace_all(census_division, " ", "_"), "map.html")))
    setwd(here::here("labs/wk03/"))
    # Example embed code
    <iframe align = "center" width = "1000" height = "1000" src="./imgs/flag_pop_quantile2010_Middle_Atlantic_Divisionmap.html"></iframe>

    2020 SVI Flag to Population Ratio Map

    Code
    gg_int20 <- choropleth_map(county_svi_flags20, "flag_pop_quantile", "2020")
    Code
    # View map
    girafe(ggobj = gg_int20)
    Code
    # Save map to HTML file for embedding in final project 
    save_int <- girafe(ggobj = gg_int20)
    setwd(here::here("labs/wk03/imgs"))
    htmlwidgets::saveWidget(widgetframe::frameableWidget(save_int),here::here(paste0("labs/wk03/imgs/", "flag_pop_quantile", "2020",  "_", str_replace_all(census_division, " ", "_"), "map.html")))
    setwd(here::here("labs/wk03/"))
    # Example embed code
    <iframe align = "center" width = "1000" height = "1000" src="./imgs/flag_pop_quantile2020_Middle_Atlantic_Divisionmap.html"></iframe>

    Why map the flag-to-population ratio?

    As a final observation, you may be curious why we’re mapping the flag-to-population ratio instead of the flag count.

    This is because our SVI flags are summed to the county level from census tracts. Census tracts can vary in size and larger counties have more census tracts. Thus when counties are larger they are more likely to have more SVI flags than smaller counties and can falsely appear to be at higher risk. To adjust for this, we can divide our flag count by the county population and then multiply this by 1,000 to get a ratio of flag-to-population per 1,000 people. This provides us with an equitable comparison across counties of varying sizes.

    To visualize this, let’s look at a map of the 2020 data for flag_count:

    Code
    gg_int_cnt20 <- ggplot() +
      # Make theme empty for customization
      theme_void() +
      # Set color palette
      scale_fill_gradient2(low = "#B9D9EB", high="#003262", labels = scales::label_percent()) +
      # Load interactive shapefile layer with county data
      geom_sf_interactive(data=county_svi_flags20 , aes(geometry=geometry, fill=flag_count_quantile, tooltip = tooltip, data_id = tooltip), size = 0.1) +
      # Load overall state outlines from state shapefiles
      geom_sf(data=st_sf, color="white", fill=NA, linewidth=.5, aes(geometry=geometry)) +
      labs(title= paste0("2020", " SVI Flag Count—", census_division),
           fill="Quintile")
    
    girafe(ggobj = gg_int_cnt20)

    Here we can see that all of the areas that are highlighted as most vulnerable are large population centers consisting of the counties that make up cities such as Pittsburgh, PA (Allegheny County), Philadelphia, PA (Philadelphia County), Newark, NJ (Essex County), and the counties associated with the five boroughs of New York City (Bronx County, Kings County (Brooklyn), New York County (Manhattan), Queens County, Richmond County (Staten Island)).

    While there is some overlap with our population-adjusted maps (largely due to the fact that urban areas often are subject to social vulnerability), we can also see that there are several smaller counties on the map that are not properly identified as being vulnerable for their population size.

    Therefore, we will omit graphing the raw flags count.

    Check all visuals saved

    Finally we can see all of the infographics we’ve saved:

    Code
    images <- sprintf("imgs/%s", list.files(here::here("labs/wk03/imgs")))
    images
    [1] "imgs/flag_pop_quantile2010_Middle_Atlantic_Divisionmap.html"
    [2] "imgs/flag_pop_quantile2020_Middle_Atlantic_Divisionmap.html"
    [3] "imgs/infographic_HAT_Middle_Atlantic_Division.png"          
    [4] "imgs/infographic_HHChar_Middle_Atlantic_Division.png"       
    [5] "imgs/infographic_REM_Middle_Atlantic_Division.png"          
    [6] "imgs/infographic_SES_Middle_Atlantic_Division.png"          

    Save mapping data sets to re-use later

    Code
    saveRDS(county_svi_flags10, file = here::here(paste0("data/wrangling/", str_replace_all(census_division, " ", "_"), "_county_svi_flags10.rds")))
    
    saveRDS(county_svi_flags20, file = here::here(paste0("data/wrangling/", str_replace_all(census_division, " ", "_"), "_county_svi_flags20.rds")))
    
    saveRDS(st_sf, file = here::here(paste0("data/wrangling/", str_replace_all(census_division, " ", "_"), "_st_sf.rds")))
    
    saveRDS(county_sf, file = here::here(paste0("data/wrangling/", str_replace_all(census_division, " ", "_"), "_county_sf.rds")))

    Lab Submission Instructions

    Congratulations! You’ve reached the end of the Lab-03 Tutorial!

    You are now ready to complete your lab and submit it on Canvas.

    Be sure to add the following code to your .RMD file to create a border for your infographics:

    
    ::: {.cell}
    <style type="text/css">
    img {
      border: 50px solid white;
    }
    </style>
    :::
    

    You can also use the following code to embed your infographic images in your final file (Replace Middle_Atlantic_Division with your Census Division:

    ![Alt-text: Socioeconomic Status Infographic](./imgs/infographic_SES_Middle_Atlantic_Division.png)
    ![Alt-text: Household Characteristics Infographic](./imgs/infographic_HHChar_Middle_Atlantic_Division.png)
    ![Alt-text: Racial and Ethnic Minority Infographic](./imgs/infographic_REM_Middle_Atlantic_Division.png)
    ![Alt-text: Housing and Transportation Infographic](./imgs/infographic_HAT_Middle_Atlantic_Division.png)
    
    Note

    Your import step in your .RMD file should look similar to the code chunk below, but your project_data_steps.R file should have your initials on the end (i.e. project_data_steps_CS.R).

    Code
    import::here( "fips_census_regions",
                  "load_svi_data",
                  "merge_svi_data",
                  "census_division",
                  "svi_percentages10",
                  "svi_percentages20",
                  "waffle_charts",
                  "person_icon",
                  "house_icon",
                  "car_icon",
                  "parent_icon",
                  "health_icon",
                  "groupqtr_icon",
                  "flag_summarize",
                  "map_tooltip",
                  "choropleth_map",
                 # notice the use of here::here() that points to the .R file
                 # where all these R objects are created
                 .from = here::here("analysis/project_data_steps.R"),
                 .character_only = TRUE)

    The following checklist will ensure that you’re on track:

    Add newly created graphing and data wrangling functions to project_data_steps.R file
    Create new .RMD file and load data and functions from project_data_steps.R file
    Create Waffle Charts for all 14 SVI categories identified above for 2010 and 2020 for the United States nationally and your Census Division. Combine into 4 infographics (Socioeconomic Status, Household Characteristics, Racial and Ethnic Minority Status, Housing and Transportation Infographics) and write summaries of visualizations
    Create 2 Choropleth Maps (2010 and 2020) of SVI-Flags-to-Population Ratio for your Census Division and write summaries of spatial distribution
    Apply code chunk options to hide any unwanted warnings/messages
    Knit .RMD file to create .HTML and .md (Github Flavored) for output
    Review example report to get a general idea of what your output should look like
    Submit project_data_steps.R, .RMD, .HTML, and .MD files to Canvas
    Treat yourself for a job well-done!