View on GitHub

Evaluating Federal Programs

CPP 528 Final Project Spring 2020

Metrics

Metrics

We are going to look at how distressed urban communities change between 2000 and 2010. Median home value will be one of the primary variables we will use in the model since it captures a lot of information about the ‘neighborhood’ (census tract). We will also utilize some neighborhood health metrics as predictors and outcomes.

Gentrification Metrics

Gentrification, physical and demographic changes to a neighborhood that brings in wealthier residents, new businesses, investment and development in the area, is an important topic in social science. Gentrification also bring concerns related social-justice due to displacement and dislocation of low-income residents from the neighborhood. Traditionally low-income neighborhood across the United States gentrify. Identifying neighborhood changes related to gentrification can help in planning for negative by-products of gentrification and urban development of the area. City can plan for low-income amenities and rent-controlled-housing in recently developed and developing neighborhoods. To capture the neighborhood changes gentrification metrices are constructed. Gentrification is not caused by a single variable, but it is the result of pool of gentrifiers with cultural preference for urban living, better amenities, disposable income and urban housing. These variables can measure the trends in latent changes such as economic strength, human capital, community vulnerability, and community distress in the neighborhood among many other factors,allowing the city to intervene before the low-income population are severely affected.

The objective of this chapter is to capture initial picture of neighborhood change in the year 2000 using longitudinal tabulated database (LTDB) from census data containing data related to almost 72,000 unique census tracts. It contains almost 70 variables describing racial, socio-economic, housing, age, and marital status of the population. The key to observing changes is the conversion of Census tract data into useful ‘features’ or variables that help predict gentrification. In theory, neighborhood change suggests that low-priced neighborhoods adjacent to wealthy ones have the highest probability of gentrifying in the face of new housing demand.

This paper uses Median Home Value as one of the main variables to construct gentrification metrices. To measure the general dimension of community strength and vulnerability in the year 2000 three instruments are created measuring different latent construct. For the porpose of this paper neighorhood health is constructed from the entire available LTDB data.

Set-Up

Library

# load necessary packages ----
library(here)
library(tidyverse)
library(dplyr)
library(psych)
library(dplyr)  # data wrangling
library(xtable)  # nice tables 
library(kableExtra)  # nice tables 
library(psych)  # calculate instrument reliability (alpha)
library(ggplot2)  # graphics
library(tidyr)  # wrangling 

Functions

# Load helper functions for graphing purposes
source(here("functions/lab-05_helper_functions_cs.R"))

Data

Load the LTDB data in raw form.

# load necessary data
census_dat2000 <- read_csv(here("data/raw/Harmonized_Census_Tracts/ltdb_std_all_sample/ltdb_std_2000_sample.csv")) %>%
    # make all column names lower case
rename_all(.funs = function(i) str_to_lower(i))
# replace -999 in data set with NA
census_dat2000 %>% as_tibble() %>% mutate_all(function(i) ifelse(i == -999, NA, i)) %>%
    na.omit()

Note that we are comparing neighborhoods (census tracts) within cities, so comparisons with rural areas are not very meaningful. Before you begin your analysis drop all of the rural tracts located outside of metropolitan areas.

Add metro attributes and urban / rural status by merging the crosswalk data using county FIPS codes.

URL <- "https://data.nber.org/cbsa-msa-fips-ssa-county-crosswalk/cbsatocountycrosswalk.csv"
crosswalk <- read.csv(URL, stringsAsFactors = F)
# all metro areas in the country
sort(unique(crosswalk$cbsaname))
crosswalk$urban <- ifelse(crosswalk$cbsaname == "", "rural", "urban")
keep.these <- c("countyname", "state", "fipscounty", "msa", "msaname", "cbsa", "cbsaname",
    "urban")
cw <- dplyr::select(crosswalk, keep.these)
# merge into census data by county FIPS watch the leading zeros problem

Neighborhood Health Indices

Construction of three neighborhood indices and their reliability.

Three different instruments/metrics are constructed to measure the neighborhood health. The value of variables included in the instruments are converted to their Z-scores, this avoids overweighting one variable. After normalising each variable, the Cornbach Alpha Score is calculated which shows how closely each variable is related to each other as a group. An alpha score of 70% or higher is considered better for internal relialbility of the group.

Instrument 1

This Instrument measures economic strength in the neighborhood. The variables included in the matrix are percent population with a college degree, median household income, and median home value. Percent of population with a college degree or higher is calculated by dividing the number of people with a college degree by the of population 25 years and over.

Instrument 1 measures a Cronbach Alpha Score of 87%

economic_strength <- census_dat2000 %>% mutate(pcol00 = (census_dat2000$col00/census_dat2000$ag25up00)) %>%

select(pcol00, hinc00, mhmval00) %>%
mutate(pcol00zscore = scale(pcol00, center = TRUE, scale = TRUE), hinc00zscore = scale(hinc00,
    center = TRUE, scale = TRUE), mhmval00zscore = scale(mhmval00, center = TRUE,
    scale = TRUE)) %>% select(pcol00zscore, hinc00zscore, mhmval00zscore) %>% rename(`percent with college degree` = pcol00zscore,
    `median household income` = hinc00zscore, `median home value` = mhmval00zscore,
    ) %>% data.frame()
# pairs( instrument1, lower.panel= panel.smooth, upper.panel= panel.cor )
alpha1 <- psych::alpha(economic_strength, check.keys = TRUE)$total$raw_alpha
alpha1
[1] 0.8699186

Instrument 2

This Instrument measures community vulnerability in the neighborhood. The variables included in the matrix are percent unemployed population, percentage of population with a high school degree or less, and per capita income.

Instrument 2 measures a Cornbach Alpha Score of 77%

community_vulnerability <- census_dat2000 %>% mutate(punemp00 = (census_dat2000$unemp00/census_dat2000$clf00),
    phs00 = (census_dat2000$hs00/census_dat2000$ag25up00)) %>%
select(punemp00, phs00, incpc00) %>%
mutate(punemp00zscore = scale(punemp00, center = TRUE, scale = TRUE), phs00zscore = scale(phs00,
    center = TRUE, scale = TRUE), incpc00zscore = scale(incpc00, center = TRUE, scale = TRUE)) %>%

select(punemp00zscore, phs00zscore, incpc00zscore) %>% rename(`percent unemployed` = punemp00zscore,
    `percent with high school degree or less` = phs00zscore, `per capita income` = incpc00zscore,
    ) %>% data.frame()
# pairs( instrument1, lower.panel= panel.smooth, upper.panel= panel.cor )
alpha2 <- psych::alpha(community_vulnerability, check.keys = TRUE)$total$raw_alpha
alpha2
[1] 0.7732793

Instrument 3

This Instrument measures community distress in the neighborhood. The variables included in the matrix are median home value, median household income, percent unemployed, and percent widowed, divorced and separated.

Instrument 3 measures a Cornbach Alpha Score of 74%

community_distress <- census_dat2000 %>% mutate(punemp00 = (census_dat2000$unemp00/census_dat2000$clf00),
    pwds00 = (census_dat2000$wds00/census_dat2000$ag15up00)) %>%
select(mhmval00, hinc00, punemp00, pwds00) %>%
mutate(mhmval00zscore = scale(mhmval00, center = TRUE, scale = TRUE), hinc00zscore = scale(hinc00,
    center = TRUE, scale = TRUE), punemp00zscore = scale(punemp00, center = TRUE,
    scale = TRUE), pwds00zscore = scale(pwds00, center = TRUE, scale = TRUE)) %>%

select(mhmval00zscore, hinc00zscore, punemp00zscore, pwds00zscore) %>% rename(`median home value` = mhmval00zscore,
    `median household income` = hinc00zscore, `percent unemployed` = punemp00zscore,
    `percent widowed, divorced and separated` = pwds00zscore) %>% data.frame()
# pairs( instrument1, lower.panel= panel.smooth, upper.panel= panel.cor )
alpha3 <- psych::alpha(community_distress, check.keys = TRUE)$total$raw_alpha
alpha3
[1] 0.7468594

Descriptive Statistics

Filter out urban county data from crosswalk data and keep entries for unique CBSA. Then look for those CBSA in LTGB_2000 data base and keep only common entires.

Present descriptive statistics on all of the metrics for all urban census tracts.

cbsa <- cw %>% filter(urban == "urban") %>% select(cbsa, urban)
cbsa <- unique(cbsa)
options(scipen = 999)

cbsa.id <- cbsa$cbsa
keep.these <- census_dat2000$cbsa10 %in% cbsa.id
dat_2000_urban <- filter(census_dat2000, keep.these)
summary(dat_2000_urban) %>% kable() %>% kable_styling() %>% scroll_box(height = "400px",
    width = "100%", fixed_thead = TRUE)
trtid10 state county tract placefp10 cbsa10 metdiv10 ccflag10 pop00sf3 ruanc00 itanc00 geanc00 iranc00 scanc00 rufb00 itfb00 gefb00 irfb00 scfb00 fb00 nat00 n10imm00 ag5up00 olang00 lep00 ag25up00 hs00 col00 ag15up00 mar-00 wds00 clf00 unemp00 dflabf00 flabf00 empclf00 prof00 manuf00 semp00 ag18cv00 vet00 cni16u00 dis00 dpov00 npov00 n65pov00 dfmpov00 nfmpov00 dwpov00 nwpov00 dbpov00 nbpov00 dnapov00 nnapov00 dhpov00 nhpov00 dapov00 napov00 incpc00 hu00sp h30old00 ohu00sp h10yrs00 dmulti00 multi00 hinc00 hincw00 hincb00 hinch00 hinca00 mhmval00 mrent00 hh00 hhw00 hhb00 hhh00 hha00
Min. : 1001020100 Length:40351 Length:40351 Length:40351 Min. : 100 Min. :10180 Min. :99999 Min. :0.0000 Min. : 0 Min. : 0.00 Min. : 0.0 Min. : 0.0 Min. : 0.0 Min. : 0.00 Min. : 0.000 Min. : 0.000 Min. : 0.000 Min. : 0.000 Min. : 0.000 Min. : 0.0 Min. : 0.0 Min. : 0.0 Min. : 0 Min. : 0.0 Min. : 0.0 Min. : 0 Min. : 0 Min. : 0.0 Min. : 0 Min. : 0 Min. : 0.0 Min. : 0 Min. : 0.00 Min. : 0 Min. : 0.0 Min. : 0 Min. : 0.0 Min. : 0.0 Min. : 0.0 Min. : 0 Min. : 0.0 Min. : 0 Min. : 0.0 Min. : 0 Min. : 0.0 Min. : 0.00 Min. : 0.0 Min. : 0.00 Min. : 0 Min. : 0.00 Min. : 0.0 Min. : 0.0 Min. : 0.00 Min. : 0.000 Min. : 0.0 Min. : 0.00 Min. : 0.000 Min. : 0.00 Min. : 0 Min. : 0 Min. : 0 Min. : 0 Min. : 0.0 Min. : 0 Min. : 0.0 Min. : 2499 Min. : 2499 Min. : 2499 Min. : 2499 Min. : 5000 Min. : 0 Min. : 0 Min. : 0 Min. : 0 Min. : 0.000 Min. : 0.00 Min. : 0.0000
1st Qu.:12101030404 Class :character Class :character Class :character 1st Qu.:28968 1st Qu.:19740 1st Qu.:99999 1st Qu.:0.0000 1st Qu.: 2697 1st Qu.: 0.00 1st Qu.: 36.0 1st Qu.: 162.0 1st Qu.: 124.0 1st Qu.: 15.00 1st Qu.: 0.000 1st Qu.: 0.000 1st Qu.: 0.000 1st Qu.: 0.000 1st Qu.: 0.000 1st Qu.: 62.0 1st Qu.: 27.0 1st Qu.: 16.0 1st Qu.: 2517 1st Qu.: 134.8 1st Qu.: 14.0 1st Qu.:1718 1st Qu.: 671 1st Qu.: 229.0 1st Qu.: 2124 1st Qu.: 1039 1st Qu.: 362.0 1st Qu.:1279 1st Qu.: 50.00 1st Qu.:1072 1st Qu.: 594.0 1st Qu.:1186 1st Qu.: 292.9 1st Qu.: 110.0 1st Qu.: 86.0 1st Qu.: 1992 1st Qu.: 238.0 1st Qu.:1679 1st Qu.: 262.0 1st Qu.: 2602 1st Qu.: 155.5 1st Qu.: 11.63 1st Qu.: 673.3 1st Qu.: 18.00 1st Qu.: 1481 1st Qu.: 78.05 1st Qu.: 23.0 1st Qu.: 0.0 1st Qu.: 0.00 1st Qu.: 0.000 1st Qu.: 38.0 1st Qu.: 0.00 1st Qu.: 8.008 1st Qu.: 0.00 1st Qu.: 15583 1st Qu.:1111 1st Qu.: 248 1st Qu.:1015 1st Qu.: 633.0 1st Qu.:1111 1st Qu.: 58.0 1st Qu.: 32068 1st Qu.: 34125 1st Qu.: 22688 1st Qu.: 26585 1st Qu.: 30000 1st Qu.: 74100 1st Qu.: 387 1st Qu.:1018 1st Qu.: 639 1st Qu.: 8.145 1st Qu.: 10.08 1st Qu.: 0.3678
Median :27145010102 Mode :character Mode :character Mode :character Median :55000 Median :31340 Median :99999 Median :0.0000 Median : 3743 Median : 10.33 Median : 89.0 Median : 341.0 Median : 232.0 Median : 45.00 Median : 0.000 Median : 0.000 Median : 5.842 Median : 0.000 Median : 0.000 Median : 157.5 Median : 69.0 Median : 55.0 Median : 3489 Median : 269.0 Median : 36.0 Median :2397 Median :1072 Median : 453.0 Median : 2936 Median : 1576 Median : 535.0 Median :1843 Median : 84.04 Median :1490 Median : 862.7 Median :1738 Median : 515.0 Median : 202.0 Median :149.0 Median : 2756 Median : 358.0 Median :2366 Median : 402.0 Median : 3645 Median : 312.0 Median : 28.68 Median : 962.4 Median : 43.00 Median : 2606 Median : 157.00 Median : 108.0 Median : 12.0 Median : 10.00 Median : 0.000 Median : 109.7 Median : 12.00 Median : 38.000 Median : 0.00 Median : 19696 Median :1532 Median : 627 Median :1412 Median : 912.1 Median :1532 Median : 239.0 Median : 41580 Median : 43249 Median : 34316 Median : 37680 Median : 48750 Median : 100900 Median : 489 Median :1413 Median :1098 Median : 41.000 Median : 31.86 Median : 11.3812
Mean :27767098651 NA NA NA Mean :56261 Mean :30268 Mean :99999 Mean :0.4274 Mean : 3844 Mean : 24.43 Mean : 156.8 Mean : 454.4 Mean : 261.5 Mean : 98.92 Mean : 2.921 Mean : 3.868 Mean : 9.364 Mean : 1.177 Mean : 1.661 Mean : 316.6 Mean : 120.7 Mean : 140.4 Mean : 3581 Mean : 528.8 Mean : 117.2 Mean :2464 Mean :1152 Mean : 604.5 Mean : 3013 Mean : 1643 Mean : 561.6 Mean :1905 Mean : 105.29 Mean :1524 Mean : 893.7 Mean :1799 Mean : 602.9 Mean : 248.8 Mean :170.5 Mean : 2828 Mean : 383.6 Mean :2443 Mean : 439.8 Mean : 3738 Mean : 442.7 Mean : 39.49 Mean : 991.0 Mean : 67.65 Mean : 2681 Mean : 206.81 Mean : 433.3 Mean : 107.7 Mean : 28.35 Mean : 6.461 Mean : 424.8 Mean : 98.03 Mean : 111.896 Mean : 14.95 Mean : 21240 Mean :1575 Mean : 699 Mean :1448 Mean : 965.9 Mean :1575 Mean : 388.3 Mean : 44766 Mean : 46447 Mean : 41561 Mean : 43713 Mean : 57489 Mean : 120462 Mean : 533 Mean :1449 Mean :1114 Mean : 160.742 Mean : 113.52 Mean : 34.6014
3rd Qu.:42003564300 NA NA NA 3rd Qu.:84250 3rd Qu.:40140 3rd Qu.:99999 3rd Qu.:1.0000 3rd Qu.: 4924 3rd Qu.: 30.71 3rd Qu.: 189.0 3rd Qu.: 611.2 3rd Qu.: 360.0 3rd Qu.: 109.00 3rd Qu.: 0.000 3rd Qu.: 2.439 3rd Qu.: 13.000 3rd Qu.: 0.000 3rd Qu.: 0.000 3rd Qu.: 368.0 3rd Qu.: 153.0 3rd Qu.: 154.0 3rd Qu.: 4585 3rd Qu.: 587.8 3rd Qu.: 100.5 3rd Qu.:3174 3rd Qu.:1546 3rd Qu.: 835.1 3rd Qu.: 3858 3rd Qu.: 2173 3rd Qu.: 733.0 3rd Qu.:2482 3rd Qu.: 133.01 3rd Qu.:1955 3rd Qu.:1164.0 3rd Qu.:2356 3rd Qu.: 824.8 3rd Qu.: 337.8 3rd Qu.:230.8 3rd Qu.: 3623 3rd Qu.: 502.0 3rd Qu.:3152 3rd Qu.: 577.0 3rd Qu.: 4812 3rd Qu.: 590.0 3rd Qu.: 55.00 3rd Qu.:1289.2 3rd Qu.: 90.00 3rd Qu.: 3753 3rd Qu.: 276.00 3rd Qu.: 427.1 3rd Qu.: 83.0 3rd Qu.: 29.00 3rd Qu.: 3.284 3rd Qu.: 372.0 3rd Qu.: 63.08 3rd Qu.: 109.693 3rd Qu.: 9.00 3rd Qu.: 24913 3rd Qu.:2017 3rd Qu.:1040 3rd Qu.:1864 3rd Qu.:1251.2 3rd Qu.:2017 3rd Qu.: 558.0 3rd Qu.: 54047 3rd Qu.: 55670 3rd Qu.: 52625 3rd Qu.: 53985 3rd Qu.: 72398 3rd Qu.: 142900 3rd Qu.: 623 3rd Qu.:1865 3rd Qu.:1555 3rd Qu.: 161.000 3rd Qu.: 106.53 3rd Qu.: 34.6893
Max. :56025001800 NA NA NA Max. :99999 Max. :49740 Max. :99999 Max. :1.0000 Max. :36206 Max. :1108.03 Max. :3841.4 Max. :4251.5 Max. :2958.1 Max. :2815.00 Max. :753.008 Max. :355.951 Max. :258.000 Max. :114.000 Max. :185.510 Max. :5827.0 Max. :2346.0 Max. :4061.0 Max. :31960 Max. :8121.0 Max. :4529.0 Max. :9448 Max. :6994 Max. :4644.7 Max. :28171 Max. :12555 Max. :4450.0 Max. :7139 Max. :6405.33 Max. :6162 Max. :3057.4 Max. :5358 Max. :3524.7 Max. :1934.0 Max. :983.5 Max. :11543 Max. :8128.5 Max. :9793 Max. :2149.0 Max. :20188 Max. :5157.0 Max. :438.00 Max. :5722.3 Max. :949.00 Max. :10726 Max. :4208.00 Max. :7523.0 Max. :4150.0 Max. :5837.29 Max. :2665.000 Max. :9450.0 Max. :4326.00 Max. :6423.000 Max. :1293.00 Max. :147633 Max. :7717 Max. :4975 Max. :5850 Max. :5834.4 Max. :7717 Max. :4768.0 Max. :200001 Max. :200001 Max. :200001 Max. :200001 Max. :250000 Max. :1000001 Max. :2001 Max. :5869 Max. :3744 Max. :2706.000 Max. :2196.00 Max. :1799.0000
NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA's :137 NA's :319 NA's :4455 NA's :2847 NA's :8480 NA's :136 NA's :115 NA NA NA NA NA