CPP 528 Project Group 4

Part II - Evaluation of Tax Credits

Tax Credit Programs and Predictive Analysis

Quick Insights:

We begin by providing detail on what the NMTC Program and LIHTC Program actually are and how they work. We also discuss our prior model using various independent variables to predict MHV growth. From there, we discuss how the independent variables can be added into our program models to be utilized as control variables. Once that is established, we present a model for each program utilizing a log-linear diff-in-diff approach. We find that both the NMTC Program and LIHTC Programs are marginally effective, and the NMTC can be considered the “better” program.

Introduction

This chapter analyzes the program effects of the New Market Tax Credit (NMTC) Program and the Low-Income Housing Tax Credit (LIHTC) Program from 2000 to 2010. We control for outside variables in order to accurately model median home value (MHV) growth.

Note: This chapter imports multiple functions defined in the utilitiesChapter3Chapter4.R file within the respective folder. Please refer to this file for more detail. The chapter will contain plots and tabular views with descriptions on the insight gathered.

Data Setup

# load necessary packages
library( dplyr )
library( here )
library( knitr )
library( pander )
library( stargazer )
library( scales )
library( gridExtra )

# load necessary functions and objects ----
# note: all of these are R objects that will be used through this .rmd file

import::here("S_TYPE",
             "d",
             "d1",
             "d2",
             "df",
             "d3",
             "d6",
             "PLOTS",
             "%>%",
             "nmtc",
             "lihtc",
# notice the use of here::here() that points to the .R file where all of these R objects are created
              .from = here::here("labs/utilitiesChapter3Chapter4.R"),
              .character_only = TRUE )

Overview of Programs

There are two social programs in place whose main goals are to improve low-income neighborhoods: the New Markets Tax Credit (NMTC) Program and the Low-Income Housing Tax Credit (LIHTC) Program. Both of these programs use federal tax credits to encourage private investors to fund infrastructure projects for distressed neighborhoods.

New Market Tax Credits

The New Market Tax Credits program solves a key issue in distressed communities: the lack of overall investment. They incentivize investments for community development and economic growth by sparking interest and net positive outcome for those in need of low income housing and the neighborhood as a whole. New Market Tax Credits can promote revitalization within the community directly to its residents. A key point that proponents of the NMTC program cite is that they advocate for the communities they are revitalizing by including residents in the construction and application of the improvements. With community on the forefront of every stakeholder’s mind, and the continuous funds for the projects, these tax credits can be very beneficial.

Low Income Housing Tax Credits

Low Income Housing Tax Credits also promotes neighborhood revitilization. The LIHTC program can help create more affordable housing for the community residents. Developers can utilize these Tax Credits to receive the amount needed to fund their projects. Research has shown that neighborhood improvements that come from projects such as building or rehabilitating low income houses can demonstrably improve the immediate surroundings by raising property values. Including private developers ensures that other people have a foot int he game besides only the government and policymakers. When there are more stakeholders, this creates opportunity for a wider interest and more accountability to see the project to its completion. This program has the opportunity to draw the attention of big-time investors from major corporations.


The NMTC Program has specific requirements regarding which areas can be developed: low-income communities with at least a 20% poverty rate or a median family income at or below 80%. The LIHTC program does not have these specific requirements, and developers can choose to invest in either low-income or middle-income communities. With that being said, there is typically a greater return on investment when choosing to develop in low-income communities, so that is typically where the LIHTC tax credits end up going to. Although these programs both serve similar neighborhoods, their reach is not identical. For example:

### POVERTY RATES
gridExtra::grid.arrange( PLOTS$pov_rate_2000$nmtc, 
                         PLOTS$pov_rate_2000$lihtc, 
                         nrow = 1 )

The NMTC tends to cover a greater amount of poverty-stricken neighborhoods than the LIHTC does. Knowing this, we cannot make a direct comparison between the two programs. We will instead analyze their effects in separate models.

Data Sources

We have access to records on New Market Tax Credits (NMTC) that originates from this data source. Information regarding projects, the Census Tract and city they took place in, the origination year, and more are all available.

Our data regarding Low Income Housing Tax Credits (LIHTC) derive from this data source. Relevant information regarding what projecs took place, as well as when and where they occurred are included.

As part of our data cleaning, we have grouped all projects that originated after 2000 and before 2010 into one record per tract.

Descriptive Statistics

The change in overall median home value from 2000 to 2010 adjusted for inflation gives us a basic understanding of the economy and structural changes in society and the housing market. We can see that on average, homes grew in value over this time period.

#Change in MHV 2000-2010 adjusted for inflation

hist( df$MHV.Change.00.to.10/1000, breaks=500, 
      xlim=c(-100,500), yaxt="n", xaxt="n",
      xlab="Thousands of US Dollars (adjusted to 2010)", cex.lab=1.0,
      ylab="", main="Change in Median Home Value 2000 to 2010",
      col="indianred4", border="white" )

axis( side=1, at=seq( from=-100, to=500, by=100 ), 
      labels=paste0( "$", seq( from=-100, to=500, by=100 ), "k" ) )
        
mean.x <- mean( df$MHV.Change.00.to.10/1000, na.rm=T )
abline( v=mean.x, col="yellow3", lwd=2, lty=2 )
text( x=200, y=1500, 
      labels=paste0( "Mean = ", dollar( round(1000*mean.x,0)) ), 
      col="yellow3", cex=1.8, pos=3 )

median.x <- median( df$MHV.Change.00.to.10/1000, na.rm=T )
abline( v=median.x, col="yellow4", lwd=2, lty=2 )
text( x=200, y=2000, 
      labels=paste0( "Median = ", dollar( round(1000*median.x,0)) ), 
      col="yellow4", cex=1.8, pos=3 )

Dollars Distrubuted by Program

par( mfrow=c(1,2) )


# Amounts given for NMTC

hist( nmtc$amount/10000, breaks=200, 
      xlim=c(0,3000), yaxt="n", xaxt="n",
      xlab="Thousands of Dollars", cex.lab=1.0,
      ylab="", main="Dollars Distributed via NMTC",
      col="indianred4", border="white" )

axis( side=1, at=seq( from=0, to=2500, by=500 ), 
      labels=paste0( "$", seq( from=0, to=2500, by=500 ), "" ) )
        
mean.nmtc.amt <- mean( nmtc$amount/10000, na.rm=T )
abline( v=mean.nmtc.amt, col="yellow3", lwd=2, lty=2 )
text( x=1700, y=1500, 
      labels=paste0( "Mean = ", dollar( round(1000*mean.nmtc.amt,0)) ), 
      col="yellow3", cex=1.3, pos=3 )

median.nmtc.amt <- median( nmtc$amount/10000, na.rm=T )
abline( v=median.nmtc.amt, col="yellow4", lwd=2, lty=2 )
text( x=1740, y=2000, 
      labels=paste0( "Median = ", dollar( round(1000*median.nmtc.amt,0)) ), 
      col="yellow4", cex=1.3, pos=1 )




# Amounts given for LIHTC

hist( lihtc$allocamt/1000, breaks=1000, 
      xlim=c(0,3000), yaxt="n", xaxt="n",
      xlab="Thousands of Dollars", cex.lab=1.0,
      ylab="", main="Dollars Distributed via LIHTC",
      col="indianred4", border="white" )

axis( side=1, at=seq( from=0, to=2500, by=500 ), 
      labels=paste0( "$", seq( from=0, to=2500, by=500 ), "" ) )
        
mean.lihtc.amt <- mean( lihtc$allocamt/10000, na.rm=T )
abline( v=mean.lihtc.amt, col="yellow3", lwd=2, lty=2 )
text( x=1700, y=5100, 
      labels=paste0( "Mean = ", dollar( round(1000*mean.lihtc.amt,0)) ), 
      col="yellow3", cex=1.3, pos=3 )

median.lihtc.amt <- median( lihtc$allocamt/10000, na.rm=T )
abline( v=median.lihtc.amt, col="yellow4", lwd=2, lty=2 )
text( x=1740, y=7000, 
      labels=paste0( "Median = ", dollar( round(1000*median.lihtc.amt,0)) ), 
      col="yellow4", cex=1.3, pos=1 )

Characteristics of Individuals That Received Them

# Granted 
pairs(d2[1:4], main = "Makeup of those granted either an NMTC or LIHTC tax credit",
      labels = c("unemployment" , "Household Income" , "Highschool Education", "Granted"),
      pch = 16, 
      bg = c("red")[unclass(d2$post)])

# Not granted 
pairs(d1[1:4], 
      main = "Makeup of those not granted either NMTC or LIHTC tax credits",
      labels = c("unemployment" , "Household Income" , "Highschool Education", "Granted"),
      pch = 16, 
      bg = c("red")[unclass(d1$post)])

 gridExtra::grid.arrange( PLOTS$mhv_growth$nmtc, 
                             PLOTS$mhv_growth$lihtc,
                             nrow = 2 )

Predictive Analysis

Background Information

The Baseline Model

Prior to this chapter, we explored how various census variables can be utilized in a model to predict median home value (MHV) growth. We determined that Median Household Income, Percent of High School Education, and Percent of Unemployment are all valuable in predicting median home value growth.

Following is the model:

reg.data <- d

reg.data$mhv.growth[ reg.data$mhv.growth > 200 ] <- NA
#reg.data$p.prof <- regdata$p.prof
#reg.data$p.vacant <- log10( reg.data$p.vacant + 1 )
#reg.data$p.white <- regdata$p.white

m1 <- lm( mhv.growth ~ p.unemp.00, data=reg.data )
m2 <- lm( mhv.growth ~ hinc00, data=reg.data )
m3 <- lm( mhv.growth ~ p.hs.edu.00, data=reg.data )
m4 <- lm( mhv.growth ~ p.unemp.00 + hinc00 + p.hs.edu.00, data=reg.data )

stargazer( m1, m2, m3, m4,
           type = S_TYPE, digts=2,
           dep.var.labels = ("MHV Growth"),
           covariate.labels = c("Unemployment", "Household Income", "HS Education", "Constant"),
           omit.stat = c("rsq", "f") )
Dependent variable:
MHV Growth
(1) (2) (3) (4)
Unemployment 0.706\*\*\* 0.500\*\*\*
(0.027) (0.033)
Household Income \-0.0001\*\*\* 0.00000
(0.00001) (0.00001)
HS Education 0.627\*\*\* 0.533\*\*\*
(0.019) (0.020)
Constant 25.060\*\*\* 33.494\*\*\* \-15.991\*\*\* \-12.343\*\*\*
(0.221) (0.354) (1.378) (1.467)
Observations 58,557 58,557 58,557 58,557
Adjusted R2 0.011 0.003 0.018 0.023
Residual Std. Error 34.786 (df = 58555) 34.934 (df = 58555) 34.659 (df = 58555) 34.569 (df = 58553)
Note: *p\<0.1; **p\<0.05; ***p\<0.01
2

The Controls

This model shows how these three variables can be used to predict MHV growth. As program analysts, we are interested in seeing how our specific programs can affect MHV growth. Because of this, we want to control for any variables that are naturally occurring outside of the scope of our programs, such as secular trends or natural maturation. We’ve shown that Median Household Income, Percent of High School Education, and Percent of Unemployment are all contributing factors to MHV growth, and consequently, we want to ensure that we do not include their effects as contributions to our programs. We can make sure to control for these variables as we analyze whether or not our social programs are effective at catalyzing neighborhood improvement.

Log-Linear Diff-in-Diff Models

The best plan of action to approach modeling the effects of these programs is by utilizing a log-linear diff-in-diff model. We can analyze the growth rates of the treatment and control to see whether or not our programs made a difference.

NMTC Program Model

We have created the difference in difference dataset in our data processing steps. We have logged both our median home value of 2000 and media home value of 2010. We have also made sure to include all of the tracts that received NMTC funding (d$num.nmtc > 0). From there, we stored our 2000 data and 2010 dataframes, defining our treat and post variables. We then stacked the two time periods together into one data source: d3.

We also need to include our control variables in this model. In order to do so, we added them into our data processing steps as well.

# create the difference in difference model
# note: treat = B1, post = B2, treat*post = B3

m <- lm( y ~ treat.a + x1 + x2 + x3 + post + treat.a*post, data=d3 )

# display model results
stargazer::stargazer(m,
                     type = S_TYPE,
                     dep.var.labels = ("MHV"),
                     covariate.labels = c("NMTC", "Unemployment", "Household Income", "HS Education", "Post Treatment", "Diff-in-Diff", "Constant"),
                     digits = 2)
Dependent variable:
MHV
NMTC 0.13\*\*\*
(0.01)
Unemployment \-0.005\*\*\*
(0.0003)
Household Income 0.0000\*\*\*
(0.0000)
HS Education 0.01\*\*\*
(0.0002)
Post Treatment 0.28\*\*\*
(0.003)
Diff-in-Diff 0.07\*\*\*
(0.02)
Constant 10.54\*\*\*
(0.01)
Observations 118,112
R2 0.50
Adjusted R2 0.50
Residual Std. Error 0.46 (df = 118105)
F Statistic 19,547.44\*\*\* (df = 6; 118105)
Note: *p\<0.1; **p\<0.05; ***p\<0.01

LIHTC Program Model

We also need to create the difference in difference dataset for the LIHTC Program. We included our control variables in the data processing. We made sure to include all of the tracts that received LIHTC funding (d$num.lihtc > 0). We stacked the two time periods together into one data source: d6.

# create the difference in difference model
# note: treat = B1, post = B2, treat*post = B3

m <- lm( y ~ treat.b + x1 + x2 + x3 + post + treat.b*post, data=d6 )

# display model results
stargazer::stargazer(m,
                     type = S_TYPE,
                     dep.var.labels = ("MHV"),
                     covariate.labels = c("LIHTC", "Unemployment", "Household Income", "HS Education", "Post Treatment", "Diff-in-Diff", "Constant"),
                     digits = 2)
Dependent variable:
MHV
LIHTC 0.04\*\*\*
(0.01)
Unemployment \-0.005\*\*\*
(0.0003)
Household Income 0.0000\*\*\*
(0.0000)
HS Education 0.01\*\*\*
(0.0002)
Post Treatment 0.28\*\*\*
(0.003)
Diff-in-Diff 0.04\*\*\*
(0.01)
Constant 10.52\*\*\*
(0.01)
Observations 118,112
R2 0.50
Adjusted R2 0.50
Residual Std. Error 0.46 (df = 118105)
F Statistic 19,496.83\*\*\* (df = 6; 118105)
Note: *p\<0.1; **p\<0.05; ***p\<0.01

Interpretation of Results

We see that in the NMTC Program, the secular trend had 28% growth (represented by “post”). As mentioned before, this is the growth that would have occurred naturally had our program not taken place (the counterfactual). We also see that the treatment group grew 7 percentage points more than the baseline/control group (represented by “treat.a:post”).

The LIHTC Program model comes back with a secular trend of 28% as well, which makes sense because, again, this number represents the growth over time had neither program occurred. The treatment group in this program only grew 4 percentage point more than the baseline/control group (represented by “treat.b:post”).

It appears that the added control variables had some impact on our models. Following are the models without any controls added:

NMTC Program Without Controls

m <- lm( y ~ treat.a + post + treat.a*post, data=d3 )

stargazer::stargazer( m,
                      type = S_TYPE,
                      dep.var.labels = ("MHV"),
                      covariate.labels = c("NMTC", "Post Treatment", "Diff-in-Diff", "Constant"),
                      digits = 2 )
Dependent variable:
MHV
NMTC \-0.26\*\*\*
(0.02)
Post Treatment 0.23\*\*\*
(0.004)
Diff-in-Diff 0.10\*\*\*
(0.02)
Constant 11.96\*\*\*
(0.003)
Observations 118,112
R2 0.04
Adjusted R2 0.04
Residual Std. Error 0.64 (df = 118108)
F Statistic 1,431.79\*\*\* (df = 3; 118108)
Note: *p\<0.1; **p\<0.05; ***p\<0.01

LIHTC Program Without Controls

m <- lm( y ~ treat.b + post + treat.b*post, data=d6 )

stargazer::stargazer( m,
                      type = S_TYPE,
                      dep.var.labels = ("MHV"),
                      covariate.labels = c("LIHTC", "Post Treatment", "Diff-in-Diff", "Constant"),
                      digits = 2 )
Dependent variable:
MHV
LIHTC \-0.21\*\*\*
(0.01)
Post Treatment 0.23\*\*\*
(0.004)
Diff-in-Diff 0.01
(0.01)
Constant 11.98\*\*\*
(0.003)
Observations 118,112
R2 0.04
Adjusted R2 0.04
Residual Std. Error 0.64 (df = 118108)
F Statistic 1,798.71\*\*\* (df = 3; 118108)
Note: *p\<0.1; **p\<0.05; ***p\<0.01

In regard to the NMTC Program, without control variables, the treatment group grew 3 percentage points more than the baseline. This is a larger amount than the 7 percentage points our full model presents. Again, we would expect that with the control variables accounted for, the program impact decreases as we zone in on our program’s specific contributions. With that being said, it appears that the LIHTC Program’s treatment group grew only 1 percentage point without the control variables.

Based off of these interpretations, we can conclude that the programs are effective at catalyzing neighborhood improvement. The NMTC Program can be considered more effective, as the treatment group increased 7 percentage points more than those who did not receive the program. In contrast, the LITHC Program increased by 4 percentage point. The home values for the areas where heavy investments took place saw greater improvement than neighborhoods that did not receive the same funding.


References

*Census Geography: Bridging Data for Census Tracts Across Time*. n.d. Spatial Structures in the Social Sciences, Brown University. <https://s4.ad.brown.edu/Projects/Diversity/Researcher/Bridging.htm>.
*Low-Income Housing Tax Credit (Lihtc)*. n.d. Washington, DC: U.S. Department of Housing; Urban Development. <https://www.huduser.gov/portal/datasets/lihtc.html>.
*New Markets Tax Credit Program*. n.d. U.S. Department of the Treasury, Community Development Financial Institutions Fund. <https://www.cdfifund.gov/programs-training/programs/new-markets-tax-credit>.