Tax Credit Programs and Predictive Analysis
Quick Insights:
We begin by providing detail on what the NMTC Program and LIHTC Program
actually are and how they work. We also discuss our prior model using
various independent variables to predict MHV growth. From there, we
discuss how the independent variables can be added into our program
models to be utilized as control variables. Once that is established, we
present a model for each program utilizing a log-linear diff-in-diff
approach. We find that both the NMTC Program and LIHTC Programs are
marginally effective, and the NMTC can be considered the “better”
program.
Introduction
This chapter analyzes the program effects of the New Market Tax Credit
(NMTC) Program and the Low-Income Housing Tax Credit (LIHTC) Program
from 2000 to 2010. We control for outside variables in order to
accurately model median home value (MHV) growth.
Note: This chapter imports multiple functions defined in the
utilitiesChapter3Chapter4.R file within the respective folder. Please
refer to this file for more detail. The chapter will contain plots and
tabular views with descriptions on the insight gathered.
Data Setup
# load necessary packages
library( dplyr )
library( here )
library( knitr )
library( pander )
library( stargazer )
library( scales )
library( gridExtra )
# load necessary functions and objects ----
# note: all of these are R objects that will be used through this .rmd file
import::here("S_TYPE",
"d",
"d1",
"d2",
"df",
"d3",
"d6",
"PLOTS",
"%>%",
"nmtc",
"lihtc",
# notice the use of here::here() that points to the .R file where all of these R objects are created
.from = here::here("labs/utilitiesChapter3Chapter4.R"),
.character_only = TRUE )
Overview of Programs
There are two social programs in place whose main goals are to improve
low-income neighborhoods: the New Markets Tax Credit (NMTC) Program
and the Low-Income Housing Tax Credit (LIHTC) Program. Both of these
programs use federal tax credits to encourage private investors to fund
infrastructure projects for distressed neighborhoods.
New Market Tax Credits
The New Market Tax Credits
program
solves a key issue in distressed communities: the lack of overall
investment. They incentivize investments for community development and
economic growth by sparking interest and net positive outcome for those
in need of low income housing and the neighborhood as a whole. New
Market Tax Credits can promote revitalization within the community
directly to its residents. A key point that proponents of the NMTC
program cite is that they advocate for the communities they are
revitalizing by including residents in the construction and application
of the improvements. With community on the forefront of every
stakeholder’s mind, and the continuous funds for the projects, these tax
credits can be very beneficial.
Low Income Housing Tax Credits
Low Income Housing Tax
Credits also
promotes neighborhood revitilization. The LIHTC program can help create
more affordable housing for the community residents. Developers can
utilize these Tax Credits to receive the amount needed to fund their
projects. Research has shown that neighborhood improvements that come
from projects such as building or rehabilitating low income houses can
demonstrably improve the immediate surroundings by raising property
values. Including private developers ensures that other people have a
foot int he game besides only the government and policymakers. When
there are more stakeholders, this creates opportunity for a wider
interest and more accountability to see the project to its completion.
This program has the opportunity to draw the attention of big-time
investors from major corporations.
The NMTC Program has specific requirements regarding which areas can be
developed: low-income communities with at least a 20% poverty rate or a
median family income at or below 80%. The LIHTC program does not have
these specific requirements, and developers can choose to invest in
either low-income or middle-income communities. With that being said,
there is typically a greater return on investment when choosing to
develop in low-income communities, so that is typically where the LIHTC
tax credits end up going to. Although these programs both serve similar
neighborhoods, their reach is not identical. For example:
### POVERTY RATES
gridExtra::grid.arrange( PLOTS$pov_rate_2000$nmtc,
PLOTS$pov_rate_2000$lihtc,
nrow = 1 )

The NMTC tends to cover a greater amount of poverty-stricken
neighborhoods than the LIHTC does. Knowing this, we cannot make a direct
comparison between the two programs. We will instead analyze their
effects in separate models.
Data Sources
We have access to records on New Market Tax Credits (NMTC) that
originates from this data source.
Information regarding projects, the Census Tract and city they took
place in, the origination year, and more are all available.
Our data regarding Low Income Housing Tax Credits (LIHTC) derive from
this data source. Relevant information
regarding what projecs took place, as well as when and where they
occurred are included.
As part of our data cleaning, we have grouped all projects that
originated after 2000 and before 2010 into one record per tract.
Descriptive Statistics
The change in overall median home value from 2000 to 2010 adjusted for
inflation gives us a basic understanding of the economy and structural
changes in society and the housing market. We can see that on average,
homes grew in value over this time period.
#Change in MHV 2000-2010 adjusted for inflation
hist( df$MHV.Change.00.to.10/1000, breaks=500,
xlim=c(-100,500), yaxt="n", xaxt="n",
xlab="Thousands of US Dollars (adjusted to 2010)", cex.lab=1.0,
ylab="", main="Change in Median Home Value 2000 to 2010",
col="indianred4", border="white" )
axis( side=1, at=seq( from=-100, to=500, by=100 ),
labels=paste0( "$", seq( from=-100, to=500, by=100 ), "k" ) )
mean.x <- mean( df$MHV.Change.00.to.10/1000, na.rm=T )
abline( v=mean.x, col="yellow3", lwd=2, lty=2 )
text( x=200, y=1500,
labels=paste0( "Mean = ", dollar( round(1000*mean.x,0)) ),
col="yellow3", cex=1.8, pos=3 )
median.x <- median( df$MHV.Change.00.to.10/1000, na.rm=T )
abline( v=median.x, col="yellow4", lwd=2, lty=2 )
text( x=200, y=2000,
labels=paste0( "Median = ", dollar( round(1000*median.x,0)) ),
col="yellow4", cex=1.8, pos=3 )

Dollars Distrubuted by Program
par( mfrow=c(1,2) )
# Amounts given for NMTC
hist( nmtc$amount/10000, breaks=200,
xlim=c(0,3000), yaxt="n", xaxt="n",
xlab="Thousands of Dollars", cex.lab=1.0,
ylab="", main="Dollars Distributed via NMTC",
col="indianred4", border="white" )
axis( side=1, at=seq( from=0, to=2500, by=500 ),
labels=paste0( "$", seq( from=0, to=2500, by=500 ), "" ) )
mean.nmtc.amt <- mean( nmtc$amount/10000, na.rm=T )
abline( v=mean.nmtc.amt, col="yellow3", lwd=2, lty=2 )
text( x=1700, y=1500,
labels=paste0( "Mean = ", dollar( round(1000*mean.nmtc.amt,0)) ),
col="yellow3", cex=1.3, pos=3 )
median.nmtc.amt <- median( nmtc$amount/10000, na.rm=T )
abline( v=median.nmtc.amt, col="yellow4", lwd=2, lty=2 )
text( x=1740, y=2000,
labels=paste0( "Median = ", dollar( round(1000*median.nmtc.amt,0)) ),
col="yellow4", cex=1.3, pos=1 )
# Amounts given for LIHTC
hist( lihtc$allocamt/1000, breaks=1000,
xlim=c(0,3000), yaxt="n", xaxt="n",
xlab="Thousands of Dollars", cex.lab=1.0,
ylab="", main="Dollars Distributed via LIHTC",
col="indianred4", border="white" )
axis( side=1, at=seq( from=0, to=2500, by=500 ),
labels=paste0( "$", seq( from=0, to=2500, by=500 ), "" ) )
mean.lihtc.amt <- mean( lihtc$allocamt/10000, na.rm=T )
abline( v=mean.lihtc.amt, col="yellow3", lwd=2, lty=2 )
text( x=1700, y=5100,
labels=paste0( "Mean = ", dollar( round(1000*mean.lihtc.amt,0)) ),
col="yellow3", cex=1.3, pos=3 )
median.lihtc.amt <- median( lihtc$allocamt/10000, na.rm=T )
abline( v=median.lihtc.amt, col="yellow4", lwd=2, lty=2 )
text( x=1740, y=7000,
labels=paste0( "Median = ", dollar( round(1000*median.lihtc.amt,0)) ),
col="yellow4", cex=1.3, pos=1 )

Characteristics of Individuals That Received Them
# Granted
pairs(d2[1:4], main = "Makeup of those granted either an NMTC or LIHTC tax credit",
labels = c("unemployment" , "Household Income" , "Highschool Education", "Granted"),
pch = 16,
bg = c("red")[unclass(d2$post)])

# Not granted
pairs(d1[1:4],
main = "Makeup of those not granted either NMTC or LIHTC tax credits",
labels = c("unemployment" , "Household Income" , "Highschool Education", "Granted"),
pch = 16,
bg = c("red")[unclass(d1$post)])

gridExtra::grid.arrange( PLOTS$mhv_growth$nmtc,
PLOTS$mhv_growth$lihtc,
nrow = 2 )

Predictive Analysis
The Baseline Model
Prior to this chapter, we explored how various census variables can be
utilized in a model to predict median home value (MHV) growth. We
determined that Median Household Income, Percent of High School
Education, and Percent of Unemployment are all valuable in predicting
median home value growth.
Following is the model:
reg.data <- d
reg.data$mhv.growth[ reg.data$mhv.growth > 200 ] <- NA
#reg.data$p.prof <- regdata$p.prof
#reg.data$p.vacant <- log10( reg.data$p.vacant + 1 )
#reg.data$p.white <- regdata$p.white
m1 <- lm( mhv.growth ~ p.unemp.00, data=reg.data )
m2 <- lm( mhv.growth ~ hinc00, data=reg.data )
m3 <- lm( mhv.growth ~ p.hs.edu.00, data=reg.data )
m4 <- lm( mhv.growth ~ p.unemp.00 + hinc00 + p.hs.edu.00, data=reg.data )
stargazer( m1, m2, m3, m4,
type = S_TYPE, digts=2,
dep.var.labels = ("MHV Growth"),
covariate.labels = c("Unemployment", "Household Income", "HS Education", "Constant"),
omit.stat = c("rsq", "f") )
|
|
Dependent variable:
|
|
|
|
MHV Growth
|
|
(1)
|
(2)
|
(3)
|
(4)
|
|
Unemployment
|
0.706\*\*\*
|
|
|
0.500\*\*\*
|
|
(0.027)
|
|
|
(0.033)
|
|
|
|
|
|
Household Income
|
|
\-0.0001\*\*\*
|
|
0.00000
|
|
|
(0.00001)
|
|
(0.00001)
|
|
|
|
|
|
HS Education
|
|
|
0.627\*\*\*
|
0.533\*\*\*
|
|
|
|
(0.019)
|
(0.020)
|
|
|
|
|
|
Constant
|
25.060\*\*\*
|
33.494\*\*\*
|
\-15.991\*\*\*
|
\-12.343\*\*\*
|
|
(0.221)
|
(0.354)
|
(1.378)
|
(1.467)
|
|
|
|
|
|
|
Observations
|
58,557
|
58,557
|
58,557
|
58,557
|
Adjusted R2
|
0.011
|
0.003
|
0.018
|
0.023
|
Residual Std. Error
|
34.786 (df = 58555)
|
34.934 (df = 58555)
|
34.659 (df = 58555)
|
34.569 (df = 58553)
|
|
Note:
|
*p\<0.1; **p\<0.05; ***p\<0.01
|
The Controls
This model shows how these three variables can be used to predict MHV
growth. As program analysts, we are interested in seeing how our
specific programs can affect MHV growth. Because of this, we want to
control for any variables that are naturally occurring outside of the
scope of our programs, such as secular trends or natural maturation.
We’ve shown that Median Household Income, Percent of High School
Education, and Percent of Unemployment are all contributing factors to
MHV growth, and consequently, we want to ensure that we do not include
their effects as contributions to our programs. We can make sure to
control for these variables as we analyze whether or not our social
programs are effective at catalyzing neighborhood improvement.
Log-Linear Diff-in-Diff Models
The best plan of action to approach modeling the effects of these
programs is by utilizing a log-linear diff-in-diff model. We can analyze
the growth rates of the treatment and control to see whether or not our
programs made a difference.
NMTC Program Model
We have created the difference in difference dataset in our data
processing steps. We have logged both our median home value of 2000 and
media home value of 2010. We have also made sure to include all of the
tracts that received NMTC funding (d$num.nmtc > 0). From there, we
stored our 2000 data and 2010 dataframes, defining our treat and post
variables. We then stacked the two time periods together into one data
source: d3.
We also need to include our control variables in this model. In order to
do so, we added them into our data processing steps as well.
# create the difference in difference model
# note: treat = B1, post = B2, treat*post = B3
m <- lm( y ~ treat.a + x1 + x2 + x3 + post + treat.a*post, data=d3 )
# display model results
stargazer::stargazer(m,
type = S_TYPE,
dep.var.labels = ("MHV"),
covariate.labels = c("NMTC", "Unemployment", "Household Income", "HS Education", "Post Treatment", "Diff-in-Diff", "Constant"),
digits = 2)
|
|
Dependent variable:
|
|
|
|
MHV
|
|
NMTC
|
0.13\*\*\*
|
|
(0.01)
|
|
|
Unemployment
|
\-0.005\*\*\*
|
|
(0.0003)
|
|
|
Household Income
|
0.0000\*\*\*
|
|
(0.0000)
|
|
|
HS Education
|
0.01\*\*\*
|
|
(0.0002)
|
|
|
Post Treatment
|
0.28\*\*\*
|
|
(0.003)
|
|
|
Diff-in-Diff
|
0.07\*\*\*
|
|
(0.02)
|
|
|
Constant
|
10.54\*\*\*
|
|
(0.01)
|
|
|
|
Observations
|
118,112
|
R2
|
0.50
|
Adjusted R2
|
0.50
|
Residual Std. Error
|
0.46 (df = 118105)
|
F Statistic
|
19,547.44\*\*\* (df = 6; 118105)
|
|
Note:
|
*p\<0.1; **p\<0.05; ***p\<0.01
|
LIHTC Program Model
We also need to create the difference in difference dataset for the
LIHTC Program. We included our control variables in the data processing.
We made sure to include all of the tracts that received LIHTC funding
(d$num.lihtc > 0). We stacked the two time periods together into
one data source: d6.
# create the difference in difference model
# note: treat = B1, post = B2, treat*post = B3
m <- lm( y ~ treat.b + x1 + x2 + x3 + post + treat.b*post, data=d6 )
# display model results
stargazer::stargazer(m,
type = S_TYPE,
dep.var.labels = ("MHV"),
covariate.labels = c("LIHTC", "Unemployment", "Household Income", "HS Education", "Post Treatment", "Diff-in-Diff", "Constant"),
digits = 2)
|
|
Dependent variable:
|
|
|
|
MHV
|
|
LIHTC
|
0.04\*\*\*
|
|
(0.01)
|
|
|
Unemployment
|
\-0.005\*\*\*
|
|
(0.0003)
|
|
|
Household Income
|
0.0000\*\*\*
|
|
(0.0000)
|
|
|
HS Education
|
0.01\*\*\*
|
|
(0.0002)
|
|
|
Post Treatment
|
0.28\*\*\*
|
|
(0.003)
|
|
|
Diff-in-Diff
|
0.04\*\*\*
|
|
(0.01)
|
|
|
Constant
|
10.52\*\*\*
|
|
(0.01)
|
|
|
|
Observations
|
118,112
|
R2
|
0.50
|
Adjusted R2
|
0.50
|
Residual Std. Error
|
0.46 (df = 118105)
|
F Statistic
|
19,496.83\*\*\* (df = 6; 118105)
|
|
Note:
|
*p\<0.1; **p\<0.05; ***p\<0.01
|
Interpretation of Results
We see that in the NMTC Program, the secular trend had 28% growth
(represented by “post”). As mentioned before, this is the growth that
would have occurred naturally had our program not taken place (the
counterfactual). We also see that the treatment group grew 7 percentage
points more than the baseline/control group (represented by
“treat.a:post”).
The LIHTC Program model comes back with a secular trend of 28% as well,
which makes sense because, again, this number represents the growth over
time had neither program occurred. The treatment group in this program
only grew 4 percentage point more than the baseline/control group
(represented by “treat.b:post”).
It appears that the added control variables had some impact on our
models. Following are the models without any controls added:
NMTC Program Without Controls
m <- lm( y ~ treat.a + post + treat.a*post, data=d3 )
stargazer::stargazer( m,
type = S_TYPE,
dep.var.labels = ("MHV"),
covariate.labels = c("NMTC", "Post Treatment", "Diff-in-Diff", "Constant"),
digits = 2 )
|
|
Dependent variable:
|
|
|
|
MHV
|
|
NMTC
|
\-0.26\*\*\*
|
|
(0.02)
|
|
|
Post Treatment
|
0.23\*\*\*
|
|
(0.004)
|
|
|
Diff-in-Diff
|
0.10\*\*\*
|
|
(0.02)
|
|
|
Constant
|
11.96\*\*\*
|
|
(0.003)
|
|
|
|
Observations
|
118,112
|
R2
|
0.04
|
Adjusted R2
|
0.04
|
Residual Std. Error
|
0.64 (df = 118108)
|
F Statistic
|
1,431.79\*\*\* (df = 3; 118108)
|
|
Note:
|
*p\<0.1; **p\<0.05; ***p\<0.01
|
LIHTC Program Without Controls
m <- lm( y ~ treat.b + post + treat.b*post, data=d6 )
stargazer::stargazer( m,
type = S_TYPE,
dep.var.labels = ("MHV"),
covariate.labels = c("LIHTC", "Post Treatment", "Diff-in-Diff", "Constant"),
digits = 2 )
|
|
Dependent variable:
|
|
|
|
MHV
|
|
LIHTC
|
\-0.21\*\*\*
|
|
(0.01)
|
|
|
Post Treatment
|
0.23\*\*\*
|
|
(0.004)
|
|
|
Diff-in-Diff
|
0.01
|
|
(0.01)
|
|
|
Constant
|
11.98\*\*\*
|
|
(0.003)
|
|
|
|
Observations
|
118,112
|
R2
|
0.04
|
Adjusted R2
|
0.04
|
Residual Std. Error
|
0.64 (df = 118108)
|
F Statistic
|
1,798.71\*\*\* (df = 3; 118108)
|
|
Note:
|
*p\<0.1; **p\<0.05; ***p\<0.01
|
In regard to the NMTC Program, without control variables, the treatment
group grew 3 percentage points more than the baseline. This is a larger
amount than the 7 percentage points our full model presents. Again, we
would expect that with the control variables accounted for, the program
impact decreases as we zone in on our program’s specific contributions.
With that being said, it appears that the LIHTC Program’s treatment
group grew only 1 percentage point without the control variables.
Based off of these interpretations, we can conclude that the programs
are effective at catalyzing neighborhood improvement. The NMTC Program
can be considered more effective, as the treatment group increased 7
percentage points more than those who did not receive the program. In
contrast, the LITHC Program increased by 4 percentage point. The home
values for the areas where heavy investments took place saw greater
improvement than neighborhoods that did not receive the same funding.
References
*Census Geography: Bridging Data for Census Tracts Across Time*. n.d.
Spatial Structures in the Social Sciences, Brown University.
<https://s4.ad.brown.edu/Projects/Diversity/Researcher/Bridging.htm>.
*Low-Income Housing Tax Credit (Lihtc)*. n.d. Washington, DC: U.S.
Department of Housing; Urban Development.
<https://www.huduser.gov/portal/datasets/lihtc.html>.
*New Markets Tax Credit Program*. n.d. U.S. Department of the Treasury,
Community Development Financial Institutions Fund.
<https://www.cdfifund.gov/programs-training/programs/new-markets-tax-credit>.