vignettes/chapter-04.Rmd
chapter-04.Rmd
Load the hayashir
package
library(hayashir)
And other libraries we need for this chapter
library(skimr) # summary stats
library(dplyr) # data manipulation
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(systemfit) # for fitting system of regressions
#> Loading required package: Matrix
#> Loading required package: car
#> Loading required package: carData
#>
#> Attaching package: 'car'
#> The following object is masked from 'package:dplyr':
#>
#> recode
#> Loading required package: lmtest
#> Loading required package: zoo
#>
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#>
#> as.Date, as.Date.numeric
#>
#> Attaching package: 'lmtest'
#> The following object is masked from 'package:hayashir':
#>
#> moneydemand
#>
#> Please cite the 'systemfit' package as:
#> Arne Henningsen and Jeff D. Hamann (2007). systemfit: A Package for Estimating Systems of Simultaneous Equations in R. Journal of Statistical Software 23(4), 1-40. http://www.jstatsoft.org/v23/i04/.
#>
#> If you have questions, suggestions, or comments regarding the 'systemfit' package, please use a forum or 'tracker' at systemfit's R-Forge site:
#> https://r-forge.r-project.org/projects/systemfit/
Let’s get a quick look at our data by looking at the first 10 rows:
head(greene, 10)
#> # A tibble: 10 x 9
#> firm_id total_cost output price_labor price_capital price_fuel
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 0.213 8 6869. 64.9 18
#> 2 20 0.489 14 5439. 86.1 34.2
#> 3 17 0.616 50 9204. 90.5 32.1
#> 4 14 0.761 65 8972. 41.2 28.5
#> 5 28 0.636 67 6696. 58.3 25.4
#> 6 22 1.15 90 7190. 79.1 21.5
#> 7 16 1.34 183 5063. 74.4 35.5
#> 8 15 2.26 295 8218. 71.9 39.2
#> 9 27 2.05 374 7885. 82.5 26.3
#> 10 30 3.15 378 7895. 60.3 42.5
#> # … with 3 more variables: labor_share <dbl>, capital_share <dbl>,
#> # fuel_share <dbl>
# assign the greene data to the object 'df'
df <- greene
df %>%
mutate(output = output / 1e3) %>%
select(output, labor_share, capital_share, fuel_share) %>%
skim()
#> Skim summary statistics
#> n obs: 99
#> n variables: 4
#>
#> Variable type: numeric
#> variable missing complete n mean sd p0 p25 median p75
#> capital_share 0 99 99 0.23 0.062 0.098 0.2 0.22 0.25
#> fuel_share 0 99 99 0.63 0.092 0.24 0.6 0.65 0.69
#> labor_share 0 99 99 0.14 0.059 0.053 0.1 0.12 0.17
#> output 0 99 99 9 10.32 0.008 1.95 5.71 11.75
#> p100 hist
#> 0.46 ▁▃▇▆▁▁▁▁
#> 0.81 ▁▁▁▁▃▇▆▁
#> 0.33 ▃▇▃▂▂▁▁▁
#> 53.92 ▇▅▁▁▁▁▁▁
Following the text, the two equations we want to estimate are:
First, specify the equations we want to estimate:
eq1 <- labor_share ~ log(price_labor / price_fuel) + log(price_capital / price_fuel) + log(output)
eq2 <- capital_share ~ log(price_labor / price_fuel) + log(price_capital / price_fuel) + log(output)
As Hayashi tells us, the unique symmetry restriction is then
restrict <- c(
"labor_log(price_capital/price_fuel) = capital_log(price_labor/price_fuel)"
)
We estimate with the systemfit
function from the package with the same name:
sur_results <- systemfit(list(labor = eq1, capital = eq2),
method = "SUR",
restrict.matrix = restrict,
data= df)
summary(sur_results)
#>
#> systemfit results
#> method: SUR
#>
#> N DF SSR detRCov OLS-R2 McElroy-R2
#> system 198 191 0.422566 5e-06 0.410299 0.435865
#>
#> N DF SSR MSE RMSE R2 Adj R2
#> labor 99 95 0.171289 0.001803 0.042462 0.491163 0.475094
#> capital 99 95 0.251276 0.002645 0.051430 0.338655 0.317770
#>
#> The covariance matrix of the residuals used for estimation
#> labor capital
#> labor 0.001805434 -0.000173544
#> capital -0.000173544 0.002642036
#>
#> The covariance matrix of the residuals
#> labor capital
#> labor 0.001803045 -0.000174508
#> capital -0.000174508 0.002645015
#>
#> The correlations of the residuals
#> labor capital
#> labor 1.0000000 -0.0799093
#> capital -0.0799093 1.0000000
#>
#>
#> SUR estimates for 'labor' (equation 1)
#> Model Formula: labor_share ~ log(price_labor/price_fuel) + log(price_capital/price_fuel) +
#> log(output)
#>
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.13137378 0.10792322 -1.21729 0.22500
#> log(price_labor/price_fuel) 0.08359897 0.02041109 4.09576 6.2106e-05
#> log(price_capital/price_fuel) -0.02319930 0.01626401 -1.42642 0.15538
#> log(output) -0.02115268 0.00253067 -8.35852 1.2879e-14
#>
#> (Intercept)
#> log(price_labor/price_fuel) ***
#> log(price_capital/price_fuel)
#> log(output) ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.042462 on 95 degrees of freedom
#> Number of observations: 99 Degrees of Freedom: 95
#> SSR: 0.171289 MSE: 0.001803 Root MSE: 0.042462
#> Multiple R-Squared: 0.491163 Adjusted R-Squared: 0.475094
#>
#>
#> SUR estimates for 'capital' (equation 2)
#> Model Formula: capital_share ~ log(price_labor/price_fuel) + log(price_capital/price_fuel) +
#> log(output)
#>
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 0.31807238 0.08676358 3.66597 0.00031927
#> log(price_labor/price_fuel) -0.02319930 0.01626401 -1.42642 0.15537969
#> log(price_capital/price_fuel) 0.12218712 0.02017031 6.05777 7.2001e-09
#> log(output) -0.00858598 0.00306088 -2.80507 0.00555094
#>
#> (Intercept) ***
#> log(price_labor/price_fuel)
#> log(price_capital/price_fuel) ***
#> log(output) **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.05143 on 95 degrees of freedom
#> Number of observations: 99 Degrees of Freedom: 95
#> SSR: 0.251276 MSE: 0.002645 Root MSE: 0.05143
#> Multiple R-Squared: 0.338655 Adjusted R-Squared: 0.31777
Which yields approximately the same coefficients as in the text.
Then we can calculate estimates of the remaining coefficients using adding up restrictions, homogeneity and symmetry. For example: \[ \hat{\gamma}_{33} = \hat{\gamma}_{11} + 2 \hat{\gamma}_{12} + \hat{\gamma}_{22} \]
So that
gamma_11 <- sur_results$coefficients["labor_log(price_labor/price_fuel)"]
gamma_12 <- sur_results$coefficients["labor_log(price_capital/price_fuel)"]
gamma_22 <- sur_results$coefficients["capital_log(price_capital/price_fuel)"]
gamma_33 <- gamma_11 + 2 * gamma_12 + gamma_22
print(paste("gamma_33 is: ", round(gamma_33,3)))
#> [1] "gamma_33 is: 0.159"
Run the three regressions via pooled OLS. The equations are:
eq1a <- labor_share ~ log(price_labor) + log(price_capital) + log(price_fuel) + log(output)
eq2a <- capital_share ~ log(price_labor) + log(price_capital) + log(price_fuel) + log(output)
eq3a <- fuel_share ~ log(price_labor) + log(price_capital) + log(price_fuel) + log(output)
Then estimate:
pooled_ols <- systemfit(list(labor = eq1a, capital = eq2a, fuel = eq3a),
method = "OLS",
data= df)
summary(pooled_ols)
#>
#> systemfit results
#> method: OLS
#>
#> N DF SSR detRCov OLS-R2 McElroy-R2
#> system 297 282 0.77849 0 0.49798 0.261905
#>
#> N DF SSR MSE RMSE R2 Adj R2
#> labor 99 94 0.159682 0.001699 0.041216 0.525643 0.505458
#> capital 99 94 0.232271 0.002471 0.049709 0.388674 0.362661
#> fuel 99 94 0.386536 0.004112 0.064126 0.536604 0.516885
#>
#> The covariance matrix of the residuals
#> labor capital fuel
#> labor 1.69875e-03 -2.88179e-05 -0.00166993
#> capital -2.88179e-05 2.47097e-03 -0.00244216
#> fuel -1.66993e-03 -2.44216e-03 0.00411208
#>
#> The correlations of the residuals
#> labor capital fuel
#> labor 1.0000000 -0.0140658 -0.631834
#> capital -0.0140658 1.0000000 -0.766140
#> fuel -0.6318335 -0.7661402 1.000000
#>
#>
#> OLS estimates for 'labor' (equation 1)
#> Model Formula: labor_share ~ log(price_labor) + log(price_capital) + log(price_fuel) +
#> log(output)
#>
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.7966316 0.2711681 -2.93778 0.0041579 **
#> log(price_labor) 0.1355578 0.0277201 4.89024 4.1473e-06 ***
#> log(price_capital) 0.0210676 0.0281112 0.74944 0.4554641
#> log(price_fuel) -0.0532138 0.0156931 -3.39092 0.0010202 **
#> log(output) -0.0228663 0.0025443 -8.98726 2.6201e-14 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.041216 on 94 degrees of freedom
#> Number of observations: 99 Degrees of Freedom: 94
#> SSR: 0.159682 MSE: 0.001699 Root MSE: 0.041216
#> Multiple R-Squared: 0.525643 Adjusted R-Squared: 0.505458
#>
#>
#> OLS estimates for 'capital' (equation 2)
#> Model Formula: capital_share ~ log(price_labor) + log(price_capital) + log(price_fuel) +
#> log(output)
#>
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 1.04192372 0.32704559 3.18587 0.001959 **
#> log(price_labor) -0.06462876 0.03343214 -1.93313 0.056229 .
#> log(price_capital) 0.04524629 0.03390380 1.33455 0.185248
#> log(price_fuel) -0.11141694 0.01892681 -5.88673 6.0495e-08 ***
#> log(output) -0.00636463 0.00306858 -2.07413 0.040802 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.049709 on 94 degrees of freedom
#> Number of observations: 99 Degrees of Freedom: 94
#> SSR: 0.232271 MSE: 0.002471 Root MSE: 0.049709
#> Multiple R-Squared: 0.388674 Adjusted R-Squared: 0.362661
#>
#>
#> OLS estimates for 'fuel' (equation 3)
#> Model Formula: fuel_share ~ log(price_labor) + log(price_capital) + log(price_fuel) +
#> log(output)
#>
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 0.75470792 0.42189588 1.78885 0.07686 .
#> log(price_labor) -0.07092899 0.04312819 -1.64461 0.10339
#> log(price_capital) -0.06631388 0.04373663 -1.51621 0.13282
#> log(price_fuel) 0.16463077 0.02441599 6.74274 1.2471e-09 ***
#> log(output) 0.02923092 0.00395854 7.38427 6.1154e-11 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.064126 on 94 degrees of freedom
#> Number of observations: 99 Degrees of Freedom: 94
#> SSR: 0.386536 MSE: 0.004112 Root MSE: 0.064126
#> Multiple R-Squared: 0.536604 Adjusted R-Squared: 0.516885
And extract the residual covariance matrix:
pooled_ols$residCov
#> labor capital fuel
#> labor 0.0016987469 -0.0000288179 -0.001669929
#> capital -0.0000288179 0.0024709732 -0.002442155
#> fuel -0.0016699289 -0.0024421553 0.004112084
The cross elasticity (from eq 4.7.9) is:
\[ \eta_{jk} = \frac{\gamma_{jk} + s_j s_k}{s_j s_k} \]
So we calculate these for each data point, and then find the mean in our data. Let’s do this for the labor-capital elasticity, \(\eta_{12}\):
# the coefficient we are after
gamma_12 <- sur_results$coefficients["labor_log(price_capital/price_fuel)"]
# the fitted values
fitted_vals <- fitted( sur_results)
elasticity <- fitted_vals %>%
mutate(eta_12 = (gamma_12 + (labor * capital)) / (labor * capital) ) %>%
summarise(eta_12 = mean(eta_12))
print(paste0("Capital-Labor elasticity is: ", round(elasticity$eta_12, 2)))
#> [1] "Capital-Labor elasticity is: 0.17"