R代写 Econometric Analysis R语言代写 年龄分层练习

发布时间：2021-08-03 23:32:18浏览次数：

age: mean age.enforce: indicating seat belt law enforcement (no, primary, secondary). The definitionof the different enforcement levels is given on the Governors Highway Safety website. Basically,primary enforcement means that officers can issue a ticket for not wearing a seat belt even ifthere is no other traffic infraction. For secondary enforcement, there must be another trafficinfraction before officers can issue a ticket for not wearing a seat belt.Part IIn the first part of the project, we want to estimate a model year by year. It is not the best way whenwe have a panel because it is more efficient to use all the data in a single model, but since we have notcovered how to estimate panel data models, it is a good way to start.As you can see, the proportion of states that adopted a seat belt law went from 0% in 1983 to100% in 1997. Not all States, however, chose the same level of enforcement.law - matrix(data$enforce, nrow=15) ## create a matrix num-years x num-statesno - rowSums(law== no )prim - rowSums(law== primary )sec - rowSums(law== secondary )ylim - range(c(no, sec, prim))plot(1983:1997,no, xlab= year , ylab= states , type= l , col=1,ylim=ylim,main= number of states with seat belt laws )lines(1983:1997,sec,type= l ,col=2)lines(1983:1997,prim,type= l ,col=3)legend( topright , c( no , secondary , primary ), col=1:3, lty=1)1984 1988 1992 1996number of states with seat belt lawsstatessecondaryprimaryIt may therefore be difficult to base the model selection on the first few years because the numberof states with a seat belt law was too small. Therefore, use the year 1987 only to select your model.We want to compare the effect of the law on the number of fatalities. For now, just distinguish Stateswith and without seat belt law. For that, create a dummy variable equals to 1 if there is a seat beltlaw and 0 otherwise:data$law - as.numeric(data$enforce != no )The dependent variable isf atalitiesand we want to measure the effect oflawon it by controllingfor the right variables and by using the appropriate functional form. The selection process shouldinclude the following (no necessarily in that order).Discussion on which variable should be included and why.Discussion on how each variable should enter the model (in log, with interactions, squared, etc.).It may not be obvious for all variables, but try your best.Estimate the model (or models if you have more than one in mind)Test for correct specification (Chapter 9), homoscedasticity (Chapter 8).Any other things to look for before going to the interpretation part?Interpret the result and discuss the possible weakness of the model.Once your model is selected, estimate the effect of the law on fatalities, for all years. To presentthe results, produce one graph on which the estimated effect oflawand its confidence interval arepresented in a time series format. If your model has interactions betweenlawand other variables, youneed to compute the average partial effect oflawfor each year and its confidence interval. Discuss theresults.Hint: Here is an example of how to do it for the simplest possible model:form - fatalities~lawres -vector()for (y in 1983:1997){reg - lm(form, subset(data,year==y))conf - confint(reg, 2)ans - c(conf[1], coef(reg)[2], conf[2])res - rbind(res, ans)}matplot(1983:1997, res, lty=c(2,1,2), col=c(2,1,2), lwd=2, type= l ,xlab= year , ylab=expression(beta[1]),main= Effect of Seat Belt law on traffic fatalities )abline(h=0)1984 1988 1992 1996Effect of Seat Belt law on trafficfatalities1Part IIWe have not learned how to estimate models with panel data, but we will ignore it in this part anddo as if it was cross-sectional data. You should have realized in Part I that the sample size may betoo small to identify the effect of the law on fatalities (it is not too late to add that to your previousdiscussion). One benefit from using panel data is the sample size. Since we have 51 states and 15years, the sample size is equal to 765 when all years are used. There are, however, issues to take intoconsideration.The main problem with panel data is that the year and state dimension may hide unobservedheterogeneity that are relevant to the analysis. If we do not control for these unobserved characteristics,we may obtain biased estimators. We can control for unobserved year and state heterogeneity bycontrolling for year and state indicators (or dummies). In R, it simply means that we have to addyearandstatein the regression. Dummy variables for years and states will automatically be created.Adding such dummy variables is called adding year and state fixed effects to the model. Notice thata model that incorporates these fixed effects will have 64 more coefficients (can you guess why it isnot 66?). However, we are not interested by their values, so we do not print them in the final report.We only print the important coefficients, and add a comment that says that year and/or state fixedeffects are included. Another issue with panel data is the computation of the coefficient standard errors(covered in Chapter 8) and testing. It is very likely that you will need to compute robust standarderrors and perform robust tests. Use the same model you selected in Part I, and estimate it using allobservations. Compare the effect of the seat belt law on the number of traffic fatalities when (i) no yearnor state fixed effects are included, (ii) only year fixed effect is included, and (iii) both year and statefixed effects are included. You can test if the model is correctly specified and test for heteroscedasticityonce more, as the conclusion may differ when all years are used, but do not change your model (forsimplicity, but if you want to try other things, go ahead, it is your project). Present and interpret theresults. Which of the three models do you trust the most and why? Conclude with a discussion onthe main finding of your study. Do you think it is a valid result? Can you think of a way to improveyour model?Hint: Here is how you print your results without the year and state fixed effect, using stargazer:res - lm(fatalities~law+year+state, data=data)stargazer(res, type= text , omit=c( year , state ), digits=5)===============================================Dependent variable:fatalitieslaw -0.00059*(0.00031)Constant 0.03181***(0.00066)