EssayGhost Assignment代写,Essay代写,网课代修,Quiz代考




您的位置: 主页 > 编程案例 > R语言代写 >
代写R语言,R语言代写 BTRY 4030 Homework 4 Q java代写, python代写
发布时间:2021-08-03 23:36:24浏览次数:
BTRY 4030 Fall 2018 Homework 4 Q 代做homework | R语言代写 | 统计代写 | homework | assignment作业 这个项目是assignment代写的代写题目Put Your Name and NetID HereDue Tuesday, December 4, 2018Instructions :Create your homework solution file by editing the hw5-2018_q2.Rmd Rmarkdown file provided. Yoursolution to this homework assignment should include the relevant R code and output (fit summaries, ANOVAtables and computed statistics, as well as requested plots) in addition to written comments where requested.Do not include output that is not relevant to the question. You should turn in a .pdf version of your compiledcode.You may discuss the homework problems and computing issues with other students in the class. However, you must write up your homework solution on your own. In particular, do not share your homework RMarkdown file with other students.Here we will illustrate the results from Question 1 with a real world data set. We will use the study ofmortality in 55 US cities as it is influenced by pollutants NOX (nitrous oxide) and SO2 (sulfur dioxide),while controlling weather (PRECIP) and sociological variables (EDUC and NONWHITE) that appeared onhomework 4.You can find the data inairpollution.csvon CMS.a.Delete each of the first four observation in turn, fit a model with the remaining observations (ie, eachmodel should be fit based on n 1 observations) and use this to predict MORT in the left-out sample.Verfiy that your answer in Question 1d returns the same error.b.Using your identity in Question 1e, compute the cross-validation score for a model using all covariates.c. Calculate the cross-validation score in the sequence of models obtained by starting from the interceptand adding each column in the order given in the data (so every model should have one more covariatethan the previous one). Which model has the lowest score?d.What happens if you add them in reverse order? Plot both sequences of scores versus the number ofcovariates in the model.e.An alternative (fairly classical) means of selecting models in linear regression in Mallows Cp score. Thiscan be expressed asy T ( I Hj ) yy T ( I H ) y / ( n p 1) 2 joften also written as SSEj/ ^2 2 j , where SSEj is the SSE for a model with j covariates, and ^2 iscalculated from a model with all covariates.Obtain Cj for each of your models in part c, how does this compare with cross validation?bonus : When there is no natural ordering of the covariates, one way to create one is to first choose thecovariate that produces the smallest SSE among all models in 1 covariate. Then, keeping that in the modellook for the best covariate to add to it. Continue this process until all covariates are in the model. If you dothis, what ordering do you get? What is your optimal model?bonus : Simulate data from the model that you get before SO2 and NOX are entered (that is, fit a modelwith just PRECIP, EDUC and NONWHITE and simulate data with the estimated coefficients and residualvariance). Carry out the model-selection step in part c for each of 100 simulations. How frequently does crossvalidation choose the right model?