Robust Mixed Model Analysis

Robust Mixed Model Analysis

ID:82172206

大小:3.66 MB

页数:269页

时间:2024-01-05

上传者:用户名
Robust Mixed Model Analysis_第1页
Robust Mixed Model Analysis_第2页
Robust Mixed Model Analysis_第3页
Robust Mixed Model Analysis_第4页
Robust Mixed Model Analysis_第5页
Robust Mixed Model Analysis_第6页
Robust Mixed Model Analysis_第7页
Robust Mixed Model Analysis_第8页
Robust Mixed Model Analysis_第9页
Robust Mixed Model Analysis_第10页
资源描述:

《Robust Mixed Model Analysis》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库

19888_9789814733830_tp.indd119/3/1911:43AM

2b2530InternationalStrategicRelationsandChina’sNationalSecurity:WorldattheCrossroadsThispageintentionallyleftblankb2530_FM.indd601-Sep-1611:03:06AM

39888_9789814733830_tp.indd219/3/1911:43AM

4PublishedbyWorldScientificPublishingCo.Pte.Ltd.5TohTuckLink,Singapore596224USAoffice:27WarrenStreet,Suite401-402,Hackensack,NJ07601UKoffice:57SheltonStreet,CoventGarden,LondonWC2H9HELibraryofCongressCataloging-in-PublicationDataNames:Jiang,Jiming,author.Title:Robustmixedmodelanalysis/byJimingJiang(UniversityofCalifornia,Davis,USA).Description:NewJersey:WorldScientific,2019.|Includesbibliographicalreferencesandindex.Identifiers:LCCN2019004891|ISBN9789814733830(hardcover:alk.paper)Subjects:LCSH:Multilevelmodels(Statistics)--Problems,exercises,etc.|Linearmodels(Statistics)--Problems,exercises,etc.|Mathematicalmodels--Problems,exercises,etc.Classification:LCCQA278.J532019|DDC519.5/36--dc23LCrecordavailableathttps://lccn.loc.gov/2019004891BritishLibraryCataloguing-in-PublicationDataAcataloguerecordforthisbookisavailablefromtheBritishLibrary.Copyright©2019byWorldScientificPublishingCo.Pte.Ltd.Allrightsreserved.Thisbook,orpartsthereof,maynotbereproducedinanyformorbyanymeans,electronicormechanical,includingphotocopying,recordingoranyinformationstorageandretrievalsystemnowknownortobeinvented,withoutwrittenpermissionfromthepublisher.Forphotocopyingofmaterialinthisvolume,pleasepayacopyingfeethroughtheCopyrightClearanceCenter,Inc.,222RosewoodDrive,Danvers,MA01923,USA.Inthiscasepermissiontophotocopyisnotrequiredfromthepublisher.Foranyavailablesupplementarymaterial,pleasevisithttps://www.worldscientific.com/worldscibooks/10.1142/9888#t=supplPrintedinSingaporeLaiFun-9888-RobustMixedModelAnalysis.indd112-03-199:21:22AM

5January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4pagevTomymother-in-law,HoàngThịVui,andthememoryofmyfather-in-law,Nguy˜ˆenThư,withlove

6b2530InternationalStrategicRelationsandChina’sNationalSecurity:WorldattheCrossroadsThispageintentionallyleftblankb2530_FM.indd601-Sep-1611:03:06AM

7March8,20199:47ws-book9x6RobustMixedModelAnalysisbook4pageviiPrefaceOverthepastdecadeorsothecurrentauthorhaspublishedabookandamonographcoveringtopicscloselyrelatedtomixedeffectsmodels.Thebook,LinearandGeneralizedLinearMixedModelsandTheirApplica-tions(Springer2007),coverstwoimportantclassesofmixedeffectsmodels,namely,linearmixedmodels(LMMs)andgeneralizedlinearmixedmodels(GLMMs).Themonograph,AsymptoticAnalysisofMixedEffectsMod-els:Theory,Application,andOpenProblems(Chapman&Hall2017),focusesonasymptotictechniquesusedinmixedeffectsmodels,whichin-cludeLMMs,GLMMsaswellasnonlinearmixedeffectsmodels.Havinghadextensivecoverageonmixedeffectsmodelsinthesedocuments,aques-tionthatnaturallyarisesiswhetherthereisnecessityforathirdonethatis,onceagain,relatedtothesametopics.Theanswertothequestionisclearlyyes.Thesemodelsarebroadlyusedinpractice,especiallyinsituationswheretheobservationsarecorrelated(e.g.,medicalstudies),orwheretheprimaryinterestisatsubjectlevel(e.g.,precisionmedicine),orwherevarianceandcovarianceareofmaininterest(e.g.,genetics).Ontheotherhand,mostmixedeffectsmodelsarehighlyparametricrelyingonspecificassumptionsnotonlyaboutthemeanfunctionsbutalsoaboutdistributionsoftherandomeffects,whichareunobservable,addingtothedifficultyofcheckingtheseassumptions.Furthermore,standardmethodsofinferenceaboutmixedeffectsmodelsmaybesensitivetotheimpactofunusualdatapoints,sometimesreferredtoasoutliers.Itisimportanttoknowwhetheramethodofmixedmodelanalysisisrobust;ifnot,whatcanbedonetomodifytheprocedure,orifthereareotheroptions,inorderforthemethodtoberobust.Here,forthemostpart,robustnessmeansthataprocedureofanalysisisnotsensitive,orlesssensitive,totheimpactofsomethingunexpected,ifithappens.Thevii

8March8,20199:47ws-book9x6RobustMixedModelAnalysisbook4pageviiiviiiRobustMixedModelAnalysiscurrentfocusisthusontherobustnessaspectsofthecurrentstate-of-the-artmixedmodelanalysis.The2007bookismoreofamethod-applicationapproach;the2017monographismoreleaningtowardtheory.Inaway,thecurrentapproachissimilartothe2007book,butwithanincreasedemphasisonapplica-tions.Eachchapterissupportedbyanumberofreal-dataapplications,orillustrations,someofwhichwouldberevisitedmultipletimesindifferentcontexts.Furthermore,eachchapterissupplementedbyanumberofexercises.Theexercisesaremostlyrelatedtothematerialscoveredinthechapter,andusedtohelpwithunderstanding.Inthisregard,thecurrentapproachisalsosimilartothe2007book.Overall,thelayoutisbetweenabookandamonograph.Itmaybeusedasatextforagraduate-levelcoursecoveringmixedeffectsmodelsand,inthisregard,ithasallofthemainelementsofLMMsandGLMMs.Alternatively,itshouldalsoserveasacomprehensivereferenceonrobustmixedmodelanalysis,coveringmethods,theoryandapplications.Theauthorthankshiscollaborators,ParthaLahiri,YihuiLuan,ThuanNguyen,J.SunilRao,HanmeiSun,MahmoudTorabi,andYou-GanWangforallowinghimtoincludepartsoftheirresearchworkinthisbook/monograph.Finally,theauthorisgratefultohisformerandcur-rentPh.D.students,CeciliaDao,RohosenBandyopadhyay,andXiaoyan(Lucia)Liufortheirtechnicalsupportsduringthethree-yearperiodwhentheauthorwasworkingonthebookproject.JimingJiangDavis,CaliforniaNovember2018

9January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4pageixContentsPrefacevii1.Introduction11.1Illustrativeexample......................11.2Outlineofapproachestorobustmixedmodelanalysis..31.3Aroadmap..........................41.4Exercises............................52.GeneralizedEstimatingEquations72.1Methodofmoments.....................72.2Generalizedestimatingequations(GEE)..........142.3Iterativeestimatingequations................212.4RobustestimationinGLMM................282.5RobustGEE..........................342.6Real-dataexamples......................382.6.1Hipreplacementdatarevisited...........382.6.2Salamander-matingdata..............392.6.3Epilepticseizuredata................422.7Exercises............................443.Non-GaussianLinearMixedModels473.1Typesofmodels........................473.2Quasi-likelihoodmethod...................493.3Measureofuncertainty....................523.3.1Empiricalmethodofmoments...........533.3.2Partiallyobservedinformation...........56ix

10January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4pagexxRobustMixedModelAnalysis3.3.3Hypothesistesting:Asimulatedexample.....613.3.4ConsistencyofPOI.................643.4Real-dataexample......................653.5Exercises............................664.RobustTests694.1Robustdispersiontests....................694.2Robustversionsofclassicaltests..............724.2.1Basicidea,assumptions,andexamples......734.2.2TheW-,S-,andL-teststatistics..........764.2.3Asymptotictheory..................784.3RobustclassicaltestsformixedANOVAmodel......824.4Aunifiedrobustgoodness-of-fittest............854.4.1Tailoring.......................874.4.2ApplicationtoSAE.................904.4.3Empiricalresults...................954.5Real-dataexamples......................984.5.1TVSFPdata.....................984.5.2Medianincomedata.................1004.6Exercises............................1015.ObservedBestPrediction1035.1Bestlinearunbiasedprediction...............1035.2Observedbestprediction:AnotherlookattheBLUP...1045.3Example............................1085.4OBPfortwoclassesofSAEmodels.............1095.4.1Fay-Herriotmodel..................1105.4.2Nested-errorregressionmodel...........1105.5OBPforsmallareacounts..................1135.5.1Bestpredictionunderatwo-stagemodel.....1135.5.2DerivationofOBP..................1155.5.3Extensions......................1175.6AsymptoticpropertyofBPE................1195.7Exampleswithsimulations..................1215.7.1Asimpleexample..................1215.7.2Caseswhentheassumedmodeliscorrect,orpartiallycorrect...................1235.7.3ComparisonofOBPandEBPunderanLNmodel127

11January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4pagexiContentsxi5.8Estimationofarea-specificMSPE..............1295.9Real-dataexamples......................1365.9.1Hospitaldata.....................1365.9.2TVSFPdatarevisited................1405.9.3Minnesotacountydata...............1435.10Exercises............................1476.ModelSelection1496.1Generalizedinformationcriteria...............1506.1.1Selectingthefixedcovariatesonly.........1516.1.2Selectingfixedcovariatesandrandomeffectfactors1546.1.3ArobustapproximationtoBIC..........1596.1.4ArobustconditionalAICforLMM........1646.2Thefencemethods......................1666.2.1Adaptivefence....................1696.2.2Invisiblefence....................1716.2.3Modelselectionwithincompletedata.......1746.2.4Examples.......................1756.3Shrinkagemixedmodelselection..............1826.3.1AnE-Mbasedapproach...............1836.3.2Predictiveshrinkageselection............1846.3.3Real-dataexample:Analysisofhigh-speednetwork1856.4Exercises............................1907.OtherTopics1937.1Mixedmodeldiagnostics...................1937.2Nonparametric/semiparametricmethods.........1967.2.1AP-splinenonparametricmodel..........1967.2.2Nonparametricmodelselection...........1977.2.3Examples.......................1997.2.4Functionalandsemiparametricmixedmodels...2047.2.5Nonparametricbootstrapping............2087.3Bayesiananalysis.......................2107.3.1ArobusthierarchicalBayesmethod........2107.3.2Real-dataexample:Iowacropsdata........2117.3.3Bayesianempiricallikelihood............2137.3.4Bayesianmodeldiagnostics.............2147.4Moreaboutoutliers......................215

12January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4pagexiixiiRobustMixedModelAnalysis7.5Benchmarking.........................2177.6Moreaboutprediction....................2217.6.1Predictionoffutureobservation..........2217.6.2Classifiedmixedmodelprediction.........2277.7Exercises............................240Bibliography243Index253

13January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page1Chapter1IntroductionMixedeffectsmodels,ormixedmodels,havehadawide-rangingimpactinmodernappliedstatistics.See,forexample,Jiang(2007),McCullochetal.(2008),Demidenko(2013).Thesemodelsarecharacterizedbyrandomeffectsthatareinvolvedinacertainway.Dependingonhowtherandomeffectsareinvolved,amixedmodelmaybeclassifiedasalinearmixedmodel(LMM),generalizedlinearmixedmodel(GLMM),non-linearmixedmodel(NLMM),semi-parametricmixedmodel(SPMM),ornon-parametricmixedmodel(NAMMA).Duetotheirbroadapplications,therehasbeenagrowinginterestinlearningaboutthesemodels.Ontheotherhand,someofthesemodels,suchasLMMandGLMM,arehighlyparametric,involvingdistributionalassumptionsthatmaynotbesatisfiedinreal-lifeproblems.Itisimportanttomakesurethataprocedureofstatisticalanalysisisrobustagainstviolationoftheassumptions,orsome“badcases”.Herethewordrobustnessisformallyintroduced,forthefirsttime,andweneedtomakeclearwhatitmeans.Generallyspeaking,itmeanstheabilityofaproceduretosurvivesomeunexpectedsituations.Lifeisfullofsurprises.Theunexpectedsituationcouldbeaviolationofanassumption,anoutlyingobservation,orsomethingelse.Luckilyforthepractitioners,thereisarichcollectionofmethodsthatarecurrentlyavailableandrobust,incertainways.Asanintroductoryexample,considerthefollowing.1.1IllustrativeexampleAspecialclassofLMMisthemixedANOVAmodel[e.g.,Jiang(2007)],whichcanbeexpressedasy=Xβ+Z1α1+···+Zsαs+,(1.1)1

14January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page22RobustMixedModelAnalysiswhereXisaknownmatrixofcovariates,βisavectorofunknownpa-rameters(thefixedeffects),Z1,...,Zsareknownmatrices,α1,...,αsarevectorsofrandomeffects,andisavectoroferrors.Thestandardassump-tionassumesthatα∼N(0,σ2I),whereσ2isanunknownvariance,rrmrr1≤r≤s,andIdenotesthen×nidentitymatrix,and∼N(0,τ2I),nNandthatα1,...,αs,areindependent.Undersuchassumptions,restrictedmaximumlikelihood(REML)estimatorsofthefixedeffects,β,andvari-ancecomponents,σ2,1≤r≤sandτ2,canbederived.Namely,letAberanN×(N−p)matrix,wherep=rank(X),suchthatrank(A)=N−pandAX=0.(1.2)TheREMLestimatorofψ=(σ2,...,σ2,τ2)isthemaximumlikelihood1s(ML)estimatorofψbasedonz=Ay,whosedistributiondoesnotdependonβ.OncetheREMLestimatorofψisobtained,say,ψˆ,theREMLestimatorofβisgivenbyβˆ=(XVˆ−1X)−1XVˆ−1y,(1.3)2s2whereVˆ=ˆτIN+r=1σˆrZrZr.Itcanbeshown(Exercise1.1)thattheREMLestimator,ψˆ,isasolutiontothefollowingREMLequations:yP2y=tr(P),(1.4)yPZZPy=tr(PZZ),1≤r≤s,rrrrwhereP=V−1−V−1X(XV−1X)−1XV−1andVisVˆwithσˆ2,1≤r≤srandτˆ2replacedbyσ2,1≤r≤sandτ2,respectively.Hereweassume,forrsimplicity,thatXisfullrank.ThepointtobemadeisthattheREMLequations(1.4),whicharederivedunderthenormalityassumption,canbederivedunderacompletelydifferentdistribution.Infact,exactlythesameequationswillariseifthedistributionofyisassumedtobemultivariate-twiththeprobabilitydensityfunction(pdf)Γ{(n+d)/2}p(y)=(dπ)n/2Γ(d/2)|V|1/2−(n+d)/21−1×1+(y−Xβ)V(y−Xβ),(1.5)dwheredisthedegreeoffreedomofthemultivariatet-distributionand|V|denotesthedeterminantofV.Notethatthemultivariatenormaldistri-butionmaybeviewedasthelimitingdistributionofthemultivariate-tasd→∞(Exercise1.2).AnimplicationisthattheGaussianREMLestima-tor,thatis,REMLestimatorderivedunderthenormalityassumption,is

15January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page3Introduction3stillvalidevenifthenormalityassumptionisviolatedinthattheactualdistributionismultivariate-t.Thelatterisknowntohaveheaviertailsthanthemultivariatenormaldistribution.Afurtherquestionishowfarcanonegoinviolatingthenormalityassumption.Forexample,whatiftheactualdistributionisunknown?Weshalladdresstheissuemoresystematicallylater.1.2OutlineofapproachestorobustmixedmodelanalysisThesimplestwayofdoingsomethingisnottodoanythingatall.Notexactly.Herebynotdoinganythingitmerelymeansthatthereisnoneedtomakeanychangesintheexistingproceduretomakeitrobust.Still,onehastojustifytherobustnessoftheexistingprocedure.Thejustificationcanbebyexactderivation,asinSection1.1(alsoExercise1.2),byasymptoticarguments,orbyempiricalstudies,suchasMonte-Carlosimulationsandreal-dataapplications.Thisiswhatwecallthefirstapproach.Thesecondapproachistobuildaprocedureofmixedmodelanalysisonweakerassumptions.Forexample,quasi-likelihoodmethodsavoidfullspecificationofthelikelihoodfunctionbyusingestimatingfunctions,orestimatingequations.Thelatterrelyonlyonspecificationofmoments,typicallythefirsttwomoments.Generallyspeaking,themoreassumptionsonemakes,themorelikelysomeoftheseassumptionswillbeviolated.Con-versely,aprocedurebuiltonweakerassumptionsislikelytobemorerobustthanonebuiltonstrongerassumptions.Anotherwell-knownexampleistheleastsquares(LS)method.Underthestandardlinearregressionmodel,whichcanbeviewedas(1.1)withoutthetermZ1α1+···+Zsαs,theMLestimatorforβisgivenbyβˆ=(XX)−1Xy,assuming,again,thatXisfullrank.However,thesameestimatorcanbederivedfromaseeminglydifferentprinciple,theleastsquares(LS),whichdoesnotusethenormalityassumptionatall.Infact,theGaussianMLestimatorremainsconsistentevenifthenormalityfails[e.g.,Jiang(2010),sec.6.7]and,inthissense,theMLestimationisrobust.However,theML,orLS,estimatorhasanotherproblem:Itisnotrobusttooutliers.Thisbringsupthenextapproach.Thethirdapproach,whichisoftenconsideredwhendealingwithout-liers,istorobustifyanexistingmethodtomakeitmorerobust.Herebyrobustificationitmeanstomodifysomepartofthecurrentprocedurewiththeintentionofrobustness.Forexample,generalizedestimatingequations

16February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page44RobustMixedModelAnalysis(GEE)hasbeenusedintheanalysisoflongitudinaldata[e.g.,Diggleetal.(2002);seeChapter2below].However,therehasbeenconcernsthattheGEEmaynotberobusttooutlyingobservations.OnewaytorobustifytheGEEistomodifythedefinitionoftheresidualssothatithasaboundedrange,thusreducingtheinfluenceofanoutlier.Thenextapproachistogosemi-parametric,ornon-parametric.Thesemodelsaremoreflexible,orlessrestrictive,insomewayssothatthechanceofmodelmisspecificationis(greatly)reduced.Forexample,insteadofas-sumingalinearmixed-effectsfunction,asinLMM,onemayassumethatthemeanfunction,conditionalontherandomeffects,isanunknown,smoothfunction.Thelatter,ofcourse,includesthelinearfunctionasaspecialcaseso,whentheLMMholds,thenon-parametricapproachwouldlosesomeef-ficiency.However,atrade-offisrobustness–thenon-parametricmeanfunc-tionwouldstillbevalidwhenthelinearfunctionfails.Inotherwords,themeanfunctionwouldremaincorrectlyspecifiedunderthenon-parametricapproach,evenifitismisspecifiedundertheLMM.Finally,itisalwaysagoodpracticetocarefullychoose,andcheck,theproposedstatisticalmodel.Thisway,thechance,orextent,ofmodelmisspecificationislikelytobereduced.Itisinthissensethatmixedmodelselectionanddiagnosticsareimportanttechniquesofrobustmixedmodelanalysis.Itshouldbenotedthatmodelselectioninthecontextofmixedeffectsmodelsisnotaconventionalmodelselectionprobleminthattheeffectivesamplesizeisnotclearduetocorrelationsamongtheobservations.Thus,forexample,astandardinformationcriterion,suchasBIC[Schwarz(1978)],wouldencountersomedifficultiesduetoitsdependenceontheeffectivesamplesize.1.3AroadmapWehaveoutlinedanumberofmainapproachestorobustmixedmodelanalysis.However,whenitcomestocoveringtheseapproaches,wehaveourownstrategies.Whatwearegoingtodoistofirstfocusonspecialtypesofanalyses,suchasestimation(Chapters2and3),tests(Chapter4),andmixedmodelprediction(Chapter5),wheremethodsofrobustmixedmodelanalysishavebeendeveloped.Inparticular,mixedmodelpredictionisoneofthecharacteristicsofmixedmodelanalysis,becauseestimationandtestingproblems,ofcourse,alsoappearinotherfieldsofstatistics.Afterthesespecialtopics,thenextchapterisdevotedtomodelselection.

17January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page5Introduction5Here,followingM¨ulleretal.(2013),weconsiderthreeclassesofstrategiesformixedmodelselection,generalizedinformationcriteria,thefencemeth-ods,andshrinkagemixedmodelselection.Therearemanymoretopics,methods,oraspectsrobustmixedmodelanalysis.Asafinalchapter,wehaveputtogetheracollectionofsuchtop-ics,includingmixedmodeldiagnostics,nonparametricandsemiparametricmethods,Bayesiananalysis,outliers,benchmarking,andfurthertopicsonprediction.Itshouldbenotedthatthereareplentyofintersectionsofthesetopics,nomatterhowonewouldclassifythem,thatitturnsouttobeadifficulttasktodeterminewhichchapter,andsection,shouldbethe(main)homeofeachtopic.Wehavetriedourbesttomakesurethateveryonehasahomeand,hopefully,ishappyaboutitshome.Nevertheless,thecoverageofrelevanttopicsisnotinclusiveandthereare,unavoidably,somemissingtopicsthatarepotentiallyimportant.Theauthorapologizesforoverlookingsuchwork,andwouldstronglyencouragerelevantauthorstocontacthimsothatthemissingworkcanbeincludedinthenextedition.Ineachoneofthesechapters,oneormoreoftheapproachessummarizedinSection1.2willbediscussedindetail.Itwouldbehelpfultokeepthisinmind.Ofcourse,theapproachesoutlinedinSection1.2onlygivethe“bigpictures”.Everyproblemhasitsspecialty,asisoftenthecase.Noteonnotation:Inprobabilitytheoryormathematicalstatistics,itiscustomarytousecapitalletters,suchasYi,forarandomvariable,andlowercaseletters,suchasyi,foranobserved,orrealized,valueofYi.Suchadistinctionisrarelyimportant,however,inappliedstatistics.Wefollowthelatterconventionfornotationsimplicity,whichusuallycausesnoconfusion.So,forexample,throughoutthismonograph,yirepresentsbotharandomvariableandtheobservedvalueoftherandomvariable,dependingonthecontextoroccasion.1.4Exercises1.1.DerivetheREMLequations(1.4).[Hint:FirstshowthattheexpressionforPgivenbelow(1.4)isidenticaltoA(AVA)−1A.]AlsoshowthattheREMLequationsdonotdependonthechoiceofA.Inotherwords,anymatrixAsatisfying(1.2)resultsinthesameREMLequations.1.2.Thisexerciseisrelatedtothemultivariatet-distributionwhosepdfisgivenby(1.5).

18February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page66RobustMixedModelAnalysisa.Showthatthemultivariatenormaldistributionisthelimitofthemultivariate-tasd→∞.b.Showthatthemultivariatet-distribution(1.5)leadstothesameREMLequation(1.4).Morespecifically,ifonederivestheREMLestimatorfollowingthesameprocedureasinExercise1.1exceptwiththemultivariatenormaldistributionofyreplacedbythemultivariatet-distribution,theresultingREMLequationsareequivalentto(1.4).

19January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page7Chapter2GeneralizedEstimatingEquations2.1MethodofmomentsThemethodofmoments(MM)isthoughttobetheoldestmethodoffindingpointestimators[CasellaandBerger(2002),pp.312].Themethod,initsoriginalform,isbasedonanassumptionthatthedistributionofanobservation,yi,dependsonavectorθofunknownparameters.Inordertoestimateθ,certainmomentconditionsarerequired.Namely,letpbethedimensionofθ.Itisassumedthaty1,...,ynareindependentsuchthatE(|y|p)<∞.Furthermore,itisassumedthatE(yk)=ψ(θ),1≤i≤n,iikwhereψk(·)isaknownfunction,1≤k≤p.Notethatthe(pthabsolute)momentconditionimposedimpliesthatallofthemomentsinvolvedarefinite.Also,thekthmomentofyi(1≤k≤p)doesnotdependoni.Theideaistoequaltheuptopthsamplemomentsoftheobservationstoitsexpectedvalue,thatis,n−1nyk=E(n−1nyk),thatis,i=1ii=1in1kyi=ψk(θ),1≤k≤p.(2.1)ni=1Theθontherightsideof(2.1)issupposedtobethetrueparametervector.Becauseθisunknown,(2.1)istreatedasanequationsystemforsolvingθ,inwhichcasethelatteristreatedasa(vector-valued)variable.Notethatthereareasmanyequationsin(2.1)asthedimensionofθ;therefore,oneexpectstofindasolutionto(2.1),whichiscalledtheMMestimator(MME).Typically,regularityconditionsareneededtoensuretheexistence,anduniqueness,oftheMME[e.g.,Jiang(1998a)].Thebasicideacanbegeneralized,inseveralways.Forexample,ineconometrics,aprocedurecallgeneralizedmethodofmoments(GMM)isoftenused[Hansen(1982)].Letg(·,·)beavector-valuedfunctionsatisfyingE{g(yi,θ)}=0foreveryiwhenθisthetrueparametervector.The7

20January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page88RobustMixedModelAnalysis−1ngeneralizedsamplemomentsisdefinedasm(y;θ)=ni=1g(yi,θ).TheGMMestimatorofθisobtainedbyequatingm(y;θ)toitsexpectedvaluewhenθisthetrueparametervector,thatis,bysolvingm(y;θ)=0.(2.2)ItisclearthattheMMisaspecialcaseofGMM(Exercise2.1),hencejustifyingthetermGMM.AnothervariationoftheMMisthemethodofsimulatedmoments[MSM;e.g.,McFadden(1989)].Insomecases,suchasunderageneral-izedlinearmixedmodel[GLMM;e.g.,Jiang(2007)],themomentshavenoanalyticexpressionsintermsofθ.Weillustratewithanexample.Example2.1.Supposethat,giventherandomeffects,α1,...,αm,yij,1≤i≤m,1≤j≤karebinaryoutcomesthatare(conditionally)independentwithlogit{P(yij=1|α)}=μ+αi,whereα=(αi)1≤i≤m,andlogit(p)=log{p/(1−p)}.Furthermore,supposethattherandomeffectsareindependentanddistributedasN(0,σ2).Here,μ,σareunknownparame-terswithσ≥0,soletθ=(μ,σ).Itismoreconvenienttousethefollowingexpression:αi=σξi,1≤i≤m,whereξ1,...,ξmarei.i.d.N(0,1)randomvariables.ConsiderthefollowingMMequationsforestimatingθ,m1yi·=E(y1·),(2.3)mi=1m122yi·=E(y1·),(2.4)mi=1kwhereyi·=j=1yij.Notethatyi·,1≤i≤marei.i.d.Furthermore,itiseasytoshowthatE(y)=kE{h(ξ)}andE{y2}=kE{h(ξ)}+k(k−1·θ1·θ1)E{h2(ξ)},whereh(x)=exp(μ+σx)/{1+exp(μ+σx)}andξ∼N(0,1)θθ(Exercise2.2).SituationslikeExample2.1are,actually,typicalinGLMM.ItiswellknownthatcomputationisamajorissueininferenceaboutGLMMs[e.g.,Torabi(2012)].Morespecifically,underaGLMM,thelikelihoodfunc-tiontypicallydoesnothaveananalyticexpression;evenmore,itmayinvolvehigh-dimensionalintegrals,whichmakesnumericalevaluationsdif-ficult.Forexample,considerthefollowing.Example2.2.Supposethat,giventherandomeffectsu1,...,um1andv1,...,vm2,binaryresponsesyij,i=1,...,m1,j=1,...,m2areconditionallyindependentsuchthat,withpij=P(yij=1|u,v),wehave

21January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page9GeneralizedEstimatingEquations9logit(pij)=μ+ui+vj,whereμisanunknownparameter,u=(ui)1≤i≤m1,andv=(vj)1≤j≤m2.Furthermore,therandomeffectsu1,...,um1andv,...,vareindependentsuchthatu∼N(0,σ2),v∼N(0,σ2),where1m2i1j2thevariancesσ2andσ2areunknown.Thus,theunknownparameters12involvedareθ=(μ,σ,σ)withσ,σ≥0.Itcanbeshown(Exercise2.3)1212thatthelikelihoodfunctionforestimatingθcanbeexpressedasm12m22c−log(σ1)−log(σ2)+μy··22⎡⎤m1m2+log···⎣{1+exp(μ+u+v)}−1⎦iji=1j=1⎛⎞m1m21m11m2×exp⎝uy+vy−u2−v2⎠ii·j·j2σ2i2σ2ji=1j=11i=12j=1du1···dum1dv1···dvm2,(2.5)m1m2m2wherecisaconstant,y··=i=1j=1yij,yi·=j=1yij,andy·j=m1i=1yij.Themultidimensionalintegralinvolvedin(2.5)hasnoclosed-formexpression,anditcannotbefurthersimplified.Furthermore,suchanintegralisdifficulttoevaluateevennumerically.Forexample,ifm=n=40,thedimensionoftheintegralwillbe80.Tomakeitevenworse,theintegrandinvolvesaproductof1600termswitheachtermlessthanone.ThismakesitalmostimpossibletoevaluatetheintegralusinganaiveMonteCarlomethod.Toseethis,supposethatu1,...,u40andv1,...,v40aresimulatedrandomeffects(fromthenormaldistributions).Then,theproductintheintegrand(withm=n=40)isnumericallyzero.Therefore,numerically,thelawoflargenumbers,whichisthebasisofthe(naive)MonteCarlomethod,willnotyieldanythingbutzerowithoutahugeMonteCarlosamplesize.Thesituationisquitedifferent,however,ifoneconsiders,insteadofthelikelihood,momentsoftheyi·andy·j.Forexample,itiseasytoshowthatE(y)=mE{h(μ+σξ+ση)},whereh(x)=ex/(1+ex),i·212andξ,ηareindependentN(0,1)(Exercise2.3).Thelatterexpressiononlyinvolvesatwo-dimensional,andtheproductissuementionedabovedoesnotoccur,regardlessofthesizeofm1andm2.Ifthemomentsonlyinvolvelow-dimensionalintegrals,suchasinExam-ples2.1and2.2,onecanapproximatethesemomentsbythecorrespondingsamplemomentsofgeneratedrandomvariables.Theapproximationisjus-tifiedbythelawoflargenumbers[e.g.,Jiang(2010),sec.6.2].ThisistheideaofMSM.Themethodhadpreviouslyfoundapplicationsin,forexample,econometrics[e.g.,McFadden(1989),Lee(1992)].

22January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page1010RobustMixedModelAnalysisTable2.1SimulatedmeanandSEEstimatorofμEstimatorofσ2mkMeanSEMeanSE2020.310.522.903.422060.240.301.120.848020.180.221.080.838060.180.141.030.34Example2.1(continued).ToillustratetheMSM,consider,again,Example2.1.Itismoreconvenienttoconsiderthefollowingequationsthatareequivalentto(2.3)and(2.4)(verify):y··=E{hθ(ξ)},(2.6)mkm122(yi·−yi·)=E{hθ(ξ)},(2.7)mk(k−1)i=1whereξ∼N(0,1).Thenextthingwedoistoreplacetherightsidesof(2.6)and(2.7)bythecorrespondingsamplemomentsofrandomvariablesgeneratedunderthestandardnormaldistribution,resultingLy··1=hθ(ξl),(2.8)mkLl=11m1L22(yi·−yi·)=hθ(ξl),(2.9)mk(k−1)Li=1l=1whereξl,1≤l≤LareindependentN(0,1)randomvariablesgeneratedbyacomputer.Thesolutionto(2.8)and(2.9)arecalledtheMSMestimator.Jiang(1998a)presentedresultsofasimulationstudy,inwhichthetrueμandσare0.2and1.0,respectively.Theresults,basedon1000simulationruns,arepresentedinTable2.1.HeretheMSMestimatorofσ2issimplythesquareoftheMSMestimatorofσ.Itisseenthattheresultsimproveaseithermorkincreases.TodescribeageneralprocedureforMSM,weassumethatthecondi-tionaldensityofyigiventhevectorofrandomeffects,α,hasthefollowingformofanexponentialfamily,f(yi|α)=exp[(wi/φ){yiηi−b(ηi)}+ci(yi,φ)],(2.10)whereηiisalinearpredictorthatcanbeexpressedasη=xβ+zα,(2.11)iiiwithxi,zibeingknownvectorofcovariatesanddesignvector,respectively,andβbeingavectorofunknownparameters(thefixedeffects);φisadis-

23January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page11GeneralizedEstimatingEquations11persionparameter,andwiaknownweight.Typically,wi=1forungroupeddata;wi=kiforgroupeddataiftheresponseisanaverage,wherekiisthegroupsize;andwi=1/kiiftheresponseisagroupsum.Hereb(·)andci(·,·)areknownfunctionsassociatedwiththegeneralizedlinearmodels[GLM;McCullaghandNelder(1989);alsoseeAppendix].Furthermore,weassumethatα=(α,...,α),whereαisarandomvectorwhosecompo-1qrnentsareindependentanddistributedasN(0,σ2),1≤r≤q.Supposethatry,1≤i≤nareobserved.WeassumethatZ=(z)=(Z,...,Z)ii1≤i≤n1qsothatZα=Z1α1+···+Zqαq.Thefollowingexpressionofαissometimesmoreconvenient,α=Dξ,(2.12)whereDisblockdiagonalwiththediagonalblocksσrImr,1≤r≤q,andξ∼N(0,Im)withm=m1+···+mq.Firstassumethatφisknown.Letθ=(β,σ,...,σ).Consideranunrestrictedparameterspaceθ∈1qΘ=Rp+q.ThisallowscomputationalconvenienceforusingtheMSMbecause,otherwise,therewillbeconstraintsontheparameterspace.Ofcourse,thisraisestheissueofidentifiability,becauseboth(β,σ,...,σ)1qand(β,−σ,...,−σ)correspondtothesamemodel.Nevertheless,it1qsufficestomakesurethatβandσ2=(σ2,...,σ2)areidentifiable.Infact,1qJiang(1998a)showedthat,undersuitableconditions,theMSMestimatorsofβandσ2areconsistent;therefore,thelatterparametersare,atleast,asymptoticallyidentifiable.ThenextthingwedoistoWefirstderiveasetofsufficientstatisticsforθ.Itcanbeshown(Exercise2.4)thatthemarginaldensityfunctionofy=(yi)1≤i≤ncanbeexpressedasnb(u,θ)|u|2βL=expc+a(y,φ)+−+wixiyiφ2φi=1nD+wiziyiudu,(2.13)φi=1wherecisaconstant,a(y,φ)dependsonlyonyandφ,andb(u,θ)dependsonlyonuandθ.ItfollowsthatasetofsufficientstatisticsforθisgivenbynSj=i=1wixijyi,1≤j≤p,nSp+l=i=1wizi1lyi,1≤l≤m1,...nSp+m1+···+mq−1+l=i=1wiziqlyi,1≤l≤mq,

24January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page1212RobustMixedModelAnalysiswhereZr=(zirl)1≤i≤n,1≤l≤mr,1≤r≤q.Thus,anaturalsetofMMequationscanbeformulatedasnnwixijyi=wixijEθ(yi),1≤j≤p,(2.14)i=1i=122mrnmrnwizirlyi=Eθwizirlyi,1≤r≤q.(2.15)l=1i=1l=1i=1AlthoughtheSjsaresufficientstatisticsforthemodelparametersonlywhenφisknown(whichincludesthespecialcasesofbinomialandPoissondistributions),onemaystilluseEquations(2.14)and(2.15)toestimateθevenifφisunknown,providedthattheright-handsidesoftheseequationsdonotinvolveφ.Notethatthenumberofequationsin(2.14)and(2.15)isidenticaltothedimensionofθ.However,inorderfortherightsidesof(2.14),(2.15)nottodependonφ,somechangeshavetobemade.Forsimplicity,inthefollowingweassumethatZr,1≤r≤qarestandarddesignmatricesinthesensethateachZrconsistsonlyof0sand1s,andthereisexactlyone1ineachrow,andatleastone1ineachcolumn.Then,ifwedenotetheithrowofZbyz=(z),weririrl1≤l≤mrhave|z|2=1and,fors=t,zz=0or1.LetI={(s,t):1≤s=t≤irsrtrrn,zz=1}={(s,t):1≤s=t≤n,z=z}.Then,itcanbeshownsrtrsrtr(Exercise2.5)that2mrnEθwizirlyil=1i=1n=w2E(y2)+wwE(yy).(2.16)iθiststi=1(s,t)∈IrItisseenthatthefirsttermontherightsideof(2.16)dependsonφ,andthesecondtermdoesnotdependonφ(Exercise2.5).Therefore,asimplemodificationoftheearlierMMequationsthatwilleliminateφwouldbetoreplace(2.15)bythefollowingequations,wswtysyt=wswtEθ(ysyt),1≤r≤q.(2.17)(s,t)∈Ir(s,t)∈IrFurthermore,writeξ=(ξ,...,ξ)withξ=(ξ).Notethat1qrrl1≤l≤mrξr∼N(0,Imr).Then,therightsideof(2.14)canbeexpressedasXWE{e(θ,ξ)},j

25January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page13GeneralizedEstimatingEquations13whereXjisthejthcolumnofX,W=diag(wi,1≤i≤n),andpqe(θ,ξ)={b(ηi)}1≤i≤nwithηi=j=1xijβj+r=1σrzirξr(Exercise2.5).Similarly,therightsideof(2.17)canbeexpressedasE{e(θ,ξ)WHrWe(θ,ξ)},whereHristhen×nsymmetricmatrixwhose(s,t)entryis1{(s,t)∈Ir}.Thus,thefinalMMequationsthatdonotinvolveφaregivenbynwxy=XWE{e(θ,ξ)},1≤j≤p,(2.18)iijiji=1wswtysyt=E{e(θ,ξ)WHrWe(θ,ξ)},1≤r≤q,(2.19)(s,t)∈Irwheretheexpectationsontherightsidesarewithrespecttoξ∼N(0,Im).Thenextthingwedoistoapproximatetherightsidesof(2.18)and(2.19)byMonteCarlosimulation.Letξ(1),...,ξ(L)bei.i.d.copiesofξgeneratedbyacomputer.Then,therightsidesof(2.18),(2.19)areapproximatedbyL1(l)XjWe{θ,ξ},1≤j≤p,(2.20)Ll=1L1(l)(l)e{θ,ξ}WHrWe{θ,ξ},1≤r≤q,(2.21)Ll=1respectively.Inconclusion,theMSMequationsforestimatingθare(2.18)and(2.19)withtherightsidesapproximated(2.20)and(2.21),respectively.WehaveshownhowtheMSMworksinestimationoftheparametersunderaGLMM,but,sofar,wehavenotdiscussedtheconnectionbetweenMSMandrobustness.Notethat,unlikethemaximumlikelihood(ML)method,theMMonlyinvolvesassumptionsaboutthemoments.Forex-ample,in(2.14)and(2.17),onlytheconditionalfirstmomentsofthedata,yi,giventherandomeffects,needtobespecified(Exercise2.6).Itispos-siblethattheconditionaldistributionofthedatagiventherandomeffectsdoesnotsatisfy(2.10)and(2.11),buttheconditionalfirstmomentsarecorrectlyspecified.ItisinthissensethattheMM(orMSM)methodispotentiallymorerobustthantheMLmethod.Weconsideranexample.Example2.3.(Beta-binomial)IfY1,...,YkarecorrelatedBernoullirandomvariables,thedistributionofY=Y1+···+Ykisnotbinomial;infact,itmaynotbelongtotheexponentialfamily.Hereweconsideraspecialcase.Letpbearandomvariablewithabeta(π,1−π)distribution,where0<π<1.Supposethat,givenp,Y1,...,YkareindependentBernoulli(p)

26January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page1414RobustMixedModelAnalysisrandomvariables,sothatY|p∼binomial(k,p).Then,itcanbeshown(Exercise2.7)thatthemarginaldistributionofYisgivenbyΓ(j+π)Γ(k−j+1−π)P(Y=j)=,1≤j≤k.(2.22)j!(k−j)!Γ(π)Γ(1−π)Thisdistributioniscalledbeta-binomial(k,π).ItfollowsthatE(Y)=kπ(Exercise2.7).Itisseenthatthemeanunderbeta-binomial(k,π)isthesameasthatunderthebinomial(k,π)distribution.Nowsupposethat,giventherandomeffectsα1,...,αm,theyij’sarenotconditionallyinde-pendent;hence,onedoesnothaveaGLMM.Morespecifically,thecon-ditionaldistributionofyi·isbeta-binomial(k,πi),withlogit(πi)=μ+αi,insteadofbinomial(k,πi),asinExample2.1.Itfollowsthatthemeanontherightsideof(2.3)isstillgivenbykE{hθ(ξ)},asshowninExercise2.2.Aslongasthemeanfunctionsontherightsidesof(2.18)and(2.19)arecorrectlyspecified,theMSMestimatorsareconsistentunderregularityconditions[Jiang(1998a)],evenifakeypartoftheGLMMassumption,thatis,theconditionaldistribution(2.10),fails.Weshallreturntothislater.2.2Generalizedestimatingequations(GEE)WehavediscussedMMandMSMinGLMMs.Thesemethodsarerobustinthesensethatevenifpartofthedistributionalassumptionfails,theMMorMSMestimatorsmaystillbeconsistent.Ontheotherhand,theMMorMSMestimatorsmaybeinefficientinthatthevarianceoftheestimatormaybelarge,especiallyundersmallormoderatesamplesize.Thiscanbeseen,fromexample,fromTable2.1.TheSEfortheMSMestimatorofσ2isquitelargewhenm=20,k=2,althoughitdropsquicklywheneithermorkincreases.OnewaytoimprovetheefficiencyofMM(MSM)istoexploretheestimatingequations,say,(2.14)and(2.15).Notethatthelatterequationsareconstructedbysimplyaddingsomeofthesufficientstatistics,andthesquaresoftheothers,derivedbelow(2.13).Thisseemsabitarbitrary.Tolookattheissuemoreclearly,Supposethatthereisavectorofwhatwecallbasestatistics,say,S,whichtypicallyisofhigherdimensionthatθ.LetthedimensionofSandθbeNandr,respectively.OnecanconstructasetofMMequationsbylettingBS=Bu(θ),(2.23)

27January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page15GeneralizedEstimatingEquations15whereu(θ)=Eθ(S),theexpectationofSwhenθisthetrueparametervector,andBisanyr×Nconstantmatrix.ThedimensionofBischosensothatthereareexactlyasmanyequationsin(2.23)asthedimensionofθsothat,hopefully,thereisauniquesolutiontotheMMequation.Butthereare(infinitely)manychoiceofB.Forexample,equations(2.14)and(2.15)correspondtooneparticularchoiceofB(Exercise2.8).ThequestioniswhytheBhastobechosenthisway.Isthereabetterchoice,or,perhaps,thebestchoiceofB?ToexploretheissueabouttheoptimalityofB,letusfirstcarryoutaheuristicderivationoftheasymptoticcovariancematrix(ACVM)ofθˆ,theMMestimatordefinedasasolutionto(2.23).LetM(θ)denotethedifferencebetweenthetwosidesof(2.23),thatis,M(θ)=B{S−u(θ)}.BytheTaylorseriesexpansion,andrecallingthatθˆsatisfies(2.23),wehave0=M(θˆ)∂M≈M(θ)+(θˆ−θ)∂θ∂M≈M(θ)+E(θˆ−θ),(2.24)∂θwhereforM=(m)andθ=(θ),∂M/∂θisthematrixwhosej1≤j≤rk1≤k≤r(j,k)entryis∂m/∂θ,θisthetrueparametervector,atwhich∂M/∂θisjkevaluated.Thefirstapproximationin(2.23)isduetotheTaylorexpansion;thesecondapproximationholdsunderregularityconditions[e.g.,Jiang(2010),sec.4.7].IfwedenoteM(θ)byM,fornotationsimplicity,thenfrom(2.24)wearriveattheapproximation−1∂Mθˆ−θ≈−EM;(2.25)∂θhence,onewouldexpect−1−1∂M∂MVar(θˆ)≈EVar(M)E,(2.26)∂θ∂θwhere∂M/∂θ=(∂M/∂θ),andVardenotesthecovariancematrix.Itshouldbenotedthattheabovederivationisnotcompletelyrigorous–someregularityconditionsareneededfortheresulttohold,butneverthelesstrailblazingforfindingtheoptimalestimatingequation.Toseethis,allwehavetodoistospecifythetermsinvolvedin(2.26),andnotethat∂M/∂θ=−B(∂u/∂θ)andVar(M)=BVar(S)B.ItfollowsthatVar(θˆ)≈{(BU)−1}BVB{(BU)−1},(2.27)

28January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page1616RobustMixedModelAnalysiswhereV=Var(S)andU=∂u/∂θ.Itcanbeshown(Exercise2.9)thattheoptimalBinthesenseofminimizingtherightsideof(2.27),accordingtothepartialorderingofsymmetricmatrices,isB=UV−1.Herefortwosymmetricmatrices,AandB,A≥BifA−Bisnonnegativedefinite(orpositivesemidefinite),denotedbyA−B≥0.Thus,atleasttheoretically,theoptimalestimatingequation,orthebestMMequation,is(2.23)withB=UV−1,thatis−1UV{S−u(θ)}=0.(2.28)AspecialformofthisoptimalestimatingequationhasbeenusedbyLiangandZeger(1986)fortheanalysisoflongitudinaldata.Supposethatdataarecollectedfromanumberofsubjects(e.g.,patients)overtime.Letyitdenotetheresponsecollectedfromtheithsubjectattimet,i=1,...,m,t∈Ti,whereTiisthesetofobservationaltimesfortheithsubject,whichmaybesubject-dependent.Letyi=(yit)t∈Tidenotethevectorofresponsesfromtheithsubject,andμi=(μit)t∈Tidenotethevectorofmeans,fortheithsubject.Itisassumedthatthemeanisassociatedwithavectorofunknownparameters,θ,thatis,μi=μi(θ).Itisalsoassumedthattheresponsesfromdifferentsubjects,y1,...,ym,areindependent.ConsiderS=y=(yi)1≤i≤min(2.28),thenu=μ.Theoptimalestimatingequation(2.28)cannowbeexpressedas(Exercise2.10)m∂μiV−1(y−μ)=0.(2.29)iii∂θi=1(2.29)iswidelyknownasthegeneralizedestimatingequation[GEE;e.g.,Diggleetal.(2002)].Weconsidersomeexamples.Example2.4.(Bestlinearunbiasedestimation)Thesimplestmodelforthemeanfunctionwouldbealinearmodel,inwhichcaseμi=Xiβ,whereXiisamatrixofknowncovariatesassociatedwiththeithsubject,andβisavectorofunknownregressioncoefficients.Notethatsomeofthecovariatesmaybetime-varying.Thus,inthiscase,wehaveθ=β,andm−1∂μi/∂θ=Xi.TheGEE(2.29)canbeexpressedas(i=1XiViXi)β=mXV−1y,whichleadstoaclosed-formsolution,i=1iii−1mmβ˜=XV−1XXV−1y.(2.30)iiiiiii=1i=1LiketheGEE,(2.30)alsohasawell-knownname,callthebestlinearun-biasedestimator,orBLUE[e.g.,Jiang(2007),sec.2.3.1].NotethattheBLUEisnotcomputableunlesstheVi’sareknown.Alsonotethat,ifthe

29February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page17GeneralizedEstimatingEquations17distributionofyiis,actually,normal,thatis,yi∼N(µi,Vi),thentheBLUEisthesameastheMLestimator.Example2.5.Considerasimplemixedlogisticmodelinamatched-pairsetting.Letyit,i=1,...,m,t=1,2bebinaryresponsesthatare(conditionally)independentgiventhesubject-specificrandomeffects,α1,...,αm,suchthatlogit(pi1)=β+αi,logit(pi2)=β+δ+αi,wherepit=P(yit=1|αi),β,δareunknownparameterscorrespondingthebaselineandincrement,respectively.Forexample,yi1andyi2maycor-respondtotheresponsesfromtheithpatientbeforeandafteratreat-ment.Supposethattherandomeffectsareindependentandnormal,withmean0andunknownvarianceσ2.ItisclearthatthisisaspecialcaseofGLMM.Themeanfunction,µi,hascomponentsµit,t=1,2,whereµ=E{h(β+δ1+σξ)}withh(u)=eu/(1+eu)andξ∼N(0,1).Letit(t=2)θ=(β,δ,σ)′.Itiseasytoseethat∂µit′=E{h(β+δ1(t=2)+σξ)},∂β∂µit′=E{h(β+δ1(t=2)+σξ)}1(t=2),∂δ∂µit′=E{h(β+δ1(t=2)+σξ)ξ},∂σwhereh′(u)=eu(1+eu)−2.Furthermore,wehave(Exercise2.11)var(yit)=µit(1−µit),t=1,2andcov(yi1,yi2)=E{h(β+σξ)h(β+δ+σξ)}−µi1µi2withµitgivenabove.Withtheseexpressions,theGEE(2.29)canbespecified(Exercise2.11).Although(2.29)is,theoretically,theoptimalestimatingequation,itisnotavailableforcomputationinmostpracticalsituations.Thereasonisthatthecovariancematrices,Vi,1≤i≤m,typicallyareunknown,orinvolvesomeunknownparameters.Todealwiththiscomplication,LiangandZeger(1986)notedthatonemayreplaceVibya“workingcovariancematrix”,V˜i,thatisavailable,1≤i≤m.Theresultingestimatingequationisnolongeroptimal.However,thecorrespondingestimatorisstillconsis-tent,undersomemildconditions,eventhoughitmaynotbeasefficientastheGEEestimatorobtainedfrom(2.29),ifoneknowsthetrueVi’s.Forexample,theworkingcovariancematrixmaybechosenastheidentityma-trix,oramatrixbasedonthebestguessofthetruecovariancematrix.Weconsiderasimpleexample.Example2.6.IncaseV˜i=Ini,whereni=|Ti|(|·|denotescardinal-Pm′ity),1≤i≤m,andµi=Xiβ,(2.29)becomesi=1Xi(yi−Xiβ)=0,

30January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page1818RobustMixedModelAnalysiswiththesolution−1mmβˆ=XXXy.(2.31)iiiii=1i=1(2.31)isknownastheordinaryleastsquares(OLS)estimator.ItisthesameastheLSestimatorinlinearregressionexceptthatnow,becausethedataarecorrelatedwithanunknownvariance-covariancestructure(VCS),theOLSestimatordoesnothavesomeofthenicepropertiesthattheLSestimatorhasinthecaseoflinearregression.Forexample,theLSestimatoristhesameastheBLUE(seeExample2.4)underthelinearregressionmodel;however,theOLSestimatoristypicallynottheBLUEwhenthedataarecorrelated.Anothercomplicationisregardingthevariationoftheestimator.Forsimplicity,letusfocusonthecaseoflinearmodels,thatis,μi=Xiβ,1≤i≤m,althoughtheideacanbegeneralizedalongthesamelinetononlinearmodels.Notethatθ=βinthiscase.By(2.27),theasymptoticcovarianceoftheBLUEis−1m(UV−1U)−1=XV−1X.(2.32)iiii=1(2.32)canalsobederiveddirectlyfrom(2.30),ofcourse.However,whentheVi’sarereplacedbytheworkingcovariancematrices,(2.32)nolongerholds.Infact,ifwedenotetheGEEestimatorwithworkingcovariancematrices,thatis,thesolutionto(2.29)withVireplacedbyV˜i,1≤i≤m,byβˆ.Then,onceagain,itfollowsfrom(2.27),withB=UV˜−1,whereV˜=diag(V˜,1≤ii≤m),thatVar(βˆ)≈(UV˜−1U)−1UV˜−1VV˜−1U(UV˜−1U)−1=m−1mm−1XV˜−1XXV˜−1VV˜−1XXV˜−1X.(2.33)iiiiiiiiiiii=1i=1i=1Clearly,(2.33)reducesto(2.32)whenV˜i=Vi,1≤i≤m.Weconsideranotherexample.Example2.6(continued).InthecaseofOLS,(2.33)givesm−1mm−1Var(βˆ)≈XXXVXXX.(2.34)iiiiiiii=1i=1i=1Thereisaninterestingthree-waycomparisonofthecovariancematrices.Itisknown[e.g.,SenandSrivastava(1990),p.36]that,underlinearregression,thecovariancematrixoftheLSestimatoris−1mσ2(XX)−1=σ2XX,(2.35)iii=1

31January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page19GeneralizedEstimatingEquations19whereX=(X)andσ2isthevarianceoftheregressionerrors,whichi1≤i≤mareassumedtobeindependentwithmeanzero.Nowweknowneither(2.35),thecovariancematrixoftheLSestimator,nor(2.32),thecovari-ancematrixoftheBLUE,arethecorrectcovariancematrixoftheOLSestimator–thecorrectoneis(2.34).Morespecifically,(2.35)iscorrectifV=σ2I,1≤i≤m;(2.32)iscorrectifV˜=V,1≤i≤m;and(2.34)isiniiicorrect,nomatterwhat.Inpractice,thecovariancematrix,orACVM,areusedinobtainingmeasuresofuncertaintyoftheestimator.Forexample,thestandarderror(s.e.)ofanestimatorisdefinedasanestimatedstandarddeviation(s.d.)oftheestimator.However,eventhough(2.33)isthecorrectACVMofβˆ,itisnotreadytobeused,becausetheVi’sareunknown.ToobtainanestimatedACVM,weusethefollowingusefultechnique.NotethatV=E{(y−Xβ)(y−Xβ)},whereβisthetrueparametervector.iiiiiIfwebringthisexpressiontothemiddlefactorin(2.33),andmovetheexpectationsigntotheoutsideofthesummation,themiddlefactorbecomesmEXV˜−1(y−Xβ)(y−Xβ)V˜−1X.(2.36)iiiiiiiii=1Theideaistoestimate(2.36)bytheexpressioninsidetheexpectation,withβreplacedbyβˆ.ThisleadstothefollowingestimatorofVar(βˆ):−1mmVar(βˆ)=XV˜−1XXV˜−1(y−Xβˆ)(y−Xβˆ)V˜−1Xiiiiiiiiiiii=1i=1−1m×XV˜−1X.(2.37)iiii=1(2.37)iswidelyknownasthe“sandwichestimator”foravisuallyintuitivereason.Weillustratethemethodwithaspecificexample.Example2.7.(Growthcurve)Forsimplicity,supposethatforeachofthemindividuals,theobservationsarecollectedoveracommonsetoftimest1,...,tk.Supposethatyij,theobservationcollectedattimetjfromtheithindividual,satisfiesyij=ξi+ηixij+ζij+ij,whereξiandηirepresent,respectively,arandominterceptandarandomslope;xijisaknowncovari-ate;ζijcorrespondstoaserialcorrelation;andijisanerror.Foreachi,itisassumedthatξiandηiarejointlynormallydistributedwithmeansβ0,β,variancesσ2,σ2,respectively,andcorrelationcoefficientρ;andsare101ijindependentanddistributedasN(0,τ2).Asfortheζs,itisassumedthatijtheysatisfythefollowingrelationofthefirstorderautoregressiveprocess,

32January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page2020RobustMixedModelAnalysisorAR(1):ζij=φζij−1+ωij,whereφisaconstantsuchthat0<φ<1,andωsareindependentanddistributedasN{0,σ2(1−φ2)}.Further-ij2more,thethreerandomcomponents(ξ,η),ζ,andareindependent,andobservationsfromdifferentindividualsareindependent.Thisissometimescalledagrowthcurvemodelinthesensethatboththeinterceptandtheslopeforthemeanresponse,asafunctionoftime,dependontheindividual[Diggleetal.(2002)].ToexpressthemodelintermsofamorestandardLMM,writeξi=β0+v0i,ηi=β1+v1i,andeij=ζij+ij.Then,thegrowthcurvemodelcanbeexpressedasyij=β0+β1xij+v0i+v1ixij+eij=Xijβ+Xijvi+eij,whereX=(1,x),β=(β,β),andv=(v,v),or,ijij01i0i1iyi=Xiβ+Xivi+ei,i=1,...,m,(2.38)wherey=(y)andX=(X).(2.38)isaspecialcaseoftheiij1≤j≤kiij1≤j≤kso-calledlongitudinalLMM[e.g.,Jiang(2007),p.6].AfeatureofthelongitudinalLMMisthattheyi’sareindependentasvectors;however,theremaybecorrelationswithinthevectors.Further-more,sofarasthecurrentmodelisconcerned,thecorrelationstructureisfairlycomplicated,involvingthe(vector-valued)randomeffects,vi,anderrors,ei.Inparticular,thecorrelationstructurewithinei(Exercise2.12)maynotbeknown,inpractice.Ofcourse,onemaytrytomodelthecorrelationstructure,butwhatifthemodeliswrong?Asamorerobuststrategy,onemayconsiderestimatingtheβparametersusingtheGEEwithaworkingcovariancematrix.Notethat,inmostcases,themaininterestoflongitudinaldataanalysisisinferenceaboutthemeanfunction[e.g.,Diggleetal.(2002)].Here,becausetheβ’saretheonlyparametersinvolvedinthemeanfunction,theyareofmaininterest.Thesimplestworkingcovariancematrixistheidentitymatrix,Ik,whichleadstotheOLSestimator(2.31),andtheestimatedVar(βˆ)(2.37)withV˜i=Ik,1≤i≤m.Alternatively,onemayconsidertheso-callequicorrelationworkingco-variancematrix,thatis,theentriesofV˜areσ2onthediagonal,andσ2γiontheoff-diagonal,where|γ|<1.Ifσ2andγareknown,theresultingGEEestimatorforβisgivenby(2.30)withVireplacedbyV˜i,1≤i≤m(verifythis).Ifσ2andγareunknown,buttheirestimators,σˆ2andγˆareavailable(seethenextsection),anextendedGEEestimatorisobtainedby(2.30)withVreplacedbyVˆ,1≤i≤m,whereVˆisV˜withσ2,γiiiireplacedbyσˆ2,γˆ,respectively.Similarly,anestimatedcovariancematrixoftheestimatorisgivenby(2.37)withV˜ireplacedbyVˆi,1≤i≤m.

33January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page21GeneralizedEstimatingEquations212.3IterativeestimatingequationsTheconcludingexampleoftheprevioussectionhasapartthatisabitvague.Namely,iftheworkingcovariancematrixinvolvessomeunknownparameters,howwouldoneestimatetheseparameters?Moreover,asnoted,thereispotentiallyalossofefficiencyinGEEwiththeworkingcovariancematrices,becausethelatterarenotthetruecovariancematrices,ormaynotbeevenclosetothetruecovariancematrices.Inthissection,wediscussastrategythatallowsonetoeventually“getitright”.WedescribethemethodunderasemiparametricregressionmodelandthendiscussitsapplicationtolongitudinalGLMMs.Considerafollow-upstudyconductedoverasetofprespecifiedvisittimest1,...,tb.Sup-posethattheresponsesarecollectedfromsubjectiatthevisittimestj,j∈Ji⊂J={1,...,b}.Letyi=(yij)j∈Ji.Hereweallowthevisittimestobedependentonthesubject.Thisenablesustoincludesomecaseswithmissingresponses.LetXij=(Xijl)1≤l≤prepresentavec-torofexplanatoryvariablesassociatedwithyijsothatXij1=1.WriteXi=(Xij)j∈Ji=(Xijl)i∈Ji,1≤l≤p.NotethatXimayincludebothtime-dependentandindependentcovariatessothat,withoutlossofgenerality,itmaybeexpressedasXi=(Xi1,Xi2),whereXi1doesnotdependonj(i.e.,time)whereasXi2does.Weassumethat(Xi,Yi),i=1,...,mareindependent.Furthermore,itisassumedthatE(Yij|Xi)=gj(Xi,β),(2.39)whereβisap×1vectorofunknownregressioncoefficientsandgj(·,·)arefixedfunctions.Weusethenotationμij=E(Yij|Xi)andμi=(μij)j∈Ji.Notethatμi=E(Yi|Xi).Inaddition,denotethe(conditional)covariancematrixofYigivenXiasVi=Var(Yi|Xi),(2.40)whose(j,k)thelementisvijk=cov(Yij,Yik|Xi)=E{(Yij−μij)(Yik−μik)|Xi},j,k∈Ji.NotethatthedimensionofVimaydependoni.LetD={(j,k):j,k∈Jiforsome1≤i≤n}.Ourmaininterestistoestimateβ,thevectorofregressioncoefficients.Accordingtotheearlierdiscussion,iftheVisareknown,βmaybeestimatedbytheGEE(2.29).Ontheotherhand,ifβisknown,thecovariancematricesVicanbeestimatedbytheMMmethod,asfollows.Notethatforany(j,k)∈D,someofthevijksmaybethesame,eitherbythenatureofthedataorbytheassumptions.LetLjkdenotethenumber

34January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page2222RobustMixedModelAnalysisofdifferentvijks.Supposethatvijk=v(j,k,l),i∈I(j,k,l),whereI(j,k,l)isasubsetof{1,...,m},1≤l≤Ljk.Forany(j,k)∈D,1≤l≤Ljk,definevˆ(j,k,l)=1{Yij−gj(Xi,β)}{Yik−gk(Xi,β)},(2.41)n(j,k,l)i∈I(j,k,l)wheren(j,k,l)=|I(j,k,l)|,thecardinality.Then,defineVˆi=(ˆvijk)j,k∈Ji,wherevˆijk=ˆv(j,k,l),ifi∈I(j,k,l).Themainpointsoftheabovemaybesummarizedasfollows.IftheViswereknown,onecouldestimateβbytheGEE;ontheotherhand,ifβwereknown,onecouldestimatetheVisbytheMM.Itisclearthatthereisacyclehere,whichmotivatesthefollowingiterativeprocedure.Startingwithaninitialestimatorofβ,use(3.26),withβreplacedbytheinitialestimator,toobtaintheestimatorsoftheVis;thenuse(3.19)toupdatetheestimatorofβ,andrepeattheprocess.Wecallsuchaprocedureiterativeestimatingequations,orIEE.Iftheprocedureconverges,thelimitingestimatoriscalledtheIEEestimator,orIEEE.Weconsideranexample.Example2.8.AspecialcaseofthesemiparametricregressionmodelisthelinearmodelwithE(yi)=XiβandVar(yi)=V0,whereXiisama-trixofknowncovariates,andV0=(vqr)1≤q,r≤kisanunknowncovariancematrix.Lety=(yi)1≤i≤mandassumethatyi,1≤i≤mareindependent.Then,V=Var(y)=diag(V0,...,V0).Forthisspecialcase,theIEEcanbeformulatedasfollows.Ifβwereknown,asimpleconsistentestimatorofVwouldbeVˆ=diag(Vˆ0,...,Vˆ0)withmVˆ=1(y−Xβ)(y−Xβ).(2.42)0iiiimi=1Ontheotherhand,ifVwereknown,theBLUEofβisgivenby(2.46).WhenbothβandVareunknown,wheniteratesbetweenthesetwosteps,startingwithV0=Ik.ThisprocedureiscallediterativeWLS,orI-WLS(Jiangetal.(2007)).ToapplyIEEtoalongitudinalGLMM,letusdenotetheresponsesbyyij,i=1,...,m,j=1,...,ni,andletyi=(yij)1≤j≤ni.Weassumethateachyiisassociatedwithavectorofrandomeffects,αi,thathasdimensiondsuchthatg(μ)=xβ+zα,(2.43)ijijijiwhereμij=E(yij|αi),gisthelinkfunction,andxij,zijareknownvec-tors.Furthermore,weassumethattheresponsesfromdifferentclusters

35January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page23GeneralizedEstimatingEquations23y1,...,ymareindependent.Finally,supposethatαi∼f(u|θ),(2.44)wheref(·|θ)isad-variatepdfknownuptoavectorofdispersionparametersθsuchthatE(α)=0.Letψ=(β,θ).Then,wehaveθiE(yij)=E{E(yij|αi)}=E{h(xβ+zα)}ijiji=h(xβ+zu)f(u|θ)du,ijijwhereh=g−1.LetW=(XZ),whereX=(x),Z=iiiiij1≤j≤nii(z).Foranyvectorsa∈Rp,b∈Rd,defineij1≤j≤niμ(a,b,ψ)=h(aβ+bu)f(u|θ)du.1Furthermore,foranyni×pmatrixAandni×dmatrixB,letC=(AB),andg(C,ψ)=μ(a,b,ψ),whereaandbarethejthrowsofAandB,j1jjjjrespectively.Then,itiseasytoseethatE(yij)=gj(Wi,ψ).(2.45)Itisclearthat(2.45)issimply(2.39)withXireplacedbyWi,andβreplacedbyψ.Notethat,becauseWiisafixedmatrixofcovariates,wehaveE(yi|Wij)=E(yij).Inotherwords,thelongitudinalGLMMsatisfiesthesemiparametricregressionmodelintroducedabove,henceIEEapplies.Again,weconsideranexample.Example2.9.Considerarandom-interceptmodelwithbinaryre-sponses.Letyijbetheresponseforsubjecticollectedattimetj.Weassumethatgivenasubject-specificrandomeffectαi,binaryresponsesyij,j=1,...,kareconditionallyindependentwithconditionalprobabilitypij=P(yij=1|αi),whichsatisfieslogit(pij)=β0+β1tj+αi,whereβ0,β1areunknowncoefficients.Furthermore,weassumethatα∼N(0,σ2),iwhereσ>0andisunknown.Letyi=(yij)1≤j≤k.Itisassumedthaty1,...,ymareindependent,wheremisthenumberofsubjects.Itiseasytoshowthat,undertheassumedmodel,onehas∞E(yij)=h(β0+β1tj+σu)f(u)du−∞≡μ(tj,ψ),(2.46)

36January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page2424RobustMixedModelAnalysis√2whereh(x)=ex/(1+ex),f(u)=(1/2π)e−u/2,andψ=(β,β,σ).01Writeμj=μ(tj,ψ),andμ=(μj)1≤j≤k.Wehave∞∂μj=h(β0+β1tj+σu)f(u)du,(2.47)∂β0−∞∞∂μj=tjh(β0+β1tj+σu)f(u)du,(2.48)∂β1−∞∞∂μj=h(β0+β1tj+σu)uf(u)du.(2.49)∂σ−∞Also,itiseasytoseethattheyishavethesame(joint)distribution,henceVi=Var(yi)=V0,anunspecifiedk×kcovariancematrix,1≤i≤m.Thus,theGEEequationforestimatingψisgivenbymμ˙V−1(y−μ)=0,(2.50)0ii=1providedthatV0isknown.Ontheotherhand,ifψisknown,V0canbeestimatedbythemethodofmomentsasfollows,mVˆ=1(y−μ)(y−μ).(2.51)0iimi=1TheIEEproceduretheniteratesbetweenthetwostepswhenbothV0andψareunknown,startingwithV0=I,thek-dimensionalidentitymatrix.Notethatthemeanfunction,μj,in(2.50)and(2.51),isaone-dimensionalintegral,whichcanbeapproximatedbyasimpleMonteCarlomethod,asinMSM(seeSection2.1).Twoquestions,bothoftheoreticalandpracticalinterest,are:(i)DoestheIEEalgorithmconverge(numerically)?and(ii)iftheIEEconverges,howdoesthelimitoftheconvergencebehaveasymptotically,asanestima-tor?Theshortanswerstothesequestionsare:(i)Yes,notonlytheIEEconverges,itconvergeslinearly(seebelow);(ii)Yes,notonlythelimitoftheconvergenceisaconsistentestimator,itisasymptoticallyasefficientasthesolutiontotheGEE(3.19),asifthetrueVi’sareknown.Wediscusstheseresultsbelow,andreferfurtherdetailstoJiangetal.(2007).Weadaptatermfromnumericalanalysis.Aniterativealgorithmthatresultsinasequencex(k),k=1,2,...convergeslinearlytoalimitx∗,ifthereis0<ρ<1suchthatsup{|x(k)−x∗|/ρk}<∞(e.g.,Pressetal.k≥1(1997)).LetL1=maxmaxsij,1≤i≤mj∈Ji∂sij=supgj(Xi,β˜),∂β|β˜−β|≤1

37January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page25GeneralizedEstimatingEquations25βbethetrueparametervector,1isanypositiveconstant,and∂∂ff(β˜)=.∂β∂ββ=β˜Similarly,letL2=max1≤i≤mmaxj∈Jiwij,where∂2wij=sup)gj(Xi,β˜).∂β∂β|β˜−β|≤1Also,letV={v:λmin(Vi)≥λ0,λmax(Vi)≤M0,1≤i≤m},whereλminandλmaxrepresentthesmallestandlargesteigenvalues,respectively,andδ0andM0aregivenpositiveconstants.NotethatVisanonrandomset.Anarrayofnonnegativedefinitematrices{Am,i}isboundedfromaboveif−1Am,i ≤cforsomeconstantc;thearrayisboundedfrombelowifAm,i−1existsandAm,i ≤cforsomeconstantc.WealsoreferthenotionofOP,includingthatforrandomvectorsandmatrices,toJiang(2010)(sec.3.4).LetpandRbethedimensionsofβandv,respectively.Weassumethefollowing.A1.Forany(j,k)∈D,thenumberofdifferentvijksisbounded,thatis,foreach(j,k)∈D,thereisasetofnumbersVjk={v(j,k,l),1≤l≤Ljk},whereLjkisbounded,suchthatvijk∈Vjkforany1≤i≤nwithj,k∈Ji.A2.Thefunctionsgj(Xi,β)aretwicecontinuouslydifferentiablewithrespecttoβ;E(|Y|4),1≤i≤marebounded;andL,L,max(V ∨i121≤i≤ni−1Vi)areOP(1).A3(ConsistencyofGEEestimator).ForanygivenVi,1≤i≤mboundedfromaboveandbelow,theGEEequation(2.29)hasauniquesolutionβˆthatisconsistent.A4(DifferentiabilityofGEEsolution).Foranyv,thesolutionto(2.29),β(v),iscontinuouslydifferentiablewithrespecttov,andsupv∈V∂β/∂v=OP(1).A5.n(j,k,l)→∞forany1≤l≤Ljk,(j,k)∈D,asm→∞.FirstconsiderthenumericalconvergenceofIEE.Letβˆ(k),vˆ(k)denotetheupdatesofβandv,respectively,atthekthiteration.Theorem2.1.UnderassumptionsA1–A5,wehaveP(IEEconverges)→1asm→∞.Furthermore,wehavePsup{|βˆ(k)−βˆ∗|/(pη)k/2}<∞→1,k≥1Psup{|vˆ(k)−vˆ∗|/(Rη)k/2}<∞→1k≥1

38January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page2626RobustMixedModelAnalysisasn→∞forany0<η<(p∨R)−1,where(βˆ∗,vˆ∗)isthe(limiting)IEEE.Note1.Itisclearthattherestrictionη<(p∨R)−1isunnecessary(because,forexample,(pη)−k/2<(pη)−k/2foranyη≥(p∨R)−1>η),1212butlinearconvergencewouldonlymakesensewhenρ<1(seethedefinitionabove).Note2.TheproofofTheorem2.1(seeJiangetal.(2007)),infact,demonstratedthatforanyδ>0,thereareconstantsM1,δ,M2,δ,andintegermδsuchthat,forallm≥mδ,|βˆ(k)−βˆ∗|Psup≤M1,δ>1−δ,k≥1(pη)k/2|vˆ(k)−vˆ∗|Psup≤M2,δ>1−δ.k≥1(Rη)k/2Next,weconsiderasymptoticbehaviorofthelimitingIEEE.Forsim-plicity,thelatterissimplycalledIEEE.Thefirstresultisaboutconsistency.Theorem2.2.UndertheassumptionsofTheorem2.1,theIEEEisconsistent.ToestablishtheasymptoticefficiencyofIEEE,weneedtostrengthenassumptionsA2andA5alittle.DefineL=maxmax∂2μ/∂β∂β,2,0ij1≤i≤mj∈JiandL3=max1≤i≤mmaxj∈Jidij,where∂3dij=maxsupgj(Xi,β˜).1≤a,b,c≤p|β˜−β|≤1∂βa∂βb∂βcA2.SameasA2exceptthatg(X,β)arethree-timescontinuouslyjidifferentiablewithrespecttoβ,andthatL2=OP(1)isreplacedbyL2,0∨L3=OP(1).A5.Thereisapositiveintegerγsuchthatm/{n(j,k,l)}γ→0forany1≤l≤Ljk,(j,k)∈D,asm→∞.Wealsoneedthefollowingadditionalassumption.−1m−1A6.mi=1μ˙iViμ˙iisboundedawayfromzeroinprobability.Letβ˜bethesolutionto(2.29),wheretheVisarethetruecovariancematrices.Notethatβ˜isefficient,oroptimal,inthesensediscussedearlier(seeSection2.2)butisnotcomputable,unlessthetrueVisareknown.Theorem2.3.UnderassumptionsA1,A2,A3,A4,A5andA6,we√havem(βˆ∗−β˜)−→0inprobability.Thus,asymptotically,βˆ∗isasefficientasβ˜.

39March8,20199:47ws-book9x6RobustMixedModelAnalysisbook4page27GeneralizedEstimatingEquations27Note.TheproofofTheorem2.3alsorevealsthefollowingasymptoticexpansion,!−1XmXmo(1)βˆ∗−β=µ˙′V−1µ˙µ˙′V−1(Y−µ)+√P,(2.52)iiiiiiimi=1i=1whereoP(1)representsatermthatconvergestozero(vector)inprobability(e.g.,Jiang(2010),sec.3.4).ByTheorem2.3,(2.52)alsoholdswithβˆ∗replacedbyβ˜,eventhoughthelatteristypicallynotcomputable.Example2.10(Real-dataexample).Weuseareal-dataexampletoillustratetheconvergenceofIEE,orI-WLS(seeExample2.8)inthisoccasion.Jiangetal.(2007)analyzedadatasetfromHandandCrow-der(1996)regardinghipreplacementsofthirtypatients(alsoseeJiang(2007),sec.1.7.2).Eachpatientwasmeasuredfourtimes,oncebeforetheoperationandthreetimesafter,forhematocrit,TPP,vitaminE,vitaminA,urinaryzinc,plasmazinc,hydroxyprolene(inmilligrams),hydroxypro-lene(index),ascorbicacid,carotine,calcium,andplasmaphosphate(12variables).Animportantfeatureofthedataisthatthereisconsiderableamountofmissingobservations.Infact,mostofthepatientshaveatleastonemissingobservationforall12measuredvariables.Asaresult,theobservationaltimesareverydifferentfordifferentpatients.Twoofthevariablesareconsidered:hematocritandcalcium.ThefirstvariablewasconsideredbyHandandCrowder(1996),whousedthedatatoassessage,sex,andtimedifferences.Theauthorsassumedanequicorre-latedmodelandobtainedGaussianestimatesofregressioncoefficientsandvariancecomponents(i.e.,MLEundernormality).Herewetakearobustapproachwithoutassumingaparametriccovariancestructure.Thecovari-atesconsistofthesamevariablesassuggestedbyHandandCrowder(1996).Thevariablesincludeanintercept,sex,occasiondummyvariables(three),sexbyoccasioninteractiondummyvariables(three),age,andagebysexinteraction.Forthehematocritdata,theI-WLSalgorithmconvergedinseven(7)iterations.HerethecriterionfortheconvergenceisthattheEu-clideandistancebetweenconsecutiveupdatesoftheparametersislessthan10−5.Forthecalciumdata,theI-WLSinthirteen(13)iterations.WeshallrevisitthisexampleisSubsection2.6.1.

40February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page2828RobustMixedModelAnalysis2.4RobustestimationinGLMMSofarthemainobjectiveofGEE,orIEE,isanalysisoflongitudinaldata,inwhichestimationofthemeanfunctionisofprimaryinterest.Inmanycases,however,thereareinterestsinestimatingthevariancecomponents.Furthermore,therearesituationsofcorrelateddataofwhichthecovariancestructureismorecomplicatedthanthatofthelongitudinaldata,suchasExample2.2.WeneedtoextendtheGEEideasothatthemethodcanbeappliedtosuchcases.InSection2.1,weconsideredtheMMmethodwhichrequiresminimumdistributionalassumptionsaboutthedata,andproducesconsistentestima-tors.However,theMMestimatorisinefficient(e.g.,Jiang(1998a)).Ourgoalisthereforetwo-fold:Ontheonehand,weneedtoimprovetheeffi-ciencyofMM;ontheotherhand,wewishtomaintainweakdistributionalassumptionssothatthemethodisrelativelyrobusttoviolationsoftheseassumptions.Todoso,letusfirstconsideranextensionofGLMM.RecallthatitisassumedinaGLM[McCullaghandNelder(1989)]thatthedis-tributionoftheresponseisamemberofaknownexponentialfamily.Thus,foralinearmodeltofitwithintheGLM,oneneedstoassumethatthedistributionoftheresponseisnormal.However,thedefinitionofalinearmodeldoesnothavetorequirenormality,andmanyofthemethods,suchastheleastsquares,developedinlinearmodelsdonotrequirethenormalityassumption.Thus,inaway,GLMhasnotfullyextendedthelinearmodel.Inviewofthis,weconsiderabroaderclassofmodelsthantheGLMM,inwhichtheformoftheconditionaldistribution,suchastheexponentialfamily,isnotrequired.Themethodcanbedescribedunderanevenbroaderframework.Supposethat,givenavectorα=(αk)1≤k≤mofrandomeffects,responsesy1,...,yNareconditionallyindependentsuchthatE(yi|α)=h(ξi),(2.53)var(yi|α)=ai(φ)v(ηi),(2.54)whereh(·),v(·),andai(·)areknownfunctions,φisadispersionparameter,ξ=x′β+z′α,(2.55)iiiwhereβisavectorofunknownfixedeffects,andxi=(xij)1≤j≤p,zi=(zik)1≤k≤mareknownvectors.Finally,assumethatα∼Fθ,(2.56)whereFθisamultivariatedistributionknownuptoavectorθ=(θr)1≤r≤qofdispersionparameters,orvariancecomponents.Notethatwedonot

41February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page29GeneralizedEstimatingEquations29requirethattheconditionaldensityofyigivenαisamemberoftheex-ponentialfamily;instead,onlyuptosecondconditionalmomentsarespec-ified,by(2.53)and(2.54).Infact,aswillbeseen,toobtainconsistentestimators,only(3.53)isneeded.WenowconsiderestimationundertheextendedGLMMbyextendingtheideadescribedatthebeginningofSection2.2,wherethevectorofbasestatistics,S,satisfythefollowing:(i)ThemeanofSisaknownfunctionofψ.(ii)ThecovariancematrixofSisaknownfunctionofψ,oratleastisconsistentlyestimable.(iii)Certainsmoothnessandregularityconditionshold.Condition(iii)oftheaboverequirementisabitvagueatthispoint,butitwillbespecifiedlaterwhenwediscussasymptotictheory.LetthedimensionofθandSberandN,respectively.Ifonly(i)isassumed,anestimatorofθmaybeobtainedbysolvingequation(2.23).Infact,thisiswhatweweredoinginSection2.1,wherethebasestatisticsarechosenasXNSj=wixijyi,1≤j≤p,i=1XSp+j=wswtzskztkysyt,1≤k≤m.(2.57)s=6tInfact,ifZ=(zik)1≤i≤n,1≤k≤m=(Z1···Zq),whereeachZrisanN×mrstandarddesignmatrixinthatitconsistsofzerosandones;thereisexactlyone1ineachrow,andatleastone1ineachcolumn.Then,bychoosingB=diag(I,1′,...,1′),oneobtainstheMMequationsofSection2.1pm1mqthatleadtotheMSMestimator.Notethat,here,Bisaconstantmatrix.Ingeneral,wecallthesolutionto(2.23)withagivenconstantmatrixBafirst-stepestimator.Ontheotherhand,accordingtothediscussionintheearlypartofSection2.2,theoptimalB,inthesenseofminimizingtheasymptoticco-variancematrix,(2.27),isU′V−1.Unfortunately,theoptimalBdependsonθ,whichisexactlywhatwewishtoestimate.OurapproachistoreplacetheθinvolvedintheoptimalBbyθ˜,afirst-stepestimator.Thisleadstothesecond-stepestimator,denotedbyθˆ,obtainedbysolvingBS˜=Bu˜(θ),(2.58)whereB˜=U′V−1|.Itcanbeshown(JiangandZhang(2001))that,un-θ=θ˜dersuitableconditions,thesecond-stepestimatorisconsistentandasymp-toticallyefficientinthesensethatitsasymptoticcovariancematrixisthe

42January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page3030RobustMixedModelAnalysisTable2.2Simulationresults:mixedlogisticmodelMethodofEstimatorofμEstimatorofσOverallEstimationMeanBiasSDMeanBiasSDMSE1st-step.21.01.16.98−.02.34.152nd-step.19−.01.16.98−.02.24.08sameasthatofthesolutiontotheoptimalestimatingequation,thatis,(2.28).Thefollowingexamplesshowthatthesecond-stepestimatorscanbeconsiderablymoreefficientthanthefirst-stepones.Example2.11.(Mixedlogisticmodel)ConsideranextensionofEx-ample2.1byallowingthenumberofbinaryresponsestobedifferentfordifferentsubjects,thatis,replacingkbyki.Therestoftheassumptionsremainthesame.Itiseasytoseethat(2.57)reducetoyandy2−y,··i·i·nim1≤i≤m,whereyi·=j=1yijandy··=i=1yi·.Ifki=k,1≤i≤m,thatis,weareinthesituationofExample2.1,alsoknownasbalanceddata,thefirst-stepestimatingequationscanbeshowntobeequivalenttothesecond-stepones(Exercise2.14).Inotherwords,inthecaseofbalanceddata,thereisnogainbydoingthesecond-step,andthefirst-stepestimatorsarealreadyoptimal.However,whenthedataareunbalanced,thefirst-andsecond-stepestimatorsarenolongerequivalent,andthereisarealgainbydoingthesecond-step.Toseethis,asimulationwascarriedout(JiangandZhang(2001)),inwhichm=100,ni=2,1≤i≤50,andni=6,51≤i≤100.Thetrueparameterswerechosenasμ=0.2andσ=1.0.Theresultsbasedon1000simulationsaresummarizedinTable2.2,whereSDrepresentsthesimulatedstandarddeviation,andtheoverallMSEistheMSEoftheestimatorofμplusthatoftheestimatorofσ.Overall,thereisabouta43%reductionoftheMSEofthesecond-stepestimatorsoverthefirst-stepones.Becausethefirst-andsecond-stepestimatorsaredevelopedundertheassumptionoftheextendedGLMM,themethodsapplytosomesituationsbeyond(theclassical)GLMM.Thefollowingisanexample.Example2.3(continued).Consideranextensionofthebeta-binomialexampleofExample2.3.Itcanbeshown(Exercise2.15)thatE(Y)=kπandVar(Y)=φkπ(1−π),whereφ=(k+1)/2.Itisseenthatthemeanfunctionunderbeta-binomial(k,π)isthesameasthatofbinomial(k,π),butthevariancefunctionisdifferent.Inotherwords,thereisanoverdispersion.Now,supposethat,giventherandomeffectsαi,1≤i≤m,whichareindependentanddistributedasN(0,σ2),responsesy,1≤i≤m,ij

43January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page31GeneralizedEstimatingEquations31Table2.3Simulationresults:beta-binomialmixedmodelMethodofEstimationofμEstimationofσOverallEstimationMeanBiasSDMeanBiasSDMSE1st-step.25.05.251.13.13.37.222nd-step.25.05.261.09.09.25.141≤j≤niareindependentanddistributedasbeta-binomial(k,πi),whereπ=h(μ+α)withh(x)=ex/(1+ex).NotethatthisisnotaGLMMiiundertheclassicaldefinition,becausetheconditionaldistributionofyijisnotamemberoftheexponentialfamily.However,themodelfallswithintheextendedGLMM,becauseE(yij|α)=lπi,(2.59)var(yij|α)=φkπi(1−πi).(2.60)Ifonly(2.59)isassumed,onemayobtainthefirst-stepestimatorof(μ,σ),forexample,bychoosingB=diag(1,1).If,inaddition,(2.60)isassumed,monemayobtainthesecond-stepestimator.Toseehowmuchdifferencethereisbetweenthetwo,asimulationstudywascarriedoutwithm=40(JiangandZhang(2001)).Again,anunbalancedsituationwasconsidered:ni=4,1≤i≤20andni=8,21≤i≤40.Wetookk=2,andthetrueparametersμ=0.2andσ=1.0.Theresultsbasedon1000simulationsaresummarizedinTable2.3.Overall,weseeabout36%improvementinMSEofthesecond-stepestimatorsoverthefirst-stepones.Theimprovementsofthesecond-stepestimatorsoverthefirst-steponesintheprecedentexamplesarenotincidental.Wenowdiscussanasymptotictheoryrelatedtothisimprovement.First,wespecifycondition(iii)oftherequirementsforthebasestatisticsstatedabove(2.57).TheresultsestablishedhereareactuallymoregeneralthanextendedGLMM.Lettheresponsesbey1,...,yN,whosedistributiondependsonapa-rametervector,θ.LetΘbetheparameterspace.First,notethatB,S,andu(θ)in(2.58)maydependonN,thesamplesize;henceinthesubsectionweusethenotationBN,SN,anduN(θ).Also,thesolutionto(2.58)isun-−1changedwhenBNisreplacedbyCNBN,whereCN=diag(cN,1,...,cN,r),cN,jisasequenceofpositiveconstants,1≤j≤r,andristhedimension−1−1ofθ.WriteMN=CNBNSN,andMN(θ)=CNBNuN(θ).Thenthefirst-stepestimator,θ˜=θ˜NisthesolutiontotheequationMN(θ)=MN.(2.61)ConsiderM(·)asamapfromΘtoasubsetofRr.LetθdenotethetrueθNeverywhereexceptwhendefiningafunctionofθ,andMN(Θ)betheimage

44January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page3232RobustMixedModelAnalysisofΘunderM(·).Forx∈RrandA⊂Rr,defined(x,A)=inf|x−y|.Ny∈AObviously,MN(θ)∈MN(Θ).Furthermore,ifMN(θ)isintheinteriorofM(Θ),wehaved(M(θ),Mc(Θ))>0.Infact,thelatteressentiallyNNNensurestheexistenceofthesolutionto(2.61),asshownbythefollowingtheorem.TheproofisgiveninJiangandZhang(2001).Theorem2.4.Supposethat,asN→∞,MN−MN(θ)−→0(2.62)inprobability,andliminfd{M(θ),Mc(Θ)}>0.(2.63)NNThen,withprobabilitytendingtoone,thesolutionto(2.61)existsandisinΘ.If,inaddition,thereisasequenceΘN⊂Θsuchthatliminfinf|MN(θ∗)−MN(θ)|>0,(2.64)θ∗∈/ΘN|MN(θ∗)−MN(θ)|liminfinf>0,(2.65)θ∗∈ΘN,θ∗=θ|θ∗−θ|then,anysolutionθ˜Nto(2.61)isconsistent.Thelemmasbelowgivesufficientconditionsfor(2.62)–(2.65).Theproofsarefairlystraightforward.LetVNdenotethecovariancematrixofSN.Lemma2.1.(2.62)holdsprovidedthat,asN→∞,tr(C−1BVBC−1)−→0.NNNNNLemma2.2.Supposethatthereisavector-valuedfunctionM0(θ)suchthatMN(θ)→M0(θ)asN→∞.Furthermore,supposethatthereexist>0andN≥1suchthaty∈MN(Θ)whenever|y−M0(θ)|<andN≥N.Then(2.63)holds.Inparticular,ifMN(θ)doesnotdependonN,thatis,MN(θ)=M(θ),say,then(2.63)holdsprovidedthatM(θ)isintheinteriorofM(Θ),theimageofM(·).Lemma2.3.Supposethattherearecontinuousfunctionsfj(·),gj(·),1≤j≤r,suchthatfj{MN(θ)}→0ifθ∈Θandθj→−∞,gj{MN(θ)}→0ifθ∈Θandθj→∞,1≤j≤r,uniformlyinN.If,asN→∞,limsup|MN(θ)|<∞,liminfmin[|fj{MN(θ)}|,|gj{MN(θ)}|]>0,1≤j≤r,then(2.64)holdswithΘN=Θ0,acompactsubsetofΨ.WriteU=∂u/∂θ.LetH(θ)=∂2u/∂θ∂θ,whereuistheNNN,jN,jN,jjthcomponentofuN(θ),andHN,j,=sup|θ∗−θ|≤HN,j(θ∗),1≤j≤LN,whereLNisthedimensionofuN.

45January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page33GeneralizedEstimatingEquations33Lemma2.4.SupposethatMN(·)istwicecontinuouslydifferentiable,andthatliminfλ(UBC−2BU)>0,minNNNNNandthereis>0suchthat−2LN2max1≤i≤rcN,i(j=1|bN,ij|HN,j,)limsup<∞,λ(UBC−2BU)minNNNNNwherebN,ijisthe(i,j)elementofBN.Furthermore,suppose,foranycompactsubsetΘ1⊂Θsuchthatd(θ,Θ1)>0,wehaveliminfinf|MN(θ∗)−MN(θ)|>0.θ∗∈Θ1Then(2.65)holdsforΘN=Θ0,whereΘ0isanycompactsubsetofΘthatincludesθasaninteriorpoint.Onceagain,weconsideraspecificexample.Example2.1(continued).Asnoted(seeExample2.11;alsoExercise2.14),inthiscase,thefirstandsecond-stepestimatorsofθ=(μ,σ)arethesame,andtheybothcorrespondtoB=diag(1,1).Itcanbeshown,byNmchoosingCN=diag{mk,mk(k−1)},thatalloftheconditionsofLemmas2.1–2.4aresatisfied.Wenowconsidertheasymptoticnormalityofthefirst-stepestimator.Wesaythatanestimatorθ˜Nisasymptoticallynormalwithmeanθandasymptoticcovariancematrix(ΓΓ)−1ifΓ(θ˜−θ)−→N(0,I)inNNNNrdistribution,wherer=dim(θ).HereitisunderstoodthatΓNisr×randnon-singular.Let−1−1λN,1=λmin(CNBNVNBNCN),λ=λ{UB(BVB)−1BU}.N,2minNNNNNNNTheorem2.5.Supposethat(a)thecomponentsofuN(θ)aretwicecontinuouslydifferentiable;(b)θ˜Nsatisfies(2.61)withprobabilitytendingtooneandisconsistent;(c)thereexists>0suchthat⎛⎞LN|θ˜N−θ|−1⎝⎠−→0(λλ)1/21max≤i≤rcN,i|bN,ij|HN,j,N,1N,2j=1inprobability;and(d){C−1BVBC−1}−1/2[M−M(ψ)]−→N(0,I)NNNNNNNrindistribution.Thenθ˜isasymptoticallynormalwithmeanθandasymp-toticcovariancematrix(BU)−1BVB(UB)−1.(2.66)NNNNNNN

46January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page3434RobustMixedModelAnalysisSufficientconditionsforexistence,consistency,andasymptoticnormal-ityofthesecond-stepestimatorscanbeobtainedbyreplacingtheconditionsofTheorems2.4and2.5bythecorrespondingconditionswithaprobabil-itystatement.Forexample,letξNbeasequenceofnonnegativerandomvariables.WesaythatliminfξN>0withprobabilitytendingtooneifforany>0thereisδ>0suchthatP(ξN>δ)≥1−forallsufficiently−1largeN.NotethatthisisequivalenttoξN=OP(1).Then,(2.64)isreplacedby(2.64)withprobabilitytendingtoone.Notethattheasymp-toticcovariancematrixofthesecond-stepestimatorisgivenby(2.66)withB=UV−1,whichis(UVU)−1.ThisisthesameastheasymptoticNNNNNNcovariancematrixofthesolutionto(2.58),or(2.61),withtheoptimalB(BN).Inotherwords,thesecond-stepestimatorisasymptoticallyoptimal.SeeJiangandZhang(2001)fordetails.2.5RobustGEESofarthefocushasbeenonrobustnesstoviolationtodistributionalas-sumptions.Anissuethathasnotbeenaddressedisrobustnesstooutliers.Asamotivatingexample,ThallandVail(1990)analyzeddatafromaclin-icaltrialinvolving59epileptics.Thesepatientswererandomizedtoanewdrug(treatment)oraplacebo(control).Thenumberofepilepticseizureswasrecordedforeachpatientduringaneight-weekperiod,namely,oneseizurecountduringthetwo-weekperiodbeforeeachoffourclinicvisits.Baselineseizuresandthepatient’sagewereavailableandtreatedascovari-ates.AsnotedbyThallandVail(1990)[alsoseeDiggleetal.(2002),pp.188-189],patients#112,207,225,and227arepossible“outliers”;however,therewasnoclinicalbasisforexcludingthesepatientsfromtheanalysis.Itisthereforeofinteresttocarryouttheanalysisusingamethodthatisrobusttopotentialoutliers.Fortheithsubject,letyijdenotetheresponsecollectedattimetij,1≤j≤ni.Inaway,theGEEmethodiscloselyrelatedtotheLSinthattheleftsideof(2.29)maybeviewedasaweightedsumofresiduals,yi−μi,1≤i≤m.However,theresidualsareknowntobesensitivetoheavy-taileddistributions,contaminateddata,andoutliers.Inthecaseofindependentobservations,suchaslinearregression,astandardapproachtoobtainestimatorsthatarerobusttooutliersistouseM-estimationthatreliesonadispersionfunctionthatvariesmoreslowlythanthesquare

47January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page35GeneralizedEstimatingEquations35function,whichcorrespondstotheLS(Huber(1981)).Supposethatyij=μij(β)+σeij,(2.67)whereμijisameanfunctionthatdependsonavectorβofunknownpa-rameters,σisascaleparameter,andeijisarandomerror.Inthecontextofsemiparametricmodels,Heetal.(2002)proposedtoestimatethepa-rametersbyminimizingmniρ(uij),(2.68)i=1j=1whereuij=(yij−μij)/σ,whichdependsontheparameters,andρ(·)isaconvexfunctionsuchthatρ(u)≥ρ(0)=0.Ifρ(u)=u2,theprocedureisthesameasLS;ifρ(u)=|u|,theprocedurecorrespondstomedianre-gression[e.g.,Huber(1981)).Theleftsideof(2.68)maybeviewedasalossfunction,whichisthesumofthedispersionfunctionsfortheindivid-uals.Thisisreasonableiftheyij’sareindependent.Whentheresponsesarecorrelated,onewouldneedamultivariatedispersionfunctiontotakethecorrelationsintoaccount.Forexample,withui=(uij)1≤j≤ni,oneni2mayconsiderρ(ui)=j=1uij.Thelatterisappropriateifthewithin-subjectcorrelationsarethesame;otherwise,weightsshouldbeconsidered.Theoreticallyspeaking,themostefficientmultivariatedispersionfunctioncanbeobtainedbyconsideringthenegativelog-likelihoodwhichintrinsi-callyincorporatesthepossiblecorrelations.Butsuchanapproachrequiresspecificationofthejointlikelihoodfunction,whichmaybedifficultinnon-Gaussiansituations,andnotrobusttomisspecificationofthedistribution.Belowwetakeadifferentapproach.AfundamentalideaintheM-estimationisthatthecentraltendencyisdefinedpossiblydifferentlythantraditionally.Namely,weassumethatE{ψ(e)}=0insteadofE(e)=0,whereψ(u)=ρ(u),andsubderivativeijijwillbeusedwhenthederivativedoesnotexist.Forexample,inthecaseofρ(u)=|u|,wehaveψ(u)=sign(u).Thelattercorrespondstomedianregression[e.g.,Jung(1996),Heetal.(2002)].AnotherexampleiswheneijhasthestandardCauchydistribution.Inthiscase,thelossfunctionρ(u)=log(1+u2)actuallycorrespondstothemaximumlikelihoodesti-mationforindependentdata(Exercise2.16).Letψ(ui)denotethevector[ψ(uij)]1≤j≤ni.ArobustversionoftheGEEequation,(2.29),ism∂μiQ−1ψ(u),(2.69)ii∂βi=1

48February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page3636RobustMixedModelAnalysiswhereQi=Var{ψ(ei)}andeiisdefinedthesamewayasuibutwithuijreplacedbyeij.Equation(2.69)ismotivatedbythesameoptimalitytheorydiscussedatthebeginningofSection2.2.AsthetrueQiistypicallyunknown,onemayreplaceitbyaworkingcovariancematrix,asinGEE(LiangandZeger(1986)).Itismoreconvenienttouseaworkinginversecovariancematrix(WICM),W,forQ−1.Inparticular,ifQ=τ2I,1≤iiinii≤m,forsomepositiveconstantτ,(2.69)followsastheusualpracticeofminimizing(2.68),thatis,bydifferentiationandsettingthederivativesequaltozero.Thenextsimplestworkingcovariancematrixis,perhaps,theso-calledequicorrelated(EQC)structure,whichiscloselyrelatedtoamixedeffectsmodel.Supposethatψ(eij)canbeexpressedasψ(eij)=αi+ǫij,(2.70)whereαisarandomeffectthathasmean0andvarianceσ2,andǫisiαijanadditionalerrorthathasmean0andvarianceσ2.Assumethattheǫrandomeffectsanderrorsareuncorrelated.Then,itiseasytoshowthatvar{ψ(e)}=σ2+σ2andcov{ψ(e),ψ(e)}=σ2,ifj6=k.Thus,weijαǫijikαhavecor{ψ(e),ψ(e)}=σ2/(σ2+σ2),whichisaconstantacrossallofijikααǫthesubjectsandtimepoints.ItfollowsthatQ=σ2I+σ2J,whereiǫniαniInandJndenotethen×nidentitymatrixandmatrixof1’s.Thus,theWICMhastheexpression(Exercise2.17)1σ2−1αWi=Qi=σ2Ini−σ2+nσ2Jni.(2.71)ǫǫiαGiventheWICM,Wi,1≤i≤m,onecansolveXm∂µ′iWψ(u)=0(2.72)ii∂βi=1toobtainanestimatorofβ.Undersuitableconditions,thesolutionto(2.72)isconsistent(Wangetal.(2015)).However,theWICMmaybevery−1differentfromtheoptimalchoicewhichisthetrueQi.Asaresult,thesolutionto(2.72)maybeinefficient.Inaddition,thescaleparameter,σin(2.67),sometimescancauseaproblemthatonecannotsimplysolve(2.72)withoutknowingσ.Tosolvetheseproblems,Wangetal.(2015)suggeststosupplement(2.72)withasecondsetofestimatingequationstojointlyestimateβ,andavectorγofdispersionparametersincludingthoseinvolvedinQiandpossiblyσ.Morespecifically,theauthorssuggestthatthesecondsetofequationshastheformXmU(γ;β)=fi(yi,β)−g(γ)=0,(2.73)j=1

49January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page37GeneralizedEstimatingEquations37wherefi(yi,β)=[fi,s]1≤s≤q,q=dim(γ),andg(γ)=[gs(γ)]1≤s≤qwithmgs(γ)=infE{fi,s(yi,β)}.βi=1Theideaistoiteratebetween(2.72)and(2.73);namely,startingwithsomeinitialWICM,onesolves(2.72)toupdateβ;then,onereplacetheβin(2.73)byitsupdate,andsolvethelatterequationforγ;onethenusethelatestγtoobtainestimatedQi,1≤i≤m,andsolve(2.72)againtoupdateβ,andsoon.Wangetal.(2015)showedthat,undersomeregularityconditions,theiterativealgorithmhasasimilarconvergencepropertyastheIEE(seeSection2.3).Furthermore,thelimitingestimatorofβhasasimilarasymptoticefficiencyasIEEEinthatitsasymptoticcovariancematrixisthesameasthatofthesolutionto(2.69)withthetrueQi’s.Wenowgiveafewexamplesofspecialcasesof(2.73).Example2.12.Foranyi,s,leta∈Rni,andi,sf(y,β)=a{y−μ(β)}{y−μ(β)}a.i,sii,siiiii,sThen,itiseasytoshowthatE{f(y,β)}=aΣa+[a{μ(β)−μ(β)}]2,(2.74)i,sii,sii,si,sii0whereΣi=Var(yi),the(true)covariancematrixofyi.Obviously,(2.74)isminimizedwhenβ=β0;hence,in(2.73),onehasgs(γ)equaltothefirsttermontherightsideof(2.74),whereγisthevectorofwhateverparametersinvolvedinΣi,1≤i≤m.Toconsiderthenextexample,letusfirstintroducealemma.Theproofisleftasanexercise(Exercise2.18).LetFdenotethesetofcontinuousfunctionsfthatarenonnegative,evenandunimodal,andsatisfiesf(x)→0as|x|→∞.Lemma2.5.Foranyf,g∈Fandμ∈(−∞,∞),wehavef(x−μ)g(x)dx≤f(x)g(x)dx.Example2.13.Supposethatthereareconstantsc>0suchthatc−|ψ(u)|∈F.OneexampleofsuchaψisHuber’sfunction,definedasψ(u)=uif|u|≤c;ψ(u)=−cifu≤−c,andψ(u)=cifu≥c.Furthermore,supposethatthepdfofyij,fij,iscontinuous,unimodal,andsymmetricaboutμij=μij(β0).Foranypositiveintegers,letf(u)=cs−|ψ(u)|s,andg(u)=f(u+μ).Itiseasytoshowthatf,g∈Fijij

50January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page3838RobustMixedModelAnalysis(Exercise2.19).Thus,byLemma2.5,wehavessc−E{|ψ(yij−μij(β))|}=E{f(yij−μij(β))}=f(x−μij(β))fij(x)dx=f(x−μ)g(x)dx[μ=μij(β)−μij(β0)]≤f(x)g(x)dx=cs−E{|ψ(y−μ(β))|s},ijij0implyingE{|ψ(y−μ(β))|s}≥E{|ψ(y−μ(β))|s}foranyβ.Thus,ifijijijij0onedefinesf(y,β)=ni|ψ(y−μ(β))|s,onehasi,sij=1ijijmnisg(γ)=E{|ψ(yij−μij)|},(2.75)i=1j=1whereγisthevectorofwhateverparametersthatoneneedstoknowinordertocomputetherightsideof(2.75).2.6Real-dataexamplesWeusethreereal-dataexamplestoillustratethemethodsdiscussedinthischapter.Thefirstexampleisacontinuationofanexamplediscussedearlieronhipreplacement.ThisexampleisusedtofurtherillustratetheIEEmethod.Thesecondexampleisregardingawell-knowndatasetonsalamander-matingexperiment.ThisexampleisusedtodemonstratetherobustestimationmethodforGLMM.Thelastexampleinvolveslongitudi-naldatafromanepilepticseizurestudy.ThisdatasetisusedtoillustratetherobustGEEmethod.2.6.1HipreplacementdatarevisitedInExample2.10,wereportedtheconvergenceresultsforIEEinanalyzingthehipreplacementdata.TheresultsoftheanalysesforthehematocritandcalciumdataarepresentedinTables2.4and2.5,respectively.Thehemat-ocritdatawerealsoanalyzedbyHandandCrowder(1996)whoseGaussianestimatesarereportedforcomparison.Theparameterscorrespondto,fromlefttoright,intercept,sex,occasions(three),sexbyoccasioninteraction(three),age,andagebysexinteraction;thesecondrowisestimatedstan-darderrorscorrespondingtotheIEE,orI-WLSinthiscase,estimates.

51January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page39GeneralizedEstimatingEquations39Table2.4EstimatesforhematocritdataCoef.β1β2β3β4β5I-WLS3.190.080.65-0.34-0.21s.e.0.390.140.060.060.07Gaussian3.280.210.65-0.34-0.21Coef.β6β7β8β9β10I-WLS0.12-0.051-0.0510.033-0.001s.e.0.060.0610.0660.0580.021Gaussian0.12-0.050-0.0480.019-0.020ItisseenthattheI-WLSestimatesaresimilartotheGaussianones,es-peciallyfortheparametersthatarefoundsignificant.Thisis,ofcourse,notsurprising,becausetheGaussianandI-WLSestimatorsshouldbothbeclosetotheBLUE,providedthatthecovariancemodelsuggestedbyHandandCrowderiscorrect(theauthorsbelievedthattheirmethodwasvalidinthiscase).Takingintoaccounttheestimatedstandarderrors,wefoundthecoefficientsβ1,β3,β4,β5,andβ6tobesignificantandtherestofthecoefficientsinsignificant.Thissuggeststhat,forexample,therecoveryofhematocritimprovesovertimeatleastfortheperiodofmeasurementtimes.ThefindingsareconsistentwiththoseofHandandCrowderwiththeonlyexceptionofβ6.HandandCrowderconsideredjointlytestingthehypothesisthatβ6=β7=β8=0andfoundaninsignificantresult.Inourcase,thecoefficientsareconsideredseparately,andwefoundβ7andβ8tobeinsignificantandβ6tobebarelysignificantatthe5%level.However,becauseHandandCrowderdidnotpublishtheindividualstandarderrors,thisdoesnotnecessarilyimplyadifference.Theinterpretationofthesig-nificanceofβ6,whichcorrespondstotheinteractionbetweensexandthefirstoccasion,appearstobelessstraightforward.Asforthecalciumdata,thecovariatevariablesarethesameandlistedinthesameorder.Itisseenthat,exceptforβ1,β3,andβ4,allthecoefficientsarenotsignificant(atthe5%level).Inparticular,thereseemstobenodifferenceintermsofsexandage.Also,therecoveryofcalciumaftertheoperationseemstobealittlequickerthanthatofhematocrit,becauseβ5isnolongersignificant.HandandCrowder(1996)didnotanalyzethisdataset.2.6.2Salamander-matingdataMcCullaghandNelder(1989,§14.5)presenteddatafrommatingexperi-mentsregardingtwopopulationsofsalamanders,RoughButtandWhite-side.Thesepopulations,whicharegeographicallyisolatedfromeachother,arefoundinthesouthernAppalachianmountainsoftheeasternUnited

52January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page4040RobustMixedModelAnalysisTable2.5EstimatesforcalciumdataCoef.β1β2β3β4β5I-WLS20.10.931.32-1.89-0.13s.e.1.30.570.160.130.16Coef.β6β7β8β9β10I-WLS0.090.17-0.150.19-0.12s.e.0.160.130.160.190.09States.Thequestionwhetherthegeographicisolationhadcreatedbarri-erstotheanimals’interbreedingwasthusofgreatinteresttobiologistsstudyingspeciation.Threeexperimentswereconductedduring1986,oneinthesummerandtwointheautumn.Ineachexperimenttherewere10malesand10femalesfromeachpopulation.TheywerepairedaccordingtothedesigngivenbyTable14.3inMcCullaghandNelder(1989).Thesame40salamanderswereusedforthesummerandfirstautumnexperiments.Anewsetof40animalswasusedinthesecondautumnexperiment.Foreachpair,itwasrecordedwhetheramatingoccurred,1,ornot,0.Theresponsesarebinaryandclearlycorrelated.McCullaghandNelder(1989)proposedthefollowingmixedlogisticmodelwithcrossedrandomeffects,whichwasamongoneoftheearliest,andarguablythemostin-fluentialexample,intheliteratureofGLMM.Foreachexperiment,letuiandvjbetherandomeffectscorrespondingtotheithfemaleandjthmaleinvolvedintheexperiment.Then,onthelogisticscale,theproba-bilityofsuccessfulmatingismodeledintermoffixedeffects+ui+vj.Itwasfurtherassumedthattherandomeffectsareindependentandnormallydistributedwithmeans0andvariancesσ2forthefemalesandτ2forthemales,respectively.Undertheseassumptions,aGLMMmaybeformulatedasfollows.Notethatthereare40differentanimalsofeachsex.Supposethat,giventherandomeffectsu1,...,u40forthefemales,andv1,...,v40forthemales,thebinaryresponses,yijk,areconditionallyindependentsuchthatlogit{P(y=1|u,v)}=xβ+u+v.Hereyrepresentsijkijijijkthekthbinaryresponsecorrespondingtothesamepairofithfemaleandjthmale,xijisavectoroffixedcovariates,andβisanunknownvectorofregressioncoefficients.Morespecifically,xijconsistsofanintercept;anindicatorofWhitesidefemaleWSf,anindicatorofWhitesidemaleWSm,andtheproductWSf·WSm,representinganinteraction.ItshouldbepointedoutthatMcCullaghandNelder(1989)madeasim-plificationbyassumingthatdifferentanimalsareusedindifferentexperi-ments,andthisassumptionwasfollowedbysubsequentstudies[e.g.,Karim

53January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page41GeneralizedEstimatingEquations41andZeger(1992),LinandBreslow(1996),Jiang(1998a),BoothandHobert(1999)].Ofcourse,thisisnottrueinreality,becauseagroupof40animalswereusedtwiceinasummerandanautumnexperiments.However,thesituationgetsmorecomplicatedifthisassumptionisdropped.Thisisbe-causetheremaybeserialcorrelationsnotexplainedbytheanimal-specificrandomeffects.Morespecifically,thebinaryresponsesyijkmaynotbeconditionallyindependentgiventherandomeffects;asaresult,thedistri-butionofyijknolongerfollowsaGLMM.Alternatively,onecouldpooltheresponsesfromthetwoexperimentsinvolvingthesamegroupofanimals,assuggestedbyMcCullaghandNelder(1989,§4.1),soletyij·=yij1+yij2,whereyij1andyij2representtheresponsesfromthesummerandfirstautumnexperiments,respectively,thatinvolvedthesamegroupofsala-manders.Thismayavoidtheissueofconditionalindependence;however,anewproblememerges.Thisisbecause,giventhefemaleandmaleran-domeffects,theconditionaldistributionofyij·isnotanexponentialfamily.Notethatyij·isnotnecessarilybinomialgiventherandomeffects,becauseofthepotentialserialcorrelationbetweenyij1andyij2(seeExample2.3).Duetosuchconsiderations,JiangandZhang(2001)consideredanex-tendedGLMMforthepooledresponses.Morespecifically,letyij1betheobservedproportionofsuccessfulmatingsbetweentheithfemaleandjthmaleinthesummerandfallexperimentsthatinvolvedthesamegroupofanimals(soyij1=0,0.5or1),andyij2betheindicatorofsuccessfulmatingbetweentheithfemaleandjthmaleinthelastfallexperimentthatinvolvedanewgroupofanimals.Itisassumedthat,conditionalontherandomeffects,uk,i,vk,j,k=1,2,i,j=1,...,20,whichareinde-pendentandnormallydistributedwithmean0andvariancesσ2andσ2,fmrespectively,yijk,(i,j)∈P,k=1,2areconditionallyindependent,wherePrepresentsthesetofpairs(i,j)determinedbythedesign,u,andvrep-resentthefemaleandmale,respectively;1,...,10correspondtoRB,and11,...,20toWS.Furthermore,itwasassumedthattheconditionalmeanoftheresponsegiventherandomeffectssatisfiesoneofthetwomodelsbelow:(i)(logitmodel)E(y|u,v)=h(xβ+u+v),(i,j)∈P,ijk1ijk,ik,jk=1,2,wherexijβ=β0+β1WSf+β2WSm+β3WSf×WSm,(2.76)andh(x)=ex/(1+ex);(ii)(probitmodel)sameas(i)withh(x)replaced11byh2(x)=Φ(x),whereΦ(·)isthecdfofN(0,1).Notethatitisnotassumedthattheconditionaldistributionofyijkgiventherandomeffectsisamemberoftheexponentialfamily.Theauthorsthenobtainedthefirst-stepestimators(seeSection2.4)oftheparametersunderbothmodels.

54January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page4242RobustMixedModelAnalysisTable2.6First-stepestimateswithstandarderrorsMeanFunctionβ0β1β2β3σfσmLogit0.95−2.92−0.693.620.991.28(0.55)(0.87)(0.60)(1.02)(0.59)(0.57)Probit0.56−1.70−0.402.110.570.75(0.31)(0.48)(0.35)(0.55)(0.33)(0.32)TheresultsaregiveninTable2.5.Thenumbersinparenthesesaretheestimatedstandarderrors,obtainedfromTheorem2.5inSection2.4undertheassumptionthatthebinomialconditionalvarianceiscorrect.Ifthelatterassumptionfails,thestandarderrorestimatesarenotreliablebutthepointestimatesarestillvalid.2.6.3EpilepticseizuredataAsanillustrationoftherobustGEEmethod,discussedinSection2.5,weconsidertheepilepticseizuredatafromThallandVail(1990).Diggleetal.(1994,pp.186–188)suggestedPoisson-gammaandPoisson-Gaussianrandom-effectmodels.Themeanvectorforsubjectiisμi=exp(Xiβ),withXibeingthematrixofcovariatesfortheithpatient.Weconsiderfivecovariatesincludingtheintercept,treatment,baselineseizurerate,ageofsubject,andtheinteractionbetweentreatmentandbaselineseizurerate.AsinThallandVail(1990),thebaselineiscomputedasthelogarithmof1/4ofthe8-weekpre-randomizationseizurecount,andtheageisalsolog-transformed.Thetreatmentvariableisabinaryindicatorfortheprogabidegroup.Assumingtheloglinkfunction,wehaveμ=exp(xβ),wherexijijijisthejthrowofXi.Becauseofthehighdegreeofextra-Poissonvariation,itisreasonabletoconsideraquadraticfunctionσ2=γμ+γμ2[Bartlettij1ij2ij(1936);Morton(1987)].Notethatthenegativebinomialmodelrestrictsγ1tobe1.FirstconsiderthePoissonmodelwithoverdispersion,whichcorrespondstofixingγat0.Denoteu=(y−μ)/σ,whereσ2=γμ.We2ijijijijijijfirstignorepossiblewithin-subjectcorrelationsandusetheindependencemodel.Theestimatesareβˆ=(−2.795,−1.341,0.897,0.949,0.562).Theoverdispersionparameterγisestimatedas3.80usingthemeandeviance.Asnotedearlier,therearesomeunusualobservations.Diggleetal.(2002)presentedresultswithandwithoutpatient#207.Infact,theresid-ualsindicatethatpossibleoutliersalsoincludepatients#227,225,112,135.WeapplytheLnormestimationwithρ(u)=|u|p.Alsorecallthatthepmeaningofμijdependsonthechosenρ(u).RecentworkonLpcanbe

55January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page43GeneralizedEstimatingEquations43foundinLaiandLee(2005).Inourcaseofcountdata,ρ(u)=|u|p(p>1)seemstobealegitimaterobustapproachforassessingthetreatmenteffecttogetherwithothercovariates.Thecaseofp=2correspondstotheGLMorGEEapproachdependingonwhetheracorrelationstructureisincorpo-rated.Forillustrationpurpose,weusedtwodifferentvaluesforp,namely,2and1.5.Theestimatingfunctionsfor(γ,γ)are12mni−1U(γ1;β)=σij{uijψ(uij)−1},i=1j=1mni−1U(γ2;β)=μijσij{uijψ(uij)−1}.i=1j=1Inthecasewhenγ2isequalto0(Poissonwithoverdispersion),γ1istheoverdispersionparameter,whichcanbeestimatedby⎧p⎫2/p⎪⎨yij−μij⎪⎬ij√μijγˆ1=√.⎪⎩μ⎪⎭ijijAsreportedbyThallandVail(1990),therearealsostrongwithin-subjectcorrelations.Foreachvalueofpandeachofthetwovariancefunctions,wethereforeappliedtwoworkingmodels,independenceandtheunstructured,forthecorrelationmatrixoftheM-residuals.Giventhefactthattheobservationtimesarethesameforallofthesubjects,theunstructuredcorrelationmatrixmaybeareasonablechoice.FortheM-estimationmethod,wefirstobtainedestimatesofγ,includingthecor-relationparameters,usingtheM-residualsfromtheindependencemodel,andthenobtainedestimatesofβbysolvingU(β,αˆ)=0.Weiteratedbe-tweenthecorrelationmatrix,thevariancefunctionandestimatesofβuntiltherewerenonoticeablechangesinalloftheestimates.Theasymptoticcovariancematrixofβˆisestimatedbythemodi-fiedsandwichestimator.ThemodificationliesinthecovarianceoftheM-residualsforeachsubject.Insteadofusingindividualsubjectresiduals,mwepoolallofthesubjectresidualproducts,Qˆ=i=1{ψ(ui)ψ(ui)}.ThismodifiedsandwichmethodisverysimilartoPan(2001).Table2.7showstheresultsforthecaseofσ2=γμ+γμ2.Theij1ij2ijparameterestimatesandtheirratiotothecorrespondingstandarderror(z-values)aregiven.Theestimatesbytherobustmethods,especiallyintermsofthez-values,arequitedifferentfromthosebytheGEEmethod.

56January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page4444RobustMixedModelAnalysisTable2.7ParameterEstimatesandz-valuesforEpilepticDataInterceptTreatmentlog(age)log(baseline)InteractionLpnorm,p=2GLM(Rw=I),γˆ=(0.866,0.481)βˆ-1.477-0.9000.5400.8920.349z-value-1.210-2.1141.5156.5421.645GEE,γˆ=(1.307,0.419)βˆ-1.988-1.0390.6840.9080.404z-value-1.668-2.4301.9736.9171.949Lpnorm,p=1.5Rw=I,γˆ=(0.997,0.322)βˆ-1.427-0.8330.4760.9410.291z-value-1.295-2.1031.4837.7011.516Rw=Unstructured,γˆ=(1.203,0.289)βˆ-1.536-0.9000.5100.9390.318z-value-1.418-2.2811.6167.8801.683Inallcases,thestandarderrorsarereducedafterincorporatingcorrela-tionsamongtheM-residuals.Giventhepossibleoutliers,webelievethattheresultsusingp=1.5aremorereliable.Inthecaseofp=2,theM-residualsareknownasthePearsonresiduals.TheplotsofM-residualsfromthetwoLpmodels(withunstructuredcorrelationmatrixandBartlettvariancefunction)alsoassurethis(seeFig.2.1).2.7Exercises2.1.ShowthattheMMisaspecialcaseofGMM(seeSection2.1).2.2.ShowthatinExample2.1,yi·,1≤i≤marei.i.d.,andwehavethefollowingexpressions:E(y)=kE{h(ξ)},E(y2)=kE{h(ξ)}+k(k−1·θ1·θ1)E{h2(ξ)},whereh(x)=exp(μ+σx)/{1+exp(μ+σx)}andξ∼N(0,1).θθ2.3.ThisexerciseisrelatedtoExample2.2.a.Showthatthelog-likelihoodfunctioncanbeexpressedas(2.5).b.Letμ=2.0,andm1=m2=40.Generateu1,...,u40andv1,...,v40independentlyfromtheN(0,1)distribution,say,usingtheRsoftware,and%40%40−1computetheproduct,i=1j=1{1+exp(μ+ui+vj)},thatisinvolvedin(2.5).Whatdoyouget?c.ShowthatE(y)=mE{h(μ+σξ+ση)},whereh(x)=ex/(1+ex),i·212andξ,ηareindependentN(0,1),whichonlyinvolvesatwo-dimensionalintegration.2.4.Verifyexpression(2.13)forthemarginaldensity.2.5.ThisexerciseisrelatedtosomederivationsinSection2.1.

57January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page45GeneralizedEstimatingEquations45Fig.2.1ResidualPlotsforEpilepticSeizureDataa.Verify(2.16).b.Showthatthefirsttermontherightsideof(2.16)dependsonφ,whilethesecondtermdoesnotdependonφ.c.Verifythattherightsideof(2.14)canbeexpressedastherightsideof(2.18),whiletherightsideof(2.17)canbeexpressedastherightsideof(2.19).2.6.Thisexercisehastwoparts.a.Showthatonlytheconditionalfirstmomentsoftheyi’sgiventherandomeffectsareinvolvedin(2.14)and(2.17).

58January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page4646RobustMixedModelAnalysis2.7.Showthat,underthebeta-binomialdistributionofExample2.3,(2.22)holds,withE(Y)=kπ.2.8.Showthatequations(2.14)and(2.15)areaspecialcaseof(2.23).DeterminethebasestatisticsSaswellasthedimensionsNandrinthiscase.2.9.Showthattherightsideof(2.27)isminimized,accordingtothepartialorderingofsymmetricmatrices[definedbelow(2.27)],byB=UV−1.2.10.In(2.28),letS=y=(yi)1≤i≤m,whereyi=(yit)t∈Ti,asde-finedbelow(2.28).Showthat,undertheassumptionthaty1,...,ymareindependentwithfinitesecondmoments,theoptimalestimatingequation(2.28)canbeexpressedas(2.29).2.11.Showthat,inExample2.5,wehavevar(yit)=μit(1−μit),t=1,2andcov(yi1,yi2)=E{h(β+σξ)h(β+δ+σξ)}−μi1μi2.Withtheseexpressions,specifytheGEE(2.29).2.12.Derivethevariance-covariancestructureoftheerrors,ei,in(2.38).2.13.Verifyexpression(2.46)andalsothepartialderivatives(2.47)–(2.49).2.14.Showthat,inExample2.11,ifki=k,1≤i≤m,thefirst-stepestimatingequationisequivalenttothesecond-stepone.Thus,inthisspecialcase,thefirst-stepestimatoristhesameasthesecond-stepestimator.2.15.Verifythemeanandvarianceexpressionsunderthebeta-binomialdistributioninExample2.3(continued)inSection2.4.2.16.Supposethatyij,1≤i≤m,1≤j≤niareindependentfollowingaCauchy(μij,σ)distribution,wherethelocationparameter,μij,dependsonavectorβofunknownparameters,i.e.,μij=μij(β);σisthescaleparameter,andaCauchy(μ,σ)distributionhasthepdf2−1x−μf(x|μ,σ)=πσ1+,−∞

59January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page47Chapter3Non-GaussianLinearMixedModelsTheopeningillustrativeexampleofSection1.1isaspecialcaseofinferenceaboutanon-Gaussianlinearmixedmodel.Undersuchamodel,therandomeffectsanderrorsareassumedtobeindependent,orsimplyuncorrelated,buttheirdistributionsarenotassumedtobenormal.Asaresult,the(joint)distributionofthedatamaynotbefullyspecified(uptoanumberofparameters).Linearmixedmodelswithoutthenormalityassumptionispracticalbecause,inpractice,thenormalityassumptionrarelyholdsexactly,orevenapproximately.Itwouldbeusefultodevelopmethodsthatarerobusttoviolationofsuchanassumption.Thus,throughoutthissection,normalityisnotassumed,unlessspeciallynoted.3.1TypesofmodelsForthemostpart,therearetwotypesofnon-Gaussianlinearmixedmodelsthatarenotmutuallyexclusive.1.MixedANOVAmodel.Anon-GaussianmixedANOVAmodelisdefinedby(1.1),wherethecomponentsofαrarei.i.d.withmean0andvarianceσ2,1≤r≤s;thecomponentsofarei.i.d.withmean0andrvarianceτ2;andα,...,α,areindependent.Alloftheotherassump-1stionsarethesameasinSection1.1.DenotethecommondistributionofthecomponentsofαrbyFr(1≤r≤s)andthatofthecomponentsofbyG.IftheparametricformsofF1,...,Fs,Garenotassumed,thedistributionofyisnotfullspecified,uptoasetofparameters.Infact,eveniftheparametricformsoftheFrsandGareknown,aslongastheyarenotnormal,the(joint)distributionofymaynothaveananalyticex-pression.Thevectorofvariancecomponents,ψ,isdefinedthesamewayasinSection1.1.Alternatively,onemaydefinetheHartley–Raoformof47

60January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page4848RobustMixedModelAnalysisvariancecomponents[HartleyandRao(1967)],θ=(τ2,γ,...,γ),where1sγ=σ2/τ2,1≤r≤s.rrAspecialcaseoftheabovemodelistheso-calledbalancedmixedANOVAmodels.AmixedANOVAmodelisbalancedifXandZr,&w+1al&w+1br,l1≤r≤scanbeexpressedasX=l=11nl,Zr=l=11nl,where(a,...,a)∈S={0,1}w+1,(b,...,b)∈S⊂S.Inother1w+1w+1r,1r,w+1w+1words,therearewfactorsinthemodel;nlrepresentsthenumberoflevelsforfactorl(1≤l≤w);andthe(w+1)stfactorcorrespondsto“repetitionwithincells.”Thus,wehaveas+1=1andbr,s+1=1forallr.Weconsidersomeexamples.Example3.1(One-wayrandomeffectsmodel).Amixedmodeliscalledarandomeffectsmodeliftheonlyfixedeffectisanunknownmean.Supposethattheobservationsyij,i=1,...,a,j=1,...,bisatisfyyij=μ+αi+ijforalliandj,whereμisanunknownmean;αi,i=1,...,aarei.i.d.randomeffectswithmean0andvarianceσ2;sarei.i.d.errorswithijmean0andvarianceτ2;andtherandomeffectsanderrorsareindependent.ItiseasytoshowthatthemodelisaspecialcaseofthemixedANOVAmodel;itisbalancedifbi=bforalli.Inthelattercase,onehasw=1,n1=a,n2=b,andS={(0,1)}(Exercise3.1).Example3.2(Balancedtwo-wayrandomeffectsmodel).Forsimplicity,letusconsiderthecaseofoneobservationpercell.Inthiscase,theobservationsyij,i=1,...,a,j=1,...,bsatisfyyij=μ+ui+vj+eijforalli,j,whereμisasinExample3.1;ui,i=1,...,a,arei.i.d.randomeffectswithmean0andvarianceσ2;v,j=1,...,barei.i.d.randomujeffectswithmean0andvarianceσ2;andesarei.i.d.errorswithmeanvij0andvarianceσ2,andu,v,eareindependent.NotethatthisisaspecialecaseofthebalancedmixedANOVAmodelwithw=2,n1=a,n2=b,n3=1,andS={(0,1,1),(1,0,1)}(Exercise3.1).2.Longitudinalmodel.Thename“longitudinal”referstothefactthatthesemodelsareoftenusedintheanalysisoflongitudinaldata[e.g.,Diggleetal.(2002)],althoughsuchmodelshavebeenextensivelyusedinotherfieldsaswell,suchassmallareaestimation[e.g.,RaoandMolina(2015)).Adefiningfeatureofthesemodelsisthattheobservationsmaybedividedintoindependentgroupswithonerandomeffect(orvectorofrandomeffects)associatedwitheachgroup.Inpractice,thesegroupsmaycorrespondtodifferentindividualsinvolvedinthelongitudinalstudy,ordifferentsmallareas.Furthermore,theremaybeserialcorrelationswithineachgroup,whichareinadditiontotherandomeffect.Whenthemodelisusedinlongitudinalstudy,thereareoftentime-dependentcovariates,whichmay

61January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page49Non-GaussianLinearMixedModels49appeareitherinXorinZ(seebelow).FollowingDattaandLahiri(2000),alongitudinalmodelmaybeexpressedasyi=Xiβ+Ziαi+i,i=1,...,m,(3.1)whereyirepresentsthevectorofobservationsfromtheithindividual;XiandZiareknownmatrices;βisanunknownvectorofregressioncoeffi-cients;αiisavectorofrandomeffects;andiisavectoroferrors.Itisassumedthatαi,i,i=1,...,mareindependentwithE(αi)=0,Var(αi)=Gi,E(i)=0,andVar(i)=Ri),wherethecovariancematricesGiandRiareknownuptoavectorθofvariancecomponents.Weconsideranexample.Example3.3(Growthcurve).Consider,again,Example2.7,butwithoutthenormalityassumption.Instead,assumethat,fordifferenti’s,alloftherandomvariablesareindependent.Furthermore,foreachi,wehaveE(ξ)=μ,E(η)=μ,var(ξ)=σ2,var(η)=σ2,andcor(ξ,η)=ρ;i1i2i1i2iialsoE()=0,var()=τ2).Asfortheζs,itisassumedthattheyijijijsatisfythefollowingfirstorderautoregressiveprocess,orAR(1):ζij=φζij−1+ωij,whereφisaconstantsuchthat0<φ<1,andωijsareindependentwithmean0andvarianceσ2(1−φ2).Finally,threerandom3components(ξ,η),ζ,andareindependent.Thereisaslightdepartureofthismodelfromthestandardlinearmixedmodelinthattherandominterceptandslopemayhavenonzeromeans.However,bysubtractingthemeansandthusdefiningnewrandomeffects,themodelcanbeexpressedinthestandardformofalongitudinalmodel.Inparticular,thefixedeffectsareμandμ,andtheunknownvariancecomponentsareσ2,j=1,2,3,12jτ2,ρ,andφ.3.2Quasi-likelihoodmethodInthissectionwediscussestimationinnon-Gaussianlinearmixedmodels.WeshallfocusonmixedANOVAmodels.Someremarksaremadeattheendofthesectiononpossibleextensionofthemethodlongitudinalmodels.Asnoted,whennormalityisnotassumed,likelihood-basedinferenceisdifficult,orevenimpossible.Toseethis,firstnotethatifthedistributionsoftherandomeffectsanderrorsarenotspecified,thelikelihoodfunctionissimplynotavailable.Furthermore,evenifthe(nonnormal)distribu-tionsoftherandomeffectsanderrorsarespecified(uptosomeunknownparameters),thelikelihoodfunctionisusuallycomplicated.Inparticu-lar,suchalikelihoodmaynothaveananalyticexpression.Finally,like

62January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page5050RobustMixedModelAnalysisnormality,anyotherspecificdistributionalassumptionsmaynotholdinpractice.Thesedifficultieshaveledtoconsiderationofmethodsotherthanmaximumlikelihood.OnesuchmethodisGaussian-likelihood,or,moregenerally,quasi-likelihood.Theideaistousethenormality-basedestimators,evenifthedataarenotreallynormal.ForthemixedANOVAmodels,theREMLestimatorofthevectorofvariancecomponents,say,θ,isdefinedasaroottothe(Gaus-sian)REMLequations,providedthattherootbelongstotheparameterspace.Similarly,theMLestimatorsofβandθaredefinedasaroottothe(Gaussian)MLequations,providedthatthey,too,stayintheparameterspace.Morespecifically,underthemixedANOVAmodelwiththevariancecomponents,ψ(see§3.1.1),theREMLequationsaregivenby(1.4).Withthesamemodelandvariancecomponents,theMLequationsare⎧XV−1Xβ=XV−1y,⎨yP2y=tr(V−1),(3.2)⎩yPZZPy=tr(ZV−1Z),1≤i≤s.iiiiSimilarly,theREMLequationsundertheANOVAmodelwiththeHartley–Raoformofvariancecomponents,θ(see§3.1.1),areyPy=n−p,(3.3)yPZZPy=tr(ZPZ),1≤i≤s.iiiiTheMLequationsunderANOVAmodelandtheHartley–Raoformare⎧XV−1Xβ=XV−1y,⎨yPy=n,(3.4)⎩yPZZPy=tr(ZV−1Z),1≤i≤s.iiiiAtfirst,itmightsoundabitunintuitive:Ifthenormalityassumptioniswrong,onehasthewronglikelihoodfunction;ifso,howcanthemethodstillwork?Toanswerthequestion,letusfirstpointoutthat,althoughtheREMLestimatorisderivedunderthenormalityassumption,GaussianlikelihoodisnottheonlyonethatcanleadtotheREMLequations.Forexample,Jiang(1996)notedthatexactlythesameequationswillariseifonestartswithamultivariatet-distribution,thatis,y∼tn(Xβ,V,d),whichhasajointpdf−(n+d)/2Γ{(n+d)/2}1−1p(y)=1+(y−Xβ)V(y−Xβ)(dπ)n/2Γ(d/2)|V|1/2d(Exercise3.2).Heredisthedegreeoffreedomofthemultivariatet-distribution.

63January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page51Non-GaussianLinearMixedModels51Moregenerally,Heyde(1994,1997)showedthattheREMLequationscanbederivedfromaquasi-likelihood.Considerageneralsettinginwhichyisavectorofresponsesthatisassociatedwithavectorxofexplana-toryvariables.Hereweallowxtoberandomaswell.Supposethatthe(conditional)meanofygivenxisassociatedwithψ,avectorofunknownparameters.Fornotationalsimplicity,writeμ=Eψ(y|x)=μ(x,ψ),andV=Var(y|x).Here,Vardenotesthecovariancematrix,andVarorEwithoutsubscriptψmeanstobetakenatthetrueψ.Letμ˙denotethematrixofpartialderivatives;thatis,μ˙=∂μ/∂ψ.Considerthefollowingclassofvector-valuedestimatingfunctionsH={G=B(y−μ)},whereB=B(x,ψ),suchthatE(G˙)isnonsingular.Anestimatingequationcor-respondstotheequationH=0tosolveforψ.ThefollowingtheoremcanbederivedfromTheorem2.1ofHeyde(1997)[seeJiang(2007),§4.5.1].Theorem3.1.SupposethatVisknown,andthatE(˙μV−1μ˙)isnonsingular.Then,theoptimalestimatingfunctionwithinHisgivenbyG∗=˙μV−1(y−μ),thatis,withB=B∗=˙μV−1.Asitturnsout,theREMLequationsderivedundernormalityandun-dermultivariate-tareequivalent(Exercise3.2),whichisaspecialcaseofthequasi-likelihoodestimatingequation.Forsuchareason,the(Gaussian)REMLestimationmayberegardedasamethodofquasi-likelihood.Simi-larly,the(Gaussian)MLestimationmaybejustifiedfromaquasi-likelihoodpointofview.Forsimplicity,thecorrespondingestimatorsarestillcalledREMLorMLestimators.Anotherwaytojustifythequasi-likelihoodmethodisthroughasymp-toticanalysis.Ithasbeenshown[RichardsonandWelsh(1994),Jiang(1996),Jiang(1997)]thattheREMLestimatorisconsistentandasymptot-icallynormalevenifnormalitydoesnothold,providedthatsomemomentconditionsaresatisfied.Furthermore,Jiang(1996)showedthattheMLestimatorhassimilarasymptoticproperties,providedthatthenumberoffixedeffects,p,remainsboundedorincreasesataslowerratethanthesam-plesize.Again,thelatterresultdoesnotrequirethenormalityassumption.Therefore,thequasi-likelihoodapproachis,atleast,well-justifiedfromanasymptoticpointofviewforpointestimation.Nevertheless,tomakeinferenceaboutamodel,oneneedsmorethanjustapointestimator.Standarderrorsofestimatorsandintervalestimatorsareoftenneeded.Wediscusssuchproblemsinthenextsection.

64January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page5252RobustMixedModelAnalysis3.3MeasureofuncertaintyInordertocarryouttheinference,oneneedsameasureofuncertaintyfortheestimator.Underthenormalityassumption,theuncertaintyistypicallyevaluatedusingtheasymptoticcovariancematrix(ACVM)associatedwiththeestimator.TheACMisequaltotheinversesofthecorrespondingFisherinformationmatrices.Theproblemiscomplicated,however,whennormalitydoesnothold.Accordingtotheprevioussection,itisknownthatevenwithoutthenormalityassumption,theREMLestimatorisstillconsistentandasymp-toticallynormalundermildconditions;thesamecanbesaidabouttheMLE,undermorerestrictiveconditions.But,whatcanwesayabouttheirACVMs?Tobespecific,letusfocusonREMLunderthemixedANOVAmodelwiththeHartley–Raoformofvariancecomponents,astheideacanbeeasilyextendedtoothercases.Itcanbeshownthat[e.g.,Jiang(1996)],withoutthenormalityassumption,theACVMoftheREMLestimator,θˆ,ofthevariancecomponents,hasthefollowing“sandwich”expression[com-pare,e.g.,with(2.26)]:2−12−1∂lR∂lR∂lRΣR=EVarE,(3.5)∂θ∂θ∂θ∂θ∂θwherelRistheGaussianrestrictedlog-likelihoodgivenby1'l(θ)=−(n−p)log(2π)+(n−p)log(τ2)+log(|AVA|)Rγ21+yV(γ)y(3.6)τ2underthemixedANOVAmodel.Whennormalityindeedholds,onehas∂2l∂lRRE=−Var,(3.7)∂θ∂θ∂θso(3.5)reducestotheinversedFisherinformationmatrix:−12−1−1∂lR∂lRI(θ)=Var=−E.(3.8)R∂θ∂θ∂θAnattractivefeatureof(3.8)isthatitonlydependsonthevariancecom-ponents,θ[thisismoreeasilyseenusingthesecondexpressionin(3.8)],ofwhichonealreadyhasthe(REML)estimator.Theremaynotbesuchaluxury,however,if(3.7)doesnothold.Toseewhereexactlytheproblemis,notethatI=E(∂2l/∂θ∂θ)2Rdependsonlyonθ,evenwithoutnormality,sothispartdoesnotcause

65January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page53Non-GaussianLinearMixedModels53any“trouble”.WhatmaybetroublesomeisI1=Var(∂lR/∂θ),whichmayinvolvehigher(than2nd)momentsoftherandomeffectsanderrors,whicharenotpartofθ.Weillustratewithanexample.Example3.2(continued).RecallthatIa,1adenotethea×aidentitymatrixanda×1vectorof1’s,respectively,andJ=11.Itcanbeshownaaathat,inthiscase,onehas∂lξHξ−(ab−1)σ2Re=,(3.9)∂σe22σe4whereH=Ia⊗Ib+λ1Ia⊗Jb+λ2Ja⊗Ib+λ3Ja⊗Jb,where⊗denotetheKroneckerproduct,λ=−b−1{1−(1+γb)−1},λ=−a−1{1−(1+γa)−1},1122andλ=(ab)−1{1−(1+γb)−1−(1+γa)−1}withγ=σ2/σ2and3121ueγ=σ2/σ2,andξ=y−μ1⊗1.Itisclearthat,bydifferentiating(3.9),2veabagain,withrespectivetothevariancecomponents,onestillendsupwithquadraticformsinξ.Thus,itiseasytoseethatE(∂2l/∂σ4),etc.areRefunctionsofθ=(σ2,σ2,σ2)regardlessofnormality.Similarly,itcanbeeuvshownthatotherelementsofI2arefunctionsofθ.Ontheotherhand,itiseasytoseefrom(3.9)thatvar(∂l/∂σ2)involvesfourthmomentsofRetherandomeffectsanderrors(butnothirdmoments;andthishasnothingtodowithnormality,orsymmetry;Exercise3.3).Suchquantitiesarenotfunctionsofθ,unlessnormalityholds[inwhichcaseonehas,forexample,E(e4)=3σ4].Inconclusion,incaseofnon-Gaussianrandomeffectsandijeerrors,IinvolvesE(e4),E(u4),E(v4),inadditiontoθ.1ijijTherefore,InordertomakeuseoftheACVM,onehastofindawaytoevaluateI1.Belowweconsidertwoapproaches.3.3.1EmpiricalmethodofmomentsAsimple-mindedapproachwouldbetoestimatethehighermomentsin-volvedinI1.Notethatestimatesofthehighermomentsareusuallynotprovidedinstandardpackagesofmixedmodelanalysis,suchasthoseinSASorR.Jiang(2003)usedanapproach,calledempiricalmethodofmo-ments(EMM)toestimatethehighermoments.Letθbeavectorofpa-rameters.Supposethataconsistentestimatorofθ,θˆ,isavailable.Letϕbeavectorofadditionalparametersaboutwhichknowledgeisneeded.Letψ=(θ,ϕ),andM(ψ,y)=M(θ,ϕ,y)beavector-valuedfunctionofthesamedimensionasϕthatdependsonψandy,avectorofobservations.SupposethatE{M(ψ,y)}=0whenψisthetrueparametervector.Then,ifθwereknown,amethodofmomentsestimatorofϕwouldbeobtained

66January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page5454RobustMixedModelAnalysisbysolvingM(θ,ϕ,y)=0(3.10)forϕ.Notethatthisismoregeneralthantheclassicalmethodofmoments,inwhichthefunctionMisavectorofsamplemomentsminustheirexpectedvalues.Ineconometricliterature,thisisreferredtoasgeneralizedmethodofmoments[e.g.,Hansen(1982)].Becauseθisunknown,wereplaceitin(3.10)byθˆ.TheresultisanEMMestimatorofϕ,denotedbyϕˆ,whichisobtainedbysolvingM(θ,ϕ,yˆ)=0.(3.11)Notethathereweusethewords“anEMMestimator”insteadof“theEMMestimator”,becausesometimestheremaybemorethanoneconsistentesti-matorsofθ,andeachmayresultinadifferentEMMestimatorofϕ.Jiang(2003)showsthat,undersuitableconditions,theEMMestimatoriscon-sistentforestimatingϕ.Aself-containedconsistencyresult,whichisnotcoveredbyJiang(2003),canbefoundinSection4.3.ToapplytheEMMtonon-GaussianmixedANOVAmodel,Jiang(2003)assumedthatthethirdmomentsoftherandomeffectsanderrorsarezero,thatis,E(3)=0,E(α3)=0,1≤r≤s,(3.12)1r1where1(αr1)isthefirstcomponentof(αr)[recallthecomponentsof(αr)arei.i.d.](3.12)issatisfiedif,inparticular,thedistributionsoftherandomeffectsanderrorsaresymmetric.Undersuchanassumption,itcanbeshownthattheACVMofREMLestimatorofψ=(β,ψ),whereψ=(τ2,σ2,...,σ2),dependsonψaswellasthekurtoses:κ=E(4)−3τ2,1s01κ=E(α4)−3σ4,1≤r≤s.rr1rForanymatrixH=(h),defineH=(h4)1/4;similarly,ifij4i,jijh=(h)isavector,definea=(h4)1/4.LetLbealinearspace.i4iiDefineL⊥asthelinearspace{v:vu=0,∀u∈L}.IfL,j=1,2arejlinearspacessuchthatL1⊂L2,thenL2L1representsthelinearspace{v:v∈L,vu=0,∀u∈L}.IfM,...,Marematriceswithsame211knumberofrows,thenL(M1,...,Mk)representsthelinearspacespannedbythecolumnsofM1,...,Mk.SupposethatthematricesZ1,...,Zsin(1.1)havebeensuitablyorderedsuchthatLr={0},0≤r≤s,(3.13)whereL=L(Z,...,Z)⊥,L=L(Z,...,Z)L(Z,...,Z),1≤01srrsr+1sr≤s−1,andLs=L(Zs).LetCrbeamatrixwhosecolumnsconstitute

67January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page55Non-GaussianLinearMixedModels55abaseofL,0≤r≤s.Definea=ZC4,0≤q≤r≤s.Itiseasytorrqqr4seethat,under(3.13),onehasarr>0,0≤r≤s(Exercise3.4).LetnrbethenumberofcolumnsofCr,andcrkthekthcolumnofCr,1≤k≤nr,0≤r≤s.Define2nrrb(φ)=3|Zc|2φ,0≤r≤s,rqrkvk=1q=0whereφ=(φ)withφ=τ2andφ=σ2,1≤r≤s.Letκ=r0≤r≤s0rr(κr)0≤r≤s.ConsiderM(β,φ,κ,y)=[Mr(β,φ,κ,y)]0≤r≤s,whererM(β,φ,κ,y)=C(y−Xβ)4−aκ−b(φ),0≤r≤s.rr4rqqrq=0Then,byLemma3.1belowandthedefinitionoftheCr’s,itiseasytoshowthatE{M(β,φ,κ,y)}=0whenβ,φ,κarethetrueparametervectors(Exercise3.4).Thus,asetofEMMestimatorscanbeobtainedbysolvingM(β,ˆφ,κ,yˆ)=0,whereβ,ˆφˆaretheREMLestimatorsofβ,φ,respectively[see(1.3)].TheEMMestimatorscanbecomputedrecursively,asfollows:κˆ=a−1dˆ,0000r−1κˆ=a−1dˆ−arqκˆ,1≤r≤s,(3.14)rrrrqarrq=0wheredˆ=C(y−Xβˆ)4−b(φˆ),0≤r≤s.rr4rLemma3.1.Letξ1,...,ξnbeindependentrandomvariablessuchthatE(ξ)=0andE(ξ4)<∞,andλ,...,λbeconstants.Then,wehaveii1n42nnn()2442Eλiξi=3λivar(ξi)+λiE(ξi)−3{var(ξi)}.i=1i=1i=1WeillustratetheEMMwithanexample.Example3.1(continued).Considerthebalancedcase.ItiseasytoshowthatC0=Ia⊗KbandC1=Z1=Ia⊗1b,where⎛⎞1···1⎜−1···0⎟⎜⎟Kb=⎜......⎟.⎝...⎠0···−1b×(b−1)

68January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page5656RobustMixedModelAnalysisItfollowsfrom(3.14)that,inclosed-form,1abκˆ=(y−y)4−6ˆτ4,(3.15)0i1ij2a(b−1)i=1j=21a1abκˆ=(y−bμˆ)4−(y−y)41ab4i·2ab3(b−1)i1iji=1i=1j=23246224−1−τˆ−τˆσˆ−3ˆσ,(3.16)b2bbb222whereyi·=j=1yij,andμˆ,τˆ,andσˆaretheREMLestimatorsofμ,τ,andσ2,respectively(Exercise3.5).Itiseasytoshow,eitherbyverifyingtheconditionsofTheorem1inJiang(2003),orbyarguingdirectly,thattheEMMestimatorsareconsistentprovidedonlythata→∞.Theonlyrequirementforbisthatb≥2.Thelatterisreasonablebecause,otherwise,onecannotseparateαand(Exercise3.5).3.3.2PartiallyobservedinformationAnotherideaofestimatingtheACVMhastodowiththeGEEmethodofestimatingthecovariancematrixofβˆ,discussedinSubsection2.2.Recallthederivationof(2.37)from(2.36).Theideaistoexpressaquantityofinterestasexpectationofsumsofrandomvariables.Wethendroppedtheexpectationsignandusedtheexpressioninsidedtheexpectation,withanyunknownparametersreplacedbytheirconsistentestimator,toapproximatethequantityofinterest.Thisisausefultechnique;althoughitshouldnotbe“abused”.Jiang(2010)(Example2inPreface)usedthefollowingexampletoillustratewhenthestrategyworks,andwhenitdoesnot.Example3.4.Itmightbethoughtthat,withalargesample,onecouldalwaysapproximatethemeanofarandomquantitybythequantityitself.Beforewegetintothis,letusfirstbeclearonwhatismeantbyagoodapproximation.Anapproximationisgoodiftheerroroftheapproximationisoflowerorderthantheapproximationitself.Innotation,thismeansthat,supposeonewishestoapproximateAbyB;theapproximationisgoodifA−B=o(B),sothatA=B+A−B=B+o(B)=B{1+o(1)}.Inotherwords,Bisthe“mainpart”ofA.Oncetheconceptisclear,itturnsoutthat,insomecases,theapproximationtechniquementionedaboveworks,whileinsomeothercases,itdoesnotwork.Forexample,supposethatY1,...,Ynarei.i.d.observationssuchthatμ=E(Y1)=0.Inthiscase,nonecanapproximateE(i=1Yi)=nμbysimplyremovingtheexpectation

69January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page57Non-GaussianLinearMixedModels57nnsign,thatis,byi=1Yi.Thisisbecausethedifference√i=1Yi−nμ=ni=1(Yi−μ)isoftheorderOP(n)[providedthatvar(Y1)<∞],whichisnlowerthantheorderofi=1Yi,whichisOP(n).However,thetechniquecompletelyfailsifoneconsiders,forexample,approximatingE(nY)2,i=1iwheretheYi’sarei.i.d.withmean0.Forsimplicity,assumethatYi∼n2n2N(0,1).Then,wehaveE(i=1Yi)=n.Ontheotherhand,(i=1Yi)=nχ2,whereχ2isarandomvariablewiththeχ2distributionwithonedegree11offreedom.Therefore,(nY)2−E(nY)2=n(χ2−1),whichisofthei=1ii=1i1sameorderas(nY)2.Thus,(nY)2isnotagoodapproximationi=1ii=1itoitsmean.Itcanbeseenthata“key-to-success”forthisapproximationtechniqueisthattherandomquantityinsidetheexpectationhastobeasumofran-domvariableswith“goodbehavior”(e.g.,independence).Inthecontextoflikelihoodinference,thetechniqueisassociatedwithatermcalledobservedinformation.Considerthecaseofi.i.d.observations.Inthiscase,theinformationcontainedintheentiredata,Y1,...,Yn,abouttheparametervector,θ,isI(θ)=nI(θ),where,underregularityconditions,onehasn2n2∂∂I(θ)=−Ef(Yi|θ)=−Ef(Yi|θ).(3.17)∂θ∂θ∂θ∂θi=1i=1Usingtheapproximationtechnique,oneremovestheexpectationsignin(3.17)togetanapproximationtoI(θ).Theapproximationisnotyetanestimate,becauseθisunknown.Thus,asalaststep,onereplacedθinthelatestapproximationbyθˆ,theMLE,togetn∂2I(θ)=−f(Yi|θˆ).(3.18)∂θ∂θi=1Theestimator(3.18)iscalledobservedinformation(matrix).Alternatively,especiallyifI(θ)hasananalyticexpression,onecanreplaceθinthisex-pressionby,say,theMLEθˆtogetI,(θ)=nI(θˆ).(3.19)Theestimator(3.19)iscalledestimatedinformation(matrix).See,forexample,EfronandHinkley(1978),foradiscussionandcomparisonofthetwoestimatorsinthei.i.d.case.However,neithertheestimatednortheobservedinformationmethodsapplytoREMLestimation.First,asnoted,IR(θ)isnotafunctionofθ,thevectorofvariancecomponents,butinvolvesadditionalparameters,namelythehighermoments.Secondly,asitturnsout,onecannotexpressIR(θ)as

70January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page5858RobustMixedModelAnalysisexpectationofasumofrandomvariablesthatareobservableandhavegoodbehavior.Notethat,underaLMM,theobservationsarenotindependent,sotheabovederivationforthei.i.d.casedoesnotcarrythrough.Nev-ertheless,itwasfoundthatacombinationofthesetwotechniqueswouldworkjustfine.Wefirstuseanexampletoillustrate.Example3.2(continued).Itcanbeshownthatthevarianceof(3.9)canbeexpressedasvar(∂l/∂σ2)=S+S,whereRe12⎧⎛⎞⎫⎪⎨44⎪⎬S=Eaξ4−a⎝ξ⎠−aξ,(3.20)1⎪⎩·ij1ij2ij⎪⎭i,jijjiwhereξij=yij−μ,a·=a0+a1+a2,andaj,j=0,1,2,S2arefunctionsofθ(Exercise3.6).Thus,S2canbeestimatedbyreplacingθbyθˆ,theREMLestimator.AsforS1,itcanbeestimatedusingtheobservedinformationidea,thatis,byremovingtheexpectationsign,andreplacingμandθinvolvedintheexpressioninsidetheexpectationbytheirREMLestimators.Todescribethemethodingeneral,denotethevectorofvariancecom-ponentsunderthemixedANOVAmodelbyθ=(θ),whereθ=τ2r0≤r≤s0andθr=γr,1≤r≤s.Then,itcanbeshownthat∂lR/∂θr=ξHξ−h,0≤r≤s,whereξ=y−Xβ,H=P/2θ,withPgivenrr00below(1.4),andH=(θ/2)PZZP,1≤r≤s;h=(n−p)/2θ,andr0rr00h=(θ/2)tr(PZZ),1≤r≤s.Also,withaslightabuseoftheno-r0rrtation,letzandzbetheithrowandlthcolumnofZ,respectively,irrlrs0≤r≤s,whereZ0=IDefineΓ(i1,i2)=r=0γr(zi1r·zi2r).Here,thedotproductofvectorsa1,...,akofthesamedimensionisdefinedasa1·a2···ak=la1la2l···akl.Alsoletα0=andrecallthatmristhedimensionofαr,0≤r≤s(so,inparticular,m0=n),andthatsV=Var(y)=θ0(In+r=1θrZrZr).Webeginwithsomeexpressionsforcov(ξi1ξi2,ξi3ξi4),whereξiistheithcomponentofξ,andcov(∂lR/∂θq,∂lR/∂θr),the(q,r)elementofI1[seeJiang(2005)].Lemma3.2.Wehave,forany1≤ij≤n,j=1,2,3,4,cov(ξξ,ξξ)=λ2{Γ(i,i)Γ(i,i)+Γ(i,i)Γ(i,i)}i1i2i3i413241423s+κrzi1r···zi4r,(3.21)r=0

71January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page59Non-GaussianLinearMixedModels59wherezi1r···zi4r=zi1r·zi2r·zi3r·zi4r;and,forany0≤q,r≤s,∂lR∂lRcov,=2tr(HrVHrV)∂θq∂θrsmt+κt(ztlHqztl)(ztlHrztl).(3.22)t=0l=1Letf1,...,fLbethedifferentnonzerofunctionalvaluesofsf(i1,...,i4)=κrzi1r···zi4r.(3.23)r=0Notethatthisisthesecondtermontherightsideof(3.21).Herebyfunctionalvalueitmeansf(i1,...,i4)asafunctionofκ=(κr)0≤r≤s.Forexample,κ0+κ1andκ2+κ3aredifferentfunctions(eveniftheirvaluesmaybethesameforsomeκ).Also,let0denotethezerofunction(ofκ).Also,letHr,i,jdenotethe(i,j)elementofHr.DefineAl={(i1,...,i4):f(i1,...,i4)=fl},1≤l≤L,and1cq,r,l=Hq,i1,i2Hr,i3,i4,(3.24)|Al|(i1,...,i4)∈Alwhere|·|denotescardinality.Notethatcq,r,ldependsonlyonθ.Definecq,r(i1,...,i4)=cq,r,l,iff(i1,...,i4)=fl,1≤l≤L.ThefollowingresultwasprovedinJiang(2005).LetI1,qrdenotethe(q,r)elementofI1.Theorem3.2.Foranynon-GaussianmixedANOVAmodel,wehave,forany0≤q,r≤s,I1,qr=I1,1,qr+I1,2,qr,where⎧⎫⎨⎬I1,1,qr=Ecq,r(i1,...,i4)ξi1···ξi4⎩⎭f(i1,...,i4)=0I=2tr(HVHV)−3θ2c(i,...,i)Γ(i,i)Γ(i,i).1,2,qrqr0q,r141324f(i1,...,i4)=0Theorem3.2showsthatanyelementofI1canbeexpressedasthesumoftwoterms.Thefirsttermisexpressedastheexpectationofasumwiththesummandsbeingproductsoffourξi’sandafunctionofθ;thesecondtermisafunctionofθonly.Thefirstterm,I1,1,qr,canbeestimatedthesamewayastheobservedinformation,thatis,byremovingtheexpectationsign,andreplacingβ(intheξ’s)andθbyβˆandθˆ,theREMLestimators,respectively.DenotetheestimatorbyIˆ1,1,qr.Thesecondterm,I1,2,qr,canbeestimatedbyreplacingθbyθˆ.DenotethisestimatorbyIˆ1,2,qr.AnestimatorofI1,qristhengivenbyIˆ1,qr=Iˆ1,1,qr+Iˆ1,2,qr.Because

72January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page6060RobustMixedModelAnalysistheestimatorconsistspartiallyofan“observed”termandpartiallyanesti-matedterm,itiscalledpartiallyobservedinformation,orPOI.Jiang(2005)givessufficientconditionsunderwhichthePOImatrix,Iˆ1=(Iˆ1,qr)0≤q,r≤s,consistentlyestimateI;henceΣˆ=Iˆ−1IˆIˆ−1consistentlyestimateΣ1R212Rof(3.5),whereIˆ2isI2withθreplacedbyθˆ,theREMLestimator.Weconsideranexample.Example3.1(continued).Itiseasytoshowthatf(i1j1,...,i4j4)=0,ifnoti1=···=i4;κ1,ifi1=···=i4butnotj1=···=j4;andκ0+κ1,ifi1=···=i4andj1=···j4.Thus,L=2[notethatListhenumberofdifferentfunctionalvaluesoff(i1j1,...,i4j4)].Definethefollowingfunctionsofθ=(τ2,γ)andγ=σ2/τ2:t=1−γ/(1+0γb)−1/{(1+γb)ab},t=(a−1)b/{a(1+γb)},andt={b(1+γb)2−13(1+γ)2}/(b3−1).Then,thePOIsaregivenbyIˆ=Iˆ+Iˆ,1,qr1,1,qr1,2,qrq,r=0,1,where⎧⎫⎛⎞4⎪⎨⎪⎬tˆ2−tˆ2btˆ2Iˆ=10⎝ξˆ⎠−ξˆ4+0ξˆ4,1,1,004ˆτ8b(b3−1)⎪⎩ijij⎪⎭4ˆτ8ijiji,ji,j⎧⎛⎞⎫⎪⎨4⎪⎬Iˆ(a−1)(tˆ1b−tˆ0)⎝ξˆ⎠ξˆ41,1,01=4ˆτ6(1+ˆγb)2a(b3−1)⎪⎩ij−ij⎪⎭,iji,j(a−1)tˆ0ξˆ4+4ˆτ6(1+ˆγb)2aij,i,j⎛⎞4(a−1)2Iˆ1,1,11=442⎝ξˆij⎠;4ˆτ(1+ˆγb)aijIˆ=1ab−1−3abtˆ2{(1+ˆγ)2−tˆ}−3atˆ2tˆ,1,2,002ˆτ4203213(a−1)b3(tˆb−tˆ)tˆ+(1+ˆγ)2tˆIˆ=1−1030,1,2,012ˆτ2(1+ˆγb)21+ˆγb(a−1)(a−3)b2Iˆ1,2,11=−2,4a(1+ˆγb)ξˆij=yij−y¯··,andtˆj,j=0,1,3aretj,j=0,1,3,respectively,withθreplacedbyθˆ,theREMLestimator.Noteonapplication:ItisseenfromTheorem3.2thatPOIor,morespecifically,I1,1,isbasedonfirstconsideringξ=y−Xβandthenreplacingtheβinξbyβˆ.Ontheotherhand,I1,2andI2arefunctionsofθ,whichareestimatedbyreplacingθbyθˆ.Thus,practically,onecanderivethe

73January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page61Non-GaussianLinearMixedModels61POIformulaebythefollowingA-B-Csteps:(A)ObtaintheREMLestimatorsofθandβ.(B)Replaceybyξˆ=y−Xβˆ,thenderivethePOIformulaebasedonthesameLMMbutwithoutthefixedeffects.(C)Oncetheformulaein(B)arederived,makesuretousetheREMLestimatorofθˆin(A)[nottheonebasedontheLMMmodelwithoutthefixedeffectsthatisusedin(B)].ItcanbeseenthatthisprocedurewasfollowedinExample3.1(contin-ued)discussedabove.3.3.3Hypothesistesting:AsimulatedexampleAsanapplication,weconsiderusingthePOIintestingahypothesisre-gardingthevariancecomponents,ordispersionparameters.TheexampleisregardingExample3.1(continued),discussedabove.Thetestisconsideredrobustinthesensethatitdoesnotrequirenormality.SupposethatonewishestotestthehypothesisH0:γ=1,(3.25)thatis,thevariancecontributionduetotherandomeffectsisthesameasthatduetotheerrors.Forexample,thenullhypothesisisequivalenttoH:h2=2,whereh2=4γ/(1+γ)isaquantityofgeneticinterest,0calledheritability.ThenullhypothesiscanbeexpressedasH:Kθ=00withK=(0,1).Furthermore,wehaveKΣK=Σ,whichistheRR,11asymptoticvarianceofγˆ,theREMLofγ.Thus,ateststatisticisχˆ2=(ˆγ−1)2/Σˆ,whereΣˆisthePOIofΣ.ItiseasytoshowthatR,11R,11R,11Iˆ1,11Iˆ22,00−2Iˆ1,01Iˆ2,00Iˆ2,01+Iˆ1,00Iˆ22,01ΣˆR,11=,(3.26)(Iˆ2,00Iˆ2,11−Iˆ22,01)2whereIˆ1,qr=Iˆ1,1,qr+Iˆ1,2,qr,q,r=0,1,andIˆ1,j,qr,j=1,2aregivenearlier,butwithγˆreplacedby1,itsvalueunderH0.Furthermore,wehaveIˆ=−(ab−1)/2ˆτ4,Iˆ=−(a−1)b/2ˆτ2(1+ˆγb),Iˆ=−(a−2,002,012,111)b2/2(1+ˆγb)2,againwithγˆreplacedby1,whereτˆ2istheREMLestimatorofτ2underthenull,givenby21SSAτˆ=SSE+,(3.27)ab−1b+1ab2a2whereSSE=i=1j=1(yij−y¯i·),SSA=bi=1(¯yi·−y¯··)withy¯i·=−1b−1abbj=1yijandy¯··=(ab)i=1j=1yij.Theasymptoticnulldistri-butionisχ2.1

74February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page6262RobustMixedModelAnalysisTable3.1RDTversusJackknife-SizeNominalLevelMethodSimulatedSizeI-iI-iiI-iiiII-iII-iiII-iii0.01RDT0.0220.0260.0280.0110.0130.015Jackknife0.0100.0140.0200.0090.0110.0130.05RDT0.0700.0780.0910.0540.0570.063Jackknife0.0520.0530.0680.0530.0530.0600.10RDT0.1230.1320.1510.1060.1080.114Jackknife0.0990.1030.1220.1040.1030.109Jiang(2005)carriedoutasimulationstudyontheperformanceofthedispersiontestproposedabove,whichwecallrobustdispersiontest(RDT),andcompareditwithanalternativedelete-groupjackknifemethodpro-posedbyArvesen(1969).Thelatterappliestocaseswheredatacanbedividedintoi.i.d.groups,suchasthecurrentsituation.Werefertothistestasjackknife.Morespecifically,ArvesenandSchmitz(1970)proposedtousethejackknifeestimatorwithalogarithmtransformation,andthisisthemethodthatwascompared.Weareinterestedinthesituationwhenaisincreasingwhilebremainsfixed.Therefore,thefollowingsamplesizeconfigurationsareconsidered:(I)a=50,b=2;(II)a=400,b=2.(I)representsacaseofmoder-atesamplesizewhile(II),acaseoflargesample.Inaddition,wewouldliketoinvestigatedifferentcasesinwhichnormalityandsymmetrymayormaynothold.Therefore,thefollowingcombinationsofdistributionsfortherandomeffectsanderrorsareconsidered:(i)Normal-Normal;(ii)DE-NM(−2,2,0.5),whereDErepresentsthedoubleexponentialdistribu-tionandNM(µ1,µ2,ρ)themixtureoftwonormaldistributionswithmeansµ1,µ2,varianceone,andmixingprobabilityρ[i.e.,theprobabilities1−ρandρcorrespondtoN(µ1,1)andN(µ2,1),respectively];and(iii)CE-NM(−4,1,0.2),whereCErepresentsthecentralizedexponentialdistribu-tion,i.e.,thedistributionofX−1,whereX∼Exponential(1).Notethatincase(ii)thedistributionsarenotnormalbutsymmetric,whileincase(iii)thedistributionsarenotevensymmetric,afurtherdeparturefromnormality.Alsonotethatallthesedistributionshavemeanzero.Theyarestandardizedsothatthedistributionsoftherandomeffectsanderrorshavevariancesσ2andτ2,respectively.Thetruevalueofµissetto1.0.Thetruevalueofτ2isalsochosenas1.0.Tables3.1–3.4aretakenfromJiang(2005),inwhichthesimulatedsizeandpower,withthenominallevelofα=0.05,arereportedfortestingthenullhypothesisagainstthealternativeH1:γ6=1.

75January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page63Non-GaussianLinearMixedModels63Table3.2RDTversusJackknife-Power(NominalLevel0.01)AlternativeMethodSimulatedPowerI-iI-iiI-iiiII-iII-iiII-iiiγ1=0.2RDT0.5060.6160.4681.0001.0001.000Jackknife0.4870.4630.4541.0001.0001.000γ1=0.5RDT0.1120.1640.1220.9140.8910.793Jackknife0.1080.1210.1370.9210.8660.787γ1=2.0RDT0.3540.2560.2210.9950.9710.913Jackknife0.1960.1180.0720.9930.9680.887γ1=5.0RDT0.9910.9540.9001.0001.0001.000Jackknife0.9540.8760.7151.0001.0001.000Table3.3RDTversusJackknife-Power(NominalLevel0.05)AlternativeMethodSimulatedPowerI-iI-iiI-iiiII-iII-iiII-iiiγ=0.2RDT0.7470.8070.7451.0001.0001.000Jackknife0.7280.7090.6681.0001.0001.000γ=0.5RDT0.2830.3360.2860.9800.9660.917Jackknife0.2770.2710.2750.9810.9580.912γ=2.0RDT0.5320.4240.3690.9990.9930.973Jackknife0.4110.3170.2230.9990.9930.970γ=5.0RDT0.9970.9840.9561.0001.0001.000Jackknife0.9910.9710.9031.0001.0001.000Table3.4RDTversusJackknife-Power(NominalLevel0.10)AlternativeMethodSimulatedPowerγ1=0.2RDT0.8440.8750.8071.0001.0001.000Jackknife0.8290.8100.7761.0001.0001.000γ1=0.5RDT0.4050.4420.3960.9910.9830.954Jackknife0.3980.3820.3720.9910.9790.950γ1=2.0RDT0.6330.5640.4621.0000.9970.987Jackknife0.5400.4530.3501.0000.9970.986γ1=5.0RDT0.9990.9920.9751.0001.0001.000Jackknife0.9980.9880.9541.0001.0001.000Overall,thejackknifeappearstobemoreaccurateintermsofthesize,especiallywhenaisrelativelysmall(CaseI).Ontheotherhand,thesim-ulatedpowersforRDTarehigheratallalternatives,especiallywhenaisrelativelysmall(CaseI).Alsonotethatthejackknifewiththelogarithmictransformationisspecificallydesignedforthiskindofmodelwheretheob-servationsaredividedintoindependentgroups,whilethePOImethodisforamuchricherclassofLMMwheretheobservationsmaynotbedividedintoindependentgroups.Thereareexamplesofthelatterkindinwhichtestsaboutthedispersionparametersareofinterest,butweshallleavethis

76January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page6464RobustMixedModelAnalysistothenextchapter.3.3.4ConsistencyofPOIInthissubsection,wediscussconsistencyoftheestimatedACVMobtainedviathePOImethod.ItshouldbenotedthatthedefinitionofREMLesti-matorinnon-GaussianLMMdiffersslightlyaccordingtoseveralauthors.InRichardsonandWelsh(1994),REMLestimatorisdefinedasthesolu-tiontothe(Gaussian)REMLequation;inJiang(1996),REMLestimatorisdefinedasthesolutiontotheREMLequationplustherequirementthatitbelongstotheparameterspace;inJiang(1997),REMLestimatorisde-finedasthemaximizeroftheGaussianrestrictedlikelihoodfunction.Infact,thelattershowsthat,forabalancedmixedANOVAmodel,suchamaximizerisaconsistentestimatorofθ;forageneralLMM,itshowsthatasievedmaximizerisconsistent.Notethat,fromapracticalpointofview,asieveputsnorestrictiononthemaximization,becausethemaximizerisalwayswithinasievethatsatisfiestheconditions[ofJiang(1997),withasuitableconstant].Therefore,inthefollowingtheorem,theREMLestima-torisunderstoodasthemaximizeroftheGaussianrestrictedlikelihoodinthesenseofJiang(1997)(withthesievesintheunbalancedcase).Thiseliminatesanypossibleconfusiononwhichsolution,orroot,totheREMLequationtousewhentherearemultipleroots.See,forexample,Searleetal.(1992,sec.8.1)fordiscussionregardingthemultiple-rootproblem.ThefollowingtheoremisprovedinJiang(2005).Theorem3.3.Supposethat(i)σ2>0,00,0≤j≤ssuchthatG−1GG−1isboundedfromaboveaswellasfrombelow;−1−1Landλmin(XVX)→∞;(iv)(gjgk)l=1hl|cj,k,l|,0≤j,k≤sarebounded,and(gg)−2Lh|cc|→0,0≤j,k≤s;(v)jkl1,l2=1l1,l2j,k,l1j,k,l2−1−1L(gjgk)gj,k(δ)→0,and(gjgk)l=1hldj,k,l(δ)→0,0≤j,k≤s,uniformlyinNasδ→0.Then,thePOIIˆ1andPOIestimatorΣˆRarebothconsistent.Remarks.Thefirstpartofcondition(iii)(regardingG)isequivalenttotheAI4conditionofJiang(1996,1997),which,togetherwithσ2>0,0≤rr≤s,guaranteestheconsistencyoftheREMLestimatorθˆ.Furthermore,condition(iii)ensurestheconsistencyofβˆ=(XVˆ−1X)−1XVˆ−1y,whereVˆisVwithθreplacedbyθˆ.Finally,itcanbeshown(Exercise3.7)thatthefirstpartofcondition(v)[regardinggj,k(δ)]isequivalenttothatG˜is

77January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page65Non-GaussianLinearMixedModels65uniformlycontinuousatθ].3.4Real-dataexampleWeuseareal-dataexampletoillustratethepotentialdifferenceinmeasureofuncertaintywithorwithouttakingintoaccountthenon-normality.ThedatawascollectedfromaStatisticsclasstaughtattheUniversityofCal-ifornia,Davis.PartialdataisshowninTable3.5.Thefirsttwocolumnsaregender(Fforfemale;Mformaleandmidterm(MD)scores,inpercent-age(outofatotalof50points).TherestofthecolumnsareHWscores,alsoinpercentage.Therewere112studentsenrolledintheclass.EachstudenthadamidtermscoreandsixHWscores.TheNAsinthedatasetcorrespondtoHWsthatwereeithermissing(e.g.,notturnedin),orintentionallydroppedwhencomputingtheoverallHWscores.Accordingtothegradingpolicy,thelowesttwoHWscoresweredroppedwhencom-putingtheoverallHWscores.Themidtermtookplacenearthemidpointofthequarterafterthe4thHW(soHW1,2,3,4werebeforethemidtermandtherestoftheHWsafterthemidterm).Initially,therewereinterestsinknowingthepotentialassociationbetweentheHWsandmidterm,andimpactofthemidtermonhowseriouslythestudentscompletedtheHWs.TherewasalsointerestinknowingwhethertherewasadifferencebetweenmaleandfemaleincompletingtheHWs.Toaddresssuchconcerns,thefol-lowingsimpleLMMisconsidered.LetyijdenotethejthHWscore(skiptheNAs)oftheithstudent.Itisassumedthatyij=β0+β1xi,1+β2xi,2+β3xij,3+vi+eij,(3.28)i=1,...,112,j=1,...,6,wherexi,1istheindicatorforgender(1forF;0forM);xi,2isthemidtermscore;xij,3isanindicatoronwhethertheHWscoreisbeforeorafterthemidterm(0forbefore;1forafter).Notethatxi,1,xi,2donotdependonj,butxij,3dependsonbothiandj.Furthermore,theβ’sareknownfixedeffects;viisastudent-specificrandomeffect,andeijisarandomerror.Itisassumedthattherandomeffectsarei.i.d.withmean0andvarianceσ2;theerrorsarei.i.d.withmean0andvvarianceσ2;andtherandomeffectsanderrorsareindependent.eNotethatthedataisclearlynotnormal.Forexample,theHWscoresarepercentagesoutofatotalof100pointsforeachHW.Suchnumbersarebetween0and1,andonlyuptotwodecimals.Also,itisseenfromTable3.5thatmanynumbersareexactly1,andsomeexactly0.Thus,theproposedLMMisa(truly)non-GaussianLMM.

78January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page6666RobustMixedModelAnalysisTable3.5PartialDataGenderMDHW1HW2HW3HW4HW5HW6HW7HW8M0.661.000.990.960.96NA1.001.00NAM0.68NA0.000.92NA0.780.940.851.00F0.820.950.940.98NA1.001.00NA1.00F0.820.85NA0.901.00NA0.880.950.96F0.72NANA1.001.001.001.001.001.00F0.800.900.980.961.000.971.00NANAAquestionofinterestiswhetheritisnecessarytoincludethestudent-specificrandomeffects,vi,in(3.28).TheGaussianREMLestimatoroftheratio,γ=σ2/σ2,isγˆ=0.193.Ontheotherhand,thediagonalelementveofthenormality-basedACVM,(3.8),is2.79×10−3,leadingtoastandarderror(s.e.)forγˆequalto0.053.Thus,theratioγ/ˆs.e.(ˆγ)=3.65,whichisstatisticallysignificant(fortestingthehypothesisthatγiszero),say,at10%level.However,duetothenon-normalityconcernnotedabove,onewouldusethePOImethodtoestimatetheACVM.Theresultingdiagonalelementcorrespondingtoγis2.60×10−2,leadingtoamuchlargers.e.forγˆequalto0.161,andtheratioγ/ˆs.e.(ˆγ)=1.19.Theresultisnolongersignificantat10%level.Basedonthelatterresult,onewouldexcludetherandomeffectsfrom(3.28),andthussimplifytheLMMtoalinearregression(LR)model.NotethattheanalysisresultsbasedontheLMMandLRmodelswouldleadtoquitedifferentconclusions.Forexample,bothanalysesfindthatthecoefficient,β1,correspondingtogender,isinsignificant(p-valuecloseto1forboth).Ontheotherhand,theLMManalysisfindsthatβ2isinsignificant(p-value≈0.211)whileβ3issignificantat10%level(p-value≈0.085);theLRanalysisfindstheopposite:β2issignificantat10%level(p-value≈0.090)whileβ3isinsignificant(p-value≈0.144).3.5Exercises3.1.Showthat,inthebalancedcaseofExample3.1,onehasaspecialcaseofthebalancedmixedANOVAmodeldefinedin§3.1.1.AlsoshowthatExample3.2isaspecialcaseofthebalancedmixedANOVAmodel.3.2.ShowthattheREMLequationsderivedunderthemultivariatet-distribution(seeSection3.2)areequivalenttothosederivedunderthemultivariatenormaldistribution.3.3.Thisexercisehastwoparts.

79January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page67Non-GaussianLinearMixedModels67a.Showthat,inExample3.2(continued)inSection3.3,E(∂2l/∂σ4),ReE(∂2l/∂σ2∂σ2)andE(∂2l/∂σ2∂σ2)arefunctionsofθ=(σ2,σ2,σ2)ReuReveuvregardlessofnormality.b.Showthatvar(∂l/∂σ2)involvesthefourthmomentsoftherandomReeffectsanderrors,butnotthethirdmoments.3.4.Thisexercisehastwoparts.a.Showthat,under(3.13),onehasarr>0,0≤r≤s.b.Showthat,inSubsection3.3.1,onehasE{M(β,φ,κ,y)}=0,ifβ,φ,κarethetrueparametervectors.φ,κarethetrueparametervectors3.5.ThisexerciseisassociatedwithExample3.2(continued)inSub-section3.3.1.a.Verify(3.15)and(3.16).b.ShowthattheEMMestimatorsgivenby(3.15)and(3.16)arecon-sistentprovidedthata→∞andb≥2.c.Explainwhytheconditionb≥2inpartbaboveisnecessary.3.6.ConsiderExample3.2(continued)inSubsection3.3.2.showthatthevarianceof(3.9)canbeexpressedasvar(∂l/∂σ2)=S+S,whereRe12S1isgivenby(3.20),ξij=yij−μ,a·=a0+a1+a2,andaj,j=0,1,2,S2arefunctionsofθ.3.7.Thisexercisehastwoparts.a.ProvethefollowingLemma:LetA,GbesequencesofpositivedefinitematricessuchthatG−1AG−1→B>0.LetA˜beanothersequenceofmatrices.Then,A−1/2AA˜−1/2→I,theidentitymatrix,ifandonlyifG−1(A˜−A)G−1→0.b.Usetheabovelemmatoshowthatcondition(v)ofTheorem3.3[regardinggj,k(δ)]isequivalenttothatG˜isuniformlycontinuousatθ.

80b2530InternationalStrategicRelationsandChina’sNationalSecurity:WorldattheCrossroadsThispageintentionallyleftblankb2530_FM.indd601-Sep-1611:03:06AM

81March8,20199:47ws-book9x6RobustMixedModelAnalysisbook4page69Chapter4RobustTestsPartofthepreviouschapter(seeSubsection3.3.3)hasledtotheconsider-ationofrobusttestsinLMM.Inthischapter,wecontinuethecoverageonthistopicinmuchgreatergenerality.4.1RobustdispersiontestsTobemorespecific,hereinthissection,weconsidertestsregardingonlythevariancecomponents.Suchtestsarecalleddispersiontests.Also,weshallfocusontestsbasedontheREMLestimators.Considerthefollowinggeneralhypothesis:H:K′θ=ϕ,(4.1)0whereϕisaspecifiedvector,andKisaknown(s+1)×rmatrixwithrank(K)=r.WeassumethattheREMLestimatorθˆisasymptoticallynormalwithmean0andACVMΣR,i.e.,−1/2ΣR(θˆ−θ)−→N(0,Is+1)indistribution.(4.2)Sufficientconditionsfor(4.2)canbefoundin,e.g.,Jiang(1996).Itistheneasytoshowthat,underthenullhypothesis(4.1),(K′θˆ−ϕ)′(K′ΣK)−1(K′θˆ−ϕ)−→χ2indistribution.(4.3)RrWethenreplaceΣRbyitsPOIestimatorΣˆR,discussedinSubsection3.3.2,oritsEMMestimator,discussedinSubsection3.3.1,toobtaintheteststatisticχˆ2=(K′θˆ−ϕ)′(K′ΣˆK)−1(K′θˆ−ϕ).(4.4)RThefollowingtheoremstatesthatχˆ2hasthesameasymptoticnulldistri-butionas(4.3).69

82January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page7070RobustMixedModelAnalysisTheorem4.1.SupposethattheconditionsofTheorem3.3aresatis-fied.Furthermore,supposethat(4.2)holds.Then,underthenullhypoth-esis,χˆ2→χ2indistribution.rIncasesthatsomecomponentsofθarespecifiedunderthenullhypoth-esis,itiscustomarytousethesespecifiedvalues,insteadoftheestimators,inthePOIestimator.Underthenullhypothesis,thismayimprovetheaccuracyofthePOIestimator,althoughthedifferenceisexpectedtobesmallinlargesample[becauseoftheconsistencyofθˆ;e.g.,Jiang(1996)].Itcanbeshown[seeJiang(2005)]thatthesameconclusionholdsaftersuchamodification.Weconsidersomeexamples.Example4.1.LetusrevisittheexamplediscussedinSubsection3.3.3.Itiseasytoshowthat,inthiscase,theteststatistic(4.4)reducestotheexpressionabove(3.26),thatis,χˆ2=(ˆγ−1)2/Σˆ,whereΣˆisgiven1R,11R,11by(3.26)(Exercise4.1).AccordingtoTheorem4.1,theasymptoticnulldistributionisχ2.1Example4.2.Considerthebalancedtwo-wayrandomeffectsmodelofExample3.2.SupposethatoneisinterestedintestingthefollowinghypothesisH:σ2=σ2,or,equivalently,H:γ=γ,whichmeansthat0uv012thetworandomeffectfactorscontributeequallytothetotalvariation.Itiseasytoshowthattheteststatistic(4.4)reducesto(ˆγ−γˆ)2212χˆ=,(4.5)ΣˆR,11−2ΣˆR,12+ΣˆR,22whereγˆ1,γˆ2aretheREMLestimatorsofγ1andγ2,respectively,andΣˆR,jkisthej,kelementofthePOIestimatorΣˆoftheACVMofθˆ=(ˆσ2,γˆ,γˆ),Re11theREMLestimator.Notethatinthiscasethereisno(fully)specifiedvaluesoftheparametersunderthenullhypothesis,althoughthelattermaystillbeusedinsomeway(butthedifferenceisexpectedtobesmallinlargesample;seetheremarkbelowTheorem4.1).Ontheotherhand,itisinterestingtoseehowthetestperformswhenstraightPOIestimatorisusedinthedenominatorof(4.5).AsimulationstudywascarriedoutinJiang(2005)toinvestigatethis.Morespecifically,itconsidersperformanceofthetestunderbothmoderateandlargesam-plesizes,aswellasdeparturesfromnormality.Thefollowingsamplesizeconfigurationsareconsidered:(I)a=40,b=40;(II)a=200,b=200.Furthermore,thefollowingcombinationsofdistributionsfortherandomeffectsanderrorsareconsidered:(i)u,v∼Normal;(ii)u,v∼DE;(iii)u∼DE,v∼CE;and(iv)u,v∼CE.Inallcases,e∼Normal.NotethatthejackknifemethoddiscussedinSubsection3.3.3doesnot

83January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page71RobustTests71Table4.1SimulatedSizeNominalI-iI-iiI-iiiI-ivII-iII-iiII-iiiII-iv0.010.0140.0110.0140.0110.0110.0080.0110.0080.050.0710.0610.0700.0660.0530.0510.0550.0480.100.1350.1260.1390.1360.1080.1080.1090.102Table4.2SimulatedPower(NominalLevel0.01)AlternativeI-iI-iiI-iiiI-ivII-iII-iiII-iiiII-ivγ2/γ1=0.20.9550.5680.5510.3981.0001.0000.9990.986γ2/γ1=0.50.3130.1000.1180.0730.9880.6840.6190.439γ2/γ1=2.00.3240.1000.0700.0880.9880.6850.4590.443γ2/γ1=5.00.9690.6490.4910.4971.0000.9990.9890.992Table4.3SimulatedPower(NominalLevel0.05)AlternativeI-iI-iiI-iiiI-ivII-iII-iiII-iiiII-ivγ2/γ1=0.20.9940.8640.8390.7131.0001.0001.0000.999γ2/γ1=0.50.5790.3080.3210.2320.9980.8740.8190.713γ2/γ1=2.00.5950.3050.2270.2560.9980.8790.7640.717γ2/γ1=5.00.9970.9010.7990.7791.0001.0001.0000.999Table4.4SimulatedPower(NominalLevel0.10)AlternativeI-iI-iiI-iiiI-ivII-iII-iiII-iiiII-ivγ2/γ1=0.20.9990.9460.9230.8461.0001.0001.0001.000γ2/γ1=0.50.7020.4560.4510.3640.9990.9310.8870.818γ2/γ1=2.00.7190.4480.3590.3820.9990.9360.8640.818γ2/γ1=5.00.9990.9550.9040.8791.0001.0001.0001.000applytothiscase,becausetheobservationscannotbedividedintoi.i.d.,orevenindependent,groups.Thetruevaluesofparametersareμ=σ2=σ2=1.0.Thevalueofeuσ2varies.Firstconsiderthesizeofthetest,sowetakeσ2=1.0.Thevvsimulatedsizescorrespondingtothenominallevels0.01,0.05and0.10arereportedinTable4.1.Nextconsiderthepowerofthetestatthefollowingalternatives:σ2=0.2,0.5,2,5,whichcorrespondtoγ/γ=0.2,0.5,2,v215,respectively.ThesimulatedpowersarereportedinTables4.2–4.4.Allresultswerebasedon10,000simulationruns.Thenumbersseemtofollowthesamepattern.Asthesamplesizeincreases,thesimulatedsizesgetclosertothenominallevels,andthesim-ulatedpowersincreasesignificantly.Theredoesnotseemtobedifference,intermsofthesize,acrossdifferentdistributions.However,thesimulatedpowersappearsignificantlyhigherwhenallthedistributionsarenormalas

84January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page7272RobustMixedModelAnalysiscomparedtoothercaseswherethedistributionsoftherandomeffectsarenon-normal.Also,thepowersarerelativelylowwhenthealternativesareclosetothenull(γ2/γ1=0.5or2.0)butmuchimprovedwhenthealter-nativesarefurtheraway(γ2/γ1=0.2and5.0).Overall,thesimulationresultsareconsistentwiththetheoreticalfindingsofTheorem4.1.4.2RobustversionsofclassicaltestsRobusttestingprocedureshavebeenstudiedextensivelyintheliterature.Inparticular,robustversionsoftheclassicaltests,thatis,theWald,score,andlikelihood-ratiotests[e.g.,Lehmann(1999),§7]havebeenconsidered.Inthecaseofi.i.d.observations,see,FoutzandSrivastava(1977),Kent(1982),Hampeletal.(1986),HeritierandRonchetti(1994),amongothers.Inthecaseofindependentbutnotidenticallydistributedobservations,see,forexample,SchraderandHettmansperger(1980),Chen(1985),Silvapulle(1992),andKimandCai(1993).Inmanycases,suchasinbiomedicalresearchandsurveys,theobservationsaredependent.OneimportantclassofmodelsfordependentobservationsareLMM.Undersuchamodel,theobservationsarecorrelatedduetothepresenceoftherandomeffects.TestswithdependentdatahavebeendevelopedinthecaseofLMM[e.g.Khurietal.(1998)].Otherareasinwhichdependentdataarefrequentlyen-counteredincludetimeseriesandstochasticprocesses,inwhichtestswithdependentdatahavealsobeendeveloped[e.g.,BasawaandRao(1980),Dzhaparidze(1986)].Notethat,inthesetestswithdependentdata,itisassumedthatthedistributionofthedataisknownuptoasetofparameters(thereforethelikelihoodfunctionisavailable).Incontrasttotheindepen-dentcases,theliteratureonrobusttestingwithdependentobservationsisnotextensive.Forexample,inthecaseofLMM,RichardsonandWelsh(1996)proposedarobustlikelihood-ratiotestbasedonrestrictedmaximumlikelihood(REML)estimation.Apurposeofthissectionistoprovideaunifiedtreatmentoftheserobustversionsofclassicaltestsforthecaseofdependentobservationswithrigorousproofs.Inparticular,weapplytheresultstoLMMtoobtainrobustdispersiontestswithoutthenormalityassumption.Inmanycases,oneisinterestedintestingahypothesisregardingsomeparametersassociatedwiththepopulation,buttheentirepopulationdis-tributionisunknowngiventhese,andpossiblyotherparameters.Insuchcases,thelikelihoodfunction,basedonwhichtheclassicaltestsaredefined,

85January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page73RobustTests73isnotavailable,becauseitreliesonfullspecificationofthedistributionoftheobservationsgiventheparameters.However,onemayreplacethelike-lihoodfunctionbyaknownobjectivefunction,andthusobtainrobustver-sionsoftheclassicaltests.Inthissection,theobjectivefunctionisassumedtobeafunctionforM-estimation[Huber(1981)].Itshouldbenotedthattherehasbeenaquiteunifiedextensionoftheclassicaltestsintermsofquasi-likelihoodmethod(seeSubsection3.2).ThelatterisaspecialcaseofM-estimation,inwhichtheestimatingfunctionsareclosertoa(log-)likelihoodinthesensethatthequasi-scorefunctionmaintainssomepropertiesofthefirsttwomomentsofascorefunction.However,itseemsthattheclassofquasi-likelihoodfunctionsisnotrichenoughtoincludesomeusefulmethods.Furthermore,itisnotalwayseasytoconstructaquasi-likelihoodfunction,especiallywhentheparameterismultivariate,andtheobservationsaredependent[e.g.,McCullaghandNelder(1989),§9.3.2].Thediscussionsofthecurrentsectionhavethefollowingfeatures:(i)theyarefordependentdata;(ii)theyarebasedonageneralobjectivefunction(whichisnotnecessarilyalikelihood);and(iii)thehypothesestobetestedareinageneralform.Therobustnessofthesetestsisinthesenseof(ii),thatis,thetestsrequireweakdistributionalassumptions,andthereforearerobusttoviolationofaspecificdistributionalassumption,suchasnormality.Althoughmanypractitionerswouldapplysuchteststodependentdataanyway,andassumethattheasymptoticresultstohold,atleastundercertainconditions,theexactconditionsseemtobeunclearforatestingproblemthathasthecharacteristicsof(i)–(iii).4.2.1Basicidea,assumptions,andexamplesThelikelihood-ratiotestisknowntobeapplicabletoabroadrangeoftestingproblems.LetL(θ,y)bethelikelihoodfunction,whereyisavectorofobservationsandθavectorofparameters.Then,thelikelihood-ratiotestofthehypothesisH0:θ∈Θ0versusH1:θ/∈Θ0,(4.6)whereΘ0⊂ΘandΘistheparameterspace,isbasedonthelikelihood-ratiostatisticsupθ∈Θ0L(θ,y)L(θˆ0,y)=,(4.7)supθ∈ΘL(θ,y)L(θ,yˆ)whereθˆistheMLestimator(MLE)ofθ,andθˆ0theMLEofθunderH0.

86February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page7474RobustMixedModelAnalysisTheWaldtest,ontheotherhand,isknowntoapplytosomemorerestrictedtestingproblems.TheoriginalWaldtestisforthehypothesisH0:θ=θ0versusH1:θ6=θ0,whereθ0isaknownparametervector.Theteststatisticcanbeexpressedas(θˆ−θ)′I(θ)(θˆ−θ),whereI(θ)isthe000Fisherinformationmatrix.Themethodhasbeenextendedtotest(4.6),providedthatH0canbeexpressedasR(θ)=0,whereR(·)isaknown(vectorvalued)function.Insuchacase,theteststatistichastheform"′#−1∂R∂RR(θˆ)′I(θˆ)−1R(θˆ).(4.8)∂θθˆ∂θθˆHowever,theextensionstillseemsalittlerestrictive,becauseitrulesouthypothesesthat,forexample,involveinequalities.Therefore,weconsideraWaldtestof(4.6)basedonthefollowingteststatistic(θˆ−θˆ)′Wˆ−1(θˆ−θˆ),(4.9)00whereθˆisanM-estimatorofθ,θˆ0isanM-estimatorofθunderthenullhypothesis,andWˆisa“consistentestimator”oftheasymptoticcovariancematrixofθˆ−θˆ0underthenullhypothesis.Suchanideahasbeenusedby,e.g.,GourierouxandMonfort(1995)toconstructteststhattakeintoaccountinequalityconstraints.ThereisanotherinterestingfeatureoftheproposedWaldtest.Itisknownthat,whilethelikelihood-ratiotestinvolvesbothθˆandθˆ0,theWaldtestbasedon(4.8)onlyinvolvesθˆ.However,thisisnotthecasefor(4.9).Finally,Thescoretestisbasedontheteststatistic!′!∂l∂lI(θˆ)−1,(4.10)∂θ0∂θθˆθˆ00wherel=l(θ,y)isthelog-likelihood.Notethatthescoretestinvolvesonlyθˆ0.Nevertheless,allofthethreetestsmaybethoughtasinvolvingbothθˆandθˆ0,because(4.10)canbewrittenas!′!∂l∂l−1∂l∂l∂θ−∂θI(θˆ0)∂θ−∂θ.θˆθˆθˆθˆ00Notethat(∂l/∂θ)|θˆ=0,underregularityconditions,becauseθˆistheMLE.WenowconsiderrobustversionsofWald,score,andlikelihood-ratiotests,whichwecallW-,S-,andL-tests,respectively.Lety=(yk)1≤k≤nbeavectorofobservations.Letθbeavectorofunknownparametersthatareassociatedwiththejointdistributionofy,buttheentiredistribution

87January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page75RobustTests75ofymaynotbeknowngivenθ(andpossiblyotherparameters).Weareinterestedintestingthehypothesis:H0:θ∈Θ0(4.11)versusH1:θ/∈Θ0,whereΘ0⊂Θ,andΘistheparameterspace.Supposethatthereisanewparametrizationφsuchthat,underthenullhypothesis(4.11),θ=θ(φ)forsomeφ.Hereθ(·)isamapfromΦ,theparameterspaceofφ,toΘ.Notethatsuchareparametrizationisalmostalwayspossible,butthekeyistotrytomakeφunrestricted(unlesscompletelyspecified,suchasinExample4.3below).Weconsidersomeexamples(Exercise4.2).Example4.3.Supposethat,underthenullhypothesis,θiscompletelyspecified,i.e.,H0:θ=θ0.Then,underH0,onehasθ=φ=θ0.Example4.4.Letθ=(θ,...,θ,θ,...,θ),andsupposethatone1pp+1qwishestotestthehypothesisH0:θj=θ0j,p+1≤j≤q,whereθ0j,p+1≤j≤qareknownconstants.Then,underthenullhypothesis,onehasθj=φj,1≤j≤p,andθj=θ0j,p+1≤j≤qforsome(unrestricted)φ=(φj)1≤j≤p.Example4.5.Lety1,...,ynbeindependentwithdistributionN(μ,σ2).Letθ=(μ,σ2).SupposethatonewishestotestthehypothesisH:σ2=μ2.Then,underthenullhypothesis,onehasθ=φ,θ=φ2for012someunrestrictedφ.Example4.6.Supposethatthenullhypothesisincludesinequalityconstraints:H0:θj>θ0j,p1+1≤j≤p,andθj=θ0j,p+1≤j≤q,wherep1

88January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page7676RobustMixedModelAnalysiswhereweusethenotation∂l0∂l0∂l∂l∂θ∂θi=,=,=;∂φ∂φj1≤j≤p∂θ∂θi1≤i≤q∂φ∂φj1≤i≤q,1≤j≤p∂l∂2l∂2l∂2l00=,=,22∂φ∂φj∂φj1≤j,j≤p∂θ∂θi∂θi1≤i,i≤q∂2θ∂2θiiand=.2∂φ∂φj∂φj1≤j,j≤pLetθˆbeanestimatorofθ,andφˆanestimatorofφ.Notethatherewedonotrequirethatθˆandφˆarethe(global)maximizersofl(θ,y)andl0(φ,y),respectively.However,weshallrequirethatθˆisasolutionto∂l/∂θ=0,andφˆasolutionto∂l0/∂φ=0.Weuseθ0torepresentthetrueθ,andφ0thetrueφunder(4.11).4.2.2TheW-,S-,andL-teststatisticsThefollowingquantityiscloselyrelatedtotheW-test:χ˜2={θˆ−θ(φˆ)}GQ−G{θˆ−θ(φˆ)},(4.14)wwwhereQ−representstheuniqueMoore-Penroseinverse[e.g.,Searle(1971),w§1.3]of−1−1−1−1Qw={A−C(CAC)C}Σ{A−C(CAC)C}.(4.15)Notethattheexpressioninside{···}in(4.15)canbewritteninafamiliarform:WriteC=A−1D.Then,wehaveA−1−C(CAC)−1C=A−1−A−1D(DA−1D)−1DA−1=K(KAK)−1K≡P,whereKisanymatrixofmaximumcolumnranksuchthatKD=0[e.g.,Searleetal.(1992),pp.451].ThematrixPisusedextensivelyinlinearstatisticalinference,especiallyinthecontextofLMM[e.g.,Searleetal.(1992),§6].ThedefinitionsofthematricesG,A,C,andΣwillbegiveninTheorem4.2inthesequel,butherewefirstoffersomeinterpretations:Aisthelimitofthematrixofsecondderivativesoflwithrespecttoθ;Bisthelimitofthematrixofsecondderivativesofl0withrespecttoφ;Cisthelimitofthematrixoffirstderivativesofθwithrespecttoφ;andΣistheasymptoticcovariancematrixof∂l/∂θ,allaftersuitablenormalizations.

89February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page77RobustTests77Theorem4.2inthesequelshowsthat,asthesymbolindicates,χ˜2whasanasymptoticχ2distribution.Typically,thenormalizingmatrixGiscompletelyspecified.Infact,inmanycasesGisadiagonalmatrix,wherethediagonalelementsarenormalizingconstantscorrespondingtothecomponentsof(∂l/∂θ)|θ0.Ontheotherhand,thematrixQwmayinvolveavectorofparameters,say,ϑ.Insomecasessuchasgeneralizedlinearmodels[e.g.,McCullaghandNelder(1989)],ϑdependsonlyonθ.Inothercases,ϑmayinvolvesomeadditionalparameters,whichmayalsoneedtobeestimated.InSection4.3weconsideracaseofestimatingtheadditionalparametersinthecontextofmixedlinearmodels.LetQˆ−beawconsistentestimatorofQ−inthesensethatkQˆ−−Q−k→0inprobability,wwwwhereforanymatrixM,kMk={λ(M′M)}1/2withλrepresentingmaxmaxthelargesteigenvalue(andsimilarlyλminthesmallest).NotethatthisalsocoversthecaseinwhichQ−mayrepresentasequenceofmatrices(seeanwExtensionofTheorem4.2).WedefinetheW-teststatisticasχˆ2={θˆ−θ(φˆ)}′GQˆ−G{θˆ−θ(φˆ)}.(4.16)wwSimilarly,weconsiderthefollowing:()′()∂l∂lχ˜2=G−1A−1/2Q−A−1/2G−1,(4.17)s∂θs∂θθ(φˆ)θ(φˆ)whereQ−istheuniqueMoore-PenroseinverseofsQ=(I−P)A−1/2ΣA−1/2(I−P),(4.18)sP=A1/2C(C′AC)−1C′A1/2,andthematricesG,A,Carethesameasabove.LetAˆandQˆ−beconsistentestimatorsofAandQ−,respectively.ssNotethat,quiteoften,AonlydependsonθsothataconsistentestimatorofAisalreadyavailable.WedefinetheS-teststatisticas()′()∂l∂lχˆ2=G−1Aˆ−1/2Qˆ−Aˆ−1/2G−1.(4.19)s∂θs∂θθ(φˆ)θ(φˆ)Finally,theL-ratiofortestingthehypothesis(4.11)isdefinedasL0(φ,yˆ)R=.(4.20)L(θ,yˆ)Asnotedbefore,herewedonotrequirethatθˆandφˆare,respectively,(global)maximizersofl(θ,y)andl0(φ,y).Inparticular,ifθˆandφˆare,indeed,theglobalmaximizers,RisequaltosupΘ0L(θ,y)/supΘL(θ,y),whichistheoriginaldefinitionofthelikelihood-ratiowhenL(θ,y)isalikelihoodfunction[see(4.7)].TheL-teststatisticisthen−2logR.

90January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page7878RobustMixedModelAnalysis4.2.3AsymptotictheoryThemainresultsofthissubsectionaretheasymptoticdistributionsoftheW-,S-,andL-teststatistics.Thefollowingnotationwillbeused:ifa=(a,...,a)isavector,then|a|=(|a|,...,|a|).Heretheindex1kv1kvrefersto“vector”.Also,ifb=(b1,...,bk),thena≤biffai≤bi,1≤i≤k;anda∨b=(a1∨b1,...,ak∨bk),wherea∨b=max(a,b)[andsimilarlya∧b=min(a,b)].ForanymatrixM,M>0meansthatMispositivedefinite;anddiag(xi,1≤i≤k)representsadiagonalmatrixwithxi,1≤i≤konitsdiagonal.Theorem4.2.Supposethatthefollowinghold:i)l(·,y)istwicecontinuouslydifferentiableforfixedy,andθ(·)istwicecontinuouslydifferentiable;ii)withprobability→1,θˆ,φˆsatisfy∂l/∂θ=0,∂l0/∂φ=0,respectively;iii)therearesequencesofnonsingularsymmetricmatrices{G}and{H}andmatricesA,B,CwithA,B>0suchthatthefollowing→0inprobability:∂2lsupG−1G−1+A,(i)∂θi∂θj(i)|θ−θ0|v≤|θˆ−θ0|v∨|θ(φˆ)−θ(φ0)|v,1≤i≤qθ1≤i,j≤q∂2lsupH−10H−1+B,∂φ∂φ|φ(i)−φ0|v≤|φˆ−φ0|v,1≤i≤pijφ(i)1≤i,j≤p∂θi−1supGH−C;|φ(i)−φ0|v≤|φˆ−φ0|v,1≤i≤q∂φjφ(i)1≤i≤q,1≤j≤piv)D(∂l/∂θ)|θ0→0inprobability,whereD=diag(di,1≤i≤s)withd=H−1(∂2θ/∂φ2)|H−1,andiiφ0∂lG−1−→N(0,Σ)indistribution.(4.21)∂θθ0Then,underthenullhypothesis,theasymptoticdistributionofχ˜2isχ2,wrwhereristherankofΣ1/2A−1/2(I−P)withPgivenbelow(4.18).Inparticular,ifΣisnonsingular,thenr=q−p.ThetheoremmaybeextendedtoallowthematricesA,B,etc.tobereplacedbysequencesofmatrices.Suchanextensionmaybeuseful.Forexample,supposeGisadiagonalnormalizingmatrix,then,inmanycases,Acanbechosenas−G−1{E(∂2l/∂θ2)|}G−1,butthelattermaynothaveθ0alimitasn→∞.

91January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page79RobustTests79ExtensionofTheorem4.2.Supposethat,inTheorem4.2,A,B,Carereplacedbysequencesofmatrices{A},{B},and{C},suchthatA,Baresymmetricsatisfying0

92January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page8080RobustMixedModelAnalysisItisseenthattheasymptoticnulldistributionsforW-andS-testsarebothχ2.However,thefollowingtheoremstatesthattheasymptoticdistri-butionforL-testisnotχ2buta“weighted”χ2[e.g.,ChernoffandLehmann(1954)].Let−1−11/2−1−11/2Ql=[A−C(CAC)C]Σ[A−C(CAC)C].(4.23)Theorem4.4.SupposethattheconditionsofTheorem4.2aresatisfiedexceptthatthethirdquantityiniii)(involvingC)→0inprobabilityisreplacedbyG[(∂θ/∂φ)|]H−1→C.Then,underthenullhypothesis,theφ0asymptoticdistributionof−2logRisthesameasλξ2+···+λξ2,where11rrristhesameasinTheorem4.2;λ1,...,λrarethepositiveeigenvaluesofQl;andξ1,...,ξrareindependentN(0,1)randomvariables.Inparticular,ifΣisnonsingular,thenr=q−p.ItshouldbepointedoutthatifL(θ,y)is,indeed,thelikelihoodfunc-tion,inwhichcaseL-testisthelikelihood-ratiotest,theasymptoticnulldistributionof−2logRreducestoχ2[seeWeiss(1975)].LetQˆlbeaconsistentestimatorofQl.Then,byWeyl’seigenvalueperturbationtheorem[e.g.,Bhatia(1997),p.63],theeigenvaluesofQˆlareconsistentestimatorsofthoseofQl,andthereforecanbeusedtoobtaintheasymptoticcriticalvaluesfortheL-test.InTheorem4.2,theconsistencyofθˆandφˆisnotexplicitlyrequired.However,conditioniii)doesimplicatesomeconsistentpropertiesoftheestimators.Inthefollowing,wegivesufficientconditionsunderwhichθˆ,φˆareconsistent,andconditionsii),iii)ofTheorem4.2aresatisfied.Inthetheorembelow,theassumptionofGandHbeingdiagonalisnotessential,butitmuchsimplifiestheresult.Theorem4.5.SupposethatG=diag(gj,1≤j≤q)andH=diag(hj,1≤j≤p),wheregj’sandhj’saresequencesofpositivenum-bersthat→∞assamplesizeincreases.Furthermore,supposethatthefollowingconditionsaresatisfied:i)l(·,y)isthree-timescontinuouslydifferentiableforfixedy,andθ(·)isthree-timescontinuouslydifferentiable;ii)G−1(∂l/∂θ)=O(1);θ0Piii)therearematricesA,B>0andCsuchthat∂2l∂2lG−1G−1+A−→0,H−10H−1+B−→0∂θ2∂φ2θ0φ0inprobability,andG[(∂θ/∂φ)|]H−1→C;φ0

93January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page81RobustTests81iv)thereisδ>0suchthat1∂3lsup,1≤i,j,k≤q,gigj|θ−θ0|≤δ∂θi∂θj∂θk1∂3lsup0,1≤i,j,k≤phihj|φ−φ∂φi∂φj∂φk0|≤δareboundedinprobability,andgh−1sup|∂2θ/∂φ∂φ|,1≤i≤q,ij|φ−φ0|≤δijk1≤j,k≤parebounded.Then,thereexistθˆandφˆsuchthat1)conditionii)ofTheorem4.2holds;2)G(θˆ−θ0)=OP(1)andH(φˆ−φ0)=OP(1),therefore,θˆandφˆareconsistent;and3)conditioniii)ofTheorem4.2holdsforthematricesgivenabove.WeillustrateTheorem4.5withanexample.Example4.4(Continued).Foranypositiveconstantsgj,1≤j≤q,lethj=gj,1≤j≤p.Then,∂θI∂θ−1I=,henceGH=,∂φ0∂φ0whereIisthep×pidentitymatrix.Also,gh−1sup|∂2θ/∂φ∂φ|=0,ijφijk∀i,j,k.Furthermore,letθ=(θ,...,θ).Then,by(4.13),wehave(1)1p222∂l0∂θ∂l∂θ∂l==.∂φ2∂φ∂θ2∂φ∂θ2(1)Therefore,∂2lB∗−1−1−GG−→A=>0inprobability∂θ2∗∗impliesthat∂2l∂2l−H−10H−1=−H−1H−1−→B>0inprobability.∂φ2∂θ2(1)Theorem4.5canbeeasilyextendedtoallowA,B,Ctobereplacedbysequencesofmatrices,asinExtensionsofTheorem4.2andTheorem4.3.Theextensionisleftasanexercise(Exercise4.3).

94January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page8282RobustMixedModelAnalysis4.3RobustclassicaltestsformixedANOVAmodelInthissection,weconsiderthemixedANOVAmodelintroducedinSection1.1[see(1.1)].However,itisnotassumedthattherandomeffectsanderrorsarenormallydistributed.Infact,theonlyassumptionsregardingthedistributionsoftherandomeffectsanderrorsarethatαrisanmr×1vectorofi.i.d.randomvariableswithmean0andvarianceσ2,1≤r≤s;risanN×1vectorofi.i.d.randomvariableswithmean0andvarianceτ2;andα1,...,αs,areindependent.WeconsidertheHartley-Raoformofvariancecomponents,θ(see§3.1.1).Letγ=(γ),andϑ=(β,θ)=(β,τ2,γ).Then,withoutr1≤r≤sthenormalityassumption,ϑisavectorofparameters,whichalonemaynotcompletelydeterminethedistributionofy.Nevertheless,inmanycases,therearestillinterestsintestingthefollowingtypeofhypotheses:H0:θ∈Θ0,(4.24)whereΘ⊂Θ={θ:τ2>0,γ≥0,1≤r≤s},versusH:θ/∈Θ.0r10Whennormalityisassumed,theuseoflikelihood-ratiotestforpoten-tiallycomplexhypothesesandunbalanceddatawasfirstproposedbyHart-leyandRao(1967),althougharigorousjustificationwasnotgiven.Wel-hamandThompson(1997)showedtheequivalenceofthelikelihoodratio,score,andWaldtestsundernormality.FordiscussionsaboutothertestingproceduresinLMMundernormality,seeKhurietal.(1998).Ontheotherhand,RichardsonandWelsh(1996)consideredlikelihood-ratiotestwith-outassumingnormality,whoseapproachissimilartoourL-test,buttheirgoalwastoselectthe(fixed)covariates.Underthenormalityassumption,thelog-likelihoodfunctionforesti-matingϑisgivenby12l(ϑ,y)=constant−Nlog(τ)21−1+log(|V|)+(y−Xβ)V(y−Xβ),(4.25)τ2swhereV=Vγ=I+r=1γrVrwithIbeingtheN-dimensionalidentitymatrix,V=ZZ,1≤r≤s,and|V|thedeterminantofV.Therestrictedrrrlog-likelihoodforestimatingθisgivenby12lR(θ,y)=constant−(N−p)log(τ)2yPy+log(|KVK|)+,(4.26)τ2

95January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page83RobustTests83whereKisanyN×(N−p)matrixsuchthatrank(K)=N−pandKX=0,andP=P=K(KVK)−1K=V−1−V−1X(XV−1X)−1XV−1(seeγSection1.1,butwechangethenotationAtoKtoavoidconfusionwiththeAusedintheprevioussection).Therestrictedlog-likelihoodisonlyforestimatingthevariancecomponents.Itisthencustomarytoestimateβby(1.3),whereVˆ=Vγˆ,andγˆ=(ˆγr)1≤r≤sistheREMLestimatorofγ.Alternatively,onemaydefinethefollowing“restrictedlog-likelihood”:12lR(ϑ,y)=constant−(N−p)log(τ)21−1+log|KVK|+(y−Xβ)V(y−Xβ).(4.27)τ2Itcanbeshown(Exercise4.4)thatthemaximizerof(4.27)isϑˆ=(βˆ,τˆ2,γˆ),whereτˆ2andγˆaretheREMLestimators,andβˆisgivenabove.Thedifferenceisthat,unlikel(ϑ,y),lR(ϑ,y)isnotalog-likelihood,evenifnormalityholds(Exercise4.4).Nevertheless,weshowthatbothl(ϑ,y)andlR(ϑ,y)canbeusedtotest(4.24)withoutthenormalityassumption.ForLMMs,conditionsofTheorems4.2–4.4andtheirextensionscorre-spondtomildrestrictions.However,asdiscussedinSection3.3,theACVMoftheREMLestimatorinvolveshigher(i.e.,thirdorfourth)momentsoftherandomeffectsanderrors.Morespecifically,ifthetestisonlyaboutthefixedeffects,nohighermomentsareinvolvedinΣ;ifthetestisonlyaboutthevariancecomponents,thefourthmoments,orkurtoses,willappearinΣ;ifthetestisaboutboththefixedeffectsandthevariancecomponents,thethirdmomentswillbeinvolved,inadditiontothekurtoses.Thematri-cesA,BandCintheprevioussectiondonotinvolvethehighermoments.Tosimplifytheresults,wemakethefollowingadditionalassumption:E(3)=0,andE(α3)=0,1≤r≤s.(4.28)1r1Suchconditionsholdif,inparticular,thedistributionsoftherandomeffectsanderrorsaresymmetric.Anotheractionofsimplificationisthatweshallconsidertestingaspecialclassofhypotheses(4.24).Letϑ(·)=(ϑj(·))1≤j≤q,whereq=p+s+1,bethemapsuchthat,under(4.24),ϑ=ϑ(φ)forsomeφ=(φk)1≤k≤a,wherea≤q.Weassumethatthereisasubsetofindexes1≤j1<···

96January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page8484RobustMixedModelAnalysisofthissection,ϑ,φ,etc.denotethetruevectorsofparameters.Weneed√thefollowingnotation:WriteA=(tr(V−1V)/2λNm),A=1rr1≤r≤s2√(tr(V−1VV−1V)/2mm),andrtrt1≤r,t≤s⎛⎞XV−1X/τ2N00A=⎝01/2τ4A⎠.(4.30)10A1A2Letb=(I√γZ···√γZ),B=bV−1b,B=bV−1VV−1b,1≤r≤11ss0rrs.WedefineND0,rt=Br,llBt,ll,l=1N+m1D1,rt=Br,llBt,ll,...l=N+1N+m1+···+msDs,rt=Br,llBt,ll,l=N+m1+···+ms−1+1whereBr,klisthe(k,l)elementofBr,0≤r≤s.Thekurtosesoftheerrorsandrandomeffectsaredefinedbyκ={E(4)/τ4}−3,and01√κ={E(α4)/σ4}−3,1≤r≤s.LetΔ=(Δ/Nm),r√r1r10rr1≤r≤sΔ2=(Δrt/mrmt)1≤r,t≤s,and⎛⎞000Δ=⎝0Δ00/NΔ⎠,(4.31)10Δ1Δ21(r=0)+1(t=0)−1swhereΔrt=[4λ]u=0κuDu,rt,0≤r,t≤s.LetW=bV−1X(XV−1X)−1/2,andWbethelthrowofW,1≤l≤N+m,lwherem=m1+···+ms.Theorem4.6.Supposethati)ϑ(·)isthree-timescontinuouslydifferentiableandsatisfies(4.29),and∂θjk/∂φk=0,1≤k≤a;ii)E(4)<∞,var(2)>0,E(α4)<∞,var(α2)>0,1≤r≤s,and11r1r1(4.28)holds;andiii)N→∞,mr→∞,1≤r≤s,0

97January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page85RobustTests85andΣ=A+Δ,whereΔisgivenby(4.31).ThesameconclusionholdsforlR(ϑ,y)aswell.Notethatthejthrowof∂ϑ/∂φis∂ϑ/∂φ,which,under(4.29),isj(0,...,0)ifj/∈{j1,...,ja},andis(0,...,0,∂ϑjk/∂φk,0,...,0)(kthcom-ponentnonzero)ifj=jk,1≤k≤a.Theorem4.7.SupposethattheconditionsofTheorem4.6aresatisfiedexceptthat,iniii),theconditionaboutAisstrengthenedtothatA→A0,whereA0>0,andΣ→Σ0.Then,theconditionsofTheorem4.4aresatisfiedwithA=A0,Σ=Σ0,andeverythingelsegivenbyTheorem4.6.Itisseenfrom(4.31)thatΔ,andhenceΣ,dependsonthekurtosesκr,0≤r≤s,inadditiontothevariancecomponents.Onealreadyhasconsistentestimatorsofthevariancecomponents(e.g.,REMLestimators).Asforκr,0≤r≤s,theycanbeestimatedbytheEMMestimators,introducedinSubsection3.3.1.However,asymptoticpropertyoftheEMMestimatorhasnotyetbeenstudied.BelowwegivenconditionsfortheconsistencyoftheEMMestimator.RecallthenotationinSubsection3.3.1.Inaddition,writeh=C(y−rrXβ)4,0≤r≤s,whereCisdefinedbelow(3.13).Thefollowingresult4rshowsthat,undermildconditions,theEMMestimatorsareconsistent.NotethatsucharesultisnotavailableinJiang(2003).Theorem4.8.LettheCr’sbegivenbelow(3.13),andthefollow-inghold:i)E(4)<∞,E(α4)<∞,1≤r≤s;ii)βˆ,θˆarecon-1r1sistent;iii)a−1{h−E(h)}=o(1),0≤r≤s;andiv)thefol-rrrrP-.−1nrr22lowingarebounded:art/arr,r>t≥0,arrk=1t=0|Ztcrk|,a−1nr(max|Zc|)w|Xc|4−w,w=0,1,2,3,0≤r≤s.Then,rrk=10≤t≤rtrkrktheEMMestimatorsκˆr,0≤r≤sgivenby(3.14)areconsistent.ProofsofalloftheresultsinthissectioncanbefoundinJiang(2012).4.4Aunifiedrobustgoodness-of-fittestMixedmodeldiagnosticshasbeenatopicofresearchandapplicationsofmixedeffectsmodels.Thetopicisofconsiderablepracticalinterest.Forexample,mixedeffectsmodelshaveplayedkeyrolesinsmallareaestimation[SAE;e.g.,RaoandMolina(2015)].Itisknownthat,incaseofmodelmisspecification,thetraditionalempiricalbestlinearunbiasedprediction(EBLUP)methodmayloseefficiency.Infact,insuchacase,analternativemethodknownasobservedbestprediction(OBP)islikelytobemoreaccuratethantheEBLUP.Ontheotherhand,whentheunderlyingmodel

98January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page8686RobustMixedModelAnalysisiscorrectlyspecified,EBLUPisknowntobemoreefficientthanOBP(seethenextchapterfordetails).Therefore,itisimportant,inpractice,toknowwhetherornottheassumedmodelisappropriateinordertocomeupwithamoreefficientSAEstrategy.Anotherstandardassumptioninmixedeffectsmodelsisthenormalityassumption.Forexample,inageneralizedlinearmixedmodel(GLMM),itistypicallyassumedthattherandomeffectsarenormallydistributed.Ifthenormalityassumptionfails,estimatorsofthefixedeffectsandvariancecomponentsderivedunderthenormalityassumptionmaybeinconsistent[e.g.,JiangandNguyen(2009)].Notethatthisisverydifferentfromthecaseofalinearmixedmodel;seeChapter3.Theliteratureonmixedmodeldiagnosticsisnotveryextensive.See,forexample,Pierce(1982),sec.2.4.1ofJiang(2007),ClaeskensandHart(2009).Jiang(2001)proposedaχ2-typegoodness-of-fittestforlinearmixedmodeldiagnostics,whoseasymptoticnulldistributionisaweightedχ2,wheretheweightsareeigenvaluesofsomenonnegativedefinitematrix.ClaeskensandHart(2009)proposedanalternativeapproachtotheχ2testforcheckingthenormalityassumptioninLMM.Theauthorsconsideredaclassofdistributionsthatincludethenormaldistributionasareduced,specialcase.Thetestisbasedonthelikelihood-ratiotest(LRT)thatcom-paresthe“estimateddistribution”andthenulldistribution(i.e.,normal).AmodelselectionprocedureviatheinformationcriteriaisusedtodeterminethelargerclassofdistributionsfortheLRT.Inparticular,theasymptoticnulldistributionisintheformofthedistributionofsupl≥1{2Ql/l(l+3)},l2222whereQl=q=1χq+1,andχ2,χ3,...areindependentsuchthatχjhasaχ2distributionwithjdegreesoffreedom,j≥2.Theχ2-typetestsdependonthechoiceofcells,basedonwhichtheob-servedandexpectedcellfrequenciesareevaluated.AsnotedbyJiangandNguyen(2009),performanceoftheχ2testissensitivetothechoiceofthecells,andthereisno“optimalchoice”ofsuchcellsknownintheliterature.Ontheotherhand,theClaeskens-Harttestdependsonthechoiceoftheinformationcriterion.Asiswellknown,therearedifferentversionsoftheinformationcriteria,suchasAIC[Akaike(1973)],BIC[Schwarz(1978)],HQ[HannanandQuinn(1979)].Thedifferenceintheperformanceofthetestbydifferentinformationcriteriaisunclear.Furthermore,theweighted-χ2asymptoticnulldistributionofJiang(2001)dependsoneigenvaluesofsomematrices,whoseexpressionsarecomplicated,andinvolveunknownparameters.Theseparametersneedtobeestimatedinordertoobtainthe

99March1,201916:19ws-book9x6RobustMixedModelAnalysisbook4page87RobustTests87criticalvaluesofthetests.Duetosuchacomplication,Jiang(2001)sug-geststouseaMonteCarlomethodtocomputethecriticalvalue;but,bydoingso,theusefulnessoftheasymptoticresultmaybeundermined.Sim-ilarly,theasymptoticdistributionoftheClaeskens-Harttestisnotsimpleandinvolvessupremeofχ2distributions.Itmightbearguedthat,intoday’scomputerera,havingasimpleasymptoticdistributionsuchasχ2is,perhaps,notasimportantasinthepast.However,thereare,still,attractivefeaturesoftheχ2limitingdis-tributionthatareworthpursuing.First,theχ2distributioncorrespondstotherightstandardization–itisthe“square”ofthemultivariatestandardnormaldistribution.Inthisregard,anythingotherthanχ2leaves,atleast,someroomforimprovement.Notethat,whilethereisonlyonewayofacompletestandardization,therearemany,ifnotinfinitelymany,waysofincompletestandardization,soitmaynotbeconvincingwhyonewayischosenovertheothers.Second,havingacomputer-determined,non-analyticasymptoticdistributionmakesitdifficulttostudypropertiesofthelimitingdistribution.Forexample,howdoesthereductionofcomplex-ityofthemodelunderthenullhypothesisplayarole?Itmaynotbeeasytotellifallonegetsareabunchofnumbers.Arelatedissueisregardingdirectionofimprovement.Thismaynotbeeasytoseewithoutasimpleanalyticexpressionfortheasymptoticdistribution.Inwhatfollows,wegeneralizeamethodinitiatedbyFisher(1922)inderivinggoodness-of-fittests(GoFTs)thatareguaranteedtohaveasymp-toticχ2nulldistributions.Inaddition,theproposedtesthasarobustnessfeatureinthatitcantestacertaintypeofmodelassumptionwhileanotheraspectofthemodelmaybemisspecified.Aspecialcaseofthemethodhasaconnectionwiththegeneralizedmethodofmoments(GMM),asweshallpointout.WethenconsiderapplicationofthetesttoSAEandpresentsomeempiricalresults,especiallyintermsofitsrobustnessproperty.TechnicalproofscanbefoundinJiangandTorabi(2018).4.4.1TailoringInthissection,wedescribeageneralapproachtoobtainingateststatis-ticthathasanasymptoticχ2distributionunderthenullhypothesis.TheoriginalideacanbetracedbacktoR.A.Fisher[Fisher(1922)],whousedthemethodtoobtainanasymptoticχ2distributionforPearson’sχ2-test,whentheso-calledminimumchi-squareestimatorisused.However,Fisherdidnotputforwardthemethodthatheoriginatedunderageneralframe-

100January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page8888RobustMixedModelAnalysiswork,aswedohere.Supposethatthereisasequenceofs-dimensionalrandomvectors,B(ϑ),whichdependonavectorϑofunknownparameterssuchthat,whenϑisthetrueparametervector,onehasE{B(ϑ)}=0,Var{B(ϑ)}=Is,and,asthesamplesizeincreases,|B(ϑ)|2−→dχ2,(4.32)swhere|·|denotestheEuclideannorm.However,becauseϑisunknown,onecannotuse(4.32)forGoFT.Whatistypicallydone,suchasinPearson’sχ2-test,istoreplaceϑbyanestimator,ϑˆ.Questionis:whatisϑˆ?Theidealscenariowouldbethat,afterreplacingϑbyϑˆin(4.32),onehasareductionofdegreesoffreedom(d.f.),whichleadsto2d2|B(ϑˆ)|−→χν,(4.33)whereν=s−r>0andr=dim(ϑ).Thisisthefamous“subtractonedegreeoffreedomforeachparameterestimated”ruletaughtinmanyelementarystatisticsbooks[e.g.,Rice(1995),p.242].However,asiswellknown[e.g.,Moore(1978)],dependingonwhatϑˆisused,(4.33)mayormaynothold,regardlessofwhatd.f.isactuallyinvolved.Infact,theonlymethodthatisknowntoachieve(4.33)withoutrestrictiononthedistributionofthedataisFisher’sminimumχ2method.Inaway,themethodallowsoneto“cut-down”thed.f.of(1)byr,andthusconvertanasymptoticχ2tosanasymptoticχ2.Forsuchareason,wehavecoinedthemethod,underνthemoregeneralsettingbelow,tailoring.Wedevelopthemethodwithaheuristicderivation,withtherigorousjustificationgiveninJiangandTorabi(2018).The“right”estimatorofϑfortailoringissupposedtobethesolutiontoanestimatingequationofthefollowingform:C(ϑ)≡A(ϑ)B(ϑ)=0,(4.34)whereA(ϑ)isanr×snon-randommatrixthatplaystheroleoftailoringthes-dimensionalvector,B(ϑ),tother-dimensionalvector,C(ϑ).Thespec-ificationofAwillbecomeclearattheendofthederivation.Throughoutthederivation,ϑdenotesthetrueparametervector.Fornotationsimplic-ity,weuseAforA(ϑ),AˆforA(ϑˆ),etc.Underregularityconditions,onehasthefollowingexpansions,whichcanbederivedfromtheTaylorseriesexpansionandlarge-sampletheory[e.g.,Jiang(2010)]:−1∂Cϑˆ−ϑ≈−EϑC,(4.35)∂ϑ−1∂B∂CBˆ≈B−EϑEϑC.(4.36)∂ϑ∂ϑ

101January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page89RobustTests89BecauseEϑ{B(ϑ)}=0[seeabove(4.32)],onehas∂C∂BEϑ=AEϑ.(4.37)∂ϑ∂ϑCombining(4.36),(4.37),wegetBˆ≈{I−U(AU)−1A}B,(4.38)swhereU=E(∂B/∂ϑ).WeassumethatAischosensuchthatϑ−1U(AU)Aissymmetric.(4.39)Then,itiseasytoverifythatI−U(AU)−1Aissymmetricandidempotent.sIfwefurtherassumethatthefollowinglimitexists:I−U(AU)−1A−→P,(4.40)sdthenPisalsosymmetricandidempotent.Thus,assumingthatB→N(0,Is),whichistypicallytheargumentleadingto(4.32),onehas,by(4.38),Bˆ→dN(0,P),hence[e.g.,Searle(1971),p.58]|Bˆ|2→dχ2,whereνν=tr(P)=s−r.Thisisexactly(4.33).Itremainstoansweronelastquestion:Istheresuchanon-randommatrixA=A(ϑ)thatsatisfies(4.39)and(4.40)?Weshowthat,notonlytheanswerisyes,thereisanoptimalone.LetA=N−1UW,whereWisasymmetric,non-randommatrixtobedetermined,andNisanormalizingconstantthatdependsonthesamplesize.By(4.35)andthefactthatVarϑ(B)=Is[seeabove(4.32)],wehave−12−1−1varϑ(ϑˆ)≈(UWU)UWU(UWU)≥(UU),(4.41)by,forexample,Lemma5.1ofJiang(2010).Theequalityontherightsideof(4.41)holdswhenW=Is,givingtheoptimalA:U1∂BA=A(ϑ)==Eϑ.(4.42)NN∂ϑTheAgivenby(4.42)clearlysatisfy(4.39)[whichisequaltoU(UU)−1U].Itwillbeseeninthenextsectionthat,withN=m,(4.40)isexpectedtobesatisfied.Itshouldbenotedthatthesolutionto(4.34),ϑˆ,doesnotdependonthechoiceofN.AbasicassumptionforthetailoringmethodtoworkisthatE{B(ϑ)}=0whenϑisthetrueparametervector.However,fromtheproofoftheresult[seeJiangandTorabi(2018)]itisseenthatthecondition“ϑisthetrueparametervector”isnotcritical.Forexample,incasethereisamodelmisspecification,a“trueparametervector”maynotexist.Nevertheless,

102January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page9090RobustMixedModelAnalysiswhatisimportantisthatthereissomeparametervector,ϑ,whichisnotnecessarilythetrueparametervector,suchthattheequationA(ϑ)E{B(ϑ)}=0(4.43)holds.Thisequationholds,ofcourse,whenϑisthetrueparametervector,butitcanalsoholdwhenthetrueparametervectordoesnotexist,suchasundermodelmisspecification.Infact,inthelattercase,onemaydefinethe“trueparametervector”astheuniqueϑ,assumedexist,thatsatisfies(4.43).Notethatthenumberofequationsin(4.43)isthesameasthedimensionofϑ;thus,oneexpectthatasolutionexistsandisunique,undersomeregularityconditions.Toseethat(4.43)isthekey,notethatunder(4.42),(4.34)isequivalenttoA(ϑ)[B(ϑ)−E{B(ϑ)}]=0,wheretheexpectationiswithrespecttothetrueunderlyingdistribution.ItfollowsthatonecanreplaceB(ϑ)byB(ϑ)−E{B(ϑ)},whichhasmeanzero,andalloftheargumentsintheproofgothrough.Thispropertyhasgiventailoringsomeunexpectedrobustnessfeature,thatis,itcanworkcorrectlyinspiteofsomemodelmisspecification.Weillustratethispointmorespecificallyinthenextsection.Itshouldbenotedthatthespecialcaseof(4.42)iscloselyrelatedtothespecificationtest(ST)inGMM[e.g.,Newey(1985)],wheretheestimatorϑˆisdefinedastheminimizeroftheleftsideof(4.32).ThisisthesameasFisher’sminimumchi-squareestimator.Thus,theSTmaybeviewedasanextensionofFisher’sidea,too.Althoughtheminimizerof|B(ϑ)|2andthesolutionto(4.34),withAgivenby(4.42),arenotthesame,theyareasymptoticallyequivalentundersomeregularityconditions.Nevertheless,tailoringismoregeneralthanSTinthatAdoesnothavetobegivenby(4.42);itisnotexactlythesameasSTevenif(4.42)holds.WealsopreferthenametailoringoverSTasitismoreintuitive.4.4.2ApplicationtoSAEAstandardassumptioninSAEmodels,includingtheFay-Herriotmodel[FayandHerriot(1979)],thenested-errorregression(NER)model[Batteseetal.(1988)],andthemixedlogisticmodel[JiangandLahiri(2001)],isthattherandomeffectsarenormallydistributed.Thisassumptionhashadsubstantialimpactonmanyaspectsoftheinference.Forexample,estimationofthemeansquaredpredictionerror(MSPE)isanimportantissueinSAE[e.g.,RaoandMolina(2015)].Thewell-knownPrasad-Raomethod(PrasadandRao1990)dependsonthenormalityassumptionand

103January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page91RobustTests91maynotbeaccurateiftheassumptionfails.Also,predictionintervalsobtainedviaparametricbootstrapmethods[e.g.,Chatterjeeetal.(2008)]dependsheavilyonthenormalityassumption.Althoughtherearestrategiesthatarelessdependentonthenormalityassumption,thosestrategiesareoftenlessefficientthanthenormality-basedmethodwhenthenormalityassumptionholds,evenapproximately.Thus,itisimportanttocheckthevalidityofthenormalityassumptionsothatanappropriatemethodcanbeusedfortheinference.Inthissection,wedeveloptwogoodness-of-fitteststhatareapplicabletoSAEusingthetailoringmethod.Althoughthefocusistestingforthenormalityassumptionoftherandomeffects,themethodcanbeeasilyex-tendedtotestingotheraspectsoftheassumedmodel.Also,weshallfocusontheFay-Herriotmodel;extensionofthemethodtoothertypesofSAEmodels,suchastheNERmodel,isfairlystraightforward.ThefirsttestisbasedontheexistingMLmethod;whilethesecondtestisdevelopedbasedonanewinferencemethod,whichmaybeofinterestonitsown.TheFay-Herriotmodelmaybeexpressedasthat(i)(yi,θi),i=1,...,mareindependent;(ii)y|θ∼N(θ,D);and(iii)θ∼N(xβ,A).Here,yiiiiiiiisthedirectestimatefromtheitharea,θiisthesmallareamean,xiisavectorofobservedcovariates,βisavectorofunknownparameters,Aisanunknownvariance,andDiisasamplingvariancethatisassumedknown.Thenormalityassumptionhastodowith(iii).Thereasonthatthisisnotanissuewith(ii)isbecause,inpractice,yiistypicallyasamplesummarysuchasasamplemeanorproportion;asaresult,thenormalityassumptionin(ii)oftenholdsapproximatelyduetothecentrallimitthe-orem(CLT).However,thereisnoobviousreasontobelievethattheCLTshouldholdfor(iii).Thus,weconsiderabroaderclassofdistributions,namely,theskewednormaldistribution[SN;e.g.,AzzaliniandCapitanio(2014)],whichincludesthenormaldistributionasaspecialcase.UndertheSNdistribution,(iii)isreplacedby(iii)θ∼SN(xβ,A,α),whichdenotesiitheSNdistributionwithmeanxβ,varianceA,andskewnessparameteriα(seebelow).Notingthatα=0leadstothenormaldistribution.Wedenotethemodelparametersasψ=(β,A,α).Supposethat,underthenullhypothesis,thereisareductioninthedimensionoftheparametervectorsuchthatγ=γ0underthenullhypoth-esis,whereγisasubvectorofψandγ0isknown.Letϑdenotethevectorofparametersinψotherthanγ.Inthissection,notationsuchasEϑ,etc.willbeunderstoodasexpectation,etc.underthenullhypothesis.1.Maximumlikelihood.UndertheFay-Herriotmodel,onecanshow

104January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page9292RobustMixedModelAnalysis/thaty∼SN(xβ,A+D,α)whereα=αA/(A+D+α2D).Weiii11iicanthenwritethepdfofyias:01012yi−xiβyi−xiβf(yi,ψ)=√φ√Φα1√,i=1,...,m,A+DiA+DiA+Diwhereψ=(β,A,α),φ(.)andΦ(.)arethepdfandcdfofN(0,1),re-%mspectively.ThelikelihoodfunctionisL=i=1f(yi,ψ).Underthissetting,testingnormalityisequivalenttotestingH0:α=0.Letz(y,ϑ)={(∂/∂ψ)logf(y,ψ)}|,whereϑ=(β,A),iiiα=0∂logf(y,ψ)x(y−xβ)i=iii,∂βA+Diα=0∂logf(y,ψ)1y−xβ21i=ii−,∂A2A+DiA+Diα=02∂logf(y,ψ)2Ay−xβi=ii.∂απA+Diα=0Ifthemodeliscorrectlyspecified,letϑdenotethetrueϑ.Then,undermthenullhypothesis,i=1zi(yi,ϑ)isasumofindependentrandomvectorswithmeanzero.If,somehow,themodelismisspecifiedinitsmeanfunctionthatthereisnotrueβ,hencenotrueϑ,wedefinethe“trueϑ”astheuniquesolutionto(4.43).Then,intheproofoftheasymptoticnulldistri-mbution[seeJiangandTorabi(2018)],onecanreplacei=1zi(yi,ϑ)bymi=1[zi(yi,ϑ)−E{zi(yi,ϑ)}],whichisasumofindependentrandomvec-torswithmeanzero,andalloftheargumentsgothrough.Furthermore,itcanbeshownthatmmVz(ϑ)=Varϑzi(yi,ϑ)=Varϑ{zi(yi,ϑ)},i=1i=1andthisistrueregardlessofthedefinitionofthetrueϑ,whereVarϑ{zi(yi,ϑ)}⎡√√⎤xx/(A+D)0x2A/π(A+D)iiipii=⎣01/2(A+Di)20⎦√√px2A/π(A+D)02A/π(A+D)iii(Exercise4.6).−1/2mdTherefore,ifweletB(ϑ)=Vz(ϑ)i=1zi(yi,ϑ),wehaveB(ϑ)−→N(0,Is),wheres=dim(ψ),itfollowsthat(4.32)holds.Becauser=

105January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page93RobustTests93dim(ϑ)

106January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page9494RobustMixedModelAnalysisWecalltheestimatorofψobtainedbysolvingtheadjustedPLscoreequa-tion,sa(ψ)=0,or,equivalently,thefollowingequation:∂∂logf(y|y,ψ)=Eψlogf(y|y,ψ)(4.47)∂ψ∂ψmaximumadjustedPLestimator,orMaple,inviewofitsanalogytoMLE.UndertheFay-Herriotmodel,itiseasytoshowthatf(θ|y)=%mi=1f(θi|yi).Thus,wehavemf(y|y)=f(yi|θi)f(θi|yi)dθi=1m=f(yi|θi)f(θi|yi)dθii=1m=f(yi|yi),(4.48)i=13wheref(yi|yi)=f(yi|θi)f(θi|yi)dθi.UndertheFay-Herriotmodel,wehavey|θ∼N(θ,D)andθ∼SN(xβ,A,α).ItisthenshowninJiangiiiiiiandTorabi(2018)that(Exercise4.7)f(yi|yi)1y−xβΦ[α(y−xβ)]=/φ/ii2iii,(4.49)Di(1+Bi)Di(1+Bi)/(1−Bi)Φ[α3i(yi−xiβ)]√/whereαsi=(4−s)Aα/{(4−s)A+Di}{(4−s)A+(1+α2)Di},s=2,3.Notethat,whenα=0,(4.49)reducestothatundernormality.Alsonotethatf(yi|yi)=f(yi).%mBy(4.48),thePLcanbeexpressedasi=1f(yi|yi,ψ).TotestH0:α=0,letbi(yi,ϑ)={(∂/∂ψ)logf(yi|yi,ψ)}|α=0−E[{(∂/∂ψ)logf(yi|yi,ψ)}|α=0].OnecanderivetheadjustedPLequation,(4.47),asfollows:∂logf(yi|yi,ψ)∂logf(yi|yi,ψ)∂β−E∂β=ai(A)xi(yi−xiβ),α=0α=0∂logf(yi|yi,ψ)∂logf(yi|yi,ψ)−E∂A∂Aα=0α=0=b(A)(y−xβ)2−c(A),iiii∂logf(yi|yi,ψ)∂logf(yi|yi,ψ)−E=di(A)(yi−xiβ),∂α∂αα=0α=0

107January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page95RobustTests95wherea(A)=(1−B)2/D(1+B),b(A)=(1−B)3(3+B)/2D2(1+B)2,iiiii/iiiic(A)=(1−B)2(3+B)/2D(1+B)2,d(A)=2A/π(1−B)2/D(1+B).iiiiiiiiiLetϑdenotethetrueϑ.Ifthemodeliscorrectlyspecifiedundermthenullhypothesis,then,underthenullhypothesis,i=1bi(yi,ϑ)isasumofindependentrandomvectorswithmeanzero.Ontheotherhand,ifthereissomemisspecificationinthemeanfunctionthatthetrueβ,hencethetrueϑ,doesnotexist(underthenullhypothesis),weagaindefinethe“trueϑ”astheuniquesolutionto(4.43).Then,similartotheMLcase,alloftheargumentsintheproofgothroughbyreplacingmmi=1bi(yi,ϑ)withi=1[bi(yi,ϑ)−E{bi(yi,ϑ)}].Furthermore,wehavemmVb(ϑ)=Varϑ{i=1bi(yi,ϑ)}=i=1Varϑ{bi(yi,ϑ)},where⎡/⎤g(A)xx0g(A)(x2A/π)iiipiiVar{b(y,ϑ)}=⎣0h(A)0⎦ϑii/pig(A)(x2A/π)0g(A)(2A/π)iiiwithg(A)=(1−B)3/D(1+B)2andh(A)=(1−B)4(3+B)2/2D2(1+iiiiiiii4−1/2mdBi).Thus,ifweletB(ϑ)=Vb(ϑ)i=1bi(yi,ϑ),wehaveB(ϑ)−→N(0,Is),wheres=dim(ψ).Itfollowsthat(4.32)holds.Becauser=dim(ϑ)

108January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page9696RobustMixedModelAnalysiswithP=limE(∂Tm/∂ψ).Theasymptoticnulldistributionoftheteststatisticisχ2.Inthecurrentcase,itcanbeshownthat1m√P=−1i=1Dixi/(Di+A).m0InthecaseofJiang(2001),onehastheteststatisticK212χˆJ={Nk−pk(ψˆ)},mk=1mwhereNk=i=11(yi∈Ck)=#{1≤i≤m:yi∈Ck},andpk(ψ)=mmi=1Pψ(yi∈Ck)=i=1pik(ψ).Morespecifically,thecells,Ck,1≤k≤Karedefinedasfollows:C1=(−∞,c1],Ck=(ck−1,ck],2≤k≤K−1,andCK=(cK−1,∞).RegardingthechoiceofKandck’s,byJiang(2001),wemaychooseK=max(p+2,[m1/5]),wherepisthedimensionofβ.OnceKischosen,theck’sarechosensothatthereareequalnumberofyi’swithineachCk,1≤k≤K.ItthenfollowsthatNk=m/K,1≤k≤K.Finally,thepik(ψ)havethefollowingexpressions:c−xβ√1ipi1(ψ)=Φ,A+Dic−xβc−xβp(ψ)=Φ√ki−Φ√k−1i,2≤k≤K−1,ikA+DiA+Dic−xβK√−1ipiK(ψ)=1−Φ,A+DiwhereΦ(·)isthecdfofN(0,1).WethenuseaMonteCarlomethod(e.g.,bootstrapping)tocomputethecriticalvalues,assuggestedbyJiang(2001).InthecaseofClaeskensandHart(2009),oneusestheteststatistic22{logLl−logLM=0}χˆCH=max,1≤l≤Ml(l+3)/2whereMistheorderofpolynomialwhichplaystheroleofasmoothingparameter.ThetestisbasedontheLRTwhichcomparestheestimateddistribution(M>0)andthenulldistribution(M=0;i.e.,normal).SimilartoJiang(2001),oneneedstousereplicationsfromtheteststatisticabovetoapproximatethecriticalvalues.HereweconsiderM=1.Toevaluateandcompareperformancetheaforementionedmethods,letBˆ2,Bˆ2,Fˆ,χˆ2,andχˆ2representtheteststatisticsforthetailoringMLPLJCHtestswithML,Maple,andthetestsofPierce,Jiang,andClaeskens-Hart,

109January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page97RobustTests97respectively[fornotationsimplicitywewrite|B(ϑˆ)|2asBˆ2].WeconsiderthefollowingFay-Herriotmodel:yi=β1xi+vi+ei,1≤i≤n,yi=β2xi+vi+ei,n+1≤i≤m,wherem=2n,Di=Di1for1≤i≤nandDi=Di2forn+1≤i≤m.WechooseA=10.Recallthatvi∼SN(0,A,α)andei∼N(0,Di).TheDi1aregeneratedformtheuniformdistributionbetween3.5and4.5.TherearetwoscenariosforDi2:onegeneratedfrom(i)Uniform(3.5,4.5)andtheotherfrom(ii)Uniform(0.5,1.5).Thexi’saregeneratedfromtheuniformdistributionbetween0and1.Thexi’sandDi’sarefixedthroughoutthesimulationstudy.TheactualmodelthatwearefittingistheFay-Herriotmodelwithβ1=β2≡β,thatis,yi=βxi+vi+ei,1≤i≤mwiththesameassumptionsforviandei.Twoscenariosareconsidered.Inthefirstscenario,weletβ1=β2=1;inthesecondscenario,weletβ1=1andβ2=3.Thus,underthefirstscenario,themeanfunctioniscorrectlyspecifiedwithatrueβ=1;underthesecondscenario,themeanfunctionismisspecified,therefore,thereisnotrueβ.However,itiseasytoseethat,evenunderthesecondscenario,thereisauniqueϑ=(β,A)thatsatisfies(4.43).Therefore,accordingtothetheorydevelopedintheprevioussections,thetailoringmethodsapplyunderbothscenarios.TotestH0:α=0,weconsiderfourdifferentsamplesizesm=50,100,200,and500.WecarryoutR=5,000simulationrunstocal-culateBˆ2,Bˆ2,F,ˆχˆ2,andχˆ2[fornotationsimplicitywewrite|B(ϑˆ)|2MLPLJCHasBˆ2].Notethatforthetailoringmethods,Aisestimatedbysolving(4.34),whileforFˆandχˆ2weusethePrasad-Raomethod[PrasadandRaoJ(1990)]toestimatetheparameters,andforχˆ2weuseMLforparameterCHestimation.Also,weuse1000replicationstoobtainthecriticalvalues(ineachsimulationrun)forχˆ2,and100replicationsforχˆ2.JCHTocomparethepowersofBˆ2,Bˆ2,F,ˆχˆ2,andχˆ2,werepeatsimilarMLPLJCHstepsasabovebutnowgeneratedataineachsimulationrunrunderthe(r)(r)(r)(r)alternativemodel:yi=β1xi+vi+ei,(1≤i≤n);andyi=β2xi+(r)(r)(r)vi+ei,(n+1≤i≤m)forr=1,...,R,wherevi∼SN(0,A,α=0.5)(r)andeiaresimulatedthesamewayasunderH0.ThesimulatedsizeandpowerarereportedinTable4.5.Itseemsthatwiththeincreasingsamplesize(m),wegetthecorrectsizeandhighpower

110January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page9898RobustMixedModelAnalysisTable4.5Size(Power)underDifferentSampleSizes(m)andScenariosDi2β2mBˆ2Bˆ2Fˆχˆ2χˆ2MLPLJCH(i)1500.048(0.853)0.048(0.861)0.046(0.898)0.049(0.023)0.054(0.389)1000.053(0.966)0.052(0.966)0.052(0.981)0.051(0.017)0.044(0.398)2000.051(0.999)0.051(0.999)0.049(0.999)0.045(0.004)0.007(0.108)5000.055(1.000)0.055(1.000)0.056(1.000)0.048(0.0002)0.007(0.096)3500.044(0.840)0.044(0.851)0.040(0.894)0.053(0.021)0.044(0.377)1000.050(0.962)0.050(0.961)0.049(0.980)0.060(0.013)0.046(0.332)2000.046(0.999)0.046(0.999)0.046(0.999)0.053(0.003)0.003(0.149)5000.053(1.000)0.052(1.000)0.052(1.000)0.049(0.000)0.000(0.155)(ii)1500.046(0.848)0.045(0.810)0.049(0.855)0.051(0.025)0.060(0.461)1000.053(0.964)0.048(0.934)0.048(0.970)0.052(0.017)0.034(0.372)2000.050(0.999)0.047(0.994)0.050(0.998)0.048(0.006)0.008(0.112)5000.054(1.000)0.054(1.000)0.054(1.000)0.048(0.0002)0.004(0.159)3500.042(0.838)0.043(0.799)0.092(0.765)0.061(0.021)0.062(0.467)1000.050(0.958)0.048(0.922)0.129(0.932)0.074(0.011)0.047(0.329)2000.046(0.999)0.044(0.994)0.212(0.991)0.070(0.003)0.009(0.114)5000.052(1.000)0.053(1.000)0.406(1.000)0.081(0.000)0.004(0.370)forBˆ2andBˆ2underdifferentscenarios.However,inthecasesofFˆandMLPLχˆ2wedonotgettherightsize,ifwehavebothmisspecificationintheCHmeananddifferentsamplingvariancesforthesmallareas;wealsodonothavetherightsizeforχˆ2withtheincreasingsamplesize.Furthermore,CHweobservepoorperformanceintermsofpowerforχˆ2andχˆ2comparedJCHtoBˆ2andBˆ2.Overall,thesimulationresultsdemonstratearobustnessMLPLfeatureoftailoringassuggestedbythetheoreticalresultthattheothermethodsdonotpossess.4.5Real-dataexamples4.5.1TVSFPdataWeusedatafromtheTelevisionSchoolandFamilySmokingPreventionandCessationProject(TVSFP)toillustratetherobustlikelihood-ratiotestdiscussedintheprevioussections.ForacompletedescriptionoftheTVSFPstudy,seeHedekeretal.(1994).Theoriginalstudywasdesignedtotestindependentandcombinedeffectsofaschool-basedsocial-resistancecur-riculumandatelevision-basedprogramintermsoftobaccousepreventionandcessation.Thesubjectswereseventh-gradestudentsfromLosAngeles(LA)andSanDiegointheStateofCaliforniaintheUnitedStates.ThestudentswerepretestedinJanuary1986inaninitialstudy.Thesamestu-dentscompletedanimmediatepostinterventionquestionnaireinApril1986,aone-yearfollow-upquestionnaire(inApril1987),andatwo-yearfollow-up(inApril1988).Inthisanalysis,weconsiderasubsetoftheTVSFPdatainvolvingstudentsfrom28LAschools,wheretheschoolswererandomizedtooneoffourstudyconditions:(a)asocial-resistanceclassroomcurricu-lum(CC);(b)amedia(television)intervention(TV);(c)acombination

111January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page99RobustTests99ofCCandTVconditions;and(d)ano-treatmentcontrol.Atobaccoandhealthknowledgescale(THKS)scorewasoneoftheprimarystudyout-comevariables,andtheoneusedforthisanalysis.TheTHKSconsistedofsevenquestionnaireitemsusedtoassessstudenttobaccoandhealthknowl-edge.Astudent’sTHKSscorewasdefinedasthesumoftheitemsthatthestudentansweredcorrectly.Onlydatafromthepretestandimmediatepostinterventionareavailableforthecurrentanalysis.Morespecifically,thedataonlyinvolvedsubjectswhohadcompletedtheTHKSatbothofthesetimepoints.Inall,therewere1,600studentsfromthe28schools,withthenumberofstudentsfromeachschoolrangingfrom18to137.Hedekeretal.(1994)carriedoutamixed-modelanalysisbasedonanumberofNERmodelstoillustratemaximumlikelihoodestimationfortheanalysisofclustereddata.Jiangetal.(2015b)consideredthesamedatasmallareaestimation[e.g.,RaoandMolina(2015)].ThefollowingNERmodel[Batteseetal.(1988)]wasproposed,whichisaspecialcaseoflinearmixedmodel:yij=β0+β1xi1+β2xi2+β3xi1xi2+vi+eij,(4.50)i=1,...,28,j=1,...,ni,whereyijisthedifferencebetweentheimmedi-atepostinterventionandpretestTHKSscores;xi1,xi2areindicators(1or0)ofCCandTVprograms,respectively;βj,j=0,1,2,3areunknownfixedeffects,withβ3correspondingtotheinteractionbetweenCCandTV;viisaschool-specificrandomeffect,andeijisanadditionalerror.Itisassumedthattherandomeffectsanderrorsareindependent,withv∼N(0,σ2)andive∼N(0,σ2),whereσ2,σ2areunknownvariances(Exercise4.5).ijeveTheanalysisofJiangetal.(2015b)suggestedthattheCCprogramismoreeffectivethantheTVprogram.Also,thesignoftheestimatedβ3wasnegative,indicatingthattheinteractionmayhavenegativeeffect.Thesefindingshaveledtoconsiderationofatestingproblem,withthefollowingnullhypothesis:H0:β2=0,β1>0,β3<0.(4.51)Inaway,thenullhypothesisiscomplex,especiallybecauseoftheinequal-ityconstraints.Therefore,itisnaturaltoconsiderthelikelihood-ratiotest.However,thedataisclearlynotnormalinthiscase.Notethatthepossi-blevaluesofyijareintegersbetween−7and7.Therefore,theGaussianlikelihoodfunctionisnotalikelihoodfunction.However,wecanstilluseitfortheL-test,discussedintheprevioussections.Themaximizeroftheobjectivefunction,whichisthenegativeGaussianlog-likelihood,isgivenbyβˆ0=0.211,βˆ1=0.707,βˆ2=0.240,βˆ3=−0.319,

112February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page100100RobustMixedModelAnalysisσˆv=0.069,andσˆe=1.550.Themaximizerunderthenullhypothesisisβˆ00=0.222,βˆ01=0.584,βˆ03=−0.082,σˆ0v=0.120,andσˆ0e=1.550.TheL-teststatisticis−2logR=3.923.AccordingtoTheorem4.4,theasymptoticnulldistributionoftheL-testisχ2,which,forthelevelof1significance0.05,hasacriticalvalueof3.841.Thus,thenullhypothesisisrejected.Theconclusionisthat,inspiteofbeinglesseffectivethantheCCprogram,theTVprogramstillhasapositiveimpactonimprovingtheschools’THKSscores(whethertheimprovedTHKSscoremeansimprovedtobaccousepreventionandcessationisadifferentmatterthough).4.5.2MedianincomedataItisknownthattheincomedataaretypicallynotnormal.Inthisapplica-tion,ourgoalistocheckthenormalityassumptionformedianincomesoffour-personfamiliesatthestatelevelintheUSAGhoshetal.(1996).Thedatahasbeenanalyzedbyseveralresearchersusingdifferentset-ups.Inthisanalysis,theresponsevariableyiissamplesurveyoffour-personfam-ilymedianincomeatstateiinyear1989,andxiisthecensusfour-personfamilymedianincomeatstateiinyear1979(i=1,...,51).Aninspectionofthescatterplot(seeFig.4.1)suggeststhataquadraticmodelwouldfitthedata.Thus,thefollowingmodelisconsidered:y=βx+βx2+v+e,i=1,...,m=51,(4.52)i1i2iiiwherethevi’sarestate-specificrandomeffectsandei’saresamplingerrors.Itisassumedthatviandeiareindependentwithvi∼SN(0,A,α)andei∼N(0,Di)withknownDi.WeconsideringtestingH0:α=0vsH1:α6=0.First,weapplytheMapleandMLapproachesinconjunctionwithtailoring.Inthecaseoftailoring/Maple,theparameterestimatesareβˆ1=2.07,βˆ2=−1.2e−5,Aˆ=1.9×107,whichresultinrejectingtheHat10%significancelevel0[Bˆ2=4.67>2.70=χ2(0.90)].Inthecaseoftailoring/ML,theparameterPL1estimatesareβˆ=2.07,βˆ=−1.3e−5,Aˆ=1.9×107,whichleadstothe12sameconclusion(Bˆ2=3.68>2.70).Thus,thetailoringmethodssuggestMLthatthenormalityassumptionisnotvalid.WealsoappliedthemethodsofPierce(1982),Jiang(2001),andClaeskensandHart(2009)totestthehypothesis.Noneofthesetestswereabletorejectthenormalityassumption.ThisseemstobeconsistentwiththepatternobservedinoursimulationstudyinSection4.4.3thatthesetestsappeartobelesspowerfulthanourtests.

113March27,201917:10ws-book9x6RobustMixedModelAnalysisbook4page101RobustTests101Fig.4.1PlotofCensusFour-personFamilyMedianIncomein1979(x)vsSampleSurveyofFour-personFamilyMedianIncomein1989(y)forDifferentStates.4.6Exercises4.1.Showthat,forthetestconsideredinSubsection3.3.3,theteststatistic(4.4)reducestoχˆ2=(ˆγ−1)2/Σˆ,whereΣˆisgivenby1R,11R,11(3.26),andtheasymptoticnulldistributionisχ2.14.2.WriteoutspecificallytheparameterspaceΘ0inthenullhypothesis(4.11)forthecasesofExamples4.3–4.6.

114January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page102102RobustMixedModelAnalysis4.3.ExtendTheorem4.5inthesamewasastheExtensionsofTheorem4.2andTheorem4.3toallowA,B,Ctobereplacedbysequencesofmatrices.4.4.Showthatthemaximizerof(4.27)isϑˆ=(βˆ,τˆ2,γˆ),whereτˆ2andγˆaretheREMLestimators,andβˆisgivenabove(4.27).However,lR(ϑ,y)isnotalog-likelihoodfunction,evenifnormalityholds.Pleaseexplainwhy.4.5.DeriveanexpressionoftheGaussianlog-likelihoodunderthenested-errorregressionmodelofSection4.4.Theexpressionshouldnotinvolveanmatrixproducts.4.6.VerifytheexpressionsforVar{z(y,ϑ)}andE(∂z/∂ϑ)giveninϑiiϑi§4.4.2.1.4.7.Verifyexpression(4.49).AlsoverifytheexpressionsforVar{b(y,ϑ)}andE(∂b/∂ϑ)givenin§4.4.2.2.ϑiiϑi

115January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page103Chapter5ObservedBestPredictionSofar,wehavebeenconsideringthetypesofinferencesuchasestimationandtesting.Thereisanothertypeofstatisticalinferencethathasnotbeentouched:Prediction.AsnotedinJiang(2007,sec.2.3),therearetwokindsofpredictionproblemsassociatedwiththemixedeffectsmodels.Thefirstkind,whichisencounteredmoreofteninpractice,ispredictionofmixedeffects;thesecondkindispredictionoffutureobservation.Hereinthischapter,weshallfocusonpredictionofthemixedeffects.Inthisregard,thebestknownmethodisbestlinearunbiasedprediction,orBLUP.So,beforewepresentanewmethod,letusfirstreviewbasicelementsofBLUP.5.1BestlinearunbiasedpredictionA(linear)mixedeffectisalinearcombinationoffixedandrandomeffects,expressedasη=bβ+aα,whereb,aareknownvectors,andαandβarethevectorsoffixedandrandomeffects.Predictionofamixedeffecthasalonghistory,datingbacktoC.R.Hendersoninhisearlyworkinthefieldofanimalbreeding[e.g.,Henderson(1948).Robinson(1991)givesawide-rangingaccountofpredictionofmixedeffectsusingBLUPwithexamplesandapplications.JiangandLahiri(2006)offersanotherreviewofpredictionofmixedeffects,inwhichtheauthorsusedthetermmixedmodelpredictionforsuchkindofpredictionproblems,withfocusonsmallareaestimation.Whenthefixedeffectsandvariancecomponentsareknown,thebestpredictor(BP)ofη,inthesenseofminimummeansquaredpredictionerror(MSPE),isitsconditionalexpectationgiventhedata:η˜=E(ξ|y)=bβ+aE(α|y).(5.1)103

116January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page104104RobustMixedModelAnalysisNowassumethatysatisfiesthefollowingLMM:y=Xβ+Zα+,(5.2)whereX,Zareknownmatrices;βisavectoroffixedeffects;α,arevectorsofrandomeffectsanderrors,respectively,suchthatα∼N(0,G),∼N(0,R),andα,areindependent.UndersuchLMMassumptions,onehasthewell-knownexpressionfortheconditionalexpectationin(5.1):E(α|y)=GZV−1(y−Xβ),(5.3)whereV=Var(y)=ZGZ+R.Thus,ifβ,G,Rareknown,theBPofηisgivenby(5.1)and(5.3).Inpractice,theβ,G,RinvolvedintheBPareusuallyunknown.Beforewehandlethemostpracticalsituation,letusfirstmakeasmallstep,byassumingthatβisunknownbutG,Rareknown.Insuchacase,itiscustomarytoestimateβbyitsMLE,whichisgivenby(1.3)withVˆreplacedbytheVgivenbelow(5.3).ThisleadstotheBLUP.Inotherwords,theBLUPofηisη˜=bβ˜+aα˜,whereβ˜=(XV−1X)−1XV−1y,(5.4)α˜=GZV−1(y−Xβ˜).(5.5)Notethat(5.4)isthesameastheBLUEintroducedearlier[see(2.30)]while(5.5)istheBLUPofα.Asforthemostgeneralsituation,whereG,Rarealsounknown,oneneedstoreplaceG,Rin(5.4),(5.5)bytheirestimators.Forexample,itisoftenassumedthatG,Rareknownfunctionsofsomevectorofvariancecomponents,θ,thatis,G=G(θ)andR=R(θ).Then,ifθˆisaconsistentestimatorofθ,G,RareestimatedbyGˆ=G(θˆ),Rˆ=R(θˆ),respectively.Forexample,θˆmaybetheREMLestimator,discussedinSection1.1(alsoseeSection3.2).OncetheG,RinBLUParereplacedbytheirestimators,theresultiscalledempiricalBLUP,orEBLUP.Inotherwords,theEBLUPofηisηˆ=bβˆ+aαˆ,βˆisgivenby(1.3),αˆ=GZˆVˆ−1(y−Xβˆ),(5.6)andVˆ=ZGZˆ+Rˆ.5.2Observedbestprediction:AnotherlookattheBLUPIfonethinksmorecarefully,theBLUPmaybeviewedasahybridofop-timalprediction,thatis,BPandoptimalestimation,thatis,ML.More

117January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page105ObservedBestPrediction105specifically,upto(5.3)onehastheBP,butafterthatthereis,perhapsun-notably,ashiftoffocustobestestimation,thatis,βbyitsMLE,assumingthatG,Rareknown.However,ifpredictionofthemixedeffectisofmaininterest,itmaybewonderedwhyonehastohybridize.Tomakeitsimpler,letusstay,fornow,withtheassumptionG,Rareknown.Canweestimateβinawaythatisalsotothebestinter-estofprediction(ofthemixedeffect)?NotethattheMLEisknowntobe(asymptotically)optimalintermsofestimation,butnotnecessarilyintermsofprediction(infact,itisnot,asweshallshow).Thisidealeadstoanewapproachformixedmodelprediction,proposedbyJiangetal.(2011a).Theapproachhasasurprisingbonus,asitturnsout,thatis,itleadstoapredictorthatismorerobusttomodelmisspecificationthantheBLUP,orEBLUP.Akeytothenewapproachistoentertaintwomodels.Oneiscalledassumedmodel,whichis(5.2);theotheriscalledbroadermodel,ortruemodel.Forsimplicity,letusconsider,fornow,thecasethatthepotentialmodelmisspecificationisonlyintermsofthemeanfunction;inotherwords,thetruemodelisy=μ+Zα+,(5.7)whereμ=E(y).Here,Edenotesexpectationunderthetruedistributionofy,whichmaybeunknown,butisnotmodel-dependent.Ourinterestispredictionofavectorofmixedeffects,expressedasζ=Fμ+Cα,(5.8)whereF,Careknownmatrices.Weillustratewithanexample.Example5.1(Fay–Herriotmodel).FayandHerriot[FayandHer-riot(1979)]proposedthefollowingmodeltoestimateper-capitaincomeofsmallplaceswithpopulationsizelessthan1,000:y=xβ+v+e,iiiii=1,...,m,whereyiisadirectestimate(samplemean)fortheitharea,xiisavectorofknowncovariates,βisavectorofunknownregressioncoefficients,vi’sarearea-specificrandomeffectsandei’saresamplinger-rors.Itisassumedthatvi’s,ei’sareindependentwithvi∼N(0,A)andei∼N(0,Di).ThevarianceAisunknown,butthesamplingvariancesDi’sareassumedknown(inpractice,theDi’scanbeestimatedwithac-curacy,hencecanbeassumedknown,atleastapproximately).Here,theinterestistoestimatethesmallareameanζ=(ζi)1≤i≤m=μ+v,whereμ=(μi)1≤i≤mwithμi=E(yi),andv=(vi)1≤i≤m.Inotherwords,ζ=E(y|v),whichcanbeexpressedas(5.8)withF=C=Im.

118January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page106106RobustMixedModelAnalysisSimilarto(5.1),(5.3),theBPofζisgivenbyE(ζ|y)=Fμ+CE(α|y)MM=FXβ+CGZV−1(y−Xβ)=Fy−Γ(y−Xβ),(5.9)whereΓ=F−BwithB=CGZV−1.Here,unlikeE,EdenotesMconditionalexpectationundertheassumedmodel,(5.2),denotedbyM.Thus,theBPismodel-basedinthesensethatitsexpressiondependsontheassumedLMM,(5.2),includingthenormalityassumption.Letζˇdenotetherightsideof(5.9)withafixed,arbitraryβ.Then,by(5.7),(5.8),wehaveζˇ−ζ=Hα+F−Γ(y−Xβ),whereH=ZF−C.Thus,wehaveMSPE(ζˇ)=E(|ζˇ−ζ|2)=E(|Hα+F|2)−2E{(αH+F)Γ(y−Xβ)}+E{(y−Xβ)ΓΓ(y−Xβ)}=I1−2I2+I3.ItiseasytoseethatI1doesnotdependonβ.Infact,I2doesnotdependonβeither,becauseI=E{(αH+F)Γ(y−μ)}+{E(αH+F)}Γ(μ−Xβ)=2E{(αH+F)Γ(Zα+)},by(5.7).Thus,wecanexpresstheMSPEasMSPE(ζˇ)=E{(y−Xβ)ΓΓ(y−Xβ)+···},(5.10)where···doesnotdependonβ.Theideaistoestimateβbyminimiz-ingtheobservedMSPE,whichistheexpressioninsidetheexpectationin(5.10).Then,because···doesnotdependonβ,thisisequivalenttomini-mizing(y−Xβ)ΓΓ(y−Xβ).Thesolutioniswhatwecallbestpredictiveestimator,orBPE,givenbyβˆ=(XΓΓX)−1XΓΓy,(5.11)assumingnonsingularityofΓΓandthatXisfullrank.Theresultingpredictorofζiscalledobservedbestpredictor,orOBP,givenbytherightsideof(5.9)withβreplacedbyβˆ.ThetermOBPisduetothefactthattheBPEisobtainedbyminimizingtheobservedMSPE(iftheobservedMSPEwerethetrueMSPE,thesameprocedurewouldleadtotheBP).Weconsideranexample.Example5.1(continued).Itiseasytoverify(Exercise5.1)thattheBPEofβisgivenbym2−1m2βˆ=DixxDixy.(5.12)iiiiA+DiA+Dii=1i=1

119January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page107ObservedBestPrediction107Also,theBPofζi,theithcomponentofζ,isgivenbyζ˜=Ay+Dixβ,(5.13)iiiA+DiA+Diwhereβ,Aarethetrueparameters.Thus,assumingthatAisknown,theOBPofζiisgivenby(5.13)withβreplacedbyβˆof(5.12).Ontheotherhand,itcanbeshownthattheMLEofβ,assumingthatAisknown,is−1mxxmxyβ˜=iiii.(5.14)A+DiA+Dii=1i=1TheBLUPofζiisgivenby(5.13)withβreplacedbyβ˜of(5.14).ItisinterestingtonotethatboththeMLE,(5.14),andBPE,(5.12),areintheformsofweightedaverages,theonlydifferencebeingtheweights.Inparticular,theBPEgivesmoreweightstoareaswithlargersamplingvariances,Di,whiletheMLEdoesjusttheopposite–assigningmoreweightstoareaswithsmallersamplingvariances.Aquestionis:Whoisright?Toanswerthisquestion,firstnotethat,fromtheexpressionoftheBP,(5.13),itisevidentthattheassumedmodelisinvolvedonlythroughxβ,iwhosecorrespondingweight,Di/(A+Di),isincreasingwithDi.Inotherwords,themodel-basedBPismorerelevanttothoseareaswithlargerDi.ImaginethatthereisameetingofrepresentativesfromthedifferentsmallareastodiscusswhatestimateofβistobeusedintheBP.TheareaswithlargerDithinkthattheir“voice”shouldbeheardmore(i.e.,theyshouldreceivemoreweights),becausetheBPismorerelevanttotheir“business”.Theirrequestisreasonable(although,politically,thismaynotworkoutwithinademocraticvotingsystem).NotonlytheOBPhasaintuitiveexplanation,italsohasanattractivepropertythatitismorerobusttomodelmisspecificationthanBLUP,intermsofMSPE.Thisisnotsurprising,giventhewaythatOBPisderived,butthefollowingtheorem,provedinJiangetal.(2011a),makesthisstoryprecise.Consideranempiricalbestpredictor(EBP),ζˇ,whichistheBP[i.e.,rightsideof(5.9)]withβreplacedbyaweightedleastsquares(WLS)estimator(recallthatG,Rareassumedknown),βˇ=(XWX)−1XWy,(5.15)whereWisapositivedefiniteweightingmatrix.TheBPEandMLEarespecialcasesoftheWLSestimator,withW=ΓΓfortheformerandW=V−1forthelatter.Thus,theOBPandBLUParespecialcasesoftheEBP.Alsorecallthatμ=E(y).

120January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page108108RobustMixedModelAnalysisTheorem5.1.TheMSPEofEBPcanbeexpressedasMSPE(ζˇ)=a+μA(W)μ+tr{A(W)},(5.16)012wherea0isanonnegativeconstantthatdoesnotdependonW;Aj(W),j=1,2arenonnegativedefinitematricesthatdependonW.Furthermore,μA(W)μisminimizedbytheOBP,withW=ΓΓ,andtr{A(W)}is12minimizedbytheBLUP,withW=V−1.Theexpressionsofa0,Aj(W),j=1,2aregivenbelow.DefineL=F−Γ{I−X(XWX)−1XW},whereIistheidentitymatrix,andrecallB=CGZV−1.Then,wehavea=tr{C(G−GZV−1ZG)C},0A1(W)=(F−L)(F−L),A(W)=(L−B)V(L−B).2HerearesomeoftheimportantimplicationsofTheorem5.1.Firstnotethatthesecondtermsdisappearwhenthemeanofyiscorrectlyspecified,thatis,μ=Xβforsomeβ.So,whentheunderlyingmodeliscorrect,BLUPusuallywinsthebattleofMSPE,becauseitminimizestheonly(remaining)termthatdependsonW.Ontheotherhand,whenthemeanismisspecified,thesecondtermisusuallyofhigherorderthanthethirdterm.Thus,inthissituation,OBPislikelytowinthebattle,becauseitminimizesahigherorderterm.Weuseanexampletoillustrate.5.3ExampleWeuseaverysimpleexampletoshowthatthepotentialgainofOBPoverBLUPcanbesubstantial,iftheunderlyingmodelismisspecified.ConsideraspecialcaseofExample5.1,inwhichxβ=β,anunknownmean.Toimakeitevensimpler,supposethatAisknown,sothatonecanactuallycomputetheBLUP.Furthermore,supposethatm=2n,Di=a,1≤i≤n,andDi=b,n+1≤i≤m,wherea,barepositiveknownconstants.Nowsupposethat,actually,theunderlyingmodelisc+vi+ei,≤i≤n,yi=d+vi+ei,n+1≤i≤m,wherec=d;inotherwords,wehaveamodelmisspecificationbyassumingc=d.Considerζ=(ζi)1≤i≤m,wherec+vi,1≤i≤n,ζi=d+vi,n+1≤i≤m.

121January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page109ObservedBestPrediction109Forthisspecialcase,wecanactuallyderivetheexactexpressionsoftheMSPEs.Itcanbeshown(Exercise5.2)thataba2b2(c−d)2MSPE(ζˆOBP)=+A+nA+aA+ba2(A+b)2+b2(A+a)2a4(A+b)3+b4(A+a)3+(A+a)(A+b){a2(A+b)2+b2(A+a)2}=g1n+h1,(5.17)ab(a2+b2)(c−d)2MSPE(ζˆBLUP)=+A+nA+aA+b(2A+a+b)2a2(A+b)2+b2(A+a)2+(A+a)(A+b)(2A+a+b)=g2n+h2.(5.18)Ifc=d,theng1≤g2andh1≥h2withequalityholdinginbothcasesifandonlyifthefollowingidentityholds:a2(A+b)=b2(A+a).(5.19)Nowsupposethat(5.19)doesnothold,thenwehaveg1h2,butthelatterisnotimportantwhennislarge.Infact,wehaveMSPE(ζˆOBP)g1lim=<1.n→∞MSPE(ζˆBLUP)g2Forexample,supposethatA/(c−d)2≈0,A/b≈0andb/a≈0.Thenitiseasytoshowthatg1/g2≈0.5(Exercise5.2).Thus,inthiscase,theMSPEoftheOBPisasymptoticallyabouthalfofthatoftheBLUP.Ontheotherhand,ifc=d,thatis,iftheunderlyingmodeliscorrectlyspecified,then,wehaveg1=g2while,still,h1≥h2.Therefore,inthiscase,MSPE(ζˆOBP)≥MSPE(ζˆBLUP);however,wehaveMSPE(ζˆOBP)lim=1;n→∞MSPE(ζˆBLUP)hence,theMSPEsoftheOBPandBLUPareasymptoticallythesame.5.4OBPfortwoclassesofSAEmodelsInthissection,wederivetheOBPfortwoimportantclassesofLMMsthatarefrequentlyusedinsmallareaestimation(SAE).

122January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page110110RobustMixedModelAnalysis5.4.1Fay-HerriotmodelLetusnowreferbacktotheFay-Herriotmodel,introducedinExample5.1,butwithAunknown.Again,webeginwiththeleftsideof(5.10),andnotethattheexpectationsinvolvedarewithrespecttothetrueunderlyingdistributionthatisunknown,butnotmodel-dependent.By(5.13),theBPofζcanbeexpressed,inmatrixform,asζ˜=y−Γ(y−Xβ),whereΓisdefinedbelow(5.9).Thus,by(5.7)and(5.8)(specializedtothisparticularcase),itcanbeshown(Exercise5.3)that2MSPE(ζ˜)=E{(y−Xβ)Γ(y−Xβ)+2Atr(Γ)−tr(D)}.(5.20)TheBPEofψ=(β,A)isobtainedbyminimizingtheexpressioninsidetheexpectationontherightsideof(5.20),whichisequivalenttominimizingQ(ψ)=(y−Xβ)Γ2(y−Xβ)+2Atr(Γ).(5.21)Notethat,in(5.20),ψistreatedasanunknownparametervector,ratherthanthetrueparametervector.LetQ˜(A)beQ(ψ)withβ=β˜givenbytherightsideof(5.12),consideredasafunctionofA.Itcanbeshown(Exercise5.3)thatQ˜(A)=yΓP(ΓX)⊥Γy+2Atr(Γ),(5.22)whereforanymatrixM,P−1M⊥=I−PMwithPM=M(MM)M(assumingnonsingularityofMM),hence,wehaveP2−1(ΓX)⊥=Im−ΓX(XΓX)XΓandImisthem-dimensionalidentitymatrix.TheBPEofAisthemini-mizerofQ˜(A)withrespecttoA≥0,denotedbyAˆ.OnceAˆisobtained,theBPEofβ,βˆ,isgivenby(5.12)withAreplacedbyAˆ.GiventheBPEofψ,ψˆ=(βˆ,Aˆ),theOBPofζisgivenbytheBP(5.13)withψ=ψˆ.5.4.2Nested-errorregressionmodelConsidersamplingfromafinitepopulation.Tolinkthepopulationtoanested-errorregression(NER)model,considerasuper-populationNERmodel.Supposethatthesubpopulationsofresponses{Yik,k=1,...,Ni}andauxiliarydata{Xikl,k=1,...,Ni},l=1,...,parerealizationsfromcorrespondingsuper-populationsthatareassumedtosatisfyY=Xβ+v+e,i=1,...,m,k=1,...,N,(5.23)ikikiikiwhereβisavectorofunknownregressioncoefficients,viisasubpopulation-specificrandomeffect,andeikisanerror.Itisassumedthatthevi’s

123January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page111ObservedBestPrediction111ande’sareindependentwithv∼N(0,σ2)ande∼N(0,σ2).Un-ikivikederthefinite-populationsetting,thetruesmallareameanisθi=Y¯i=−1NiNik=1Yik(asopposedtoθi=X¯iβ+viundertheinfinite-populationsetting)for1≤i≤m.Furthermore,writeri=ni/Ni.Nowsupposethatsimplerandomsamples[e.g.,Fuller(2009)](yij,xij),1≤j≤niaredrawnfromthefinitepopulation,{(Yik,Xik),1≤k≤Ni},1≤i≤m.Then,thefinite-populationversionoftheBPhastheexpression(Exercise5.4)θ˜=E(θ|y)=X¯β+r+(1−r)niγ(¯y−x¯β),(5.24)iMiiiiii·i·1+niγ−1ni−1niwhereyi=(yij)1≤j≤ni,y¯i·=nij=1,x¯i·=nij=1xij,EMdenotes(conditional)expectationundertheassumedsuper-populationNERmodel,andβandγ=σ2/σ2arethetrueparameters.Again,notethattheBPisvemodel-dependent.Inpractice,anassumedmodelissubjecttomisspecification.Jiangetal.(2011a)considersmisspecificationofthemeanfunction,whileassum-ingthatthevariance-covariancestructureofthedataiscorrectlyspecified.However,thelatter,too,maybemisspecifiedinpractical.Inthissub-section,weextendthepotentialmodelmisspecificationtoboththemeanfunctionandthevariance-covariancestructure.Onepossiblemisspecifi-cationofthevariance-covariancestructureisheteroscedasticity,definedintermsofvar(e)=σ2forareai,1≤i≤m,wheretheσ2’sareun-ijiiknownandpossiblydifferent.Forexample,JiangandNguyen(2012)usedthewell-knownIowacropsdata[Batteseetal.(1988)]toshowthatthehomoscedasticvarianceassumptionoftheNERmodelmaynotholdinpractice.Ontheotherhand,inspiteofthepotentialmodelmisspecification,therearereasonsthatonecannot“abandon”theassumedmodel,andthemodel-basedBP.First,theassumedmodelandBParerelativelysimpletouse,andtherefore,attractivetopractitioners.Inparticular,theyexploresimple(linear)relationshipbetweentheresponseandauxiliaryvariables.Forexample,incontrastto(5.23),onemayassumeYik=μik+vi+eik,wheretheμikarecompletelyunspecified,unknownconstants.Thelattermodelisalmostalwayscorrect,butisneverthelessuseless,becauseitdoesnotutilizeanyrelationshipbetweenYandXatall.Infact,inpractice,ifauxiliarydataareavailable,itisoften“politicallyincorrect”nottousethem.Secondly,eventhoughthereisaconcernaboutthemodelmisspeci-fication,itoftenlacks(statistical)evidenceonwhysomethingelseismorereasonable,orwhetheracomplicationisnecessary.Forexample,sometimes

124February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page112112RobustMixedModelAnalysisthereisaconcernaboutthenormalityassumption,asdiscussedinSection4.4.2,butnoindicationonwhyanalternativedistribution,say,t5,ismorereasonable.Asanotherexample,supposethatonefitsaquadraticmodelandfindsthatthecoefficientofthequadratictermisinsignificant.Then,oneisnotsurewhetherthecomplicationofquadraticmodelingisnecessaryasopposedtolinearmodeling.Thus,asfarasthissectionisconcerned,wearenotattemptingtochangetheassumedmodel,ortheBPbasedontheassumedmodel.Inparticular,weassumeasingleparameter,γ,fortheratioσ2/σ2,ratherthanconsideringaheteroscedasticNERmodelsuchasveinJiangandNguyen(2012).Ourgoalistofindabetterwaytoestimatetheparameters,ψ=(β′,γ)′,undertheassumedmodel,sothattheresultingBP,(5.24),ismorerobustagainstmodelmisspecifications.WedosobyconsideringanobjectiveMSPEthatisnotmodel-dependent,definedasfollows.Letθ=(θi)1≤i≤mdenotethevectorofsmallareameans,andθ˜=[θ˜i]1≤i≤mthevectorofBPs.Notethatθ˜idependsonψ,thatis,θ˜i=θ˜i(ψ).Thedesign-based[e.g.,Fuller(2009)]MSPEisXmMSPE(θ˜)=E(|θ˜−θ|2)=E{θ˜(ψ)−θ}2.(5.25)iii=1NotethattheEin(5.25)isdifferentfromtheEMin(5.24)inthatEiscompletelymodel-free;namely,theexpectationin(5.25)iswithrespecttothesimplerandomsamplingfromthefinitepopulations,whichhasnothingtodowiththeassumedmodel.Itcanbeshown(Exercise5.5)thattheMSPEin(5.25)hasanalternativeexpression.Namely,wehaveMSPE(θ˜)=E{Q(ψ)+···},where···doesnotdependonψ,andXm1−rXmQ(ψ)=θ˜2(ψ)−2iy¯X¯′β+b(γ)ˆµ2=Q.(5.26)ii·iiii1+niγi=1i=1In(5.26),ψisconsideredasaparametervector,ratherthanthetruepa-rametervector,b(γ)=1−2a(γ)witha(γ)=r+(1−r)nγ(1+nγ)−1.iiiiiiiFurthermore,µˆ2isadesign-unbiasedestimatorofY¯2thathasthefollowingiiexpression(Exercise5.6):1XniN−1Xniµˆ2=y2−i(y−y¯)2,(5.27)iijiji·niNi(ni−1)j=1j=1assumingthatni>1.TheBPEofψ,ψˆ,istheminimizerofQ(ψ)withrespecttoψ.AlsonotethattheBPisbasedonthe(model-based)area-specificMSPE(soitisoptimalforeverysmallarea,iftheassumedmodelis

125January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page113ObservedBestPrediction113correct),whiletheBPEisbasedonthe(design-based)overallMSPE.Thisisbecause,here,wedonotwanttheestimatorofψtobearea-dependent.Onereasonisthatarea-dependentestimatorsareoftenunstableduetothesmallsamplesizefromthearea,whileanestimatorobtainedbyutilizingalloftheareas,suchastheBPEdefinedabove,tendstobemuchmorestable.However,area-dependentestimatorcouldbeexplored,ifonecansolvethestabilityproblem.OncetheBPEisobtained,theOBPofθiisgivenby(5.24)withψreplacedbyitsBPE.5.5OBPforsmallareacountsSofarOBPhasbeenappliedtolinearmodels,whichmaybeappropriatewhentheresponsevariableiscontinuous.However,inmanySAEproblems,thedatafortheresponsevariablesarecounts.See,forexample,M¨unnichetal.(2009),FerranteandTrivisano(2010).SuchcasesareoftentreatedundertheframeworkofGLMM(see,e.g.,Section2.1).SuchmodelshavebeenstudiedinthecontextofSAE.See,forexample,Malecetal.(1997),Ghoshetal.(1998),Jiangetal.(2001).Inparticular,Poissonmixedmod-els,aspecialclassofGLMMthatisconsideredinthispaper,haveappearedinrecentliteratureinSAE.Forexample,M¨unnichetal.(2009)usedboththebinomialandPoissonmixedmodelsforestimatingpopulationcountsinthe2011Germancensus;FerranteandTrivisano(2010)proposedaPois-sonlog-normalmodel,whichissimilartothePoissonmixedmodelexceptformulatedasahierarchicalmodel,forestimationofthenumberofrecruitsbyfirms;inasimilardevelopment,Hajarisman(2013)proposedatwo-levelhierarchicalPoissonmodelforestimationofinfantmortalityratesusingaBayesianapproach.Inthissection,weextendtheOBPmethodtoSAEwithcountdata.5.5.1Bestpredictionunderatwo-stagemodelForsimplicity,supposethatresponsesarecountsatthearea-level,denotedbyyi,andthat,inaddition,avectorofcovariates,xi,isalsoavailableatthearea-level.Themodelofinteresthastwostages.Inthefirststage,itassumesthat,giventhearea-specificmeans,μi,1≤i≤m,wheremisthenumberofsmallareas,yi,1≤i≤mareindependentsuchthatyi∼Poisson(μi).Inthesecondstage,amodelisassumedforthedistribution

126January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page114114RobustMixedModelAnalysisoftheμi’s.Inthissection,wemainlyfocusontwoofthesemodels.Thefirstisamixed-effectlog-normal(LN)model,expressedaslog(μi)=xiβ+vi,(5.28)whereβisavectoroffixedeffects,andviisanarea-specificrandomeffectthatisassumedtobeindependent,fordifferenti’s,anddistributedasN(0,σ2)forsomevarianceσ2.ThesecondisaGamma(GM)model,whichassumesthattheμi’sareindependentwithμi∼Gamma(xiβ/φ,φ),(5.29)whereGamma(α,φ)istheGammadistributionwithshapeparameterα,scaleparameterφ,andprobabilitydensityfunction(pdf)givenby1α−1−u/φf(u|α,φ)=ue,u>0.Γ(α)φαUndereithermodel,theBPofμicanbeexpressedasEM,ψ(μi|y)=gi(ψ,yi),(5.30)whereEM,ψdenotesconditionalexpectationundertheassumedmodel,M,andparametervector,ψ,underM,andgi(·,·)isafunctionthatdependsontheassumedmodel.Notethat,fromaBayesianpointofview,theBPisindsimplytheposteriormeanofμiunderthemodelyi|μi∼Poisson(μi)andtreatingthedistributionofμiasaprior.However,thelatterispotentiallymisspecifiedwhichisamainconcernhere.Throughoutthissection,weassumethattheconditionalmodelofyigivenμiiscorrectlyspecified.TheLNmodel,alsoknownasthelog-linearmodel,is,perhaps,themostpopularinpractice.ItisaspecialcaseofGLMM[e.g.,BreslowandClayton(1993),Jiang(2007)].Inparticular,therandomeffectvioftenhasaninterpretation,whichisattractivetopractitioners.Ofcourse,itmayalsobeexpressedwithoutusingtherandomeffects,inawaysimilartotheGMmodel,thatisμ∼LN(xβ,σ2),(5.31)iiwhereLNstandsforthelog-normaldistribution[ξ∼LN(μ,σ2)ifflog(ξ)∼N(μ,σ2)].Ontheotherhand,theBPundertheLNmodeldoesnothaveananalyticexpression.Infact,ifweusetheexpressionvi=σξi,whereξ∼N(0,1),andletψ=(β,σ),thenitiseasytoderivethat,inthiscase,i32xβexp{(yi+1)σu−μi(u)−u/2}dugi(ψ,yi)=ei3,(5.32)exp{yiσu−μi(u)−u2/2}du

127January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page115ObservedBestPrediction115whereμ(u)=exp(xβ+σu).Theone-dimensionalintegralsin(5.32)iicanbeapproximatedbynumericalintegration.Alternatively,ifyiislarge,theintegralsin(5.32)canbeapproximatedbyLaplaceapproximation[e.g.,Jiang(2007),sec.3.5.1],leadingtothefollowingapproximation:gi(ψ,yi)=aiifσ=0(thisresultisexactratherthanapproximate);gi(ψ,yi)4σ2yi+1yi+1≈ai2exp(yi+1)logσ(yi+1)+1ai2yiyiyi−yilog−1+logai2(σ2yi+1)ai2yi+1yi+1−log(5.33)2{σ2(yi+1)+1}aixβifσ>0,whereai=ei(Exercise5.7).TheLaplaceapproximationiscomputationallymuchmoreefficient.AsfortheGMmodel,ithasamajoradvantageinthat,inthiscase,theBPhasasimpleanalyticexpressionasaresultofGammabeingtheconjugateprior.Infact,ifweletψ=(β,φ),thenwehaveφy+xβiigi(ψ,yi)=.(5.34)φ+1Thederivationof(5.34)isleftasanexercise(Exercise5.8).TheconjugateGammaprioralsohasitsattractivenessfromaBayesianperspective.5.5.2DerivationofOBPAsnotedintheprevioussections,thedifferencebetweenOBPandBLUPorEBLUPisnottheBP,buthowtheparametersinvolvedintheBPareestimated.Thesameistruehere.Followingtheearlierapproach,weevaluatetheperformanceoftheBPunderabroadermodel,whichissimplythefirst-stagemodel,butthesecond-stagemodelaboutthedistributionoftheμiisnotassumed.Inotherwords,underthebroadermodel,theμi’sarecompletelyunspecified.Writeμ=(μi)1≤i≤m,andμ˜=(˜μi)1≤i≤m,whereμ˜iistherightsideof(5.30),whereψisconsideredasanunknown

128January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page116116RobustMixedModelAnalysisparametervector.ConsiderMSPE=E(|μ˜−μ|2)m=E{g(ψ,y)−μ}2iiii=1mmm=Eg2(ψ,y)−2E{g(ψ,y)μ}+E(μ2)iiiiiii=1i=1i=1=I1−2I2+I3,(5.35)whereEdenotesexpectationunderthebroadermodel.NotethatI3doesnotinvolveψ,eventhoughitmaybecompletelyunknown.AlsonotethatE{gi(ψ,yi)μi}=E[μiE{gi(ψ,yi)|μ}]∞e−μiμk+1=g(ψ,k)Eiik!k=0∞e−μiμk+1=g(ψ,k)(k+1)Eii(k+1)!k=0∞=gi(ψ,k)(k+1)E{1(yi=k+1)},k=0where1AistheindicatorofeventA(=1isAoccurs,and0otherwise).Thus,ifwedefinegi(ψ,−1)=0,wehave∞E{gi(ψ,yi)μi}=Egi(ψ,k)(k+1)1(yi=k+1)k=0=E{gi(ψ,yi−1)yi}.Therefore,combinedwith(5.35),wehavemmMSPE=Eg2(ψ,y)−2g(ψ,y−1)y+···,(5.36)iiiiii=1i=1where···doesnotdependonψ,whichleadstotheBPEofψ,ψˆ,byminimizingtheexpressioninsidetheexpectationontherightsideof(5.36)without···,thatis,mmQ(ψ,y)=g2(ψ,y)−2g(ψ,y−1)y.(5.37)iiiiii=1i=1OncetheBPEofψisobtained,theOBPofthesmallareameancount,μi,isobtainedby(5.30)withψreplacedbyψˆ,thatis,μˆi=gi(ψ,yˆi),1≤i≤m.(5.38)

129January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page117ObservedBestPrediction117Itiseasytoshow(Exercise5.9)that,undertheGMmodel,theBPEhasaclosed-formexpression:−1mmβˆ=xxxy=(XX)−1Xy,(5.39)iiiii=1i=1mβˆ)2φˆ=i=1(yi−xi−1=RSS−1.(5.40)mi=1yiy·(5.39)isthewell-knownleastsquares(LS)estimator,whereX=(x)i1≤i≤mandy=(yi)1≤i≤m,whileRSSistheresidualsumofsquares,andmy·=i=1yi.UndertheLNmodel,theBPEdoesnothaveananalyticexpression,soanumericalprocedureisneededtocomputetheBPE.Itisfairlystraightforwardtoextendtheabovederivationtomorecom-plicatedarea-levelmodels[seeChen(2012)].Forexample,inSubsection5.7.3,weconsideracasewhereinterestisdemographicgroupswithintheareas,buttherandomeffectisatthearea-level.Somefurtherextensionsareconsideredinthenextsection.5.5.3ExtensionsAfurtherextensionwouldbetoconsidertheone-parameterexponentialfamily[e.g.,McCullaghandNelder(1989),p.28]withthepdfuθi−b(θi)μi∼exp+hi(u,φ),(5.41)ai(φ)whereuisthevariableforthepdf,θiisthenaturalparameterfortheexponentialfamily,φisanadditionaldispersionparameter,andb(·),ai(·),andhi(·,·)areknownfunctions.Furthermore,thenaturalparameter,θi,isassociatedwiththelinearpredictor,η=xβ,throughalinkfunction.Itiicanbeshownthat,under(5.41),themodel-basedconditionalexpectation,(5.30),isgivenby∂EM,ψ(μi|y)=ai(φ)logdi(θi,yi,φ),(5.42)∂θiwhereψ=(β,φ),andμiθidi(θi,yi,φ)=exp+ci(μi,yi,φ)dμi(5.43)ai(φ)withci(μi,yi,φ)=yilogμi−μi+hi(μi,φ).However,therightsideof(5.42)doesnotnecessarilyhaveananalyticexpression;itreducesto(5.34),ofcourse,intheGammacase.

130January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page118118RobustMixedModelAnalysisOnemayalsoconsideraddinga(sampling)weightinthefirst-stagemodelbyassumingyi|μi∼Poisson(wiμi),wherewiisaknownweight,whilemaintainingthesecond-stagemodelforμi.Again,denotethemodel-basedBPofμiby(5.30).AverysimilarderivationtothatinSubsection5.5.2leadstogi(ψ,yi−1)E{gi(ψ,yi)μi}=Eyi.wiThus,similarto(5.36),wehavemmg(ψ,y−1)2iiMSPE=Egi(ψ,yi)−2yi+···,(5.44)wii=1i=1where···doesnotdependonψ.TheBPEofψisthenobtainedbyminimizingtheexpressioninsidetheexpectationontherightsideof(5.44)without···,thatis,mmg(ψ,y−1)2iiQ(ψ,y,w)=gi(ψ,yi)−2yi,(5.45)wii=1i=1wherew=(wi)1≤i≤m.Itisclearthat,whenwi=1,1≤i≤m,Q(ψ,y,w)reducestoQ(ψ,y)of(5.37),leadingtotheBPEwithouttheweights.Itiseasytoshowthat,undertheGMmodel,theBPhasaclosed-formexpression:φy+xβiigi(ψ,yi)=.(5.46)1+φwiTheBPEofψdoesnothaveaclosed-formexpression,ingeneral.However,itcanbecomputedviathefollowingsimplenumericalalgorithm:Firstsolvemy/wm(y/w−xβ˜)2iiiii−wi=0,(5.47)(1+φwi)2(1+φwi)3i=1i=1where−1mxxmx(y/w)β˜=iiiii.(5.48)(1+φwi)2(1+φwi)2i=1i=1Letthesolutionto(5.47)beφˆ,whichistheBPEofφ.TheBPEofβis(5.48)withφreplacedbyφˆ.Itiseasytoverifythat,whenwi=1,1≤i≤m,thisleadsto(5.39)and(5.40).Ontheotherhand,undertheLNmodel,neithertheBPnortheBPEhaveclosed-formexpressions,andonedoesnothaveasimplealgorithmlike(5.47)and(5.48)forthecomputation.

131January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page119ObservedBestPrediction1195.6AsymptoticpropertyofBPETherehavebeenstudiesexaminingtheasymptoticbehaviorsofestimatorsundermodelmisspecifications.Forexample,inthecontextofmaximumlikelihoodestimationwithi.i.d.observations,White(1982)showedthat,whentheunderlyingmodelismisspecified(thereforethetrueparametervectordoesnotexist),theMLEundertheassumedmodelstillconvergesalmostsurelyto“something”,andthe“something”isthe(unique)maxi-mizeroftheexpected“log-likelihood”(consideredasafunctionofthepa-rameters),undersomeregularityconditions.Here“log-likelihood”referstothelogarithmofthemisspecifiedlikelihoodfunction,whichWhitecalledquasi-likelihoodfunction.AsimilarapproachcanbeusedtostudytheasymptoticbehavioroftheBPEunderamisspecifiedmodel.Weconsiderageneralsettingwhichincludesthecasesconsideredintheprevioussectionsasspecialcases.SupposethattheBPofθ,avectorofmixedeffects,undertheassumedmodel,θBP,dependsonψ,avectorofparametersthatmayincludeβaswellassomevariancecomponents.Also,supposethatMSPE(θBP)canbeexpressedasE{Q(y,ψ)}≡M(ψ),whereEdenotesexpectationunderthetruedistributionofy,eitherwithrespecttoatruemodel(e.g.,thecaseofFay-Herriotmodel),orwithrespecttothesamplingdesign(e.g.,thecaseofnested-errorregressionmodel).Throughoutthissection,E,andlaterP,areunderstoodinthesamesense.WecallM(ψ)theMSPEfunction.TheBPEofψ,denotedbyψ˜,istheminimizerofQ(y,ψ)overψ∈Ψ,theparameterspaceforψ.TheOBPofθ,θ˜,isθBPwithψreplacedbyψ˜.Furthermore,weassumemQ(y,ψ)=Qi(yi,ψ),(5.49)i=1wheremisthenumberofsmallareas;yiisthesubvectorofycorrespondingtodatacollectedfromtheithsmallareasuchthaty1,...,ymareindepen-dent;andQi(yi,ψ)isthree-timescontinuouslydifferentiablewithrespecttoψ.Inaddition,thefollowingregularityconditionsareassumed,whereλmindenotesthesmallesteigenvalue.A1.Thereexistsauniqueψ∈Ψo,theinteriorofΨ,suchthatM(ψ)=∗∗infψ∈ΨM(ψ).Notethatψ∗maydependonthe(joint)distributionofy1,...,ymaswellasotherquantitiesthatmaybeinvolvedinthedefinitionofM(ψ)(suchasthexi’sintheFay-Herriotmodel).A2.OnecandifferentiateE{Q(y,ψ)}withrespecttoψundertheex-pectation,thatis,(∂/∂ψ)E{Q(y,ψ)}=E{(∂/∂ψ)Q(y,ψ)}.

132January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page120120RobustMixedModelAnalysisA3.Asm→∞,wehave−1m2(i)liminfλmin[mi=1(∂/∂ψ∂ψ)E{Qi(yi,ψ∗)}]>0;(ii)limsupm−1mE{(∂/∂ψ)Q(y,ψ)}2<∞;i=1rii∗(iii)m−2mE{(∂2/∂ψ∂ψ)Q(y,ψ)}2→0;andi=1rsii∗(iv)m−3/2mE{sup|∂3Q(y,ψ)/∂ψ∂ψ∂ψ|}→0forsomei=1ψ∈S¯ρ(ψ∗)iirstρ>0andany1≤r,s,t≤q=dim(ψ),whereS¯ρ(ψ∗)={ψ∈Ψ:|ψ−ψ∗|≤ρ}.ConditionA1requiresidentifiabilityoftheparametervector,ψ∗,whichassumestheroleofthe“trueparametervector”.ConditionA2isaregularityconditionthatisoftenrequiredinasymptotictheory[e.g.,LehmannandCasella(1998),pp.441].Condition(i)ofA3correspondstoan“informationassumption”,whichisanextensionofthat(inthei.i.d.case)thesamplesizegoestoinfinity.TherestoftheconditionsinA3areessentiallymomentconditions.Theseconditionsarefairlymildinthattheyaresatisfiedintypicalsituations.Weuseanexampletoillustratehowtoverifytheseconditions.Example5.1(continued).ConsidertheFay-HerriotmodelwithAknown,forsimplicity.Inthiscase,theminimizerofM(ψ)hasanexplicitexpression(Exercise5.10),2−12ψ∗=(XΓX)XΓE(y).(5.50)Clearly,ψliesintheinterioroftheparameterspace,Rp,wherepisthe∗dimensionofβ.Thus,conditionA1issatisfied.Notethat,inthiscase,Q(y,ψ)isgivenbytheexpressioninsidetheexpectationin(5.10).Thus,itiseasytoverifythatconditionA2issatisfied.Similarly,Q(y,ψ)=E(By−θ)2+(1−B)2(xβ)2−2(1−B)2xβy,iiiiiiiiiiwhereBi=A/(A+Di).Itfollowsthat1m∂22mE{Q(y,ψ)}=(1−B)2XX.m∂ψ∂ψii∗miiii=1i=1SupposethattheDi’sareboundedawayfromzero.Then,itiseasytoshowthatcondition(i)ofA3issatisfiedifliminfλ(m−1XX)>0.minThelatterisaregularityconditionoftenrequiredinstudyinglargesamplepropertiesoftheleastsquaresestimator[e.g.,sec.6.7ofJiang(2010)andthereferencestherein].Forexample,ifX=(1,x),wherexisascalar,iiithenthelatterconditionisequivalenttom12liminf(xi−x¯)>0,mi=1

133January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page121ObservedBestPrediction121−1mwherex¯=mi=1xi.TherestoftheconditionsinA3areeasytoverify;inparticular,wehave∂3Q/∂β∂β∂β=0forallj,k,l.ijklThefollowingtheorem,provedinJiangetal.(2011a),statesthat,under√apossiblymisspecifiedmodel,theBPEis“m-consistent”withrespecttoψ∗,theminimizeroftheMSPEfunction.Theorem5.2.UndertheassumptionsA1–A3,thereexistswithprob-abilitytendingtoone,asm→∞,alocalminimizer,ψ˜,ofQ(y,ψ)ina√neighborhoodofψ∗,suchthatm(ψ˜−ψ∗)=OP(1).Thus,inparticular,Pwehaveψ˜−ψ∗−→0asm→∞.PNotethatwedonotwriteψ˜−→ψ∗,becauseψ∗,althoughisnonrandom,maydependonm(seeassumptionA1).5.7ExampleswithsimulationsInthissection,wepresentthreeexamples,eachsupportedbyasimulationstudy,todemonstratethepropertiesofOBPanditscomparisontothestandardmethodsinSAE.Thefirstisasimpleexampleusedtodemon-stratethetheoreticalpropertiesofOBPestablishedinSections5.2and5.3.Thesecondisamorepracticalexampleusedtoillustratethedesign-basedpredictiveperformanceofOBP.ThelastexampleisusedtoillustratetheperformanceofOBPforestimatingsmallareacounts.5.7.1AsimpleexampleTheexampleisthesameasthesimpleFay-HerriotmodelconsideredinSection5.3exceptthatAisunknown.Notethattheassumedmodelcanbewrittenasyi=β+vi+ei,i=1,...,m,wherethevi’sandei’sareindependentsuchthatvi∼N(0,A)withAunknown,andei∼N(0,Di)withDigiveninSection5.3andfurtherspecifiedbelow.WeconsiderfourdifferentestimatorsofAthathavebeentraditionallyused.ThesearetheML,REML,Fay-Herriot[F-H;FayandHerriot(1979);alsoseeDattaetal.(2005)]andPrasad-Rao[P-R;PrasadandRao(1990)]estimators.GivenoneoftheestimatorsofA,denotedbyAˆ,theEBLUPisobtainedbythefollowingsteps:(i)computingβˆof(5.14)withxi=1andAreplacedbyAˆ;and(ii)computingtheEBLUPθˆ=(θˆi)1≤i≤mof

134January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page122122RobustMixedModelAnalysisTable5.1EmpiricalMSPE(%increaseoverOBP)mdEBLUP-1EBLUP-2EBLUP-3EBLUP-4OBP50128.76(28%)27.94(25%)25.00(11%)25.87(15%)22.43100151.05(26%)50.20(24%)47.74(18%)49.02(21%)40.42200194.22(26%)93.95(26%)92.52(24%)93.87(25%)74.8650595.83(42%)95.05(41%)93.55(39%)93.12(38%)67.411005189.93(44%)189.22(43%)186.51(41%)185.61(41%)132.012005372.59(44%)371.92(44%)366.96(42%)365.21(41%)258.60θ=(θi)1≤i≤m,whereAˆaθˆi=yi+β,ˆ1≤i≤n;Aˆ+aAˆ+aAˆbθˆi=yi+β,nˆ+1≤i≤m(5.51)Aˆ+bAˆ+b[notethattheAˆin(5.51)isthesameAˆusedtocomputeβˆ].DependingonwhetherAˆistheML,REML,F-H,orP-Restimator,thecorrespondingEBLUPsaredenotedbyEBLUP-1,EBLUP-2,EBLUP-3,andEBLUP-4,respectively.TheOBPofθ,θ˜=(θ˜i)1≤i≤m,isgivenbytherightsidesof(5.51)withAˆandβˆreplacedbyA˜andβ˜,respectively,whereψ˜=(β˜,A˜)istheBPEofψ=(β,A).TheempiricalMSPEsarereportedinTable5.1,takenfromJiangetal.(2011a).The%intheparenthesesisthepercentageincreaseinMSPEbythecorrespondingEBLUPovertheOBP.ItisseenthatalloftheEBLUPsperformverysimilarly,whiletheOBPisadistanceawayfromtheEBLUPs.TheMSPEsoftheEBLUPsaresomewherebetween11%to44%higherthanthatoftheOBP.ThepercentageincreaseinMSPEbytheEBLUPsismoresubstantialford=5thanford=1.Thismakessensebecaused=5featuresamoreseriousmodelmisspecificationthand=1does,andtheOBPshineswhentheunderlyingmodelismisspecified.AmoreexplicitexplanationofthispatternmaybeseenfromthecomparisonoftheformulaefortheexactMSPEs,thatis,(5.17)and(5.18)(withc=0),althoughthelatterarederivedforthecasethatAisknown.TheformulaeshowthatthedifferenceinMSPEbetweentheOBPandanyoftheEBLUPsgetslarger,evenproportionally,asdincreases.WenextcomparetheOBPandEBLUPsintermsofarea-specificMSPEs.AlthoughtheOBPisdefinedbyminimizingtheoverall(observed)MSPE,thereisnoguaranteethatitsarea-specificMSPEsareminimal.Ontheotherhand,area-specificMSPEsareoftenofmaininterestinSAE.Therefore,acomparisonofthearea-specificMSPEsoftheOBPwiththose

135January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page123ObservedBestPrediction123oftheEBLUPisimportant,especiallyfromapracticalpointofview.SuchacomparisonmayalsoprovidefurtherdetailsfortheoverallMSPEsre-portedinTable5.1.Theempiricalarea-specificMSPEisevaluatedbyK∗1(k)(k)2MSPEi={θˇi−θi}Kk=1(k)(k)withK=500,whereθˇiandθiarethepredictorandtruesmallareamean,respectively,fortheithsmallareainthekthsimulationrun,1≤i≤m,1≤k≤K.Duetothefairlylargenumberofsmallareasinvolved(m=50,100or200inoursimulations),wesummarizetheresultsusingboxplotsandhistograms,asshowninFigures5.1and5.2.ThefiguresrevealsomeuntoldstoriesbytheoverallMSPEs.First,theboxplotsshowasignificantdifferenceinthedistributionsoftheempiricalMSPEsbetweentheOBPandEBLUPs.NotonlydoestheOBPhavesmallermedianempiricalMSPEineachcase,whatismoreapparentistherange,orvariation,oftheempiricalMSPEsoverwhelminglyinfavoroftheOBP.Second,thehistogramsexhibitquitedifferentshapesbetweentheOBPandEBLUPs.AcloserlookatthenumbersshowsthattheempiricalMSPEsoftheEBLUPsaresomewherebetweenslightlytomoderatelysmallerthanthoseoftheOBPforhalfofthesmallareas;butfortheotherhalf,theempiricalMSPEsoftheEBLUPsaremuchlargerthanthoseoftheOBP.Thepatterncanalsobeseenfromthehistograms.Recalltheassumedmodelhasacommonmeanforallofthesmallareas,whilethetruemodelhasonemeanforhalfoftheareasandanothermeanfortheotherhalf.Apparently,whattheEBLUPdoesisto“sidewith”onemeanwhile“abandoning”theother.TheOBP,ontheotherhand,usesaratherdifferentstrategy,by“stayinginthemiddle”or“balancing”betweenthetwomeans.ThisexplainsthebimodalhistogramsfortheEBLUPs,comparedtothefairlynormal-look-likehistogramsfortheOBP(withmuchnarrowerspreads).Overall,thesimulationresultsshowamuchmorerobustperformanceoftheOBPintermsofthearea-specificMSPEascomparedtotheEBLUPs.5.7.2Caseswhentheassumedmodeliscorrect,orpartiallycorrectThesituationconsideredinSubsection5.7.1mightbealittleextreme.Inpractice,theassumedmodelmaynotbecompletelywrong,ormaybeclosetobecorrect.Inthissubsectionwefirstconsideracasewheretheassumedmodelis“partiallycorrect”.Itisacaseoffinite-populationsetting

136January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page124124RobustMixedModelAnalysisMSPEMSPE0.30.40.50.60.70.80.91.01.52.02.53.0MLREMLF-HP-ROBPMLREMLF-HP-ROBPMSPEMSPE1.01.52.02.53.00.20.30.40.50.60.70.80.9MLREMLF-HP-ROBPMLREMLF-HP-ROBPMSPEMSPE5.03.53.02.52.01.10.20.30.40.50.60.70.8MLREMLF-HP-ROBPMLREMLF-HP-ROBPFig.5.1BoxplotsofMSPE∗,i=1,...,m.UpperLeft:m=50,d=1;UpperRight:im=50,d=5;MiddleLeft:m=100,d=1;MiddleRight:m=100,d=5;LowerLeft:m=200,d=1;LowerRight:m=200,d=5.Withineachplotfromlefttoright:EBLUP-1,EBLUP-2,EBLUP-3,EBLUP-4,OBP.consideredinSubsection5.4.2.Asinglecovariate,xij,isthoughttobelinearlyassociatedwiththeresponse,yij,throughtheNERmodelyij=βxij+vi+eij,i=1,...,m,j=1,...,5(5.52)(sowehaveni=5,1≤i≤minthiscase),whereβisanunknowncoefficient,andvi,eijarethesameasin(5.23).Thus,inparticular,thereisabeliefthatthemeanresponseshouldbezerowhenthevalueofthe

137January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page125ObservedBestPrediction125MLREMLF-HP-ROBPFrequencyFrequencyFrequencyFrequencyFrequency2010246812010246810510152005101500246810.20.40.60.81.00.20.40.60.81.00.20.40.60.81.00.20.40.60.81.00.20.40.60.81.0MSPEMSPEMSPEMSPEMSPEMLREMLF-HP-ROBPFrequencyFrequencyFrequencyFrequencyFrequency051015200510152005101520051015202010246810.51.01.52.02.53.03.50.51.01.52.02.53.03.50.51.01.52.02.53.03.50.51.01.52.02.53.03.50.51.01.52.02.53.03.5MSPEMSPEMSPEMSPEMSPEMLREMLF-HP-ROBPFrequencyFrequencyFrequencyFrequencyFrequency50252010510510152025300510152025300030201052010510.20.40.60.81.00.20.40.60.81.00.20.40.60.81.00.20.40.60.81.00.20.40.60.81.0MSPEMSPEMSPEMSPEMSPEMLREMLF-HP-ROBPFrequencyFrequencyFrequencyFrequencyFrequency051015202530350510152025303505101520253035051015202530352010246810.51.01.52.02.53.03.50.51.01.52.02.53.03.50.51.01.52.02.53.03.50.51.01.52.02.53.03.50.51.01.52.02.53.03.5MSPEMSPEMSPEMSPEMSPEMLREMLF-HP-ROBPFrequencyFrequencyFrequencyFrequencyFrequency0080604020080604020080604020060504030201051015200.20.40.60.81.00.20.40.60.81.00.20.40.60.81.00.20.40.60.81.00.20.40.60.81.0MSPEMSPEMSPEMSPEMSPEMLREMLF-HP-ROBPFrequencyFrequencyFrequencyFrequencyFrequency00605040302010060504030201007060504030201007060504030201005040302010.51.01.52.02.53.03.50.51.01.52.02.53.03.50.51.01.52.02.53.03.50.51.01.52.02.53.03.50.51.01.52.02.53.03.5MSPEMSPEMSPEMSPEMSPEFig.5.2HistogramsofMSPE∗,i=1,...,m.Therows,fromtoptobottom,corresponditom=50,d=1;m=50,d=5;m=100,d=1;m=100,d=5;m=200,d=1;andm=200,d=5,respectively.Withineachrowfromlefttoright:EBLUP-1,EBLUP-2,EBLUP-3,EBLUP-4,OBP.Withineachrowthex-axeshavethesamerange.covariateiszero.Thetruthisthattheslopein(5.52)isnonzero(sotheassumedmodeliscorrectinthisregard);but,thereisanonzerointercept,althoughitsvalueismuchsmallercomparedto,say,thoseconsideredinSubsection5.7.1(sotheassumedmodeliswrong,butnot“terriblywrong”).Morespecifically,thetrueunderlyingmodelisYij=b0+b1Xik+vi+eik,i=1,...,m,k=1,...,1000,(5.53)

138January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page126126RobustMixedModelAnalysisTable5.2OverallEmpiricalMSPE(bias,variancecontri-bution):Assumedmodelispartiallycorrect;%IncreaseisMSPEofEBLUPoverMSPEofOBP(negativenumbermeansdecrease).mOBPEBLUP%Increase500.421(0.224,0.197)0.405(0.238,0.167)-4.01000.733(0.448,0.285)0.748(0.457,0.291)2.14002.745(1.847,0.899)2.848(1.878,0.971)3.8asopposedto(5.52),wheretheXsubpopulationisgeneratedfromthenormaldistributionwithmeanequalto1andstandarddeviationequalto√0.1≈0.32;b0=0.2,b1=0.1;theviaregeneratedindependentlyfromthenormaldistributionwithmean0andstandarddeviation0.1;theeikaregen-eratedfromtheheteroscedasticnormaldistributionsothate∼N(0,σ2),ikiwhereσ2aregeneratedindependentlyfromtheUniform[0.05,0.15]distri-ibution(sothatrangeforσiisapproximatelyfrom0.22to0.39);andthevi’sandeik’saregeneratedindependently.InadditiontotheoverallMSPE,wealsoreportcontributiontotheMSPEdueto“bias”and“variance”.Let(k)di=θˆi−θi,anddibedibasedonthekthsimulateddataset,1≤k≤K.Wedefinetheempiricalbiasandvariancefortheithsmallareaas1K1Kd¯=d(k)andv2={d(k)−d¯}2,iKiiK−1iik=1k=1respectively.LetMSPEidenotetheempiricalMSPEfortheithsmallarea.ItiseasytoshowthattheoverallempiricalMSPEismK−1mmMSPE=v2+(d¯)2.iiiKi=1i=1i=1Thus,thebiasandvariancecontributiontotheoverallMSPEaredefinedm2m2asi=1(d¯i)andi=1vi,respectively.ResultsbasedonK=1000sim-ulationrunsarepresentedinTable5.2.Aswecansee,forthesmallerm,m=50,OBPperforms(slightly)worsethantheEBLUP,butforthelargerm,m=100andm=400,OBPperforms(slightly)better,anditsadvan-tageincreaseswithm.Asforthebias,variancecontribution,OBPseemstohavesmallerbias,andsmallervarianceforlargerm(m=100,400).Next,weconsideracasewheretheassumedmodelisactuallycorrect.Namely,thetrueunderlyingmodelis(5.53)withb0=0;theerrorseikarehomoscedasticwithvarianceequalto0.1,andeverythingelseisthesameasthecaseconsideredabove.ResultsbasedonK=1000simulationrunsarepresentedinTable5.3.Thistime,weseethattheEBLUPperformsslightlybetterthanOBPunderdifferentm,butthedifferenceisdiminishingasthe

139January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page127ObservedBestPrediction127Table5.3OverallEmpiricalMSPE(bias,variancecon-tribution):Assumedmodeliscorrect;%IncreaseisMSPEofEBLUPoverMSPEofOBP(negativenumberindicatesdecrease).mOBPEBLUP%Increase500.335(0.204,0.131)0.330(0.205,0.125)-1.41000.749(0.457,0.292)0.746(0.456,0.290)-0.44002.796(1.800,0.997)2.794(1.799,0.996)-0.1samplesizeincreases.Asforthebias,variancecontribution,EBLUPseemstohavesmallervariance,andsmallerbiasforlargerm(m=100,400),butitsadvantagesinbothbiasandvarianceshrinkasmincreases.TheresultsreportedinTables5.2and5.3arecopiedfromJiangetal.(2015b).Insummary,thesimulationresultssuggestthat,whentheassumedmodelisslightlymisspecified,OBPmaynotoutperformEBLUPwhenm,thenumberofsmallareas,isrelativelysmall;however,OBPisexpectedtooutperformEBLUPwhenmisrelativelylarge,andtheadvantageofOBPoverEBLUPincreaseswithm(recallthedefinitionoftheoverallMSPE).Ontheotherhand,whentheassumedmodeliscorrect,EBLUPisexpectedtoperformbetterthanOBP,althoughthedifferencemaybeignorable;andtheadvantageofEBLUPoverOBPisdisappearingasmincreases.Thesefindings,alongwiththoseinSubsection5.7.2,areverymuchinlinewiththoseofJiangetal.(2011a)fortheFay-Herriotmodel.5.7.3ComparisonofOBPandEBPunderanLNmodelFinally,weconsiderasimulatedexampleregardingOBPunderanLNmodelforcountdata,discussedinSection5.5.Asnoted(seetheendofSub-section5.5.2),theOBPmethodcanbeextendedtomorecomplicatedarea-levelmodels.Hereweconsideroneofsuchcases,withyij|μij∼Poisson(μij)andμij=exp(ηij+vi),i=1,...,m,j=1,...,s,wheretheyij’sarecondi-tionallyindependent,andthevi’sareindependentN(0,1).Heretheindexicorrespondstothearea(e.g.,county),andtheindexjtoaclassificationofdemographicgroups(e.g.,agegroups)withinthearea.Therandomef-fect,vi,isatthearea-level,buttheinterestisthearea/demographicmeancounts,μij.Theassumedmodelhasηij=xijβ,wherethexij’saregen-eratedindependentlyfromtheUniform(0,2)distribution,andthenfixedthroughoutthesimulationstudy.Weconsiderthreetrueunderlyingdistributions:(i)ηij=xij+1,(ii)η=2x2/3+1;and(iii)η=x+z,wherethez’saregeneratedijijijijijij

140January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page128128RobustMixedModelAnalysisTable5.4OBPvsEBPunderanLNmodel.ReportedaresimulatedMSPEs(basedonK=500simulations)underdifferenttruemodelsandsamplesizes.AssumedModelModel(i)Model(ii)Model(iii)mOBPEBPOBPEBPOBPEBPOBPEBP864562412585276495500573932218204811841248132773564036445128856839342140901104914979178534183780Table5.5RatioofEmpiricalMSPE(OBPoverEBP)fromTable5.4mAssumedModelModel(i)Model(ii)Model(iii)81.140.930.810.96321.070.960.760.981281.020.840.740.97independentlyfromtheUniform(0,2)distribution,andfixed.Itfollowsthat,undereachofthethreecases,theassumedmodelismisspecified.Threedifferentsamplessizesareconsidered:m=8,m=32,andm=128,withs=6inallofthecases.Asimilarmeasureof(overall)simulatedMSPE,isconsidered:1Kms(k)(k)2MSPE={μˆ−μ},Kijijk=1i=1j=1(k)(k)whereK=500,μijandμˆijarethetruesmallareameancountandthecorrespondingOBP,orEBP,forthekthsimulationrun.WecomparetheOBPwiththeEBPunderthecaseswheretheassumedmodelholds,andwherethetrueunderlyingmodelis(i),(ii),or(iii).Theresults,roundedtothenearestintegersandcopiedfromChenetal.(2015),arepresentedinTable5.4.Itisseenthat,whentheunderlyingmodeliscorrectlyspecified,theEBPperformsbetterthantheOBP,butthe(proportional)differenceisdi-minishingasmincreases.Ontheotherhand,whentheunderlyingmodelismisspecified,theOBPperformsbetterthantheEBP,andthe(propor-tional)differenceisnotgoingawayasmincreases.Thesepatterns,asillustratedbyTable5.5,areconsistentwiththosefoundinJiangetal.(2011a)andJiangetal.(2015b)(seetheprevioussubsection).

141January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page129ObservedBestPrediction1295.8Estimationofarea-specificMSPEAnimportantandchallengingprobleminSAEistoobtainmeasuresofuncertaintyforsmall-areaestimators,orpredictors.Thisistypicallydone,underthefrequentistframework,byestimatingthearea-specificMSPEforthesmall-areapredictors.AdesirablepropertyfortheMSPEestimatoristhattheestimatorisnearlyunbiasedor,moreprecisely,second-orderunbiased.ThismeansthatthebiasoftheMSPEestimatorisoftheordero(m−1),wheremisthenumberofsmallareas.ThePrasad-Raomethod[PrasadandRao(1990)]iswell-knowninde-rivingsecond-orderunbiasedMSPEestimatorfortheEBLUP.ItisknownthatthenaiveestimatoroftheMSPE(oftheEBLUP),whichsimplyre-placestheunknownvariancecomponentsinthe(analytic)expressionoftheMSPEofBLUPbytheirestimators,underestimatestheMSPEoftheEBLUP,andisonlyfirst-orderunbiased,thatis,thebiasisoftheor-derO(m−1),iftheparameterestimatorsareconsistent.PrasadandRao(1990)usedTaylorexpansionstoobtainasecond-orderapproximationtotheMSPE,andthenbias-correctedtheplug-inestimatorbasedontheap-proximation,againtothesecondorder,toobtainanestimatoroftheMSPEwhosebiasiso(m−1).ThemethodhassincebeenusedextensivelyinSAEandseveralextensionshavebeengiven[e.g.,LahiriandRao(1995),DattaandLahiri(2000),JiangandLahiri(2001),andDattaetal.(2005)].ThePrasad-Raomethodisbasedontheassumptionthattheunderlyingmodeliscorrect,hencetheexistenceofthetrueparameters.Inthissection,however,suchanassumptionisnotmade,whichmakesestimationofthearea-specificMSPEmuchmoredifficult.Consider,forexample,theFay-√Herriotmodel.Itisknown(seeSection5.6)thatψ˜,theBPE,ism-consistenttoψ∗thatisdefinedinSection5.6,butψ∗isnotnecessarilythetrueparametervector.Eveninthemuchsimplercaseinwhichψ∗isknown,henceonecanreplaceψ˜byψ∗intheOBP,itiseasytoseethattheMSPEofθ˜iisnotaknownfunctionofψ∗butdependsontheunknownμiandE(y2).Itfollowsthat,undertheweakmodelassumption,theMSPEicannotbeconsistentlyestimated.Ontheotherhand,itisstillpossibletoobtainanearlyunbiasedes-timatoroftheMSPEofθ˜i.Jiangetal.(2011a)derivedasecond-orderunbiasedMSPEestimatorforthecaseofFay-Herriotmodel.Firstnotethat,insteadofmakingtheTaylorexpansionatthepointofthetrueparametervector,asinthePrasad-Raomethod,theexpansioncanbemadeatψ∗.However,thecurrentweakassumptionaboutthemeanfunc-

142January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page130130RobustMixedModelAnalysistionsmakesitdifficulttocarryoutthestandardargumentsofPrasadandRao(1990),whichmakeuseofsuchassumptionsasE(y)=xβandiiE{(y−xβ)2}=A+D,1≤i≤m,whereψ=(β,A)isthetrueiiiparametervector.Nevertheless,withtheasymptoticpropertyoftheBPE(Theorem5.2),andusingadifferenttechnique,theauthorsobtainedasecond-orderunbiasedestimatorofthearea-specificMSPE.Firstintroducesomenotation.LetB˜ibeBi(definedinSection5.6)withAreplacedbyA˜.DefineM˜=diag{x,(A+D)−1},U˜=(u,u),whereu=y−xβ˜andiiiiim+iiiiu=(y−xβ˜)2−(A˜+D).LetW˜=(W˜)withW˜=xx,m+iiiiii,aba,b=1,2i,11iiW˜=2(A˜+D)−1(y−xβ˜)x,i,12iiiiW˜=W˜andi,21i,12W˜=(A˜+D)−2{3(y−xβ˜)2−(A˜+D)}.i,22iiiiLetf˜=−2(1−B˜)2M˜U˜andiiiimG˜=2(1−B˜)2(W˜−Δ˜),2jjjj=1whereΔ˜=diag{0,...,0,(A˜+D)−1}(pzeros;pisthedimensionofjjβ),andh˜bethelastrowofG˜−1.Jiangetal.(2011a)showedthatthe22followingisasecond-orderunbiasedestimatoroftheMSPEofθ˜i,theOBPofθi,MSPE(θ˜)=(θ˜−y)2+D(2B˜−1)iiiii+2(1−B˜)2h˜f˜+4D(1−B˜)3tr(G˜−1W˜).(5.54)i2iii2iInasimulationstudycarriedoutinJiangetal.(2011a),theauthorsshowedthattheMSPEestimator(5.54)performedbetterinreducingthebias,comparedtoabootstrapMSPEestimator(seebelowformoredetail).However,theestimator(5.54)isnotguaranteednonnegative.Notethatitdoesnotmakesensetouseanegativenumberasanestimateforanonnegativequantity,suchastheMSPE.Furthermore,itisseenthattheleadingtermontherightsideof(5.54),(θ˜−y)2+D(2B˜−1),isO(1),iiiiPanditdependsonthearea-specificdata,yi.[TheremainingtermsareoftheorderO(m−1)].Becauseyisanobservationfromasinglesmallarea,iithasarelativelylargevariance.Ontheotherhand,thetermDi(2B˜i−1)canbenegative.Thus,asaresultofthehighvariationof(θ˜−y)2,thereisiianon-vanishingprobability(asmincreases)thattheleadingterm,hencetheestimatedMSPE,isnegative.

143January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page131ObservedBestPrediction131Jiangetal.(2011a)alsoproposedanalternative,bootstrapapproachtotheMSPEestimation.TheproposedbootstrapMSPEestimatorisguar-anteednonnegative,butitsbiasappearstobelargerthantheestimator(5.54).Notethat,typically,abootstrapMSPEestimatorwithoutbiascor-rectionisonlyfirstorderunbiased[e.g.,HallandMaiti(2006)].Also,thejustificationofthebootstrapapproachisquestionablegiventhepotentialmodelmisspecification.Adifferent,nonparametricbootstrapapproachwasproposedbyJiangetal.(2015b).UnlikeJiangetal.(2011a),thereisabetterjustificationofthenonparametricbootstrapapproachintermsofEfron’soriginalidea[Efron(1979)].Considerthefinite-populationsituationofSubsection5.4.2.Supposethatthesmallareasubpopulations,ortheNi’s,arelargeenough,sothatthesamplingfromthesubpopulationscanbetreatedapproximatelyaswithreplacement.Letz=(x,y),j=1,...,ndenotethe(original)ijijiji(a)samplesfromtheithsmallarea,1≤i≤m.Wethendrawsamples,zij=[{x(a)},y(a)],j=1,...,n,withreplacement,from{z,j=1,...,n},ijijiijiindependentlyfor1≤i≤m.SupposethatBbootstrapsamplesaredrawn,yieldingsamplesz(a)={z(a),1≤j≤n,1≤i≤m},1≤a≤B.ijiThebootstrappedversionoftheBP(5.24)isθ˜(a)=X¯β+r+(1−r)niγ[¯y(a)−{x¯(a)}β],(5.55)ii·ii1+nγi·i·iwhereβandγarethesamepopulationparametersforβandγ,respectively,asfortheoriginalpopulation.Notethattheoriginalsamplesofzijareas-sumedtosatisfythesameNERmodel,(5.23),withXik(Yik)replacedbyxij(yij).Becausetheoriginalsamplesaretreatedasthebootstrappop-ulation,followingEfron’soriginalidea,thepopulationparameters,β,γ,forthebootstrapsamplesarethesameasthosefortheoriginalsamples.Nevertheless,asmentioned,theproposedbootstrapprocedureisnonpara-metricinthesensethattheassumedmodel,(5.23),playsnoroleindrawingthebootstrapsamples.Inparticular,theBPEofβandγ,basedontheoriginalsamples,arenotusedanywhereinthebootstrapping;andthepop-ulationquantitiesofinterestareY¯i,1≤i≤m,whosebootstrapanalogiesarey¯i·,1≤i≤m.ThisisdifferentfromtheparametricbootstrapofJiangetal.(2011a),wheretheBPEofthemodelparameters,basedontheorigi-nalsamples,areusedtodrawbootstrapsamplesundertheassumedmodel.Alsonotethat,becausetheX¯iareknown,theyaretreatedasknowncon-stants,andthereforedonotchangeduringthebootstrapping(itdoesnotmakesenseto“estimate”somethingthatonealreadyknows).Otherthan

144February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page132132RobustMixedModelAnalysisthose,theprocedurefollowscloselythestandardprocedure[e.g.,EfronandTibshirani(1993)].ThebootstrapestimatorofMSPE(θˆ)=E(θˆ−Y¯)2isiiiXBMSPE(\θˆ)=1{θˆ(a)−y¯}2,(5.56)iii·Ba=1(a)whereθˆiis(5.55)withβ,γreplacedbytheirBPEbasedontheboot-strappedsamples.Onemightbeconcernedthat,becausetheni’smaybesmallintypicalSAEproblems,theremaynotbemanydistinctbootstrapsamplesforeachsmallarea.However,thedataconsistofnotjustone,butmanysmallareas.Whenallofthesmallareasarecombined,thereare,still,alotofdistinctbootstrapsamples,eveniftheni’saresmall.Asanexample,belowwereporttheresultsofasimulationstudybyJiangetal.(2015b).Example5.2.ConsidertheNERmodel(5.52).ThesettingisthesameasinSubsection5.7.2exceptthatthetrueunderlyingmodelisYik=b+vi+eik,i=1,...,m,k=1,...,1000(5.57)(sothesubpopulationsizeisNi=1000,1≤i≤m),andthesamplesizesaresmaller.Morespecifically,wehaveb=0.5;m=10or20,andni=5or10.Wefirstconsiderthedesign-basedbiasofMSPE(\θˆi).Twofinitepopulationsaregenerated,andthenfixed,sothatthefinitepopulationform=10isasubpopulationofthefinitepopulationform=20.Table5.6reports,forthefirst10smallareas(theseareallthesmallareasthatarecommonunderdifferentvaluesofm),thesimulatedtrueMSPE(MSPE),obtainedthesamewayasinSection5.7,thesimulatedmeanofMSPE(\θˆi)(MSPE\),andthepercentagerelativebias(%RB)definedas()E(MSPE)\−TrueMSPE100×,TrueMSPEwheretheexpectationisbasedonthesimulations.Anothermeasureofperformanceisthesquarerootofthemeansquarederror(RMSE)overthesimulations,definedasvuuXKt1(MSPE\i,k−MSPEi)2Kk=1fortheithsmallarea,whereMSPEiisthetrueMSPEfortheithsmallarea(whichdoesnotdependonk),evaluatedoverthesimulations,andMSPE\i,kistheMSPEestimatebasedonthekthsimulateddataset.We

145January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page133ObservedBestPrediction133Table5.6EmpiricalPerformanceofMSPEmniiMSPEMSPE%RBRMSEiMSPEMSPE%RBRMSE1051.041.0424.5.1036.034.04326.3.07010101.036.036-0.4.0686.034.0366.4.0702051.031.0324.1.0516.028.03112.5.0461052.046.038-16.1.0787.032.04025.4.07810102.035.033-4.1.0787.033.0342.7.0682052.031.029-7.2.0507.030.0313.6.0551053.038.04210.2.1218.042.042-0.4.15010103.037.036-1.7.0918.033.0357.5.0672053.031.0324.4.0528.030.0314.1.0581054.056.052-7.6.1219.050.042-15.0.07410104.037.0406.3.0729.034.034-1.0.0632054.040.035-11.3.0689.034.030-11.1.0491055.033.03711.8.06610.041.0433.1.08210105.032.0332.5.06610.034.033-2.9.0732055.024.0252.9.05210.035.033-7.9.062considerB=100asthenumberofbootstrapsamplesusedtoevaluatetheMSPEestimator,(5.56).Allresultsarebasedon1,000simulationruns.Itisseenthat,overall,theresultsimprovewheneitherniormincrease,but,intermsof%RB,theimprovementismoreuniversal,oreffective,whenniincreases.Thisismainlyduetothefactthat,asniincreases,thesampleprovidesabetterapproximationtothepopulation;hence,thebootstrapdistributionbetterapproximatestothepopulationdistribution.Alsonotethat,dependingonthearea,thesignoftheRBcanbeeitherpositiveornegative.Thisismainlyduetothearea-to-areadifference(recallthatthepopulationsarefixed)aswellasthebootstraperrors.Toobtainsomeoverallmeasures,wereportthemeanandstandarddeviation(s.d.)ofthe%RBsoverthe10smallareasasfollows:m=10,ni=5:mean=4.2%,s.d.=14.8%;m=10,ni=10:mean=1.5%,s.d.=4.2%;m=20,ni=5:mean=−0.6%,s.d.=8.1%.Theboxplotsofthe%RBsarepresentedinFigure5.3.Theplotsfurtherillustratethepatternofimprovement.Ontheotherhand,intermsofRMSE,theimprovementismuchmoresignificantwhenmincreasesthanwhenniincreases.ThisisbecausehavingalargermreducestheMSPEs,ingeneral;hence,naturally,thecorrespondingMSPEestimatesalsodrop.Inotherwords,boththeestimatorandtheparameter(theMSPE)decrease,whichtypicallyresultsinareductioninRMSE.Finally,weconsidermeasureofuncertaintyinthecaseofcountdata,discussedinSection5.5.Chenetal.(2015)proposedabootstrapapproachforestimatingtheconditionalMSPEoftheOBPgiventhesmallarea

146January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page134134RobustMixedModelAnalysis−1001020123Fig.5.3Boxplotsof%RB.1:m=10,ni=5;2:m=10,ni=10;3:m=20,ni=5.means,μi,1≤i≤m[e.g.,Dattaetal.(2011)].Themethodissurprisinglysimple.Foreach1≤i≤m,drawabootstrapsample,y∗,fromthePoissonidistributionwithmeany.Lety∗,1≤i≤mbethebthbootstrapsample,ii,b1≤b≤B,andμˆ∗betheOBPcomputedbasedonthebthbootstrapi,bsampleforareai.Then,for1≤i≤m,theconditionalMSPEoftheOBPforareai,MSPE(ˆμ|μ)=E{(ˆμ−μ)2|μ},isestimatedbyiiiiiBMSPE=1(ˆμ∗−y)2.(5.58)ii,biBb=1Althoughtheproposalmightsoundsimple,itactuallycarriesanidea

147January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page135ObservedBestPrediction135onhowtousetheoriginalnon-parametricbootstrapideaofEfron(1979)tojustifysomethingthatmightsoundlikeparametricbootstrapwithoneob-servation.AkeyideaiscalledPoissonapproximationtobinomial,detailedbelow,fromwhich,oneisabletocreateanimaginative“box”,anddrawsamplesfromtheboxasinEfron’s(1979)bootstrap.Afterthebootstrapsamplesaredrawn,one“throwsout”theboxandgobacktotheoriginalPoissonobservation(thisiswhytheboxiscalledimaginative).Considerthecasethatthe(conditional)Poissondistributioncanbeviewedasanapproximationtoabinomialdistribution.Thelatteris,ofcourse,wellknowninprobabilitytheory.Partofthetheorystatesthat,ifY∼Binomial(n,p),wherenislargeandpissmallsuchthatnp≈λ,thedistributionofYisapproximatelyPoisson(λ).See,forexample,Jiang(2010,sec.10.3),foradiscussionaswellassome(historical)examples.Supposethattheobservation,Y,isaPoissoncountsuchthat,approxi-nmately,Y=i=1Yi,wheretheYi’sarei.i.d.Bernoulli.IftheYi’swereobserved,itwouldbestraightforwardtoapplyEfron’s(1979)bootstrapbydrawingsamples,withreplacement,from{Y,...,Y},say,Y∗,...,Y∗,1n1nandthencomputeY∗=nY∗asthebootstrappedY.Thepointisi=1ithat,actually,theY’sneednottobeobserved,becauseY∗isthesameasiarandomsamplefromtheBinomial(n,Y¯)distribution,whereY¯=Y/n.Here,weassumethatthesamplesize,n,isknown.Thus,onceagain,us-ingthePoissonapproximationtobinomial,Y∗isapproximatelyarandomsamplefromthePoisson(Y)distribution.Itturnsoutthat,afterall,onedoesnotneedtoreallydoanythingwiththe“box”,nordoesoneneedtoknowthesamplesize,n,forthePoissonapproximationtoBinomialexceptthatnislarge.Notethat,underthebroadermodel,andconditioningonμi,yiisaPoissoncount.Itfollowsthatthebootstrappedy,y∗,isapproximatelyaiirandomsamplefromthePoisson(yi)distribution,conditioningonμi.Alternatively,y∗mayalsobeviewedasaparametricbootstrapsam-ipleinthat,conditioningonμi,thedistributionofyiisPoisson(μi).Theparameterμiisthenestimatedbyyi,resultingthebootstrapdistribution,Poisson(yi).Theperformanceofthelatestbootstrapmethodwasevaluatedempiri-callybyChenetal.(2015),whoshowedthatthemethodperformsreason-ablywell,especiallywhenthesmallareameans,μi,arerelativelylarge.

148January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page136136RobustMixedModelAnalysisTable5.7TheHospitalData,EBLUP-4,OBPwithMeasureofUncertainty√AreayixiDiOBPRMSPEEBLUP-4RMSPE1.302.112.055.239(.060).226(.026)2.140.206.053.181(.019).181(.027)3.203.104.052.220(.017).209(.026)4.333.168.052.249(.085).238(.026)5.347.337.047.347(.047).347(.049)6.216.169.046.234(.016).224(.026)7.156.211.046.172(.020).175(.028)8.143.195.046.197(.045).193(.026)9.220.221.044.162(.058).170(.031)10.205.077.044.180(.017).176(.028)11.209.195.042.206(.015).203(.026)12.266.185.041.228(.031).222(.026)13.240.202.041.201(.031).200(.026)14.262.108.036.234(.024).224(.027)15.144.204.036.180(.032).179(.027)16.116.072.035.154(.040).152(.029)17.201.142.033.236(.032).224(.027)18.212.136.032.238(.020).226(.027)19.189.172.031.223(.031).214(.026)20.212.202.029.199(.014).198(.026)21.166.087.029.187(.017).181(.026)22.173.177.027.212(.039).204(.025)23.165.072.025.165(.017).163(.027)5.9Real-dataexamplesWeillustratetheOBPmethodusingthreereal-dataexamples.ThefirstexampleisregardingacaseofFay-Herriotmodel;thesecondisregardingtheTVSFPdata,discussedinSection4.4,buthereusedtoillustratetheOBPunderanNERmodel;thelastemailisaboutacaseofcountdata.5.9.1HospitaldataMorrisandChristiansen(1995)presentedadatasetinvolving23hospitals(outofatotalof219hospitals)thathadatleast50kidneytransplantsduringa27monthperiod.SeeTable5.7.Theyi’saregraftfailureratesforkidneytransplantoperations,thatis,yi=numberofgraftfailures/ni,whereniisthenumberofkidneytransplantsathospitaliduringtheperiodofinterest.Thevarianceforthegraftfailurerate,Di,isapproximatedby(0.2)(0.8)/ni,where0.2istheobservedfailurerateforallhospitals.Thus,

149January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page137ObservedBestPrediction137Diisassumedknown.Inaddition,aseverityindexxiisavailableforeachhospital,whichistheaveragefractionoffemales,blacks,childrenandextremelyillkidneyrecipientsathospitali.Ganesh(2009)proposedaFay-Herriotmodelforthegraftfailurerates,whichisExample5.1withxβ=β+βx.Notethatthegraftfailureratesarebinomialproportioni01ioffairlylargedenominators(atleast50).Thus,anormaldistributionfortheyi’sisnotunreasonable,atleastfromanapproximationpointofview,bythecentrallimittheorem.However,inspectionsoftherawdatasuggestsomenonlineartrendsinthemeanfunction(seeFigure5.4).Jiangetal.(2010)proposedacubicmodelforthesamedata.Ontheotherhand,therehasbeenaconcernthatthepointattheupperrightcornermightbean“outlier”,insomeway,soacubicmodeltoaccommodateasinglepointmightoverfit.Duetosuchaconcern,Jiangetal.(2011a)proposedaquadratic-outlying(Q-O)model.Supposethatthereisanabrupt“jump”inthemeanresponsewhenxiisgreaterthan0.3;otherwise,themeanresponseisaquadraticfunctionofthecovariate.Thiscanbeexpressedasy=β+βx+βx2+d1+v+e,(5.59)i01i2i(xi>0.3)iii=1,...,m,wherethexi’saretheseverityindexgiveninTable5.7andsoaretheDi’s.Jiangetal.(2011a)consideredOBPandEBLUP-4(seeSubsection5.7.1)undertheQ-Omodel.ThelatterappearstoperformthebestamongtheEBLUPsinthiscasebasedonthesimulationresultsofJiangetal.(2011a).Furthermore,theMSPEestimates(5.54)fortheOBPandthePrasad-RaoMSPEestimatesforEBLUP-4[PrasadandRao(1990)]wereobtained.For6outofthe23areas(area#3,6,7,11,20,and23)theestimates(5.54)arenegative.FollowingtherecommendationofJiangetal.(2011a),forthoseareasthebootstrapMSPEestimatesareused,withthebootstrapsamplesizeequalto100.TheBPEfortheparametersare,approximately,β˜0=−0.084,β˜1=4.614,β˜2=−16.045,d˜=0.698andA˜=3.4×10−4;thecorrespondingestimatesforEBLUP-4are,ap-proximately,βˆ0=−0.040,βˆ1=3.723,βˆ2=−12.745,dˆ=0.580andAˆ=3.6×10−4.TheOBPandEBLUP-4arereportedinTable5.7withthesquarerootsofthecorrespondingMSPEestimates(RMSPE;intheparentheses)usedasmeasuresofuncertainty.IftheassumedQ-Omodeliscorrect,thenaccordingtothepropertiesofOBP,EBLUP-4mayworkbetterinthiscase;ontheotherhand,iftheQ-Omodelismisspecified,theOBPmayhaveanedge.WhilewemayneverknowforsurewhethertheQ-Omodeliscorrect,itisverypossible,asis

150January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page138138RobustMixedModelAnalysis●●●●●●●●●●y●●●●●●●●●●●●●●●●●●●●●●●●●●●●●0.150.200.250.30●●0.35●●0.100.150.200.250.30xFig.5.4DatafromMorrisandChristiansen(1995):Circles—rawdata;smoothcurve&dots—cubicfitfromJiangetal.(2010).usuallythecase,thattheunderlyingmeanfunctionis(still)misspecified,tosomeextend.Forexample,aplot(seeFigure5.5)suggeststhatthefit(byeithermethod)isrelativelypoorforthemiddlepartofthexrange,whereaquadraticmeanfunctionisfitted.AlsonotethatthePrasad-RaoMSPEestimatesarealmostconstantacrosstheareas,withtheexceptionofarea#5thatcorrespondstotheoutlyingcase.AsfortheOBP,areaswiththelargestestimatedMSPEscorrespondtothoseforwhichthequadraticcurvefitsthedatarelatively

151March27,201917:10ws-book9x6RobustMixedModelAnalysisbook4page139ObservedBestPrediction139y0.150.200.250.300.350.00.10.20.30.4xFig.5.5TheQ-Omeanfunctionfitsforthehospitaldata:Redcurve(higherandslightlytotheleft)—OBP;bluecurve(lowerandslightlytotheright)—EBLUP-4.Thecirclesaretherawdata.poorly,suchasareas#1and#4,butnotarea#5,forwhichtheQ-Omodelfitsthedatawell(againseeFigure5.5).ThismakessensebecausetheOBPMSPEestimatestakesintoaccountthepotentialbiascausedbymodelmisspecification–andthisisalsowithrespecttoaspecificsmallarea.

152January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page140140RobustMixedModelAnalysis5.9.2TVSFPdatarevisitedTheTVSFPdatawereusedinSection4.4todemonstratetheL-test.Here,weusethesamedatatoillustratetheOBPmethodforfinite-populationestimation.Morespecifically,weconsideraproblemofestimatingthesmallareameansforthedifferencebetweentheimmediatepostinterventionandpretestTHKSscores.Herethe“smallarea”isunderstoodasanumberofmajorcharacteristics(e.g.,residentialarea,teacher/studentratio)thataffecttheresponse,butarenotcapturedbythecovariatesinthemodel(i.e.,linearcombinationoftheCC,TVandCCTVindicators).Notethat,traditionally,thewords“smallareas”correspondtosmallgeographicalre-gionsorsubpopulations,forwhichadequatesamplesarenotavailable[e.g.,RaoandMolina(2015)],andsuchinformationasresidentialcharacteristicsorteacher/studentratioswouldbeusedasadditionalcovariates.However,suchinformationisnotavailable.Thisiswhywedefinetheseunavailableinformationas“area-specific”,sothattheycanbetreatedasrandomef-fects.Thisisconsistentwithfundamentalfeaturesoftherandomeffectsthatareoftenusedtocaptureunobservableeffectsorinformation[e.g.,Jiang(2007)],andextendsthetraditionalnotionofSAE.Thus,asmallareaistheseventhgradersinalloftheU.S.schoolsthatsharethesimilarmajorcharacteristicsasanLAschoolinvolvedinthedataoverareasonableperiodoftime(e.g.,5years)sothatthesecharacteristicshadnotchangemuchduringthetimeandneitherhadthesocial/educationalrelevanceoftheCCandTVprograms.Thereare28LAschoolsintheTVSFPdatathatcorrespondto28setsofcharacteristics,sothatthedataareconsideredrandomsamplesfromthe28smallareasdefinedasabove.Assuch,eachsmallareapopulationislargeenoughsothatni/Ni≈0,1≤i≤28.Recallthattheni’sintheTVSFPsamplerangefrom18to137,whiletheNi’sareexpectedtobeatleasttensofthousands.NotethattheonlyplaceintheOBPwheretheknowledgeofNiisrequiredisthroughtherationi/Ni.TheproposedNERmodelis(4.32)withtheassumptionsmadebelowtheequation.Itfollowsthatalloftheauxiliarydataxiareatthearealevel;asaresult,thevalueofX¯iisknownforeveryi.Asnoted,thesamplesizesforsomesmallareasarequitelarge,buttherearealsoareaswithrelatively(much)smallersamplesizes.Thisisquitecommoninreal-lifeproblems.Becausetheauxiliarydataareatarea-level,wehaveX¯β=¯xβ;thus,itiseasytoshowthattheBP(5.24)canii

153January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page141ObservedBestPrediction141beexpressedasθ˜=r+(1−r)niγy¯+1−rix¯β.iiiii1+niγ1+niγItisseenthat,whenniislarge,theBPisapproximatelyequaltoy¯i,thedesign-basedestimator,whichhasnothingtodowiththeparameteresti-mation.Therefore,whenniislarge,thereisnotmuchdifferencebetweentheOBPandtheEBLUP.Ontheotherhand,whenniissmallormoder-ate,weexpectsomedifferencebetweentheOBPandtheEBLUPintermsoftheMSPE.However,itisdifficulttotellhowmuchdifferencethereisinthisrealdataproblem.ThesimulationresultsinSubsection5.7.2showthatthedifferencebetweenOBPandEBLUPintermsoftheMSPEdependsontowhatextenttheassumedmodelismisspecified.Asnoted,theresponse,yij,isdifferenceintheTHKSscores,andpossiblevaluesoftheTHKSscoreareintegersbetween0and7.Clearly,suchdataarenotnormal.Thepo-tentialimpactofthenonnormalityistwo-fold.Ontheonehand,itislikelythattheNERmodel,asproposedbyHedekeretal.(1994),ismisspecified,inwhichcaseexpression(5.24)isnolongertheBP,andtheGaussianML(REML)estimatorsarenolongerthetrueML(REML)estimators.Ontheotherhand,evenwithoutthenormality,(5.24)canstillbejustifiedasthebestlinearpredictor[BLP;e.g.,Jiang(2007),p.75].Furthermore,theGaussianML(REML)estimatorsareconsistentandasymptoticallynor-malevenwithoutthenormalityassumption[Jiang(1996)].OtheraspectsoftheNERmodelincludehomoscedasticityoftheerrorvarianceacrossthesmallareas.Figure5.6showsthehistogramofsamplevariancesforthe28smallareas.Thebimodalshapeofthehistogramsuggestspotentialheteroscedasticityintheerrorvariance,yetanothertypeofpossiblemodelmisspecification.Therefore,theOBPmethodisnaturallyconsidered.Jiangetal.(2015b)carriedouttheOBPanalysisforthe28smallareas.TheresultsarepresentedinTable5.8.TheBPEoftheparametersparam-etersareβˆ0=0.206,β1=0.687,βˆ2=0.213,βˆ3=−0.288,andγˆ=0.003.Althoughinterpretationmaybegivenfortheparameterestimates,thereisaconcernaboutpossiblemodelmisspecification(inwhichcasethein-terpretationmaynotbesensible),asnotedabove.Regardless,ourmaininterestisprediction,notparameterestimation;thus,wefocusontheOBP.InadditiontotheOBPs,wealsocomputedthecorrespondingMSPE,andtheirsquarerootsasthemeasuresofuncertainty.Asacomparison,theEBLUPsforthesmallareasaswellasthecorrespondingsquarerootsoftheMSPEestimates,MSPE,usingthePrasad-Raomethod[P-R;Prasad

154January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page142142RobustMixedModelAnalysisFrequency024681.522.533.5samplevarianceFig.5.6Histogramofsamplevariances;akerneldensitysmootherisfitted.andRao(1990)]arealsoincluded.ItisseenthattheOBPsareallpositive,evenforthesmallareasinthecontrolgroup.Asforthestatisticalsignifi-cance(here“significance”isdefinedasthattheOBPisgreaterinabsolutevaluethan2timesthecorrespondingsquarerootoftheMSPEestimate),thesmallareameansaresignificantlypositiveforallofthesmallareasinthe(1,1)group.Incontrast,noneofthesmallareameanissignificantlypositiveforthesmallareasinthe(0,0)group.Asfortheothertwogroups,thesmallareameansaresignificantlypositiveforallthesmallareasinthe(1,0)group;thesmallareameansaresignificantlypositiveforallbuttwosmallareasinthe(0,1)group.Thereare7,8,7and7smallareasinthe(0,0),(0,1),(1,0)and(1,1)groups,respectively.

155March8,20199:47ws-book9x6RobustMixedModelAnalysisbook4page143ObservedBestPrediction143ComparingtheOBPwiththeEBLUP,thevaluesofthelatteraregen-erallyhigher,andtheircorrespondingMSPEestimatesaremostlylower.Intermsofstatisticalsignificance,theEBLUPresultsaresignificantforthe(1,1),(1,0)and(0,1)groups,andinsignificantforthe(0,0)group.ItshouldbenotedthattheP-RMSPEestimatorfortheEBLUPisderivedunderthenormalityassumption,whileinthiscasethedataisclearlynotnormal,asnotedearlier.Thus,themeasureofuncertaintyfortheEBLUPmaynotbeaccurate.Inparticular,justbecausethe(squarerootsofthe)MSPEsfortheEBLUPsarelower,comparedtothosefortheOBPs,itdoesnotmeanthecorrespondingtrueMSPEsfortheEBLUPsarelowerthanthosefortheOBPs.Infact,oursimulationresults(seeSection5.7)haveshownotherwise.ItisalsoobservedthattheMSPEestimatesfortheEBLUPsaremorehomogeneouscrossthesmallareas.ThismaybeduetothefactthattheP-RMSPEestimatorforEBLUPisobtainedassumingthattheNERmodeliscorrect,whiletheproposedMSPEestimatorforOBPdoesnotusesuchanassumption.Inconclusion,inspiteofthepotentialdifferenceinthesmallareachar-acteristics,theCCandTVprogramsappearstobesuccessfulintermsofimprovingthestudents’THKSscores(whethertheimprovedTHKSscoremeansimprovedtobaccousepreventionandcessationisadifferentmatterthough).ItalsoseemsapparentthattheCCprogramwasrelativelymoreeffectivethantheTVprogram.Withouttheinterventionofanyoftheseprograms,theTHKSscoredidnotseemtoimproveintermsofthesmallareameans.Intermsofthestatisticallysignificantresults,whenCC=0andTV=0,theTHKSscoredidnotseemtoimprove;whenCC=1,theTHKSscoreseemedtoimprove;and,whenCC=0andTV=1,theim-provementoftheTHKSscorewasnotsoconvincing.5.9.3MinnesotacountydataFinally,wepresentanexampleregardingtheOBPmethodforcountdata.Torabi(2014)reporteddatafrom87countiesinMinnesota,USAthatwerefirstreportedbyJinetal.(2005).Bothreferencesusedthedatatodemon-stratespatialPoissonregressionwithmultivariateCARerrorsinthelog-linearmodelforthePoissonmean.Chenetal.(2015)usedpartofthedata,namely,dataregardingesophaguscancer,toillustratetheOBPmethodandcompareitwiththeEBP.ThedataareprovidedinTable5.9forconve-nience,whereyi,xicorrespondtotheobservedandexpectednumbersofdeathsduetoesophaguscancerforcountyi,i=1,...,87.

156January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page144144RobustMixedModelAnalysisTable5.8OBP,EBLUP,MeasuresofUncertaintyforTVSFPDataIDCCTVOBPMSPEEBLUPMSPE40310.886.171.913.12140411.844.296.856.12119300.215.207.217.12019400.221.137.221.13419610.878.171.907.12419700.225.158.223.12619811.771.220.807.13119901.426.142.453.13040111.826.133.844.12740200.188.171.199.12340501.394.147.432.12940701.508.300.508.13340810.871.240.903.12340900.230.125.227.13641011.778.304.813.12441101.409.195.444.11541210.913.219.930.12641410.929.257.941.12741511.869.199.872.13550511.790.154.818.13650601.389.169.428.13450701.426.148.452.13550801.411.108.442.13650910.915.097.929.14351010.880.119.905.14351300.185.215.197.12351411.866.144.870.14051500.180.102.192.143WeconsideraGMmodelwithxβ=xin(5.29),oranLNmodeliiwithxβ=log(x)in(5.28).Inotherwords,theslopeforthecovariateisii1andthereisnointercept,underbothmodels.Thus,theonlyunknownparametersareφundertheGMmodel,andσundertheLNmodel.SimilarmodelswerealsoproposedbyJinetal.(2005)andTorabi(2014).Whilethesemodelsmaynotholdexactly,amainpointthatweintendtomakeinthissectionisthattheOBPmethodismorerobusttopotentialmodelmisspecifications.TheBPEofφundertheGMmodelisgivenby(5.40)withxβˆreplacedibyxi,1≤i≤m=87.Thevalueisφˆ=9.13.TheOBPofμi,undertheGMmodel,isgivenby(5.34)withxβ=xandφreplacedbyφˆ.Ontheiiotherhand,theMLestimator(MLE)ofφisφ˜=3.44;theEBPofμiisobtainedsimilarlyexceptusingφ˜.ItisseenthattheBPEandMLEare

157February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page145ObservedBestPrediction145Table5.9EsophagusCancerData.Observednumberofdeathsduetoesophaguscancerfor87Minnesotacounties.Source:Torabi(2014).CountyData1–10yi9667314212611441169100160124xi845821311261083718913113115211–20yi1476713716152155123972548xi129721241814221732346956221–30yi11673791511492837978010590xi1479611116618539352884809231–40yi2324556127418645682468xi1926357169327051552110341–50yi5388862335946688113187xi4410613625531201009413120551–60yi347673482962265213654172xi549510146346280611025715661–70yi63194520767415852621000145xi631741229294176526092116271–80yi12854350893553113219353xi11772394126466111028906881–87yi67450652719321162xi78402603618522566verydifferent.NotethattheBP(5.40)cannowbeexpressedasaweightedaverageofyiandxiwiththeweightsbeingφ/(φ+1)foryiand1/(φ+1)forxi.Thus,theOBPisgivingrelativelymoreweighttoyithantheEBP.Thisisnotunreasonable.Inspectionoftherawdata(Exercise5.11)suggeststhatthereislittledifferencebetweentheobservedandexpected;inotherwords,thevariationissmall,conditioningonthemean.Therefore,naturally,OBPisshowingitsfaithinthedata,yi.However,becausethexiandyiaresoclose,intheend,thereisnotmuchdifferencebetweentheOBPandEBPanyway,asshowninFigure5.7andTable5.10,wherethemarginoferrorisobtainedaspredictedvalueplus/minusthesquareroot√oftheestimatedMSPE(MSPE),andtheMSPEofEBPisobtainedthesamewayasthatoftheOBPusingthebootstrapmethodintroducedinSection5.8[see(5.58)].Duetothelargerangeofthepredictedvalues,itisdifficulttomakeagooddisplaywithallofthecasesinonefigure.Asacompromisesolution,onlythecaseswithyi≤200areshowninFigure5.7,whilethecaseswithyi>200arereportedinTable5.10.AsfortheanalysisundertheLNmodel,becausetheyiarefairlylarge(someareverylarge)inthiscase,Laplaceapproximationisused[see(5.33)fortheOBP;asimilarapproximationisusedforcomputingtheMLEfortheEBP].TheBPEandMLEofσareσˆ=0.157andσ˜=0.158,respectively.

158January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page146146RobustMixedModelAnalysisTable5.10AnalysisofEsophagusCancerData.ReportedareOBPandEBP,withsquare–rootsofcorrespondingestimatedMSPEsintheparentheses,forcountieswith√yi>200.√CountyOBPMSPEEBPMSPE2664.0(25.6)652.5(28.8)18238.5(15.1)237.9(14.2)19722.0(24.7)718.2(21.5)273770.5(61.3)3736.4(66.2)31228.1(14.1)223.0(14.0)55300.9(13.8)307.3(15.5)56231.3(15.7)238.2(15.2)621924.9(44.5)1899.1(57.9)69992.2(32.9)982.2(30.4)73354.3(16.3)359.9(18.3)82445.3(18.0)439.2(18.4)86212.4(13.8)214.2(11.9)ThistimetheBPEandMLEareveryclose;asaresult,theOBPandEBParealsoveryclose.Thedetailsareomitted.OBPEBPPredictiveValue50100150200020406080CountyFig.5.7AnalysisofEsophagusCancerData:OBP(bluetriangle,ontheleft)andEBP(redsquare,ontheright)forcountieswithyi≤200;dashlinesaremarginsoferrors.

159January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page147ObservedBestPrediction1475.10Exercises5.1.Showthat,inExample5.1,theithcomponentoftheBPofζisgivenby(5.13),whereβ,Aarethetrueparameters.TheBPEofβisgivenby(5.12),andtheMLEofβisgivenby(5.14).5.2.Thisexercisehasseveralparts.a.Derivetheexpressions(5.17)and(5.18).b.Showthat,ifc=d,theng1≤g2andh1≥h2withequalityholdinginbothcasesifandonlyif(5.19)holds.c.SupposethatA/(c−d)2≈0,A/b≈0andb/a≈0.Showthat,inthiscase,thevalueofg1/g2isapproximately0.5.5.3.Verifytheidentity(5.20)andtheexpression(5.22).5.4.Showthat,underthefinite-populationsettingdescribedinSub-section5.4.2,theBPofthesmallareamean,θi,hastheexpression(5.24).5.5.Showthatthedesign-basedMSPE(5.26)canbeexpressedasMSPE(θ˜)=E{Q(ψ)+···},where···doesnotdependonψ,andQ(ψ)isgivenby(5.26).5.6.Showthat(5.27)isadesign-unbiasedestimatorofY¯2,assumingini>1.5.7.UsingLaplaceapproximation[e.g.,Jiang(2007),§3.5.1]toderive(5.33).5.8.DerivetheBPformula(5.34)undertheGMmodel.5.9.VerifytheexpressionsofBPEundertheGMmodel,thatis,(5.39)and(5.40).5.10.Showthat,inExample5.1(continued)inSection5.6,themini-mizerofM(ψ)isgivenby(5.50).5.11.ExplorevisuallytheesophaguscancerdataofTable5.9.

160b2530InternationalStrategicRelationsandChina’sNationalSecurity:WorldattheCrossroadsThispageintentionallyleftblankb2530_FM.indd601-Sep-1611:03:06AM

161January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page149Chapter6ModelSelectionStatisticalmodelsarekeytopotentialgainsthatStatisticscouldmakeovera“simple-minded”approach.Forexample,avariancecomponentmodelallowsonetoobtainamoreefficientestimatorofthefixedeffects(regressioncoefficients)thantheLSestimator;aSAEmodelmakesitpossibleforoneto“borrowingstrengths”,andthusdobetterthandirectestimatorssuchasthe(area-level)samplemean.Ontheotherhand,asisalwayssaid,“thereisnofreelunch”,inthesensethatsuchgainsviastatisticalmodelingisnotrisk-freeformodelfailure,andtherewillbeaconsequenceinthelattercase.Therefore,theimportanceofcarefulmodelselection,inparticularformixedeffectsmodels,cannotbeoverstated.Inaway,robustnessofamodel-basedmethodhastodowiththemodelchoice.Forexample,insomecasesthelinearmodelassumptionisviolatedbysomeoutliersbutahigher-ordermodelmayfitthedatawell.Inthiscase,itispossiblethatthelinearmodelisnotagoodchoiceinthefirstplace.Forexample,Recallthehospitaldataof§5.9.1.Ganesh(2009)proposedaFay-Herriotmodelforthegraftfailurerates.ThemodelisthesameasthatinExample5.1withxβ=β+βx.However,inspectionsi01ioftherawdatasuggestonepotentialoutlier(attheupperrightcorner)whenthelinearisfitted;seeFigure5.4.Jiangetal.(2009)usedamodelselectionprocedure,calledthefencemethods(seebelow),toidentifytheoptimalmodelinthiscase,whichledtoacubicmodelcorrespondingtothesmoothcurveinthefigure.Itisapparentthat,withrespecttothecubicmodel,thereisnopresenceofoutliers.Weshalldiscussthreemainapproachesinmixedmodelselection.Thefirstisgeneralizedinformationcriteria;thesecondisthefencemethods;thethirdisshrinkagemixedmodelselection.Aspecialtopicregardingnonparametricmixedmodelselectionisdeferredtothenextchapter.149

162January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page150150RobustMixedModelAnalysis6.1GeneralizedinformationcriteriaTheinformationcriteria,firstproposedbyAkaike[Akaike’sinformationcriterion(AIC);Akaike(1973)],havehadaprofoundimpactinstatisticalmodelselectionandrelatedfields.See,forexample,deLeeuw(1992),forareview.Anumberofsimilarcriteriahavesincebeenproposed,includingtheBayesianinformationcriterion[BIC;Schwarz(1978)],acriterionduetoHannanandQuinn[HQ;HannanandQuinn(1979)],andthegeneral-izedinformationcriterion[GIC;Nishii(1984),Shibata(1984)].AllofthesecriteriacanbeexpressedasGIC(M)=−2ˆl(M)+λn|M|,(6.1)whereˆl(M)isthemaximizedlog-likelihoodfunctionundermodelM,λnisapenaltyforcomplexityofthemodel,whichmaydependontheeffectivesamplesize,n,and|M|isthedimensionofMdefinedasthenumberoffreeparametersunderM.Althoughtheinformationcriteriaarebroadlyused,difficultiesareof-tenencountered,especiallyinsomenon-conventionalsituations.Oneofthedifficultiesisthat,inmanycases,thedistributionofthedataisnotfullyspecified(uptoanumberofunknownparameters);asaresult,thelikelihoodfunctionisnotavailable.Forexample,supposethatnormalityisnotassumedinaLMMfortherandomeffectsanderrors,suchasinthenon-GaussianLMMofChapter3,andthatonewishestoselectthefixedcovariatesusingAIC,BIC,orHQ.Itisnotclearhowtodothisbe-causethelikelihoodisunknownundertheassumedmodel.Ofcourse,onecouldblindlyusethosecriteria,pretendingthatthedataarenormal,butthecriteriaarenolongerwhattheymeantobe.Forexample,Akaike’sbiasapproximationthatledtotheAICisnolongervalid[e.g.,JiangandNguyen(2015),sec.1.1].Nevertheless,itisofinterest,fromboththeoreticalandpracticalpointsofview,toknowaboutthebehavioroftheinformationcriteriawhentheassumeddistributiondoesnothold.Infact,itisnotuncommon,atall,thatapractitionerusesaninformationcriterionwithoutmakingsurethatthedistributionalassumptionholds.Thenwhat?Therearemanyaspectsofanassumedmodel.Quiteofteninpractice,onlyone,orsome,oftheseaspectsareofdirectinterest.Forexample,intheexamplementionedabove,oneisonlyconcernedaboutthefixedcovariatesthatareinvolvedinthemodel.So,evenifthenormalityassumptiondoesnothold,aslongastheinformationcriteriacancorrectlyidentifythemodel

163January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page151ModelSelection151intermsofthefixedcovariates,itisgoodenoughfortheproblemofdirectinterest.SuchanideahasbeenexploredinthecontextofLMMselection.JiangandRao(2003)proposedtwoproceduresofGICtypeforLMMselection.TheauthorsconsideredageneralLMMthatcanbeexpressedas(5.2)withthesameassumptionsunderneaththeequation.TheproblemsofinterestareregardingselectionofthecolumnsofX,whichcorrespondtothefixedcovariates,andthefactorsofrandomeffects,whichcorrespondtosubvectorsofα.6.1.1SelectingthefixedcovariatesonlyThefirstprocedureofJiangandRao(2003)isforthecaseofselectingthefixedcovariateswhentherandom-effectfactorsarenotsubjecttoselection.Thisproblemiscloselyrelatedtoaregressionmodelselectionproblemwithcorrelatederrors.Thus,considerthefollowinggenerallinearmodel:y=Xβ+ζ,(6.2)whereζisavectorofcorrelatederrors,andeverythingelseisasin(5.2).Itisassumesthatthereareanumberofcandidatevectorsofcovariates,X1,...,Xq,fromwhichthecolumnsofXaretobeselected.LetK={1,...,q}.ItisassumedthatthereexistasubsetofX1,...,Xqsuchthat,withXbeingthematrixcorrespondingtothesubset,model(6.2)holdswithallofthecomponentsofβnonzero.Suchasubsetiscalledoptimalmodel.ThesetofallpossiblemodelscanbeexpressedasB={M:M⊆K},andthereare2qpossiblemodels.LetAbeasubsetofBthatisknowntocontaintheoptimalmodel,sotheselectionwillbewithinA.Inanextremecase,AmaybeBitself.ForanymatrixA,letL(A)bethelinearspacespannedbythecolumnsofA;PtheprojectionontoL(A):P=A(AA)−A,whereAAB−denotestheMoore-Penroseinverse;andP⊥theorthogonalprojection:AP⊥=I−P.ForanyM∈B,letX(M)bethematrixwhosecolumnsareAAXj,j∈M,ifM=∅;andX(M)=0ifM=∅.Considerthefollowingmodelselectioncriterion:2CN(M)=|y−X(M)βˆ(M)|+λN|M|=|P⊥y|2+λ|M|,(6.3)X(M)NM∈A,where|M|representsthecardinalityofM;βˆ(M)istheordinaryleastsquares(OLS)estimatorofβ(M)inthemodely=X(M)β(M)+ζ,whichcanbeexpressedasβˆ(M)=[X(M)X(M)]−X(M)y,

164March8,20199:47ws-book9x6RobustMixedModelAnalysisbook4page152152RobustMixedModelAnalysisandλNisapositivenumbersatisfyingcertainconditionsspecifiedbelow.NotethatPX(M)isunderstoodas0ifM=∅.DenotetheoptimalmodelbyM0.IfM06=∅,denotethecorrespondingXandβbyXandβ=(βj)1≤j≤p(p=|M0|).Notethatβj6=0,1≤j≤pbythedefinitionofthetruemodel.(Thisisreasonablebecause,otherwise,theoptimalmodelcanbefurthersimplified.)IfM0=∅,X,βandpareunderstoodas0.For1≤j≤q,Let{j}crepresentthesetK\{j}.Definethefollowingsequences:ω=min|P⊥X|2,ν=max|X|2,andρ=N1≤j≤qX({j}c)jN1≤j≤qjNλ(ZGZ′)+λ(R),whereλdenotesthelargesteigenvalue.LetmaxmaxmaxMˆbetheminimizerof(6.4)overM∈A,whichistheselectedmodel.Thefollowingtheorem,provedinJiangandRao(2003),providessufficientconditionsunderwhichthemodelselectionisconsistentinthesensethatP(Mˆ6=M0)−→0.(6.4)Theorem6.1.SupposethatνN>0forlargeN,ρN/νN−→0,whileliminf(ωN/νN)>0.(6.5)Then,(6.4)holdsforanyλNsuchthatλN/νN−→0andρN/λN−→0.(6.6)Note1.Ifthefirstpartof(6.5)holds,therealwaysexistsλNthat√satisfies(6.6).Forexample,takeλN=ρNνN.Note2.Typically,onehasνN∼N.ToseewhattheorderofρNmayturnouttobe,consideraspecialbutimportantcaseofLMM.SupposethatZ=(Z1,...,Zs),whereeachZrisastandarddesignmatrixinthesensethatitconsistsonlyof0’sand1’s,thereisexactlyone1ineachrow,andatleastone1ineachcolumn.Likewise,α=(α′,...,α′)′such1sthatZα=Z1α1+···+Zsαs[see(1.1)],whereαrisamr-dimensionalvectorofuncorrelatedrandomeffectswithmean0andvarianceσ2.Fur-rthermore,ǫisavectorofuncorrelatederrorswithmean0andvarianceσ2.0Finally,α1,...,αs,ǫareuncorrelated.Letnrkbethenumberof1’sinthekthcolumnofZr.Notethatnrkisthenumberofappearanceofthekthcomponentofα,andZ′Z=diag(n,1≤k≤m).Thus,wehaverrrrkrXsXsλ(ZGZ′)≤σ2λ(ZZ′)=σ2maxn.maxrmaxrrrrk1≤k≤mrr=1r=1

165January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page153ModelSelection153Also,onehasλ(R)=σ2.Itfollowsthatmax0ρN=Omaxmaxnrk.1≤r≤s1≤k≤mrTherefore,(6.6)issatisfiedprovidedthatλN/N→0andnrkmaxmax−→0.1≤r≤s1≤k≤mrλNItisseenthat,sofar,onlyweakassumptionsaremaderegardingtherandomeffectsanderrors.Namely,neitherthenormalityassumption,noreventheindependence(ori.i.d.)assumptions,suchasthoseunderthenon-Gaussian(seeChapter3),aremade.Forexample,inthespecialcasenotedabove,itisonlyassumedthatthecomponentsoftherandomeffectanderrorvectorsareuncorrelated,andthevectorsofrandomeffectsanderrorsareuncorrelated.Becauseoftheweakdistributionalassumption,thecon-sistencyresultofTheorem6.1holdswithoutthenormalityorindependenceassumptions.Weillustratewithaspecificexample.Example6.1.ConsiderthefollowingsimpleLMM:yij=β0+β1xij+αi+ij,(6.7)i=1,...,m,j=1,...,n,whereβ0,β1areunknowncoefficients(thefixedeffects).Itisassumethattherandomeffectsα1,...,αmareuncorrelatedwithmean0andvarianceσ2.Furthermore,assumethattheerrors’sijhavethefollowingexchangeablecorrelationstructure:Leti=(ij)1≤j≤n.2Then,wehaveCov(i,i)=0ifi=i,andVar(i)=τ{(1−ρ)I+ρJ},whereIistheidentitymatrixandJmatrixof1’s.Alsoassumethattheα’sareuncorrelatedwiththe’s.Supposethatm→∞,⎡⎤1mnliminf⎣(x−x¯)2⎦>0,ij··mni=1j=1⎡⎤1mnandlimsup⎣x2⎦<∞,(6.8)ijmni=1j=1−1mnwherex¯··=(mn)i=1j=1xij.Then,itiseasytoshowthattheconditionsofTheorem6.1aresatisfied.Infact,inthiscase,ρN∼n,whileνN∼ωN∼mn(Exercise6.1).TheaboveprocedurerequiresselectingMˆfromtheentireA.NotethatAmaycontainupto2qsubsets,ifA=B.Whenqisrelativelylarge,alternativeprocedureshavebeenproposedinthe(fixedeffects)linearmodel

166January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page154154RobustMixedModelAnalysiscontext,whichrequirelesscomputation[e.g.,ZhengandLoh(1995)].JiangandRao(2003)consideranapproachthatissimilar,inspirit,RaoandWu(1989).First,notethatonecanalwaysexpressXβin(6.2)asqXβ=βjXj(6.9)j=1withtheunderstandingthatsomeofthecoefficientsβjmaybezero.ItfollowsthatM0={1≤j≤q:βj=0}.LetX−j=(Xu)1≤u≤q,u=j,1≤j≤q,η=min|P⊥X|2,andδbeasequenceofpositiveN1≤j≤qX−jjNnumberssatisfyingconditionsspecifiedinTheorem6.2below.LetMˆbethesubsetofKsuchthatj∈Mˆiff|P⊥y|2−|P⊥y|2X−jX>1.(6.10)|P⊥Xj|2δNX−jThefollowingtheorem,provedinJiangandRao(2003),statesthat,undersuitableconditions,Mˆisaconsistentmodelselection.RecallthatρNisdefinedaboveTheorem6.1.Theorem6.2.SupposethatηN>0forlargeN,andρN/ηN−→0.(6.11)Then,(6.4)holdsforanyδNsuchthatδN−→0andρN/(ηNδN)−→0.(6.12)Example6.1(continued).Itiseasytoshowthat,underexactlythesameconditions[i.e.,m→∞and(6.8)],onehasηN∼mn.RecallthatρN∼n.Thus,theconditionsofTheorem6.2aresatisfied(Exercise6.2).6.1.2SelectingfixedcovariatesandrandomeffectfactorsTherearesomemajordifferencesbetweenselectingthefixedcovariates,Xj,asconsideredintheprevioussubsection,andselectingtherandomeffectfactors.Onedifferenceisthat,inselectingthelattercase,oneisgoingtodeterminewhetherthevector,αr,asopposedtoanycomponentofαr,shouldbeincludedinthemodel.Inotherwords,thecomponentsofαrareeitherall“in”orall“out”.Anotherdifferenceisthat,unlikeselectingthefixedcovariates,whereitisreasonableassumethattheXj’sarelinearlyindependent,inaLMM,itispossibletohaver=rbutL(Zr)⊂L(Zr).Forexample,seeExample6.2below.Duetothesefeatures,theselectionofrandomeffectfactorscannotbehandledthesamewayasbefore.

167January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page155ModelSelection155Inthissection,weassumethatZαcanbeexpressedassZα=Zrαr,(6.13)r=1whereZ1,...,Zsareknownmatrices;eachαrisavectorofindependentrandomeffectswithmean0andvarianceσ2,whichisunknown,1≤r≤rs.Furthermore,weassumethatisavectorofindependenterrorswithmean0andvarianceτ2>0,andα,...,α,areindependent.Such1sassumptionsarecustomaryinthemixedeffectsmodelcontext[e.g.,Jiang(2007)];therefore,(6.13)representsafairlygeneralclassofLMMs.Ifσ2>r0,αrisinthemodel;otherwise,itisnot.Itfollowsthatselectionoftherandomeffectfactorsisequivalenttosimultaneouslydeterminingwhichofthevariancecomponentsσ2,...,σ2arepositive,andwhichofthemare1szero.Thetruemodelcanbeexpressedasy=Xβ+Zrαr+,(6.14)r∈l0whereX=(Xj)j∈k0andk0⊆K(seeSubsection6.1.1);l0⊆L={1,...,s}suchthatσ2>0,r∈l,andσ2=0,r∈L\l.r0r0Itshouldnotedthatthereisahypothesistestingapproachtoselectingtherandomeffectfactorsbytestingthehypothesisthatsomeoftheσ2arerzero(seethenextsubsection).However,ifthenullhypothesisisrejected,oneknowsthatatleastoneofthesevariancecomponentsisnon-zero,butonestilldoesnotknowwhichone(s)arezeroandwhichone(s)arenon-zero.Ontheotherhand,ifthenullhypothesisisaccepted,oneconcludesthatthesevariancecomponentsarezero,butitisnotimmediatelyclearwhathappenstoothervariancecomponentsnotinvolvedinthenullhypothesis.Thus,theresultofhypothesistestingmaybeinconclusiveformodelselec-tionpurposes.Furthermore,ahypothesistestingapproachtoselectingtherandomeffectfactorsoftenrelyonthenormalityassumption(but,seethenextsubsection),butnosuchanassumptionismadehere.Duetosuchconsiderations,JiangandRao(2003)tookadifferentapproach.Recallthat,intheprevioussubsectionwehadaproceduretoselectthefixedcovariatesofthemodel,whichleadstoMˆthatsatisfies(6.4).Infact,theonlyplacethatthedeterminationofMˆmightuseknowledgeaboutZ,hencel,isthroughλ,whichdependsontheorderofλ(ZGZ).0NmaxHowever,under(6.13),wehavesλ(ZGZ)≤σ2Z2,maxrrr=1

168January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page156156RobustMixedModelAnalysiswhereforanymatrixA,A=[λ(AA)]1/2.Thus,anupperboundformaxtheorderofλ(ZGZ)ismaxZ2,whichdoesnotdependonl.max1≤r≤sr0Therefore,Mˆcouldbedeterminedwithoutknowingl0.Inanycase,onemaywriteMˆ=Mˆ(l0),beitdependentonl0ornot.Now,supposethataselectionfortherandomeffectfactors,thatis,adeterminationofl0,isˆl.WethendefineMˆ=Mˆ(ˆl).Itiseasytoestablishthefollowingtheorem,whichstatesthatthecombinedprocedure,whichdeterminesboththefixedcovariatesandtherandomeffectfactors,isconsistent(Exercise6.3).Theorem6.3.SupposethatP(ˆl=l0)→0andP(Mˆ(l0)=M0)→0.Then,wehaveP(Mˆ=M0andˆl=l0)→1.Theorem6.3allowsustofocusonhowtoselecttherandomeffectfac-tors,because,oncewehaveaconsistentprocedureofselectingtherandomeffectfactors,aprocedureofjointlyselectingthefixedcovariatesandran-domeffectfactorsisreadilyinplace.Wenowdescribehowtoobtainˆl.Firstdividethevectorsα1,...,αs,or,equivalently,thematricesZ1,...,Zsintoseveralgroups.Thefirstgroupiscalled“largestrandomfactors”.Roughlyspeaking,thoseareZr,r∈L1⊆Lsuchthatrank(Zr)isofthesameorderasN,thesamplesize.WeassumethatL(X,Zu,u∈L\{r})=L(X,Zu,u∈L),r∈L1,whereL(A1,...,At)representsthelinearspacespannedbythecolumnsofthema-tricesA1,...,At.SuchanassumptionisreasonablebecauseZrissupposedtobelargest,hencemusthavecontributiontothelinearspace.ThesecondgroupconsistsofZr,r∈L2⊆LsuchthatL(X,Zu,u∈L\L1\{r})=L(X,Zu,u∈L\L1),r∈L2.TheranksofthematricesinthisgroupareoflowerorderofN.Similarly,thethirdgroupconsistsofZr,r∈L3⊆LsuchthatL(X,Zu,u∈L\L1\L2\{r})=L(X,Zu,u∈L\L1\L2),andsoon.Notethatifthefirstgroup,i.e.,thelargestrandomfactors,doesnotexist,thesecondgroupbecomesthefirst,andtheothergroupsalsomoveon.Intuitively,aselectionprocedurewillnotworkifthereislineardependenceamongthecandidatedesignmatrices,becauseofanidentifia-bilityproblem.Toconsideraratherextremeexample,supposethatZ1isadesignmatrixconsistof0’sand1’ssuchthatthereisexactlyone1ineachrow,andZ2=2Z1.Then,tohaveZ1α1inthemodelmeansthatthereisatermα1i;tohaveZ2α2=2Z1α2inthemodelmeansthatthereisacorrespondingterm,2α2i.However,itmakesnodifferenceintermsofaLMM,becausebothα1iandα2iarerandomeffectswithmean0andcertainvariances.However,bygroupingtherandomeffectfactorswehavedividedtheZr’sintoseveralgroupssuchthatthereislinearindependence

169January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page157ModelSelection157withineachgroup.Thisisthemotivationbehindthegroupingstrategydescribedabove.Toillustratetheprocedureandalsotoshowthatsuchadivisionofgroupsdoesexistintypicalsituations,considerthefollowing.Example6.2.Considerthefollowingrandomeffectsmodel:yijkl=μ+ai+bj+ck+dij+fik+gjk+hijk+eijkl,(6.15)i=1,...,m1,j=1,...,m2,k=1,...,m3,l=1,...,n,whereμisanunknownmean;a,b,carerandommaineffects;d,f,g,hare(random)two-andthree-wayinteractions;andeiserror.Themodelcanbewrittenasy=Xμ+Z1a+Z2b+Z3c+Z4d+Z5f+Z6g+Z7h+e,whereX=1NwithN=m1m2m3n,Z1=Im1⊗1m2⊗1m3⊗1n,...,Z4=Im1⊗Im2⊗1m3⊗1n,...,andZ7=Im1⊗Im2⊗Im3⊗1n.HereIrand1rrepresentther-dimensionalidentitymatrixandvectorof1’s,and⊗meansKroneckerproduct.ItiseasytoseethattheZr’sarenotlinearlyindependent.Forexample,L(Zr)⊂L(Z4),r=1,2,andL(Zr)⊂L(Z7),r=1,...,6.Also,L(X)⊂L(Zr)foranyr.Supposethatmr→∞,r=1,2,3,whilenisbounded.Then,thefirstgroupconsistsofZ7;thesecondgroupZ4,Z5,Z6;andthethirdgroupZ1,Z2,Z3.Ifnalso→∞,thelargestrandomfactordoesnotexist.How-ever,onestillhasthesethreegroups.ItiseasytoseethattheZr’swithineachgrouparelinearlyindependent(Exercise6.4).Ingeneral,SupposethattheZr’saredividedintohgroupssuchthatL=L1∪···∪Lh.Wedescribeaprocedurethatdeterminestheindexesr∈Lforwhichσ2>0;thenaprocedurethatdeterminestheindexes1rr∈Lforwhichσ2>0;andsoon.2rGroupone.Considerthefirstgroup.WriteB=L(X,Z1,...,Zs),B−r=L(X,Zu,u∈L\{r}),r∈L1;d=N−rank(B),dr=rank(B)−rank(B);D=|P⊥y|2,D=|(P−P)y|2.IfAisamatrix,define−rBrBB−rA=[tr(AA)]1/2.Forany1<ρ<2,letˆlbethesetofindexesr∈L211suchthat(d/D)(D/d)>1+d(ρ/2)−1+d(ρ/2)−1.(6.16)rrrLetl={r∈L:σ2>0}.011rLemma6.1.Supposethatd→∞;anddr→∞,liminf(P⊥Z2/d)>0,P⊥Z2/d2→0,andZP⊥Z2/d2→B−rr2rB−rr2rrB−rr2r0,r∈L1.Then,wehaveP(ˆl1=l01)→1.Example6.2(continued).Supposethatmt→∞,t=1,2,3,whilenisboundedbutn≥2.Then,grouponecorrespondstoasingleindex

170January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page158158RobustMixedModelAnalysisr=7.Furthermore,itiseasytoseethatd=m1m2m3(n−1),d7≥g=mmm−mm−mm−mm,ng≤ P⊥Z2≤N=mmmn,123122331B−772123andZP⊥Z2≤nN.ItfollowsthatalloftheconditionsofLemma6.17B−772aresatisfied.Grouptwo.Nowconsiderthesecondgroup.LetB1=L(X,Zu,u∈L\L1),B2=L(X,Zu,u∈L\L1\L2),B1,−r=L(X,Zu,u∈L\L1\{r}),r∈L2,andB1(l2)=L(X,Zu,u∈(L\L1\L2)∪l2),l2⊆L2.Consider⊥2C1,N(l2)=|PB1(l2)y|+λ1,N|l2|,(6.17)l2⊆L2,whereλ1,Nisapositivenumbersatisfyingconditionsspeci-fiedbelow.Letˆl2betheminimizerofC1,Noverl2⊆L2,andl02={r∈L:σ2>0}.LetL={0},andZ=I,theidentityma-2r00trix;ρ=maxPZ2,ν=minP⊥Z2,and1,Nr∈L0∪L1B1r21,Nr∈L2B1,−rr2γ=max(P⊥Z2/P⊥Z2).1,Nr∈L2B2rB1,−rr2Lemma6.2.Supposethatν1,N>0forlargeN,ρ1,N/ν1,N−→0,andγ1,N−→0.(6.18)Then,P(ˆl2=l02)→0foranyλ1,Nsuchthatλ1,N/ν1,N−→0andρ1,N/λ1,N−→0.(6.19)Example6.2(continued).Grouptwocorrespondstothreeindexes:r=4,5,6.ItiseasytoseethatB1=L(Z4,Z5,Z6),B1,−4=L(Z5,Z6),etc.Thus,P2=rank(B)≤f=mm+mm+mm,PZ2≤nf,B121122331B172andP⊥Z2≤ Z2=mn,etc.Finally,itiseasytoverifythatB2443PZ5PZ6=PZ6PZ5.Thus,byaninequalityinExercise6.5,wehavePB1,−4=P(Z5Z6)≤PZ5+PZ6.Itfollowsthattr(ZPZ)≤tr(ZPZ)+tr(ZPZ)4B1,−444Z544Z64=n(m2m3+m3m1);hence,wehaveP⊥Z2=tr(ZZ)−tr(ZPZ)B1,−442444B1,−44≥n(m1m2m3−m2m3−m3m1),etc.Thus,alloftheconditionsofLemma6.2aresatisfied.General.Theaboveprocedurecanbeextendedtotheremaininggroups.Ingeneral,letBt=L(X,Zu,u∈L\L1\···\Lt),1≤t≤h;Bt,−r=L(X,Zu,u∈(L\L1\···\Lt\{r}),r∈Lt+1,andBt(lt+1)=L(X,Zu,u∈(L\L1\···\Lt+1)∪lt+1),lt+1⊆Lt+1,1≤t≤h−1.Define⊥2Ct,N(lt+1)=|PBt(lt+1)y|+λt,N|lt+1|,lt+1⊆Lt+1;(6.20)

171January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page159ModelSelection159whereλt,Nisapositivenumbersatisfyingtheconditionsspecifiedbelow.Letˆlt+1betheminimizerofCt,Noverlt+1⊆Lt+1,andl0t+1={r∈L:σ2>0}.ByexactlythesameproofasthatofLemma6.2[seeJiangt+1randRao(2003)],onecanestablishtheconsistencyofˆlt+1,2≤t≤h−1.Letρ=maxPZ2,ν=minP⊥Z2,andt,Nr∈L0∪···∪LtBtr2t,Nr∈Lt+1Bt,−rr2γ=max(P⊥Z2/P⊥Z2).Then,wehavethefollowingt,Nr∈Lt+1Bt+1rBt,−rr2theoremfordeterminingthecombinationofl01,...,l0h.Theorem6.4.SupposethattheconditionsofLemma6.1aresatisfied,ρt,N/νt,N−→0,andγt,N−→0,1≤t≤h−1.(6.21)Then,foranyλt,Nsuchthatλt,N/νt,N−→0andρt,N/λt,N−→0,1≤t≤h−1,(6.22)wehaveP(ˆl1=l01,...,ˆlh=l0h)→1.Note.UnlikeMˆinSubsection6.1.1(seediscussionaboveTheorem6.3),hereˆltdoesnotdependonˆlt,t

172January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page160160RobustMixedModelAnalysisreferstothefactthatoneisconsideringsamplingfromthefinitepopula-tion,whichiswheretherandomnesscomesfrom.Forexample,theran-domnessofyijin(3.28)isnotduetothatviandeijarerandomvariablesfromnormaldistributions,butratherthatyijisrandomlysampled,with-outreplacement,fromtheithfinitesubpopulation.Beforewediscussthedesign-basedapproximation,letusfirstreviewaconnectionbetweentheBayesfactor(BF)andBIC.Letys=(yi)1≤i≤ndenoteavectorofi.i.d.samples,whosedistributionbelongtoafamilyofprobabilitydistributionsparameterizedbyψ=(φ,θ)withdim(ψ)=manddim(φ)=m0.Considertestingthehypothesis:M:θ=θversusM:θ∈Rm−m0.(6.23)00aTheBFisdefinedastheratiooftheaposteriorandaprioroddsinfavorofalargermodel,M:−13P(M|ys)P(M)p(ys|φ,θ)π(φ,θ)dφdθBF==3,(6.24)P(M0|ys)P(M0)p(ys|φ,θ0)π0(φ)dφwhereπ(φ,θ)andπ0(φ)arethejointpriorforφandθandmarginalpriorforφ,respectively.TheBFhasbeenusedinhypothesistestingandmodelselection.However,thecalculationoftheBFrequiresafullspecificationofthepriordistributionsunderbothM0andM.Alternatively,onemayuseasuitableapproximationtothelogarithmoftheBF,andonepopularapproximationistheBIC[Schwarz(1978)],givenbym−m0S=λ−log(n),(6.25)2whereλ=l(φ,ˆθˆ)−l0(φˆ0,θ0)isthelogarithmofthelikelihoodratio;namely,l(φ,ˆθˆ)isthelog-likelihoodevaluatedattheMLE,ψˆ=(φˆ,θˆ)ofψ,andl0(φˆ0,θ0)isthelog-likelihoodevaluatedattheMLEofφunderM0,φˆ0.Approximation(6.25)isbasedonLaplaceapproximation[e.g.,Jiang(2007),sec.3.5.1].Forasuitablechoiceoftheprior,itcanbeshown[e.g.,KassandWassermann(1995)]thatS=log(BF)+O(n−1/2).(6.26)PAlthoughthederivationofSisfromahypothesistestingpointofview,itcanbeviewedas,atleast,asimpleproblemofmodelselection,thatis,thechoicebetweentwomodels,MandM0.Intheregard,theBICcriterion,whichisequivalenttominimizing−S,isconsistentinthesensethatSgoesto∞(−∞)ifM(M0)istrue.Ontheotherhand,theBICcriterionintheaboveisderivedundertheinfinite-populationassumption.FabriziandLahiri(2013)proposedtwo

173January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page161ModelSelection161approachestoadapttheBICtothefinite-populationsetting.Thefirstapproachisanestimatorofthefinite-populationBIC.Ifalloftheunitsofthefinitepopulationwereobserved,theauthorsshowedthattheBICbasedonalloftheobservationsinthepopulationisgivenbyN21Spop(yU)=y¯U−log(N),(6.27)22whereNisthepopulationsize,andyU=(y1,...,yN)isthepopulation.(6.27)iscalledthefinitepopulationBICOfcourse,onecannotcomputeSpop(yU)becausey¯Uisunknown.Lety8¯Ubeadesign-consistentestimatorofy¯U.Thismeansthat,asthesamplesize,n,goesto∞,y8¯Uconvergestoy¯Uinprobabilitythatisinducedbythesamplingdesign.Itshouldbenotedthat,here,thesamplingiswithoutreplacement.Then,becausen≤N,Nmustalsogoto∞asn→∞.Replacingy¯Uin(6.27)byy8¯U,oneobtainsanaivemodelselectioncriterion:N21Splugin(ys)=(y8¯U)−log(N),(6.28)22wheresrepresentsthesampledindexesandys=(yi)i∈s.(6.28)iscallednaivebecauseitmaynotworkevenunderthesimplerandomsamplingwithreplacement.Toseethis,notethat,ifnislarge,onewouldexpectthat(6.28)tobeveryclosetotheBICobtainedunderi.i.d.samplingfromanormalpopulation,n21Siid(ys)=y¯s−log(n),(6.29)22wherey¯sisthesamplemean.Itisknownthaty¯sisadesign-consistentestimatorofy¯U.However,ifwelety8¯U=¯ysin(6.28),wehaveN−n21NSplugin(ys)−Siid(ys)=y¯s−log.(6.30)22nNowsupposethatn=ρN,whereρ∈(0,1).Then,therightsideof(6.30)goesto∞asN(andn)goesto∞.Thisimpliesthat(6.28)providesstrongerevidenceagainstM0than(6.29),whichisknowntobeaconsistentmodelselectioncriterion.Thefailureof(6.28)isduetothefactthat(6.28)approximates(6.27)ifalloftheunitsinthefinitepopulationwereobserved,whichcorrespondstoN,notn,intheexpression.Thismakesthedisagreementbetweenthedataandthenullhypothesislookmorethanitactuallyis.Because,here,wearedealingwithafinitepopulation,itismorereason-abletousetheBICbasedontheexactlikelihoodforthesample,orsamplelikelihood.Thelattercanbederivedusingasuper-populationmodel(see

174January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page162162RobustMixedModelAnalysisSubsection5.4.2)forthefinitepopulationandtheunderlyingsamplingdesign.However,surveypopulationsusuallyhavecomplexstructures;asaresult,misspecificationoftheassumedmodelisquitelikely[e.g.,Kott(1991)].Thefollowingsimpleexampleshowsthat,whenamodelmisspeci-ficationtakesplace,therecouldbeseriousconsequencetomodelselection.Example6.3.Considertheone-wayrandomeffectsmodelofExample3.1.Supposethatthesuper-populationsatisfiestheone-wayrandomeffectsmodel.Forsimplicity,supposethatvar(y)=σ2+τ2=1;thus,wehaveijτ2=1−σ2.SupposethatthefinitepopulationhassizeN,andisdividedintoMclustersofsizeNc.Asampleofmclustersisselectedbysimplerandomsampling(withoutreplacement),andalloftheunitsintheselectedclustersareselected.Thus,thetotalsamplesizeisn=mNc.ItcanbeshownthattheMLEofμ,basedonthesamplelikelihood,isy¯s,thesamplemean.Furthermore,theBICbasedonthesamplelikelihoodisgivenbyny¯21sSs(ys)=·−log(n).(6.31)21+(Nc−1)σ22Itisclearthat,whenNc=1,(6.31)reducesto(6.30).Infact,thelatteristheappropriateBICwhenthereisnoclustering,thatis,theclustersizeN=1;orwhenthereisnoclustereffect,thatis,σ2=0.However,whencN>1andσ2>0,thedifferencebetweenthetwoBICsiscn(N−1)σ2c2Siid(ys)−Ss(ys)=·2y¯s,21+(Nc−1)σwhichgoesto∞whenm→∞(henceN→∞whileNc>1.Thismeansthat,ifoneneglectstheclusteringinthepopulation,resultingamodelmisspecification,oneislikelytorejectthenullhypothesismoreoftenthanwhatshouldbedonecorrectly.InordertomaketheBICmorerobusttomodelmisspecification,FabriziandLahiri(2013)proposedarobustdesign-basedapproximationtotheBIC.Theprocedurewaspresentedinasimplecasedescribedbelow.LetyUbearealizationfromanunderlyingsuper-populationdistribu-tion,ξ,characterizedbyaparameter,θ.WeareinterestedintestingM0:θ=θ0versusMa:θ=θ0.(6.32)Inthisspecialcase,by(6.25),theBICisgivenbyS=λ−(1/2)log(n),whereλ=l(θˆML)−l(θ0)isthelogarithmofthelikelihoodratio.Theparameterθisestimatedbysolvingthelikelihoodequation,f(yU,θ)=0.(6.33)

175January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page163ModelSelection163IfyUwereobserved,thesolutionto(6.33),denotedbyθˆML=T(yU),wouldbeanestimatorofθ,knownasthecorrespondingdescriptivepopulationquantity(CDPQ)ofθ.BecauseyUarenotentirelyobserved,weestimateT(yU)byadesign-basedestimator,θˆ=Tˆ(zs).Forexample,θˆmaybeobtainedusingaquasi-likelihoodapproachasdiscussedinSection3.2.WethenconsiderthefollowingapproximationtoS:11SDB=WDB−log(n),(6.34)22whereW=(θˆ−θ)2/Vˆ(θˆ)andVˆ(θˆ)isaconsistentestimatorofV(θˆ),DB0DDDthevarianceofθˆundertherandomizationdistribution.FabriziandLahiri(2013)notedthatthenin(6.34)issupposedtobethe“effective”samplesize,sosomeadjustmentmaybeneeded.But,typically,suchanadjust-mentleadstoadifferencethatisoflowerorderthantheleadingtermin(6.34).So,asymptotically,(6.34)stillprovidestherightapproximation.Forexample,iftheeffectivesamplesizeisn∗=Cn,whereCisanycon-stant,thenlog(n∗)=log(n)+log(C),wherelog(C)isanotherconstant.Tofurtherjustifytheapproximation,wehavethefollowingtheorem.TheproofcanbefoundinFabriziandLahiri(2013).Theorem6.5.Supposethatthefollowingregularityconditionshold:i)θˆ−θ=O(n−1/2)undermodelM,whereOmeansO(e.g.,JiangML0ξaξP2010,sec.3.4)withrespecttothesuper-populationdistributionξ;ii)l(·)istwicedifferentiablewith−l(θˆ)=I(θ)+O(n−1/2),whereML0ξ∂2lI(θ0)=−E∂θ2θ0istheFisherinformationevaluatedatθ0;iii)θˆ=θˆ+o(n−1/2),whereodenoteso(e.g.,Jiang2010,sec.3.4)MLξDξDPwithrespecttothecombinedmodel/randomizationdistribution;i)Vˆ(θˆ)=I−1(θ)+o(n−1).D0ξDThen,wehaveS−S=o(n−1/2).DBξDExample6.4.FabriziandLahiri(2013)carriedoutasimulationstudyunderasettinginwhichK=200clusters,eachofsizeNc=10,weregeneratedunderthefollowingmodel:indyij∼Bernoulli(πi),indμ1−μπi∼Beta,,γγi=1,...,M,j=1,...,Nc.Theabovemodelimpliesthatthemarginalproportionoftheobservationisμ,andtheintra-clustercorrelationis

176February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page164164RobustMixedModelAnalysisγ/(1+γ)(Exercise6.6).Simplerandomsampling(withoutreplacement)of3clusters(caseI)or6clusters(caseII)isusedinselectingtheclus-ters;oncetheclusterischosen,alloftheunitsintheselectedclustersaresampled.TheauthorscomparedthreeBICsintestingthehypothesisH0:µ=0.25versusH1:µ6=0.25.ThesearetheexactBIC,SE,basedontheexactsamplelikelihood;ana¨iveBIC,SN,thatignoresthesamplingdesign;andthedesign-basedBIC,SDB,introducedabove.Inaddition,theauthorsconsideredtwodifferentvaluesofγ:γ=0.3andγ=1.0.Thesevaluescorrespondtointra-clustercorrelationcoefficientsof0.25and0.5,respectively.Theresults,basedon1000simulatedruns,showedthattheperformanceofSDBismuchclosertothatofSE,whichisconsideredthegoldstandard,thanthatofSNintermsofboththesizeandpowersofthetest.Here,thepowerswereconsideredunderthealternativesµ=0.5,0.6,0.75and0.9,respectively.SeeFabriziandLahiri(2013)forthedetails.6.1.4ArobustconditionalAICforLMMConsiderahybridofthenon-GaussianmixedANOVAmodel,introducedinSubsection3.1.1,andlongitudinalmodel,introducedinSubsection3.1.2.Themodelcanbeexpressedas(3.1)withXsZiαi=Zirαir,(6.35)r=1whereZirisanni×qrknowndesignmatrix,withni=dim(yi),andαirisaqr×1vectorofrandomeffects,1≤r≤s.Itisassumedthatαir,1≤i≤m,1≤r≤sareindependentwithE(α)=0andVar(α)=σ2I,irirrqr1≤r≤s.Furthermore,itisassumedthatǫ1,...,ǫmareindependentwithE(ǫ)=0,Var(ǫ)=τ2I,1≤i≤m,andareindependentwiththeα’s.iiniLetψ=(τ2,σ2,...,σ2)′.1sUnderthenormalityassumption,letf(y|β,ψ)denotethemarginallike-lihoodfunction.Then,themarginalAICisgivenbymAIC=−2log{f(y|β,ˆψˆ)}+2(p+s+1),(6.36)whereβ,ˆψˆaretheMLEsofβ,ψ,respectively,andp=dim(β).VaidaandBlanchard(2005)notedthatthemarginalAICisinappropriatewhenselectionoftherandomeffectsisofinterest.TheyproposedaconditionalAICintheformofcAIC=−2log{fy|α(y|β,ˆα,ˆψˆ)}+2{tr(H)+1},(6.37)

177January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page165ModelSelection165whereβ,ˆψˆaretheREMLestimatorsofβ,ψ,respectively,andαˆistheEBLUPbasedontheREMLestimators.Furthermore,Hcorrespondstoamatrixthatmapstheobservedvector,y,intothefittedvectoryˆ=Xβˆ+Zαˆ,thatis,yˆ=Hy.Itshouldbenotedthat(6.37)isapproximateversionofthecAICproposedbyVaidaandBlanchard(2005);theauthorsalsoderivedanexactversion,whichisanunbiasedestimatorofaconditionalAkaikeinformation,definedbytheauthors,asanextensionoftheoriginalAkaikeinformation[Akaike(1973)].Consider,forexample,alinearmixedmodelthatcanbeexpressedas(3.1).FirstassumethatGi=Var(αi)isknown,andR=Var()=τ2I,whereτ2isanunknownvariance.IftheREMLiinimethodisusedtoestimatethemodelparameters,theexactversionofcAICisgivenby(6.37)withtr(H)+1replacedby(N−p−1)(ρ+1)+p+1KREML=,(6.38)N−p−2whereNisthetotalsamplesize,p=rank(X)withX=(Xi)1≤i≤m,andρ=tr(H).IfMLmethodisusedtoestimatetheparameters,theexpressionoftheexactcAICisthesameexceptreplacingKML={N/(N−p)}KREML.Note(6.37)isinthegeneralformof(6.1),wherethesecondtermcor-respondstoapenaltyformodelcomplexity.ToseethedifferencebetweencAICandAIC[Akaike(1973)]intermsofthepenalty,notethatthecorre-spondingtermstoKREMLandKMLinAICarethesame,whichisp+1,assuming,again,thatGisknownandR=τ2I.Thereisalsoafinite-iinisampleversionofAIC[VaidaandBlanchard(2005)],inwhichcasethepenaltytermsaregivenby(N−p)(N−p−2)−1(p+1)forREML,andN(N−p−2)−1(p+1)forML.Whenthecovariancematricesoftherandomeffectsinvolveadditionalunknownparameters,theformofcAICbecomesmorecomplicated.Weuseanexampletoillustrate.Example6.5.ConsideraspecialcaseofExample3.1withμ=0,a=m,andbi=n,1≤i≤m.Inthiscase,thecAICisgivenby(6.37)withtr(H)+1replacedbyanestimatorofabias-correctionterm,BC.Furthermore,BCwhichhasanasymptoticexpansion:2−1BC=ρ++o(n),1+nγwhereγ=σ2/τ2.Thus,theestimatorofBCisgivenbyBC=ˆ8ρ+2/(1+nγˆ),whereρ,ˆγˆaretheMLestimatorofρ,γ,respectively.

178January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page166166RobustMixedModelAnalysis6.2ThefencemethodsAlthoughtheinformationcriteriaarebroadlyused,difficultiesareoftenencountered,especiallyinsomenon-conventionalsituations.Wediscussanumberofsuchcasesbelow.1.Theeffectivesamplesize.Inmanycases,theeffectivesamplesize,n,isnotthesameasthenumberofdatapoints.Thisoftenhappenswhenthedataarecorrelated.Takealookattwoextremecases.Inthefirstcase,theobservationsareindependent;therefore,theeffectivesamplesizeshouldbethesameasthenumberofobservations.Inthesecondcase,thedataaresomuchcorrelatedthatallofthedatapointsareidentical.Inthiscase,theeffectivesamplesizeis1,regardlessofthenumberofdatapoints.Apracticalsituationmaybesomewherebetweenthesetwoextremecases,suchascasesofmixedeffectsmodels,whichmakestheeffectivesamplesizedifficulttodetermine.2.Thedimensionofamodel.Notonlytheeffectivesample,thedimen-sionofamodel,|M|,canalsocausedifficulties.Insomecases,suchastheordinarylinearregression,thisissimplythenumberofparametersunderM,butinothersituationswherenonlinear,adaptivemodelsarefitted,thiscanbesubstantiallydifferent.Ye(1998)developedtheconceptofgener-alizeddegreesoffreedom(gdf)totrackmodelcomplexity.Forexample,inthecaseofmultivariateadaptiveregressionsplines[Friedman(1991)],knonlineartermscanhaveaneffectofapproximately3kdegreesoffreedom.Whileageneralalgorithminitsessence,thegdfapproachrequiressignif-icantcomputations.Itisnotatallclearhowaplug-inofgdffor|M|in(6.1)affectstheselectionperformanceofthecriterion.3.Unknowndistribution.Inmanycases,thedistributionofthedataisnotfullyspecified(uptoanumberofunknownparameters);asaresult,thelikelihoodfunctionisnotavailable.Forexample,supposethatnormalityisnotassumedunderalinearmixedmodel(e.g.,Chapter3).Then,thelikelihoodfunctionistypicallynotavailable.NowsupposethatonewishestoselectthefixedcovariatesusingAIC,BIC,orHQ.Itisnotclearhowtodothisbecausethefirsttermontherightsideof(6.1)isnotavailable.Ofcourse,onecouldstillblindlyusethosecriteria,pretendingthatthedataarenormal,butcriteriaarenolongerwhattheymeantobe.Forexample,Akaike’sbiasapproximationthatledtotheAIC[Akaike(1973)]isnolongervalid.Furthermore,itisnotclearhowrobusttheperformanceofthesecriteriaistomisspecificationofthedistribution(e.g.,normal).

179January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page167ModelSelection1674.Finite-sampleperformance,andtheeffectofaconstant.Evenintheconventionalsituation,therearestillpracticalissuesregardingtheuseofthesecriteria.Forexample,theBICisknowntohavethetendencyofoverlypenalizing“bigger”models.Inotherwords,thepenalizer,λn=logn,maybealittletoomuchinsomecases.Insuchacase,onemaywishtoreplacethepenalizerbyclog(n),wherecisaconstantlessthanone.Questionis:Whatc?Asymptotically,thechoiceofcdoesnotmakeadifferenceintermsofconsistencyofmodelselection,solongasc>0.However,practically,itdoes.Asanotherexample,comparingBICwithHQ,thepenalizerinHQislighterinitsorder,thatis,lognforBICandcloglognforHQ,wherec>2isaconstant.However,ifn=100,wehavelogn≈4.6andloglogn≈1.5;hence,ifcischosenas3,BICandHQarealmostthesame.Infact,therehavebeenanumberofmodificationsoftheBICaimingatimprovingthefinite-sampleperformance.Forexample,BromanandSpeed(2002)proposedaδ-BICmethodbyreplacingtheλn=logninBICbyδlogn,whereδisaconstantcarefullychosentooptimizethefinite-sampleperformance.However,thechoiceofδreliesonextensiveMonte-Carlosimulations,iscase-by-case,and,inparticular,dependsonthesamplesize.Therefore,itisnoteasytogeneralizetheδ-BICmethod.5.Criterionofoptimality.Strictlyspeaking,modelselectionishardlyapurelystatisticalproblem–itisusuallyassociatedwithaproblemofprac-ticalinterest.Therefore,itseemsabitunnaturaltoletthecriterionofoptimalityinmodelselectionbedeterminedbypurelystatisticalconsider-ations,suchasthelikelihoodandK-Linformation.Otherconsiderations,suchasscientificandeconomicconcerns,needtobetakenintoaccount.Forexample,whatiftheoptimalmodelselectedbytheAICisnottothebestinterestofapractitioner,say,aneconomist?Inthelattercase,cantheeconomistchangeoneoftheselectedvariables,anddoso“legitimately”?Furthermore,thedimensionofamodel,|M|,isusedtobalancethemodelcomplexitythrough(6.1).However,theminimum-dimensioncriterion,alsoknownasparsimony,isnotalwaysasimportant.Forexample,thecriterionofoptimalitymaybequitedifferentifpredictionisofmaininterest.Theseconcerns,suchastheabove,ledtothedevelopmentofanewclassofstrategiesformodelselection,knownasthefencemethods,firstintroducedinJiangetal.(2008).AlsoseeJiangetal.(2009).Theideaconsistsofaproceduretoisolateasubgroupofwhatareknownascorrectmodels(thosewithinthefence)viatheinequalityQ(M)−Q(M˜)≤c,(6.39)whereQ(M)isthemeasureoflack-of-fitformodelM,M˜isa“baseline

180February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page168168RobustMixedModelAnalysismodel”thathastheminimumQ,andcisacut-off.Theoptimalmodelisthenselectedfromthemodelswithinthefenceaccordingtoacriterionofoptimalitythatcanbeflexible;inparticular,thecriterioncanincorporatetheproblemofpracticalinterest.Furthermore,thechoiceofthemeasureoflack-of-fit,Qˆ,isalsoflexibleandcanincorporateproblemofinterest.Toseehowthefencehelptoresolvethedifficultiesoftheinformationcriteria,notethatthe(effective)samplesizeisnotusedinthefencepro-cedure,althoughthecut-offc,whenchosenadaptively(seebelow),mayimplicitlydependontheeffectivesamplesize.Dependingonthecriterionofoptimalityforselectingtheoptimalmodelwithinthefence,thedimen-sionofthemodelmaybeinvolved,butthecriteriondoesallowflexibility.Also,themeasureoflack-of-fit,Q,doesnothavetobethenegativelog-likelihood,asintheinformationcriteria.Forexample,theresidualsumofsquares(RSS)isoftenusedasQ,whichdoesnotrequirecompletespecifica-tionofthedistribution.Furthermore,adate-drivenapproachisintroducedinSection6.2.1forchoosingthecut-offortuningconstant,c,thatopti-mizesthefinite-sampleperformance.Finally,thecriterionofoptimalityforselectingthemodelwithinthefencecanincorporatepracticalinterests.Itshouldbenotedthat,asfarasconsistencyisconcerned,whichis,byfar,themostimportanttheoreticalpropertyforamodelselectionproce-dure,thebasicunderlyingassumptionsforthefencearethesameasthoseforthetraditionalmodelselectionapproaches,suchastheinformationcri-teriaandcross-validation[CV;e.g.,Shao(1993)].Forthemostpart,itisassumedthatthespaceofcandidatemodelsisfinite,whichcontainsatruemodel,andthatthesamplesizegoestoinfinitywhilethemodelspaceremainsthesame.Thereisasimplenumericalprocedure,knownasthefencealgorithm,whichapplieswhenmodelsimplicityisusedasthecriteriontoselectthemodelwithinthefence.Giventhecut-offcin(6.39),thealgorithmmaybedescribedasfollows:Checkthecandidatemodels,fromthesimplesttothemostcomplex.Onceonehasdiscoveredamodelthatfallswithinthefenceandcheckedalltheothermodelsofthesamesimplicity(formembershipwithinthefence),onestops.Oneimmediateimplicationofthefencealgorithmisthatonedoesnotneedtoevaluateallthecandidatemodelsinordertoidentifytheoptimalone.Thisleadstopotentiallycomputationalsavings.Asoftwarepackage,TheFencePackage,isavailableathttps://cran.r-project.org/package=fenceSeveralvariationsofthefencewillbediscussedbelow.Werefertoa

181January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page169ModelSelection169monograph,JiangandNguyen(2015),forfurtherdetails.6.2.1AdaptivefenceFinite-sampleperformanceofthefencedependsheavilyonthechoiceofthecut-off,ortuningparameter,cin(6.39).Inaway,thisissimilartooneofthedifficultieswiththeinformationcriterianotedintheprevioussection.Jiangetal.(2008)cameupwithanidea,knownasadaptivefence(AF),toletthedata“speak”onhowtochoosethiscut-off.LetMde-notethesetofcandidatemodels.Tobemorespecific,assumethattheminimum-dimensioncriterionisusedinselectingthemodelswithinthefence.Furthermore,assumethatthereisacorrectmodelinMaswellasafullmodel,Mf,sothateverymodelinMisasubmodelofMf.ItfollowsthatM˜=Mfin(6.39).Firstnotethat,ideally,onewishestoselectcthatmaximizestheprobabilityofchoosingtheoptimalmodel,heredefinedasacorrectmodelthathastheminimumdimensionamongallofthecorrectmodels.ThismeansthatonewishestochoosecthatmaximizesP=P(Mc=Mopt),(6.40)whereMoptrepresentstheoptimalmodel,andMcisthemodelselectedbythefence(6.39)withthegivenc.However,twothingsareunknownin(6.40):(i)underwhatdistributionshouldtheprobabilityPbecomputed?and(ii)whatisMopt?Tosolveproblem(i),notethattheassumptionsaboveonMimplythatMfisacorrectmodel.Therefore,itispossibletobootstrapunderMf.Forexample,onemayestimatetheparametersunderMf,thenuseamodel-based(orparametric)bootstraptodrawsamplesunderMf.ThisallowsustoapproximatetheprobabilityPontherightsideof(6.40).Tosolveproblem(ii),weusetheideaofmaximumlikelihood.Namely,letp∗(M)=P∗(M=M),whereM∈MandP∗denotestheempir-cicalprobabilityobtainedbythebootstrapping.Inotherwords,p∗(M)isthesampleproportionoftimesoutofthetotalnumberofbootstrapsamplesthatmodelMisselectedbythefencewiththegivenc.Letp∗=maxp∗(M).Notethatp∗dependsonc.TheideaistochoosecM∈Mthatmaximizesp∗.Itshouldbekeptinmindthatthemaximizationisnotwithoutrestriction.Toseethis,letM∗denoteamodelinMthathastheminimumdimension.Notethatifc=0,thenp∗=1becausetheproce-durealwayschoosesM.Similarly,p∗=1forverylargec,ifMisuniquef∗(because,whencislargeenough,everyM∈Misinthefence;hence,the

182January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page170170RobustMixedModelAnalysisp*p*0.40.60.810.40.50.60.70.80.910369131721252933048121722273237CnCnp*p*0.40.60.810.40.60.8103691318232833012345678CnCnFig.6.1Afewplotsofp∗againstcn=cprocedurealwayschoosesM∗).Therefore,whatonelooksforis“apeakinthemiddle”oftheplotofp∗againstc.Seetheupper-leftplotofFigure6.1foranillustration,wherec=cn.Thehighestpeakinthemiddleoftheplot(correspondingtoapproximatelyc=9)givestheoptimalchoice.HereisanotherlookattheAF.Typically,theoptimalmodelisthemodelfromwhichthedataisgenerated,thenthismodelshouldbethemostlikelygiventhedata.Thus,givenc,oneislookingforthemodel(usingthefence)thatismostsupportedbythedataor,inotherwords,onethathasthehighestposteriorprobability.Thelatterisestimatedbybootstrapping.Onethenpullsoffthecthatmaximizestheposteriorprobability.

183January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page171ModelSelection171Note:IntheoriginalpaperofJiangetal.(2008),thefenceinequality(6.39)waspresentedwiththecontherightsidereplacedbycσˆ,whereM,M˜σˆisanestimatedstandarddeviationoftheleftside.Although,insomeM,M˜specialcases,suchaswhenQisthenegativelog-likelihood,σˆiseasyM,M˜toobtain,thecomputationofσˆ,ingeneral,canbetime-consuming.M,M˜ThisisespeciallythecasefortheAF,whichcallsforrepeatedcomputationofthefenceunderthebootstrapsamples.Jiangetal.(2009)proposedtomergethefactorσˆwiththetuningconstantc,whichleadsto(6.39),M,M˜andusetheAFideatochoosethetuningconstantadaptively.Thelatterauthorscalledthismodificationsimplifiedadaptivefence,andshowedthatitenjoyssimilarlyimpressivefinite-sampleperformanceastheoriginalAF.RecalltheAFlookstofindapeakinthemiddleoftheplotofp∗againstc.Sometimes,theremaynotbeapeakinthemiddle,ortheremaybemultiplepeaks.Figure6.1showsafewotherpossiblepatterns.See,forexample,Jiangetal.(2015b)fordiscussiononhowtodealwithsuchcases.6.2.2InvisiblefenceAnothervariationofthefencethatisintendedforhigh-dimensionalselec-tionproblemsisinvisiblefence[Jiangetal.(2011b)].AcriticalassumptioninJiangetal.(2008)isthatthereexistsacorrectmodelamongthecandi-datemodels.Althoughtheassumptionisnecessaryinestablishingconsis-tencyofthefence,itlimitsthescopeofapplicationsbecause,inpractice,acorrectmodelsimplymaynotexist,orexistbutnotamongthecandidatemodels.Wecanextendthefencebydroppingthisassumption.Notethatthemeasureoflack-of-fit,Qin(6.39),typicallyhastheex-pression,Q(M)=infθM∈ΘMQ(M,θM;y)forsomemeasureQ.Avectorθ∗∈ΘiscalledanoptimalparametervectorunderMwithrespecttoMMQifitminimizesE{Q(M,θM;Y)},thatis,∗E{Q(M,θM;Y)}=infE{Q(M,θM;Y)}≡Q(M),(6.41)θM∈ΘMwheretheexpectationiswithrespecttothetruedistributionofY(whichmaybeunknown,butnotmodel-dependent).AcorrectmodelwithrespecttoQisamodelM∈MsuchthatQ(M)=infQ(M).(6.42)M∈MWhenMisacorrectmodelwithrespecttoQ,thecorrespondingθ∗isMcalledatrueparametervectorunderMwithrespecttoQ.Notethathereacorrectmodelisdefinedasamodelthatprovidesthebestapproximation,

184January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page172172RobustMixedModelAnalysisorbestfittothedata(notethatQtypicallycorrespondstoameasureoflack-of-fit),whichisnotnecessarilyacorrectmodelinthetraditionalsense.However,theabovedefinitionsareextensionsofthetraditionalconceptsinmodelselection[e.g.,Jiangetal.(2008)].Themaindifferenceisthat,inthelatterreference,themeasureQmustsatisfyaminimumrequirementthatE{Q(M,θM;y)}isminimizedwhenMisacorrectmodel,andθMatrueparametervectorunderM.Withtheextendeddefinition,theminimumconditionisnolongerneeded,becauseitisautomaticallysatisfied.Letusnowtakeanotherlookatthefence.Tobespecific,assumethattheminimum-dimensioncriterionisusedtoselecttheoptimalmodelwithinthefence.Incasethereareties(i.e.,twomodelswithinthefence,bothwiththeminimumdimension),themodelwiththeminimumdimensionandminimumQ(M)willbechosen.Recallthecut-offcin(6.39).Asitturnsout,whateveronedoesinchoosingthecut-off(adaptivelyorotherwise),onlyafixedsmallsubsetofmodelshavenonzerochancetobeselected;inotherwords,themajorityofthecandidatemodelsdonotevenhaveachance.Weillustratethiswithanexample.Supposethatthemaximum†dimensionofthecandidatemodelsis3.LetMjbethemodelwithdi-†mensionjsuchthatcj=Q(Mj)minimizesQˆ(M)amongallmodelswithdimensionj,j=0,1,2,3.Notethatc3≤c2≤c1≤c0;assumenoequalityholdsforsimplicity.Thepointisthatanyc≥c0doesnotmakeadifference†intermsofthefinalmodelselectedbythefence,whichisM0.Similarly,†anyc1≤c

185January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page173ModelSelection173findamodelthathasthehighestempiricalprobabilitytobestfitthedata,thisisthemodelweselect.Althoughthenewproceduremightlookquitedifferentfromthefence,itactuallyusesimplicitlytheprincipleoftheAF.Forsuchareason,theprocedureiscalledinvisiblefence,orIF.Computationisamajorconcerninhighdimensionproblems.Forex-ample,forthe522genepathwaysdevelopedbySubramanianetal.(2005)atdimensionk=2thereare135,981differentQ(M)’stobeevaluated;atdimensionk=3thereare23,570,040differentQˆ(M)’stobeevaluated,....Ifonehastoconsiderallpossiblek’s,thetotalnumberofevalua-tionsis2522,anastronomicalnumber.Jiangetal.(2011b)proposedthefollowingstrategy,calledthefastalgorithm,tomeetthecomputationalchallenge.Considerthesituationwheretherearea(large)numberofcan-didateelements(e.g.,gene-sets,variables),denotedby1,...,m,suchthateachcandidatemodelcorrespondstoasubsetofthecandidateelements.AmeasureQissaidtobesubtractiveifitcanbeexpressedasQ(M)=s−si,(6.43)i∈Mwheresi,i=1,...,maresomenonnegativequantitiescomputedfromthedata,Misasubsetof1,...,m,andsissomequantitycomputedfrommthedatathatdoesnotdependonM.Typicallywehaves=i=1si,butthedefinitiondoesnotimposesucharestriction.Forexample,ingene-setanalysis[GSA;e.g.,EfronandTibshirani(2007),sicorrespondstothegene-setscorefortheithgene-set.Asanotherexample,Mou(2012)consideredsi=|βˆi|,whereβˆiistheestimateofthecoefficientfortheithcandidatevariableunderthefullmodel,inselectingthefixedcovariates.Forasubtractivemeasure,themodelsthatminimizeQ(M)atdifferentdimensionsarefoundalmostimmediately.Letr1,r2,...,rmbetherankingofthecandidateelementsintermsofdecreasingsi.Then,themodelthatminimizesQ(M)atdimensiononeisr1;themodelthatminimizesQ(M)atdimensiontwois{r1,r2};themodelthatminimizesQ(M)atdimensionthreeis{r1,r2,r3},andsoon(Exercise6.8).Jiangetal.(2011b)implementedtheIFwiththefastalgorithm,forwhichanaturalchoiceofsiisthegene-setscore,asnoted,andshowedthatIFsignificantlyoutperformsGSAinsimulationstudies.TheauthorsalsoshowedthatIFhasanicetheoreticalproperty,calledsignal-consistency,whichGSAdoesnothave.Consider,forexample,theproblemofselectingthefixedcovariatesinalinearmixedmodel.Insteadoflettingthesamplesizeincrease(inacertainway),onemayconsiderlettingtheabsolutevaluesofthefixedeffects,calledthesignals,increase.Amodelselectionprocedure

186February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page174174RobustMixedModelAnalysisiscalledsignal-consistentiftheprobabilityofselectingtheoptimalmodelgoestooneasthesignalsgotoinfinity.Inaway,signal-consistencyisequivalenttoconsistencyinthetraditionalsense.6.2.3ModelselectionwithincompletedataThemissing-dataproblemhasalonghistory[e.g.,LittleandRubin(2002)].Whilethereisanextensiveliteratureonstatisticalanalysiswithmissingorincompletedata,theliteratureonmodelselectioninthepresenceofmissingdataisrelativelysparse.See,Jiangetal.(2015a)forareviewofliteratureonmodelselectionwithincompletedata.Existingmodelselectionproce-duresfacespecialchallengeswhenconfrontedwithmissingorincompletedata.Obviously,thenaivecomplete-data-onlystrategyisinefficient,some-timesevenunacceptablebythepractitionersduetotheoverwhelminglywastedinformation.Forexample,inastudyofbackcrossexperiments[e.g.,LanderandBotstein(1989)],adatasetwasobtainedbyresearchersatUC-Riverside(personalcommunications;seeZhanetal.(2011)).Outofthe150orsosubjects,only4havecompletedatarecord.Situationslikeare,unfortunately,therealitythatweoftenhavetodealwith.Verbekeetal.(2008)offeredareviewofformalandinformalmodelselec-tionstrategieswithincompletedata,butthefocusisonmodelcomparison,insteadofmodelselection.AsnotedbyIbrahimetal.(2008),whilemodelcomparisons“demonstratetheeffectofassumptionsonestimatesandtests,theydonotindicatewhichmodelingstrategyisbest,nordotheyspecifi-callyaddressmodelselectionforagivenclassofmodels”.ThelatterauthorsfurtherproposedaclassofmodelselectioncriteriabasedontheoutputoftheE-Malgorithm.Jiangetal.(2015a)pointsoutapotentialdrawbackoftheE-MapproachofIbrahimetal.(2008)inthattheconditionalexpec-tationintheE-stepistakenundertheassumed(candidate)model,ratherthananobjective(true)model.Notethatthecomplete-datalog-likelihoodisalsobasedontheassumedmodel.Thus,bytakingtheconditionalex-pectation,again,undertheassumedmodel,itmaybringfalsesupportingevidenceforanincorrectmodel.Similarproblemshavebeennotedintheliterature,whicharesometimesreferredtoas“double-dipping”[e.g.,CopasandEguchi(2005)].Ontheotherhand,theAFideaworksnaturallywiththeincompletedata.Forthesimplicityofillustration,letusassume,fornow,thatthecandidatemodelsincludeacorrectmodelaswellasafullmodel.Itthenfollowsthatthefullmodelis,atleast,acorrectmodel,eventhoughit

187February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page175ModelSelection175maynotbethemostefficientone.Thus,inthepresenceofmissingdata,wecanruntheE-M[Dempsteretal.(1977)]toobtaintheMLEoftheparameters,underthefullmodel.Notethat,here,wedonothavethedouble-dippingproblem,becausetheconditionalexpectation(underthefullmodel)is“objective”.Oncetheparameterestimatesareobtained,wecanusethemodel-based(orparametric)bootstraptodrawsamples,underthefullmodel,asintheAF.Thebestpartofthisstrategyisthat,whenonedrawsthebootstrapsamples,onedrawssamplesofcompletedata,ratherthandatawiththemissingvalues.Therefore,onecanapplyanyexistingmodelselectionprocedurethatisbuiltforthecomplete-datasituation,suchasthefence,tothebootstrapsample.SupposethatBbootstrapsamplesaredrawn.Themodelwiththehighestfrequencyofbeingselected(astheoptimalmodel),outofthebootstrapsamples,isthe(final)optimalmodel.WecallthisproceduretheEMAFalgorithmduetoitssimilaritytotheAFidea.OnecanextendtheEMAFideatosituationswhereacorrectmodelmaynotexist,orexistsbutnotamongthecandidates.Insuchacase,thebootstrapsamplesmaybedrawnunderamodelM˜,whichisthemodelwiththeminimumQinthesenseofthesecondparagraphof§6.2.2.6.2.4Examples1.Fay-Herriotmodel.TheFay-HerriotmodelwasintroducedinEx-ample5.1.Jiangetal.(2008)reportedresultsofasimulationstudy,inwhichAFwascomparedwithseveralothernon-adaptivechoicesofthecut-off,cin(6.39).ThemeasureQwastakenasthenegativelog-likelihood,andthesamplesizewasn=m=30.Thecandidatepredic-torsarex1,...,x5,generatedfromtheN(0,1)distribution,andthenfixedthroughoutthesimulations.Thecandidatemodelsincludedallpossiblemodelswithatleastanintercept.Fivecaseswereconsidered,inwhichP5thedatayweregeneratedunderthemodely=j=1βjxj+v+e,whereβ′=(β,...,β)=(1,0,0,0,0),(1,2,0,0,0),(1,2,3,0,0),(1,2,3,2,0)and15(1,2,3,2,3),denotedbyModel1,2,3,4,5,respectively.TheauthorsconsideredthesimplecaseDi=1,1≤i≤n.ThetruevalueofAis1inallcases.Thenumberofbootstrapsamplesfortheevaluationofthep∗’sis100.InadditiontotheAF,fivedifferentnon-adaptivechoiceofc=cnwereconsidered,whichsatisfytheconsistencyrequirementsgiveninJiangetal.(2008),namely,thatcn→∞andcn/n→0inthiscase.TheresultsarepresentedinTable6.1,whichwerethepercentageoftimes,outofthe100simulations,thattheoptimalmodelwasselectedbyeach

188February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page176176RobustMixedModelAnalysisTable6.1FencewithdifferentchoiceofcOptimalModel12345Adaptivec10010010099100c=loglog(n)52637083100c=log(n)96989996100√c=n100100100100100c=n/log(n)100919590100c=n/loglog(n)1000006√method.Itisseenthattheperformancesofthefencewithc=log(n),norn/log(n)arefairlyclosetothatoftheAF.Ofcourse,inanyparticularcaseonemightgetluckytofindagoodcvalue,butonecannotbeluckyallthetime.Regardless,theAFalwaysseemstopickuptheoptimalvalue,orsomethingclosetotheoptimalvalueofcintermsofthefinite-sampleperformance.2.Genesetanalysis.EfronandTibshirani(2007)carriedoutanem-piricalstudy,inwhichtheauthorssimulated1000genesand50samplesineachof2classes,controlandtreatment.Thegeneswereevenlydividedinto50gene-sets,with20genesineachgene-set.ThedatamatrixwasoriginallygeneratedindependentlyfromtheN(0,1)distribution,thenthetreatmenteffectwasaddedaccordingtooneofthefollowingfivescenarios:1.All20genesofgene-set1are0.2unitshigherinclass2.2.Thefirst15genesofgene-set1are0.3unitshigherinclass2.3.Thefirst10genesofgene-set1are0.4unitshigherinclass2.4.Thefirst5genesofgene-set1are0.6unitshigherinclass2.5.Thefirst10genesofgene-set1are0.4unitshigherinclass2,andthesecond10genesofgene-set1are0.4unitslowerinclass2.Jiangetal.(2011b)consideredthesamefivescenariosinasimulationstudy.InEfronandTibshirani’sstudyonlythefirstgene-setisofpotentialinterest.Jiangetal.(2011b)expandedtheone-gene-setcasetoatwo-gene-setcase,inwhichtheyduplicatedthefivescenariostothesecondgene-set.Also,inEfronandTibshirani’sstudythegenesweresimulatedinde-pendently.Jiangetal.(2011b)considered,inadditiontotheindependentcase(ρ=0),acasewherethegenesarecorrelatedwithequalcorrelationcoefficientρ=0.3.Thecorrelationisgeneratedbyassociatingwitheachmicroarrayarandomeffect.Thegenesonthesamemicroarrayarethencorrelatedforsharingthesamerandomeffect.Letxijbethe(i,j)elementofthedatamatrix,X,whereirepresentsthegeneandjthemicroarray,i=1,...,1000,j=1,...,100.Herej=1,...,50correspondtothecon-trolsandj=51,...,100thetreatments.Then,wehavexij=αj+ǫij,(6.44)

189January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page177ModelSelection177wheretheαj’sandij’sareindependentrandomeffectsanderrorsthataredistributedasN(0,ρ)andN(0,1−ρ),respectively.ItfollowsthateachxijisdistributedasN(0,1),andcor(xij,xij)=ρ,i=i.Thetreatmenteffectsarethenaddedtotherightsideof(6.44)forj=51,...,100andgenesiinthegivengene-set(s),asabove.ComparetheperformanceofIFwithGSAingene-setidentification.InEfronandTibshirani’ssimulationstudy,theauthorsshowedthatthemaxmeanhasthebestoverallperformanceascomparedwithothermeth-ods,includingthemean,theabsolutemean,GSEA(GeneSetEnrichmentAnalysis;Subramanianetal.(2005)andGSEAversionoftheabsolutemean.Therefore,thecomparisonsfocusonthebestperformerofGSA,thatis,themaxmean.Inadditiontotheone-gene-setandtwo-gene-setcases,eachwiththefivescenarioslistedabove,thesimulationcomparisonsalsoincludethecasewherenogene-setispotentiallyinteresting,thatis,notreatmenteffectisaddedtoanygene-set.Thisiswhatwecallthenullscenario.ForGSAoneneedstochoosetheFDRaswellasthenumberofpermutationsamplesforthetestofsignificanceofgene-sets.ForIF,ontheotherhand,onealsoneedstospecifythelevelofsignificanceaswellasthenumberofpermutationsamplesforthetestfornogene-set.TheFDRandlevelofsignificancearebothchosenasα=0.05.ThenumberofpermutationsforbothGSAandIFis200[whichisthenumberthatEfronandTibshirani(2007)usedfortheirsimulations].Thefirstcomparisonisontheprobabilityofcorrectidentification,ortrue-positive(TP).ForIFthismeansthatthegene-setsselectedmatchexactlythosetowhichthetreatmenteffectsareadded,whichwecalltruegene-sets;similarly,forGSAthismeansthatthegene-setsthatarefoundsignificantareexactlythosetruegene-sets.Table6.2reportstheempiricalprobabilityofTPbasedon100simulationruns.Forexample,fortheNullScenario,One-Gene-Setcase,withρ=0,thenumbersmeanthatfor95outofthe100simulationruns,IFselectedno(0)gene-sets;whilefor59ofthe100simulationruns,GSAfoundno(0)gene-sets.Asanotherexample,forScenario2,Two-Gene-Setcase,withρ=0.3,IFselectedtheexacttwogene-sets,towhichthetreatmenteffectsareadded,for97outofthe100simulationruns;whileGSAfoundtheexacttwogene-setsfor66outofthe100simulationruns.Notethattheseareresultsofsame-datacomparisons,thatis,foreachsimulationrun,theresultsforbothmethodsarebasedonthesamesimulateddata.Alsoreported(intheparentheses)areempiricalprobabilitiesofoverfit(OF,inthesensethattheidentifiedgene-setsincludeallthetruegene-setsplussomefalsediscoveries)andunderfit(UF,inthe

190January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page178178RobustMixedModelAnalysisTable6.2IFvsGSA-EmpiricalProbabilities(in%)ofTP(OF,UF)ρ=0ρ=0.3ScenarioMethod1-Gene-Set2-Gene-Set1-Gene-Set2-Gene-SetNullIF95(5,0)95(5,0)64(36,0)64(36,0)GSA59(41,0)59(41,0)52(48,0)52(48,0)1IF80(6,14)68(1,31)80(16,4)88(4,8)GSA53(37,10)53(25,22)61(36,3)62(30,8)2IF88(5,7)88(0,12)88(12,0)97(2,1)GSA67(32,1)65(26,9)65(35,0)66(32,2)3IF87(5,8)84(0,16)83(14,3)96(2,2)GSA66(31,3)68(24,8)66(33,1)69(27,4)4IF73(6,21)63(2,35)75(15,10)80(7,13)GSA64(28,8)57(19,24)66(29,5)63(21,16)5IF87(6,7)84(0,16)91(9,0)99(0,1)GSA70(30,0)76(17,7)82(18,0)86(12,2)sensethatatleastoneofthetruegene-setsisnotdiscovered).ItappearsthatIFhasbetterperformancethanGSAintermsofTPuniformlyacrossallthecasesandscenarios.WhilemostofthelossesforIFareduetoUF,OFappearstobethemajorproblemforGSA.Furthermore,bothmethodsappeartobefairlyrobustagainstcorrelationsbetweengenes.Thenextcomparisonfocusesonthesignal-consistencypropertiesofbothmethods(seethelastparagraphofSubsection6.2.2).Asnoted,tra-ditionally,consistencyinmodelidentification(includingparameteresti-mationandmodelselection)involvessamplesizegoingtoinfinity.Suchanassumption,however,isnotveryrealisticingene-setanalysis,becausethesamplesizenusuallyismuchsmallerthanthenumberofgenesunderconsideration.Therefore,signal-consistencyisconsidered.Agene-setiden-tificationprocedureissignal-consistentifitsprobabilityofTPgoestooneasthetreatmenteffects,orsignals,increasetoinfinity.Ofcourse,onemaynotbeabletoincreasethesignalsinreal-life,butthepointistoseeifaprocedureworksperfectlywellinthe“idealsituation”,justlikeconsistencyinthetraditionalsense.Toinvestigatesignal-consistencypropertyofIFandGSA,oneofthecases,namely,thetwo-gene-setcaseofScenario5,wasexpandedbyincreasingthetreatmenteffectsintwodifferentways.First,oneincreasesthesignalsinabalancedmanner,thatis,thesignalsincreaseatthesamepaceforbothgene-sets.Next,oneletsthesignalsincreaseinanunbalancedmanner,sothatthepaceismuchfasterforthefirstgene-setthanforthesecondone.Table6.3reportstheempiricalprobabilitiesofTPbasedon100simulationruns.Herethesignalsareexpressedintheformof(a,b,c,d),wherethevaluesa,b,c,dareaddedtotherightside

191January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page179ModelSelection179Table6.3IFvsGSA-EmpiricalProbabilities(in%)ofTPwithIncreasingSignalsρ=0ρ=0.3Case#SignalsIFGSAIFGSA1(0.4,-0.4,0.4,-0.4)847699862(0.5,-0.5,0.5,-0.5)10088100973(1.0,-1.0,1.0,-1.0)1001001001004(1.0,-1.0,0.5,-0.5)10097100995(1.5,-1.5,0.5,-0.5)10088100886(2.0,-2.0,0.5,-0.5)10064100567(2.5,-2.5,0.5,-0.5)10026100238(3.0,-3.0,0.5,-0.5)1001010039(3.5,-3.5,0.5,-0.5)1002100010(4.0,-4.0,0.5,-0.5)10001000of(6.44)for51≤j≤100and1st10genesofgene-setone,2nd10genesofgene-setone,1st10genesofgene-settwo,and2nd10genesofgene-settwo,respectively.Case1istakenfromthebottomtworowsofTable6.2(two-gene-setcase),whichservesasabaseline.Then,onecanseewhathappenswhenthesignalsincrease.Incases1–3,wherethesignalsincreaseinthebal-ancedway,bothIFandGSAseemtoworkperfectlywellasbothmethodsshowsignsofsignal-consistency.However,incases1,4–10,wherethesig-nalsincreaseintheunbalancedway,theempiricalprobabilitydrops,andeventuallyfallsapartforGSA,evenwithincreasingsignals.Ontheotherhand,IFstillshinesinthissituation,havingperfectempiricalprobabilitiesofTP.ItisinterestingtoknowwhathappenstoGSAinthelatestsitua-tion.Jiangetal.(2011b)notedthatthiswasduetotherestandardizationmethodusedinEfronandTibshirani(2007)(Exercise6.9).InintroducingtheGSAmethod,EfronandTibshirani(2007)consideredasituationwherethesametreatmenteffectisaddedtoallthegene-sets.Inotherwords,allthegene-setsareequallyd.e.Theauthorsusedthisexampletomakethepointfortheneedofrestandardization.Theclaimisthat,inthiscase,thereis“nothingspecialaboutanyonegene-set”.Whiletheclaimisarguablefromapracticalpointofview,itwouldbeinterestingtoseehowthetwomethods,IFandGSA,workinasituationlikethis.Jiangetal.(2011b)simulateddataaccordingtoScenario1above,exceptthatthe0.2unitsareaddedtoallthegene-sets.If,asEfronandTibshiraniclaimed,thereisnothingspecialaboutanygene-set,oneexpectsaproceduretoidentifyno(zero)gene-setinthiscase.Accordingtotheresultsbasedon100simulationruns,whenρ=0,theempiricalprobabilities

192January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page180180RobustMixedModelAnalysisofidentifyingzerogene-setis50%forIFand47%forGSA;whenρ=0.3,thecorrespondingempiricalprobabilitiesare58%forIFand54%forGSA.So,inthelatestcomparison,thetwomethodsperformedsimilarlywithIFdoingslightlybetter.3.Backcrossexperiment:Areal-dataexample.RecallthedatasetobtainedbytheUC-RiversideresearchersmentionedinSubsection6.2.3.ThegeneexpressiondatawereoriginallypublishedbyLuoetal.(2007).ThephenotypicvaluesofeightquantitativetraitsofbarleywerepublishedbyHayesetal.(1993).Detaileddescriptionoftheexperimentcanbefoundinthelatterreference,whichinvolved150doublehaploid(DH)linesderivedfromthecrossoftwospringbarleyvarieties,MorexandSteptoe.TheDHlinesareconsideredasthesubjectshere.Inalltherewere495SNPmarkersonsevenchromosomesthatareunderinvestigation.Asmentioned,therearesignificantmissingvaluesinthedatasothatonly4ofthe150subjectshavecompletegenotyperecords.Ontheotherhand,therearenomissingvaluesinthephenotypicdata.WeusethisdatasettoillustrateavariationoftheEMAFalgorithm,describedinSubsection6.2.3,withtheAFreplacedbyIF,introducedinSubsection6.2.2.ThenewalgorithmisthuscalledEMIF.FollowingBromanandSpeed(2002),wehaveaconditionallinearre-gressionmodelforthephenotypevariable,Y,suchthat,giventhemarkerrindicators,x,wehaveYi=k=1j∈Mkβjkxijk+i,whereristhenumberofchromosomes,Mkisasubsetof{1,...,q}andqisthenumberofmarkersoneachchromosome,andiisanormalerror,withmeanzeroandunknownvarianceσ2.The’sareuncorrelatedandalsoindependentwiththeX’s.iijkFurthermore,themarkerindicators,Xijk,areassumedtobeaMarkovchainwithineachchromosomewithP(Xi1k=0)=P(Xi1k=1)=1/2(Mendel’srule)andP(Xi,j+1,k=1|Xijk=0)=P(Xi,j+1,k=0|Xijk=1)=θ,whereθistherecombinationfraction.TheproblemofinterestistoidentifythesubsetM=(M1,...,Mr),whichisviewedasamodelselectionproblemasinBromanandSpeed(2002).However,thehigh-dimensionalnatureofthedatapresentsaproblemforthedirectapplicationoftheEMIF,becausethetotalnumberofmarkers(495)ismuchlargerthanthesamplesize(n=150).Morespecifically,theleastsquares(LS)fitisunfeasiblewhenthenumberofpredictorsislargerthanthesamplesize.Toovercomethisdifficulty,Jiangetal.(2015a)usedthefollowingideaofconditionalmodeling,describedunderamoregeneralsetting.Supposethat,conditionalonX=(x),onehasalinearregressioni1≤i≤nY=Xβ+,whereY=(Yi)1≤i≤naretheobservations,and=(i)1≤i≤n

193January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page181ModelSelection181aretheerrorssuchthatthecomponentsofareindependentwithmean0,andisindependentofX.Furthermore,supposethatX=[X(1)X(2)]withX=(X),r=1,2suchthatX,Xareindependent(r)ir1≤i≤n(1)(2)[e.g.,BromanandSpeed(2002)].Then,itiseasytoshowthatX(1)isindependentof[X(2),].NotethatwecanexpresstheregressionmodelasY=X(1)β1+X(2)β2+.Withoutlossofgenerality,weassumethatX(1)β1doesnotinvolveanintercept[which,ifexist,belongstoX(2)β2].NowsupposethatXi2,i=1,...,nareindependent,andthatE(Xi2)doesnotdependoni.Then,E(Xβ+)=E(X)βisaconstant,say,i22ii22β.Lete=Xβ+−β.Itiseasytoshowthate,i=1,...,nare0ii22i0iindependentwithE(e)=0,andY=[1X](ββ)+e,ebeinginde-in(1)01pendentof[1nX(1)].Inotherwords,conditionalonX(1),we,onceagain,haveastandardlinearregressionmodel(i.e.,theerrorsareindependentwithmeanzero,andindependentwiththepredictors).ThepointisthatX(1)canbeofmuchlowerdimensionthanX.Forthebarleycrossdata,wecanletX(1)correspondtomarkersonanyparticularchromosome.Thenumberofmarkersonthe7chromosomesare60,78,81,60,93,56and67,respectively,allofwhicharesmallerthanthesamplesize150.Withineachchromosome,weapplytheEMIFinconjunctionwiththeIF(seeSubsection6.2.2).ThenumberofbootstrapsamplesisB=100.Itisknownthat,forhigh-dimensionaldatatheIFmaysufferfromtheso-calleddominantfactoreffect[Jiangetal.(2011b),sec.3.3].Forthemostpart,thismeansthattheIFfrequency(i.e.,theempiricalprobabilityofthemostfrequentlyselectedmodel;seeSubsection2.2)tendstobeinfavorofalowerdimensionalmodelthanthetruemodel,ifthe“signals”arerelativelyweakdueduetothelimitedsamplesize.ThisproblemisdealtwithnaturallybytheEMIF.FirstweapplytheIF,underthefullmodel,thatis,allthemarkersonagivenchromosome,toobtaintheIFfrequenciesatdifferentdimensions,say,p∗,p∗,...,p∗,wherep∗istheIFfrequencyat12qjdimensionj,andqisthetotalnumberofmarkers,forthechromosome.Ifthefrequenciesshowa“peak”,thatis,thereisa1p∗andp∗>p∗,theEMIFshallcontinue;otherwise,weconcludejj−1jj+1thatthereisnomorethanoneQTLonthechromosome.Inthelattercase,thehighestIFfrequencymusttakeplaceattheboundary,thatis,eitheratdimensiononeoratthehighestdimensioncorrespondingtoallthemarkersonthechromosome.However,itisunlikelythatallthemarkersareQTLs;therefore,dimensiononeischosen,andtheEMIFstops.Ifthefrequencyplotshowa“peak”,andthereforetheE-MSistocon-tinue,wefirstlookforthelastpeak,thatis,thehighestdimensionthat

194January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page182182RobustMixedModelAnalysisTable6.4EMIFResultsforGrainProteinChromosomeMarkerID#ChromosomeMarkerID#112135280285332333265666379380318418619920074674704176correspondstoapeakinordertobeconservative.ThisissimilartotheAF(Subsection6.2.1),wherethefirstsignificantpeakischoseninordertodeterminethecut-offforthefence[e.g.,Jiangetal.(2009),JiangandNguyen(2015)].ThefirstpeakfortheAFcorrespondstothelastpeakfortheIF.Themarkerscorrespondingtothelastpeakareselected,thecur-rentmodelisupdated,andtheupdatedmodelistreatedasthe(new)fullmodelforthenextstepofiteration.Theprocedureisrepeateduntileithertheupdatedmodelisidenticaltothecurrentmodel,ornopeakisfoundduringthecurrentstep;inbothcases,thecurrentmodelischosenasthefinalmodel.Forthelattercase,whennopeakisfound,wechoosethehigh-estdimension,insteadofdimensiononeasaboveintheinitialstep.Thisisbecause,atthisstage,wehavealreadydeterminedthattherearemorethanoneQTLsonthechromosome(theEMIFwouldnothavecontinuedotherwise);furthermore,thehighestdimensionpossiblyhasbeenupdated,soitnolongercorrespondstoallofthemarkersonthechromosome.TheresultsforthegrainproteinphenotypearepresentedinTable6.4.TheresultsshowsomeconsistencywiththefoundingsofZhanetal.(2011).Forexample,thelatterauthorsfoundthatchromosomes2,3,5“seemtocontrolmoregenesthanotherchromosomes”.Accordingtoourresults,thosethreechromosomescontainnearly60%ofalltheQTLsfound.Inparticular,chromosomes3and5arethetoptwoaccordingtothenumberofQTLsfound.ItshouldbenotedthatthenumberofQTLsfoundonachromosomeisnottheonlythingthatrepresentstherelativeimportanceofthechromosome;themagnitudeoftheQTLeffectisalsoimportant.Inthisapplication,however,ourfocusisidentificationoftheQTLs,ratherthanestimationoftheQTLeffects.6.3ShrinkagemixedmodelselectionTherehasbeensomerecentworkonjointselectionofthefixedandrandomeffectsinmixedeffectsmodels.Bondelletal.(2010)consideredsuchaselectionprobleminacertaintypeoflinearmixedmodels,whichcanbe

195January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page183ModelSelection183expressedasyi=Xiβ+Ziαi+i,i=1,...,m,(6.45)whereyiisanni×1vectorofresponsesforsubjecti,Xiisanni×pmatrixofexplanatoryvariables,βisap×1vectorofregressioncoefficients(thefixedeffects),Ziisanni×qknowndesignmatrix,αiisaq×1vectorofsubject-specificrandomeffects,iisanni×1vectoroferrors,andmisthenumberofsubjects.Itisassumedthattheαi,i,i=1,...,mareindependentwithα∼N(0,σ2Ψ)and∼N(0,σ2I),whereΨisaniiniunknowncovariancematrix.Theproblemofinterest,usingthetermsofshrinkagemodelselection,istoidentifythenonzerocomponentsofβandαi,1≤i≤m.Forexample,thecomponentsofαimayincludearandominterceptandsomerandomslopes.6.3.1AnE-MbasedapproachTotakeadvantageoftheideaofshrinkagevariableselection,Bondelletal.(2010)adoptedamodifiedCholeskydecomposition.Notethatthecovariancematrix,Ψ,canbeexpressedasΨ=DΩΩD,whereD=diag(d1,...,dq)andΩ=(ωkj)1≤k,j≤qisalowertriangularmatrixwith1’sonthediagonal.Thus,onecanexpress(6.45)asyi=Xiβ+ZiDΩξi+i,i=1,...,m,(6.46)wheretheξ’sareindependentN(0,σ2)randomvariables.Theideaistoiapplyshrinkageestimationtobothβj,1≤j≤panddk,1≤k≤q.Notethatsettingdk=0isequivalenttosettingalloftheelementsinthekthcolumnandkthrowofΨtozero,andthuscreatinganewsubmatrixbydeletingthecorrespondingrowandcolumn,ortheexclusionofthekthcomponentofαi.However,directimplementationofthisideaisdifficult,becausetheξi’sareunobserved,eventhoughtheirdistributionismuchsimpler.Toovercomethisdifficulty,Bondelletal.(2010)usedtheE-Malgorithm[Dempsteretal.(1977)].Bytreatingtheξi’sasobserved,thecomplete-datalog-likelihoodcanbeexpressedas(Exercise6.10)N+mq2122lc=c0−logσ−(|y−Xβ−ZD˜Ω˜ξ|+|ξ|),(6.47)22σ2mwherec0isaconstant,N=i=1ni,X=(Xi)1≤i≤m,Z=diag(Z1,...,Zm),D˜=Im⊗D,Ω=˜Im⊗Ω(⊗meansKroneckerproduct),ξ=(ξi)1≤i≤m,and|·|denotestheEuclideannorm.(6.47)leadstothe

196January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page184184RobustMixedModelAnalysisshrinkageestimationbyminimizing⎛⎞p|β|q|d|P(φ|y,ξ)=|y−Xβ−ZD˜Ω˜ξ|2+λ⎝j+j⎠,(6.48)cj=1|β˜j|k=1|d˜j|whereφrepresentsalloftheparameters,includingtheβ’s,thed’s,andtheω’s,β˜=(β˜j)1≤j≤pisgivenbytherightsideof(5.4)withthevariancecomponentsinvolvedinV=Var(y)replacedbytheirREMLestimators(seeSection3.2)],d˜k,1≤k≤qareobtainedbydecompositionoftheestimatedΨviatheREML,andλistheregularizationparameter.Bondelletal.(2010)proposedtousetheBICinchoosingtheregularizationparameter.HeretheformL1penaltyin(6.48)isintermsoftheadaptiveLasso[Zou(2006)].ToincorporatewiththeE-Malgorithm,onereplaces(6.48)byitsconditionalexpectationgiveny,andthecurrentestimateofφ.Notethatonlythefirsttermontherightsideof(6.48)involvesξ,withrespecttowhichtheconditionalexpectationistaken.Theconditionalexpectationisthenminimizedwithrespecttoφtoobtaintheupdated(shrinkage)estimateofφ.AsimilarapproachwastakenbyIbrahimetal.(2011)forjointselectionofthefixedandrandomeffectsinGLMMs[seeSection2.4],althoughtheperformanceoftheproposedmethodwasstudiedonlyforthespecialcaseoflinearmixedmodels.AsinSubsections6.2.1,onemayusetheAFideatoderiveadata-drivenapproachtoselectionoftheregularizationparametersinshrinkagemixedmodelselection,suchastheλin(6.48).Belowweconsiderselectionproblemsfromadifferentperspective.6.3.2PredictiveshrinkageselectionSupposethatthepurposeofthejointselectionisforpredictingsomemixedeffects.Wecanincorporatethisintothemodelselectioncriterion[seediscussionintheparagraphbelowtheonecontaining(6.39)].Weconsiderapredictivemeasureoflack-of-fitdevelopedinSection5.2.,andincorporateitwiththeshrinkageideaofBondelletal.(2010)andIbrahimetal.(2011).Namely,wereplacethefirsttermontherightsideof(6.48)bythemeasure,Q(φ|y)=(y−Xβ)ΓΓ(y−Xβ)−2tr(ΓΣ),(6.49)whichcanbederivedinasimilarwayas(5.10)[or(5.20);Exercise6.11].Theregularizationparameter,λ,maybechosenusingasimilarprocedureasinSubsection6.2.1.Werefertothismethodaspredictiveshrinkage

197January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page185ModelSelection185selection,orPSS.TheideahasbeenexploredbyHuetal.(2015).Theau-thorsfoundthatPSSperformsbetterthantheshrinkageselectionmethodbasedon(6.48)notonlyintermsofthepredictiveperformance,butalsointermsofparsimony.Thelatterreferstotheclassicalcriterionofselectingacorrectmodelwiththeminimumnumberofparameters.TheauthorsalsoextendedthePSStoPoissonmixedmodels,aspecialcaseofGLMM,andobtainedsimilarresults.ThereisanotheradvantageofthePSSintermsofcomputation.Denotethefirsttermontherightsideof(6.48)byQc(φ|y,ξ).Notethat,unlikeQc(φ|y,ξ),theunobserveξisnotinvolvedinQ(ψ|y)definedby(6.49),whichiswhatthePSSisbasedon.Thismeansthat,unlikeBondelletal.(2010)andIbrahimetal.(2011),PSSdoesnotneedtoruntheE-Malgorithm,andthusiscomputationally(much)moreefficient.Weillustratethiswithareal-dataexample.6.3.3Real-dataexample:Analysisofhigh-speednetworkEfficientdataaccessisessentialforsharingmassiveamountsofdataamonggeographicallydistributedresearchcollaborators.Theanalysisofnetworktrafficisgettingmoreandmoreimportanttodayforutilizinglimitedre-sourcesofferedbythenetworkinfrastructuresandplanningwiselylargedatatransfers.Thelattercanbeimprovedbylearningthecurrentcon-ditionsandaccuratelypredictingfuturenetworkperformance.Short-termpredictionofnetworktrafficguidestheimmediatescientificdataplace-mentsfornetworkusers;long-termforecastofthenetworktrafficenablescapacity-planningofthenetworkinfrastructureneedsfornetworkdesign-ers.Suchpredictionsbecomenon-trivialwhentheamountofnetworkdatagrowsinunprecedentedspeedandvolumes.OnesuchavailabledatasourceisNetFlow[CiscoSystems(1966)].TheNetFlowmeasurementsprovidehighvolume,abundantspecificin-formationforeachdataflow;somesamplerecordsareshowninTable6.5(withIPaddressesmaskedforprivacy).Foreachrecordcontainsthefol-lowinglistofvariables:Start,EndThestartandendtimeoftherecordeddatatransfer.Sif,DifThesourceanddestinationinterfaceassignedautomaticallyforthetransfer.SrcIPaddress,DstIPaddressThesourceanddestinationIPaddressesofthetransfer.SrcP,DstPThesourceanddestinationPortchosenbasedonthetransfertypesuchasemail,FTP,SSH,etc.

198January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page186186RobustMixedModelAnalysisTable6.5SampleNetFlowRecords.StartEndSifSrcIPaddress(masked)SrcPDifDstIPaddress(masked)DstPPFlPktsOctets0930.23:59:37.9200930.23:59:37.925179xxx.xxx.xxx.xxx62362175xxx.xxx.xxx.xxx22364601520930.23:59:38.3450930.23:59:39.051179xxx.xxx.xxx.xxx62362175xxx.xxx.xxx.xxx283356042081001.00:00:00.3721001.00:00:00.372179xxx.xxx.xxx.xxx62362175xxx.xxx.xxx.xxx204926021040930.23:59:59.4430930.23:59:59.443179xxx.xxx.xxx.xxx62362175xxx.xxx.xxx.xxx26649601521001.00:00:00.3721001.00:00:00.372179xxx.xxx.xxx.xxx62362175xxx.xxx.xxx.xxx26915601521001.00:00:00.3721001.00:00:00.372179xxx.xxx.xxx.xxx62362175xxx.xxx.xxx.xxx20886602104PTheprotocolchosenbasedonthegeneraltransfertypesuchasTCP,UDP,etc.FlTheflagsmeasuredthetransfererrorcausedbythecongestioninthenetwork.PktsThenumberofpacketsoftherecordeddatatransfer.OctetsTheOctetsmeasuresthesizeofthetransferinbytes.FeaturesofNetFlowdatahaveledtoconsiderationofGLMMs(seeSection2.4)forpredictingthenetworkperformance.First,NetFlowrecordiscomposedofmultipletimeserieswithunevenlycollectedtimestamps.Becauseofthisfeature,traditionaltimeseriesmethodssuchasARIMAmodel,waveletanalysis,andexponentialsmoothing[e.g.,FanandYao(2003),sec.1.3.5]encounterdifficulties,becausethesemethodsaremainlydesignedforevenlycollectedtimestampsanddealingwithasingletimeseries.Thus,thereisaneedformodelingalargenumberoftimeserieswithoutconstraintsonevencollectionoftimestamps.Ontheotherhand,GLMMisabletofullyutilizeallofthevariablesinvolvedinthedatasetwithoutrequiringevenly-spacedtimevariable.Second,thereareempiricalevidencesofassociations,aswellashet-eroscedasticity,foundintheNetFlowrecords.Forexample,withincreasingnumberofpacketsinadatatransfer,ittakeslonger,ingeneral,tofinishthetransfer.Thissuggeststhatthenumberofpacketsmaybeconsideredasa(fixed)predictorforthedurationofdatatransfer.Furthermore,thereappeartobefluctuationamongnetworkpathsintermsofslopeandrangeintheplotsofdurationagainstthenumberofpackets.SeeFig.1ofHu

199March27,201917:10ws-book9x6RobustMixedModelAnalysisbook4page187ModelSelection187etal.(2015).Thissuggeststhatthenetworkpathfordatatransfermaybeassociatedwitharandomeffecttoexplainthedurationundervary-ingconditions.Thus,again,amixedeffectsmodelseemstobeplausible.Moreover,GLMMismoreflexiblethanlinearmixedmodelintermsofthemean-varianceassociation.Third,NetFlowmeasurementsarebigdatawithmillionsofobservationforasinglerouterwithinadayand14variablesineachrecordwith30sor40sinteractiontermsascandidates.Thelargevolumeandcomplex-ityofthedatarequireefficientmodeling.However,thisisdifficulttodowithfixedeffectsmodeling.Forexample,traditionalhierarchicalmodel-ingrequiresdividingthedataintogroups,butthegroupingisnotclear,andrequiresinvestigationtoidentifythevariablethatclassifytheobserveddata.Explorativedataanalysisshowsthatthegroupingfactorcouldbethepathofthedatatransfer,thedeliveringtimeoftheday,thetransferprotocolused,orthecombinationofsomeorallofthese.Withsomanyuncertainties,oneapproachtosimplifyingthemodelingisviatheuseofrandomeffects[Jiang(2007)].TheNetFlowdatausedforthecurrentanalysiswasprovidedbyESnetforthedurationfromMay1,2013toJune30,2013.Consideringthenetworkusers’interests,theestablishedmodelshouldbeabletopredictthedurationofadatatransfersothattheuserscanexpecthowlongitwouldtakeforthedatatransfer,giventhesizeoftheirdata,thestarttimeofthetransfer,selectedpathandprotocols.Consideringthenetworkdesigners’interests,theestablishedmodelshouldbeabletopredictthelong-timeusageofthenetworksothatthedesignerwillknowwhichlinkinthenetworkisusuallycongestedandrequiresmorebandwidth,orreroutingofthepath.Inthefollowing,weillustrateamodelbuiltfortheseinterests,andcompareitspredictionaccuracywithtwotraditionalGLMMprocedures:Backward-Forwardselection(B-F)andEstimation-basedLasso(E-Lasso).ThelatterreferstoLassoforthepenalizedlikelihoodmethod,correspond-ingto(6.48)[Bondelletal.(2010)].Thefullmodelpredictsthetransferduration,assuminginfluencesfromthefixedeffectsincludingtransferstarttime,transfersize(OctetsandPackets)andtherandomeffectsincludingnetworktransferconditionssuchasFlagandProtocol,sourceanddestinationPortnumbers,andtransferpathsuchassourceanddestinationIPaddressesandInterfaces.ThePSS

200March27,201917:37ws-book9x6RobustMixedModelAnalysisbook4page188188RobustMixedModelAnalysisFig.6.2FittedSmoothingSpline:TransferDurationvsStartTime.procedure,withtheLassopenalty,thatis,(6.48)withoutthedenominators,|β˜j|andd˜j|,1≤j≤p,hasselectedthemodely=βstarts(xstart)+βpktxpkt+Zip−pathvip−path+e,(6.50)wheres(·)isafittedsmoothingsplineimplementedtotakenintoaccountthatthemeanresponseisusuallynonlinearlyassociatedwiththetimevari-able,xstart,withtheparametersofthesmoothingsplinechosenautomati-callybycross-validation.TheparameterestimatesandtheircorrespondingP-valuesformodel(6.50)aregiveninTable6.6.ThefittedsmoothingsplineisplottedinFigure6.2,whichshowshowthetransferdurationvarieswithstarttime.Furthermore,in(6.50),Zip−pathisthedesignmatrixwhosecolumnscorrespondtotheip-paths,andvip−pathisavector-valuedrandomeffectwhosecomponentscorrespondtotheip-paths,andeisanadditionalerrorcorrespondingtothebackgroundnoise.ThePSShasidentifiedsixpathswithnon-zerorandom-effectstandarddeviations,indexedas14,16,38,41,61,and83.Amongthosepaths,allexceptpath61havetheestimatedstandarddeviationofatleast10,whiletheestimatedstandarddeviationforpath61isalmostzero.AplotisshowninFigure6.3.Theestimatedstandarddeviationforthebackgroundnoiseis11.239.

201January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page189ModelSelection189Fig.6.3EstimatedNon-zeroStandardDeviation.Table6.6EstimatesofNon-zeroFixedEffects.FixedEffectsEstimatesStandardErrorP-valueIntercept-13.8090.914<2e-16StartTime0.5740.0169<2e-16Packets1.1150.035<2e-16Table6.7ComparisonofPSS,B-F,E-LassoinTermofMSPEandComputingTime.E-LassoB-FPSSMSPE230642230127Time(inseconds)6.26×1075.43×1010142RegardingthecomparisonofPSSwithB-FandE-Lasso,theoverallMSPEaswellascomputationspeedforthecomparingmethodsarepre-sentedinTable6.7.Notethat,inthiscase,oneactuallyknowsthetruthfortheprediction,andthereforeisabletocomputethe(exact)MSPE,andrecordthetotalcomputationaltime,ofcourse.Theresultsshowthat,intermsofpredictionaccuracy(MSPE),PSSisabout18timesbetterthanE-Lasso,and330timesbetterthanB-F;intermsofcomputationaltime,PSSisabout4×105timeslessthanE-Lassoand3.8×108timeslessthanB-F.Inconclusion,atleastforthisapplication,PSSgreatlyimprovesthepredictionaccuracythatfitstheinterestsofmodelingnotedearlierand,atthesametime,providesefficientfastalgorithmcomparedtotheE-MbasedE-LassoandregressionbasedB-Fprocedure.

202January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page190190RobustMixedModelAnalysis6.4Exercises6.1.Showthat,inExample6.1,theconditionsofTheorem6.1aresatisfiedwithρN∼nandνN∼ωN∼mn.6.2.Continuewiththepreviousexercise.Showthat,inthiscase,ηN∼mn,whereηNisdefinedabove(6.10).Thus,theconditionsofTheorem6.2aresatisfied.6.3.ProveTheorem6.3.6.4.Showthat,inExample6.2,theZr’swithineachgrouparelinearlyindependent.6.5.LetAandBbematriceswiththesamenumberofrowssuchthatPAPB=PBPA.ShowthatP(AB)≤PA+PB.6.6.InExample6.4,itisunderstoodthatyij,j=1,...,Ncarecondi-tionallyindependentgivenπi.Showthata.E(yij)=μ;b.var(yij)=μ(1−μ);andc.cor(yij,yik)=γ/(1+γ),1≤j=k≤Nc.6.7.InSubsection6.2.2,theparagraphbelowtheonecontaining(6.42),anexampleisgivenonhowmanydifferentmodelscanbepossiblyselectedbythefence.Extendtheexampletoageneralrule,assumingthatthemaximumdimensionofthecandidatemodelsisk.6.8.InSubsection6.2.2,astatementismadeintheparagraphbelowtheonethatcontains(6.43)regardingthefastalgorithmunderasubtractivemeasure.Provethestatement.6.9.ConsidertheGSAmethodmentionedinExample2ofSubsection6.2.4.Letsidenotethegenesetscoreforgeneseti,1≤i≤n.Con-siderasimplecasewhereeachgenesetconsistsofasinglegene.Then,byrestandardization,onesubtractsthemeanofthesi’sfromeachsi,thendividesthedifferencebythestandarddeviationofthesi’s.Canyou

203January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page191ModelSelection191explainwhatwouldhappentotheperformanceofGSAwhenonegenesetissodominantthatallbutonegenesetscoresarebelowtheoverallmean?6.10.Showthat,bytreatingtheξi’sinSection6.3asobserved,thecomplete-datalog-likelihoodcanbeexpressedas(6.47).6.11.Derivethepredictivemeasureoflack-of-fit(6.49)[Hint:thisissimilartothederivationof(5.10)].

204b2530InternationalStrategicRelationsandChina’sNationalSecurity:WorldattheCrossroadsThispageintentionallyleftblankb2530_FM.indd601-Sep-1611:03:06AM

205January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page193Chapter7OtherTopicsThereareothertypes,oraspects,ofrobustmixedmodelanalysisthat,sofar,havenotbeensystematicallycovered.However,thesetopicsarerelativelyunrelated,orhavesomeuniquefeaturesoftheirown.Also,theliteraturecoveringeachoneofthesetopicsisnotasextensiveastheonescoveredinthepreviouschapters.Duetosuchconsiderations,wehaveputtogetherthisfinalchapter,withacollectionofseveraltopics.Theseincludemixedmodeldiagnostics,nonparametricandsemiparametricmeth-ods,Bayesiananalysis,outliers,benchmarking,andmoreaboutprediction.Itshouldbenotedthatsomeofthesetopicsmaybeassociated,inacertainway,withtopicscoveredinthepreviouschapters;yet,thereareobviousreasonsthatjustifytheirinclusioninthecurrentchapter.Aswehavebeendoingallalong,thefocusisonrobustness,orimplicationtorobustness.7.1MixedmodeldiagnosticsInaway,thefirsttopiciscloselyrelatedtothepreviouschapter.Modelchecking,ormodeldiagnostics,issometimesthefirststepofamodelingprocessinpractice;someothertimes,itisthelaststep.Infact,quiteoftentheprocessinvolvesback-and-forthpracticesofmodelselectionanddiag-nostics.Ateachstep,amodelisselectedfromamongstasetofcandidatemodels;theselectedmodelisthencheckedusingsomemodeldiagnostictechniques(seebelow).Ifthediagnosticsdoesnotfindanyproblemwiththeselectedmodel,thelatterwillbeused,andtheprocessends;otherwise,somefurtherconsiderationwillbeimplemented,eithertothecandidatemodelsortotheselectionprocedure,andthecycleisrepeated.McCullaghandNelder(1989)(ch.12)categorizesmodeldiagnostictoolsasinformalandformal.Informalcheckingismainlybasedondi-193

206January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page194194RobustMixedModelAnalysisagnosticplots,whileformalcheckingmostlyrelyongoodness-of-fittests.SincethelatterhavebeendiscussedearlierinSubsection4.4,thepresentsectionwillfocusoninformalchecking.Animportantfeatureofmixedeffectsmodelsisthepresenceofrandomeffects;however,thelatterareunobservable.Thus,inadditiontothestandardtoolsforinformalchecking[e.g.,McCullaghandNelder(1989),ch.12],specialattentionshavebeenpaidoncheckingtheassumptionsabouttherandomeffects.SeveralauthorshaveusedtheideaofEBLUPorempiricalBayesestimators(EB),discussedinthepreviouschapter,fordiagnosingdistributionalassumptionsregardingtherandomeffects[e.g.,DempsterandRyan(1985);CalvinandSedransk(1991)].TheapproachisreasonablebecausetheEBLUPorEBarenaturalestimatorsoftherandomeffects.InthefollowingwefirstdescribeamethodproposedbyLangeandRyan(1989)basedonasimilaridea.Acommonlyusedassumptionregardingtherandomeffects,anderrors,isthattheyarenormallydistributed.Ifsuchanassumptionholds,onehasacaseofGaussianmixedmodels;otherwise,oneisdealingwithanon-Gaussianlinearmixedmodel(seeChapter3).LangeandRyanconsideredthelongitudinalmodel(seeSubsection3.1.2),assumingthatGi=G,Ri=τ2I,i=1,...,m,anddevelopedaweightednormalplotforassessingkinormalityoftherandomeffectsinasuchamodel.First,underthemodel(3.1),andnormality,onecanderivetheBPs,orBayesestimators,oftherandomeffectsαii=1,...,m(seeSection5.1),assumingthatβandγ,thevectorofvariancecomponents,areknown.Thisisgivenby−1α˜i=E(αi|yi)=GZiVi(yi−Xiβ),whereV=Var(y)=τ2I+ZGZ.Furthermore,thecovariancematrixiikiiiofα˜isgivenbyVar(˜α)=GZV−1ZG.LangeandRyanproposedtoiiiiiexamineaQ–Qplotofthestandardizedlinearcombinationscα˜izi=,i=1,...,m,(7.1){cVar(˜αi)c}1/2wherecisaknownvector.Theyarguedthat,throughappropriatechoicesofc,theplotcanbemadesensitivetodifferenttypesofmodeldepartures.Forexample,foramodelwithtworandomeffectsfactors,arandominterceptandarandomslope,onemaychoosec=(1,0)andc=(0,1)and12producetwoQ–Qplots.Ontheotherhand,suchplotsmaynotrevealpossiblenonzerocorrelationsbetweenthe(random)slopeandintercept.Thus,LangeandRyansuggestedproducingasetofplotsrangingfromone

207February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page195OtherTopics195marginaltotheotherbylettingc=(1−u,u)′forsomemoderatenumberofvalues0≤u≤1.DempsterandRyan(1985)suggestedthatthenormalplotshouldbeweightedtoreflectthedifferingsamplingvariancesofα˜i.Followingthesameidea,LangeandRyanproposedageneralizedweightednormalplot.TheysuggestedplottingzagainstΦ−1{F∗(z)},whereF∗istheiPiPweightedempiricalcdfdefinedbyF∗(x)=mw1/mw,withi=1i(zi≤x)i=1iw=c′Var(˜α)c=c′GZ′V−1ZGc.iiiiiInpractice,however,βandγareunknown.Insuchcases,LangeandRyansuggestedusingtheMLorREMLestimatorsinplaceoftheseparame-ters.Theyarguedthat,undersuitableconditions,thelimitingdistribution√ofn{Fˆ∗(x)−Φ(x)}isnormalwithmeanzeroandvarianceequaltothe√varianceofn{F∗(x)−Φ(x)}minusanadjustment,whereFˆ∗(x)isF∗(x)withtheunknownparametersreplacedbytheirMLorREMLestimators.SeeLangeandRyan(1989)fordetails.Thissuggeststhat,inthecaseofunknownparameters,theQ–QplotwillbezˆagainstΦ−1{Fˆ∗(ˆz)},whereiizˆiisziwiththeunknownparametersreplacedbytheirML(REML)esti-mates.However,the(asymptotic)varianceofFˆ∗(x)isdifferentfromthatofF∗(x),asindicatedabove.Therefore,ifonewishestoinclude,forexam-ple,a±1SDboundintheplot,theadjustmentforestimationofparametersmustbetakenintoaccount.Weconsideranexample.Example7.1.Consider,again,theone-wayrandomeffectsmodelofExample3.1withnormalityassumption.Notethata=mandbi=kiunderthenewnotation.Becauseαisreal-valued,c=1in(7.1).Ifµ,σ2,iτ2areknown,theEBestimatorofαisgivenbyikσ2iαˆi=τ2+kσ2(¯yi·−µ),i−1Pkiwherey¯i·=kij=1yij,withkσ4iwi=var(ˆαi)=.τ2+kiσ2Therefore,inthiscase,wehaveαˆiy¯i·−µzi==p,sd(ˆαi)σ2+τ2/kii=1,...,mand!−1Xmkσ4Xnkσ4∗iiF(x)=τ2+kσ2τ2+kσ21(zi≤x).iii=1i=1Inpractice,µ,σ2,andτ2areunknownandthereforereplacedbytheirREML(ML)estimatorswhenmakingaQ–Qplot(Exercise7.1).

208January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page196196RobustMixedModelAnalysis7.2Nonparametric/semiparametricmethods7.2.1AP-splinenonparametricmodelOnewaytoachieverobustnessistomaketheunderlyingmodellessrestric-tive.Forexample,insteadofassumingalinearmodel,onemayincludehigherorderterms,suchasquadraticorcubicfunctionsofthecovariates.Thehigher-ordermodelincludesthelinearmodelasaspecialcase(whenthecoefficientsofthehigher-ordertermsarezero),andthusislessre-strictivethanthelinearmodelinthat,evenifthelinearmodelfails,thehigher-ordermodelmaystillbevalid.Moregenerally,onemaymodelthemeanfunctionnon-parametrically,orsemi-parametrically.LetusbeginwiththeFay-HerriotmodelofExample5.1.Anextensionofthemaybewrittenasyi=f(xi)+vi+ei,i=1,...,m,(7.2)wheretheassumptionsaboutviandeiarethesameasintheFay-Herriotmodel(seeSubsection2.1),butf(·)isanunknownfunction.Inordertomakeinferenceaboutf(·)trackable,Opsomeretal.(2008)usedaP-splineapproximationtof(·)intheformoff˜(x)=β+βx+···+βxp01ppp+γ1(x−κ1)++···+γq(x−κq)+,(7.3)wherepisthedegreeofthespline,qisthenumberofknots,κj,1≤j≤qaretheknots,andx+=x1(x>0).Graphically,aP-splineispiecesof(pthdegree)polynomialssmoothlyconnectedattheknots.TheP-splinemodel,whichis(7.2)withfreplacedbyf˜,isfittedbypenalizedleastsquares—thatis,byminimizing|y−Xβ−Zγ|2+λ|γ|2,(7.4)withrespecttoβandγ,wherey=(yi)1≤i≤n,theithrowofXisppp(1,xi,...,xi),theithrowofZis[(xi−κ1)+,...,(xi−κq)+],1≤i≤n,andλisapenalty,orsmoothing,parameter.Todetermineλ,Wand(2003)usedthefollowinginterestingconnectiontoalinearmixedmodel(seethepreviouschapter).Supposethatthe’saredistributedasN(0,τ2).Theniiftheγ’saretreatedasindependentrandomeffectswiththedistributionN(0,σ2),theminimizerof(7.4)isthesameasthebestlinearunbiasedestimator(BLUE)forβandthebestlinearunbiasedpredictor(BLUP)forγ,providedthatλisidenticaltotheratioτ2/σ2(Exercise7.2).Thus,theP-splinemodelisfittedthesamewayasthelinearmixedmodely=Xβ+Zγ+(7.5)

209January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page197OtherTopics197[e.g.,Jiang(2007),§2.3.1].AlthoughtheaboveP-spline–linearmixedmodelconnectionisconve-nientforcomputationalpurpose,Jiang(2010)(p.453)notedthattheconnectionisasymptoticallyvalidonlyifthetrueunderlyingfunctionfisnotaP-spline.Nevertheless,inmostapplicationsofP-splines,theun-knownfunctionfisunlikelytobeaP-spline,so,inaway,(7.3)isonlyusedasanapproximation,withtheapproximationerrorvanishingasq→∞.Opsomeretal.(2008)incorporatedthesplinemodelwithrandomef-fects.BymakinguseoftheP-spline–mixed-modelconnection,theirap-proximatingmodelhasthetermUαaddedtotherightsideof(7.5):y=Xβ+Zγ+Uα+,(7.6)whereαisthevectorofrandomeffectsandUisaknownmatrix.Itisassumedthatγ,α,andareuncorrelatedwithmeans0andcovariancematricesΣγ,Σα,andΣ,respectively.TheBLUEandBLUParegivenbyβ˜=(XV−1X)−1XV−1y,γ˜=ΣZV−1(y−Xβ˜),γα˜=ΣUV−1(y−Xβ˜),αwhereV=Var(y)=ZΣZ+UΣU+Σ[see(5.4),(5.5)].TheEBLUEγαandEBLUP,denotedbyβˆ,γˆ,andαˆ,areobtainedbyreplacingVbyVˆinthecorrespondingexpressions.Here,weassumethatV=V(ϕ),whereϕisavectorofvariancecomponents,sothatVˆ=V(ˆϕ).TheREMLestimatorisusedforϕˆ,assuggestedbyOpsomeretal.(2008).7.2.2NonparametricmodelselectionNotethataP-splineischaracterizedbyp,q,andalsothelocationoftheknots.Notethat,however,givenp,q,thelocationoftheknotscanbeselectedbythespace-fillingalgorithmimplementedinR[cover.design()].Butthequestionhowtochoosepandqremains.Thegeneral“rule-of-thumb”isthatpistypicallybetween1and3,andqproportionaltothesamplesize,n,with4or5observationsperknot.Buttheremaystillbealotofchoicesgiventherule-of-thumb.Forexample,ifn=200,thepossiblechoicesforqrangefrom40to50,which,combinedwiththerangeof1to3forp,givesatotalof33choicesfortheP-spline.Jiangetal.(2010)proposedtouseaversionofthefencemethodstochoosethedegreeofthespline,p,thenumberofknots,q,andthesmoothingparameter,λatthesametime.Notethat,hereagain,thetrueunderlying

210January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page198198RobustMixedModelAnalysismodelisnotamongtheclassofcandidatemodels,thatis,theapproximat-ingsplines(7.3).ThisissimilartowhatwasdiscussedinthebeginningofSubsection6.2.2.Furthermore,theroleofλinthemodelshouldbemadeclear:λcontrolsthedegreeofsmoothnessoftheunderlyingmodel.Anaturalmeasureoflack-of-fitisQ(M,θ;y)=|y−Xβ−Zγ|2.However,MQ(M)isnotobtainedbyminimizingQ(M,θM;y)overβandγwithoutconstraint.Instead,wehaveQ(M)=|y−Xβˆ−Zγˆ|2,whereβˆandγˆaretheminimizerof(7.4),andhencedependsonλ.Notethat,insteadofusingtheP-spline-linearmixedmodelconnection,Jiangetal.(2010)suggeststosolvetheminimizationproblemdirectly.Anotherdifferenceisthattheremaynotbeafullmodelamongthecandidatemodels.Therefore,thefenceinequality(6.39)isreplacedbyQ(M)−Q(M˜)≤c,(7.7)whereM˜isthecandidatemodelthathastheminimumQ(M).Weusethefollowingcriterionofoptimalitywithinthefencewhichcombinesmodelsimplicityandsmoothness.Forthemodelswithinthefence,choosetheonewiththesmallestq;iftherearemorethanonesuchmodels,choosethemodelwiththesmallestp.Thisgivesthebestchoiceofpandq.Oncep,qarechosen,wechoosethemodelwithinthefencewiththelargestλ.ThetuningconstantcischosenadaptivelyusingtheAFidea(seeSub-section6.2.1),whereparametricbootstrapisusedforcomputingp∗.Morespecifically,M˜andMLestimatorsunderM˜areusedforthebootstrapping.ThefollowingtheoremisprovedinJiangetal.(2010).Forsimplicity,assumethatthematrixW=(XZ)isoffullrank.LetPW⊥=In−PW,m−1wheren=i=1niandPW=W(WW)W.Theorem7.1.Computationally,theabovefenceprocedureisequiva-lenttothefollowing:(i)firstusetheAFtoselectpandqusing(7.7)withPλ=0andQ(M)=yW⊥y(seeLemma7.1below),andsamecriterionasaboveforchoosingp,qwithinthefence;(ii)letM∗denotesthemodel0correspondingtotheselectedpandq,findthemaximumλsuchthat∗∗Q(M0,λ)−Q(M˜)≤c,(7.8)whereforanymodelMwiththecorrespondingXandZ,wehaveQ(M,λ)=|y−Xβˆ−Zγˆ|2,λλβˆ=(XV−1X)−1XV−1y,λλλ−1−1−1γˆλ=λ(Iq+λZZ)Z(y−Xβˆλ),XV−1X=XX−λ−1XZ(I+λ−1ZZ)−1ZX,λqXV−1y=Xy−λ−1XZ(I+λ−1ZZ)−1Zy,λq

211January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page199OtherTopics199andc∗ischosenbytheAF(Visdefinedbelowbutnotdirectlyneededλhereforthecomputationbecauseofthelasttwoequations).Notethatinstep(i)oftheTheoremonedoesnotneedtodealwithλ.Themotivationfor(7.8)isthatthisinequalityissatisfiedwhenλ=0,soonewouldliketoseehowfarλcango.Infact,themaximumλisasolutiontotheequationQ(M∗,λ)−Q(M˜)=c∗.Thepurposeofthelast0twoequationsistoavoiddirectinversionofV=I+λ−1ZZ,whoseλndimensionisequalton,thetotalsamplesize.NotethatVλdoesnothaveablockdiagonalstructurebecauseofZZ,soifnislargedirectinversionofVλmaybecomputationallyburdensome.TheproofoftheTheoremrequiresthefollowinglemma,whoseproofisleftasanexercise(Exercise7.3).Lemma7.1.ForanyMandy,Q(M,λ)isanincreasingfunctionofλwithinfλ>0Q(M,λ)=Q(M).7.2.3Examples1.Asimulationstudy.Jiangetal.(2010)carriedoutasimulationstudydesignedtoevaluateperformanceoftheproposedfencemethod.Consider(7.2)withDi=D,1≤i≤m.Themodelcanbewrittenasyi=f(xi)+i,i=1,...,m,(7.9)where∼N(0,σ2)withσ2=A+D,whichisunknown.Thus,themodeliisthesameasanonparametricregressionmodel.Considerthreedifferentcasesthatcovervarioussituationsandaspects.Inthefirstcase,Case1,thetrueunderlyingfunctionisalinearfunction,f(x)=1−x,0≤x≤1,hencethemodelreducestothetraditionalFay-Herriotmodel.ThegoalistofindoutifthefencecanvalidatethetraditionalFay-Herriotmodelinthecasethatitisvalid.Inthesecondcase,Case2,thetrueunderlyingfunctionisthequadraticsplinewithtwoknots,givenbyf(x)=1−x+x2−2(x−1)2+2(x−2)2,0≤x≤3(7.10)++(theshapeishalfcirclebetween0and1facingup,halfcirclebetween1and2facingdown,andhalfcirclebetween2and3facingup).Notethatthisfunctionissmoothinthatithasacontinuousderivative(Exercise7.4).Heretheintentionistoinvestigatewhetherthefencecanidentifythetruefunctioninthe“perfect”situation,thatis,whenf(x)itselfisaspline.Thelastcase,Case3,isperhapsthemostpracticalsituation,inwhichnosplinecanprovideaperfectapproximationtof(x).Inotherwords,the

212February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page200200RobustMixedModelAnalysisTable7.1NonparametricModelSelection-Case1&Case2.Reportedareempiricalprobabilities,intermsofpercentage,basedon100simulationsthattheoptimalmodelisselected.Case1Case2Samplesizem=10m=15m=20m=30m=40m=50HighestPeak629197718397ConfidenceL.B.739097738096trueunderlyingfunctionisnotamongthecandidates.Inthiscasef(x)ischosenas0.5sin(2πx),0≤x≤1,whichisoneofthefunctionsconsideredbyKauermann(2005).Considersituationsofsmallormediansamplesize,namely,m=10,15or20forCaseI,m=30,40or50forCase2,andm=10,30or50forCase3.ThecovariatexiaregeneratedfromtheUniform[0,1]distributioninCase1,andfromUniform[0,3]inCase2;thenfixedthroughoutthesimulations.FollowingKauermann(2005),weletxibeequidistantinCase3.Theerrorstandarddeviationσin(7.9)ischosenas0.2inCase1andCase2.Thisvalueischosensuchthatthesignalstandarddeviationineachcaseisaboutthesameastheerrorstandarddeviation.AsforCase3,weconsiderthreedifferentvaluesforσ,0.2,0.5and1.0.Thesevaluesarealsoofthesameorderasthesignalstandarddeviationinthiscase.ThecandidateapproximatingsplinesforCase1andCase2arethefollowing:p=0,1,2,3,q=0andp=1,2,3,q=2,5(sothereareatotalof10candidates).AsforCase3,followingKauermann(2005),weconsideronlylinearsplines(i.e.,p=1);furthermore,weconsiderthenumberofknotsintherangeofthe“rule-of-thumb”(i.e.,roughly4or5observationsperknot;seeSection7.2.2.),plustheinterceptmodel(p=q=0)andthelinearmodel(p=1,q=0).Thus,form=10,q=0,2,3;form=30,q=0,6,7,8;andform=50,q=0,10,11,12,13.Table7.1showstheresultsbasedon100simulationsunderCase1andCase2.AsinJiangetal.(2009),weconsiderboththehighestpeak,thatis,choosingcwiththehighestp∗,and95%lowerbound,thatis,choosingasmallerccorrespondingtoapeakofp∗inordertobeconservative,ifthecorrespondingp∗isgreaterthanthe95%lowerboundofthep∗foranylargercthatcorrespondstoapeakofp∗.ItisseenthatperformanceoftheAFissatisfactoryevenwiththesmallsamplesize.Also,itappearsthattheconfidencelowerboundmethodworksbetterinsmallersample,butmakesalmostnodifferenceinlargersample.TheseareconsistentwiththefindingsofJiangetal.(2009).Table7.2showstheresultsforCase3.Notethat,unlikeCase1and

213January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page201OtherTopics201Table7.2NonparametricModelSelection-Case3.Reportedareempiricaldistributions,intermsofpercentage,oftheselectedmodels.SampleSizem=10m=30m=50#ofKnots0,2,30,6,7,80,10,11,12,13(p,q)%(p,q)%(p,q)%σ=.2HighestPeak(0,0)1(1,0)9(1,10)100(1,0)31(1,6)91(1,2)68ConfidenceL.B.(1,0)24(1,0)9(1,10)100(1,2)76(1,6)91σ=.5HighestPeak(0,0)14(1,0)21(1,0)13(1,0)27(1,6)77(1,10)84(1,2)56(1,7)2(1,11)2(1,3)3(1,12)1ConfidenceL.B.(0,0)8(1,0)8(1,0)2(1,0)23(1,6)89(1,10)94(1,2)65(1,7)3(1,11)2(1,3)4(1,12)2σ=1HighestPeak(0,0)27(0,0)15(0,0)10(1,0)20(1,0)18(1,0)26(1,2)49(1,6)63(1,10)60(1,3)4(1,7)4(1,11)2(1,12)2ConfidenceL.B.(0,0)20(0,0)1(0,0)2(1,0)13(1,0)13(1,0)13(1,2)59(1,6)82(1,10)80(1,3)8(1,7)4(1,11)2(1,12)3Case2,herethereisnooptimalmodel(anoptimalmodelmustbeatruemodel,accordingtoourdefinition,whichdoesnotexist).So,insteadofgivingtheempiricalprobabilitiesofselectingtheoptimalmodel,wegivetheempiricaldistributionoftheselectedmodelsineachcase.Itisapparentthat,asσincreases,thedistributionofthemodelsselectedbecomesmorespreadout.Areversepatternisobservedasmincreases.Theconfidencelowerboundmethodappearstoperformbetterinpickingupamodelwithsplines.Withinthemodelswithsplines,fenceseemstooverwhelminglypreferfewerknotsthanmoreknots.Notethatthefenceprocedureallowsonetochoosenotonlypandqbutalsoλ.Ineachsimulationwecomputeβˆ=βˆλandγˆ=ˆγλ,givenbelow(7.8),basedontheλchosenbytheadaptivefence.Thefittedvaluesarecalculatedby(7.3)withβandγreplacedbyβˆandγˆ,respectively.Wethenaveragethefittedvaluesoverthe100simulations.Figure7.1showstheaveragefittedvaluesforthethreecases(m=10,30,50)withσ=0.2

214March27,201917:10ws-book9x6RobustMixedModelAnalysisbook4page202202RobustMixedModelAnalysisFig.7.1Case3Simulation.Topfigure:Averagefittedvaluesform=10.Middlefigure:Averagefittedvaluesform=30.Bottomfigure:Averagefittedvaluesform=50.Inallcases,thereddotsrepresentthefittedvalues,whilethebluecirclescorrespondtothetrueunderlyingfunction.underCase3.Thetrueunderlyingfunctionvalues,f(xi)=0.5sin(2πxi),i=1,...,marealsoplottedforcomparison.2.Hospitaldatarevisited.RecallthehospitaldataofSubsection5.9.1.Ganesh(2009)proposedaFay-Herriotmodelforthegraftfailureratesasfollows:yi=β0+β1xi+vi+ei,wherethevi’sarehospital-specificrandomeffectsandei’saresamplingerrors.Itisassumedthatvi,eiareindependentwithvi∼N(0,A)andei∼N(0,Di).HerethevarianceAisunknown.Basedonthemodel,Ganeshobtainedcredibleintervalsforselectedcontrasts.However,inspectionsoftherawdatasuggestsome

215January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page203OtherTopics203nonlineartrends,whichraisesthequestiononwhetherthefixedeffectspartofthemodelcanbemademoreflexibleinitsfunctionalform.Toanswerthisquestion,weconsidertheFay-Herriotmodelasaspecialmemberofaclassofapproximatingsplinemodelsdiscussedinthissection.Morespecifically,weassumethemodel(7.2)withxibeingtheseverityindex.Wethenconsidertheapproximatingspline(7.3)withp=0,1,2,3andq=0,1,...,6(p=0isonlyforq=0).Heretheupperbound6ischosenaccordingtothe“rule-of-thumb”(becausem=23,som/4=5.75).NotethattheFay-Herriotmodelcorrespondstothecasep=1andq=0.Thequestionisthentofindtheoptimalmodel,intermsofpandq,fromthisclass.WeapplytheAFmethoddescribedinSubsection7.2.2tothiscase.Heretoobtainthebootstrapsamplesneededforobtainingc∗,wefirstcomputePtheMLestimatorunderthemodelM˜,whichminimizesQ(M)=yW⊥yamongthecandidatemodels[i.e.,(6.2);seeTheorem6.1],thendrawpara-metricbootstrapsamplesunderM˜withtheMLestimatorstreatedasthetrueparameters.ThisisreasonablebecauseM˜isthebestapproximatingmodelintermsofthefit,eventhoughundermodel(7.2)theremaynotbeatruemodelamongthecandidatemodels.Thebootstrapsamplesizeischosenas100.Thefencemethodhasselectedthemodelp=3andq=0,thatis,acubicfunctionwithnoknots,astheoptimalmodel.Werepeatedtheanalysis100times,eachtimeusingdifferentbootstrapsamples.Allresultsledtothesamemodel:acubicfunctionwithnoknots.TheleftfigureofFigure7.2showstheplotofp∗againstc=cintheAFmodelselection.nAfewcomparisonsarealwayshelpful.Ourfirstcomparisonistofenceitselfbutwithamorerestrictedspaceofcandidatemodels.Morespecifi-cally,weconsider(7.3)withtherestrictiontolinearsplinesonly,thatis,p=1,andknotsintherangeofthe“rule-of-thumb”,namely,q=4,5,6,plustheinterceptmodel(p=q=0)andthelinearmodel(p=1,q=0).Inthiscase,thefencemethodselectedalinearsplinewithfourknots(i.e.,p=1,q=4)astheoptimalmodel.Thevalueofλcorrespondingtothismodelisapproximatelyequalto0.001.Theplotofp∗againstcforthisnmodelselectionisverysimilartotheleftfigureofFigure7.2,andthereforeomitted.Inaddition,therightfigureofFigure7.2showsthefittedvaluesandcurvesunderthetwomodelsselectedbythefencefromwithinthedifferentmodelspacesaswellastheoriginaldatapoints.Afurthercomparisoncanbemadebytreating(7.2)asageneralizedad-ditivemodel[GAM;e.g.,HastieandTibshirani(1990)]withheteroscedastic

216January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page204204RobustMixedModelAnalysiserrors.Aweightedfitcanbeobtainedwiththeamountofsmoothingop-timizedbyusingageneralizedcross-validation(GCV)criterion.Heretheweightsusedarewi=1/(A+Di)wheretheMLestimateforAisusedasaplug-inestimate.RecallthattheDi’sareknown.ThisfittedfunctionisalsooverlayedintherightplotofFigure7.2.Noticehowcloselythisfittedfunctionresemblestherestrictedspacefencefit.ToexpandtheclassofmodelsunderconsiderationbyGCV-basedsmoothing,weusedtheBRUTOprocedure[HastieandTibshirani(1990)]whichaugmentstheclassofmodelstolookatanullfitandalinearfitforthesplinefunction;andembedstheresultingmodelselection(i.e.,null,linearorsmoothfits)intoaweightedbackfittingalgorithmusingGCVforcomputationalefficiency.Interestinglyhere,BRUTOfindssimplyanoveralllinearfitforthefixedeffectsfunctionalform.Whilecertainlyanin-terestingcomparison,BRUTO’stheoreticalpropertiesformodelslike(7.2)havenotreallybeenstudiedindepth.Finally,asmentionedinSubsection7.2.1,byusingtheconnectionbe-tweenP-splineandlinearmixedmodelonecanformulate(6.2)asalinearmixedmodel,wherethesplinecoefficientsaretreatedasrandomeffects.Theproblemthenbecomesa(parametric)mixedmodelselectionproblem,hencethemethodofSection6.2canbeapplied.Infact,thiswasourinitialapproachtothisdataset,andthemodelwefoundwasthesameastheonebyBRUTO.However,thereissomereservationaboutthisapproach,asexplainedinSubsection7.2.1.7.2.4FunctionalandsemiparametricmixedmodelsTherehavebeenstudiesinfunctionalmixedeffectsmodels.Theseincludemodelsforfunctionaldata,inwhichtherandomeffectsareEuclidean(i.e.,notfunctional),andmodelsinwhichboththedataandtherandomef-fectsarefunctional.GuoGuo(2002)notedthatdatainmanycasesariseascurves,suchasgrowthcurves,bormoneprofiles,andbiomarkersmea-suredovertime.Theauthorproposedafunctionalmixedeffectsmodel,inwhichtherandomeffectsaremodeledasrealizationsofaGaussianpro-cess.Supposethatthereisaresponsecurveassociatedwitheachofthemsubjects.Letyijbetheobservedvalueoftheithcurveattimetij,i=1,...,m,j=1,...,nsuchthaty=xβ(t)+zα(t)+,iijijijijiijijwhereβ(t)=[βk(t)]1≤k≤pisap×1vectoroffixedfunctions,αi(t)=[αk(t)]1≤k≤qisaq×1vectorofrandomfunctionsthataremodeledasrealizationsofGaussianprocesses,A(t)=[αk(t)]1≤k≤q,withzeromeans,

217January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page205OtherTopics205Fig.7.2Left:Aplotofp∗againstc=cnfromthesearchoverthefullmodelspace.Right:Therawdataandthefittedvaluesandcurves;reddotsandcurvecorrespondtothecubicfunctionresultedfromthefullmodelsearch;bluesquaresandlinescorrespondtothelinearsplinewith4knotsresultedfromtherestrictedmodelsearch;greenX’sandlinesrepresenttheGAMfits.xij=(xijk)1≤k≤pandzij=(zijk)1≤k≤qaredesignmatricesthatincludecovariatesaswellasdummyvariables,and∼N(0,σ2)andareindepen-ijdentwiththeα’s.Infittingtheproposedmodel,Guofirstapproximatedboththefixedandrandomfunctionsbysmoothingsplines.HethenusedaconnectionbuiltbyWahbaWahba(1978),[Wahba(1983)]tomodelβ(t)andA(t)ast−1/2βk(t)=b1k+b2kt+λb,kWb,k(s)ds,k=1,...,p;0t−1/2αk(t)=a1k+a2kt+λa,kWa,k(s)ds,k=1,...,q,0where(b,b)∼N(0,τI)withτ→∞(diffuseprior),and(a,a)∼1k2k21k2kN(0,σ2D),Dbeinganunknowncovariancematrix,andW(s),W(s)kb,ka,kareWeinerprocesses.Here,a1kanda2kareconsideredrandominterceptandrandomslope,respectively.Intermsofstatisticalinference,thelatestauthorwasmainlyinterested−1−1intestinglinearityofβk(t)andαk(t),thatis,λb,k=0,orλa,k=0.Usingaresultfrom[Self,S.G.andLiang,K.Y.(1987)]onlikelihoodratiotestforparametersontheboundaryoftheparameterspace.Morespecifically,GuoGuo(2002)showedthat,underthenullhypothesisof−1λ=0,theasymptoticdistributionofthelikelihoodratiostatistic(LRS)b,kisamixtureof0andχ2withequalweight(1/2forboth).Asfortesting1forasinglecomponentofβ(t),thatis,βk(t)=0,itisequivalenttotesting

218January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page206206RobustMixedModelAnalysis   Fig.7.3FittedCurvesviaSSMandFMEM−1forb1k=b2k=λb,k=0.TheasymptoticnulldistributionoftheLRSisamixtureofχ2andχ2withequalweight.23Example7.2.Weillustratethefittingofthefunctionalmixedeffectsmodelusingasimulatedexample.Thedataweregeneratedunderthemodelyi(t)=5xi1sin(2πt)+5xi2sin(πt)+ri(t)+ei,i=1,...,10,t∈{t1,...,t30}.Foreachsubjecti,xi1isgeneratedfromUniform[−0.5,0.5]andxi2isbinarywithequalnumberof0sand1s.Thus,wehaveβ1(t)=2sin(2πt),β(t)=5sin(πt).Furthermore,r(t)∼N(0,σ2Σ(t))withσ2=2i1190andΣ(t)issimulatedbythereproducingkernelR1(t,t)[e.g.,Guoiii.i.d.(2002)].Finally,ei=(eij)1≤j≤30andeij∼N(0,1).Thegenerateddataaresubjecttorandommissingwithprobability0.1.ThemodelwerefittedusingtwomethodsproposedinGuo’spaper,thestatespacemodel(SSM)andfunctionalmixedeffectsmodel(FMEM).TherearesoftwarepackagesinSASthatimplementthesemethods,PROCSSMforSSMandPROCMIXEDforFMEM.SeeFigure7.3fortheillustration.Kraftyetal.(2011)consideredfunctionalmixedeffectsspectralanalysis.Inbiomedicalexperiments,dataareoftencollectedfrommultiplesubjectsastimeseries.Itisnaturaltousecarryouttimeseriesanalysistostudy

219January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page207OtherTopics207theeffectsofdesigncovariates.Oneofthestandardanalysisintimeseriesisspectralanalysis(e.g.,[Koopmans(1995)]).Here,arandomeffectisassociatedwithagroupofsubjects,whilemultipletimeseriescorrespondtosubjectswithinthegroup.Therandomeffectsarefunction-valued,anddifferentgroupsareassumedtobeindependent.AnovelcontributionoftheworkisamixedeffectsCramérrepresentationinthefollowingform:1/22πitωXjkt=A0(ω;Ujk)Aj(ω;Vjk)edZjk(ω),−1/2j=1,...,N,k=1,...,nj,whereNisthenumberofgroupsandnjthenumberoftimeserieswithinthejthgroup;Xjktisthekthreplicatetimese-riesofthejthgroup;Ujk,Vjkarevectorsofcovariates,A0(·;Ujk),Aj(·;Vjk)arefunctionscorrespondingtothefixedandrandomeffects,respectively.Throughasymptoticanalysis,theauthorsshowedthat,whenthereplicate-specificspectraaresmooth,thelog-periodogramsconvergetoafunctionalmixedeffectsmodel.TheauthorsalsodevelopedBLUPfortheunit-specificrandomeffects.Inanotherrelatedwork,Radyetal.(2015)consideredestimationinmixed-effectsfunctionalANOVAmodels.Theauthorsconsideredthefol-lowingso-calledmixedfunctionalANOVAmodel:yij(t)=μ(t)+αi(t)+βj(t)+ij(t),i=1,...,a,j=1,...,b.Thisissimilartoatwo-wayrandomeffectsANOVAmodel(seeExample3.2)exceptthateverythinginvolvedisafunction.However,intheiranal-ysis,theauthorshavefocusedonafixedvalueoft.Suchresultshavelittleinterestintermsofunderstandingthefunctionalrelationship.SemiparametricGLMMalsohasreceivedattentionintheliterature.Forexample,LombardíaandSperlich(2008)consideredanextensionofGLMMinthattheconditionalmeanoftheresponsegiventherandomeffectscanbeexpressedasμ=E(y|α,T,X)=g{λ(T)+xβ+zα},whereαijijiijijijijijiiisavector-valuedrandomeffect,xij,tijareobservedvectorsofregressors,andzisasubvectorof(1,x).Furthermore,λ(·)isanunknownfunction.ijijItisassumedthat(i)theresponsesyijareconditionallyindependentgivenα,T,X;(ii)therandomeffectsarei.i.d.withmean0andcovariancematrixΣ;and(iii)(T,X)areindependentwithα.Theauthorscombinelikelihoodapproachformixedeffectsmodelswithkernelmethods.Intermofasymptoticproperties,apoint-wiseasymptoticnormalityresultregardingestimationofthefunctionλ(·).Asformeasure

220January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page208208RobustMixedModelAnalysisofuncertainty,theauthorsproposedabootstrapprocedureandprovidedatheoreticaljustification,alsointermsofasymptoticapproximationofthedistributionof(y,X,T)bythebootstrapdistribution.TheauthorsdiscussedapplicationoftheirmethodtoSAE[e.g.,RaoandMolina(2015);alsoseeChapter5],currentlyaveryactiveareaofresearchandapplications.7.2.5NonparametricbootstrappingModel-based(orparametric)bootstrapmethodshavebeenusedinmixedmodelanalysis;see,forexample,Jiang(2017)(sec.4.5.3).Itisabittricky,though,tobootstrapnonparametricallyinamixedmodelsituation.Clearly,Efron’soriginalidea[Efron(1979)]ofi.i.d.bootstrappingwouldnotworkduetothefactthattheobservationsareneitherindependentnoridenticallydistributed.Ontheotherhand,therandomeffectsinamixedeffectsmodelareoftenassumedtobei.i.d.,foreachrandomeffectfactor.Thedifficultyisthattherandomeffectsareunobservedso,onceagain,Efron’si.i.d.bootstrapwouldnotwork.[HallandMaiti(2006)]proposedamethodtobootstraptherandomeffectsnonparametrically,atleastinsomecasesofmixedeffectsmodels.ConsidertheNERmodel(e.g.,Subsection5.4.2).Supposethattherandomeffectsarei.i.d.withanunknowndistribution.Thequestionishowtogeneratesamplesoftherandomeffects?AnanswerbyHallandMaiti:Dependingonwhatonewantstodo.Inmanycases,thequantityofinterestdoesnotinvolveeverypieceofinformationaboutthedistributionoftherandomeffects.Forexample,HallandMaitiobservedthattheMSPEofEBLUPinvolvesonlythesecondandfourthmomentsoftherandomeffectsanderrors,uptotheorderofo(m−1).Thismeansthatforrandomeffectsanderrorsfromanydistributionswiththesamesecondandfourthmoments,theMSPEofEBLUP,withsomesuitableestimatorsofthevariancecomponents,aredifferentonlybyatermofo(m−1).Thisobservationleadstoaseeminglysimplestrategy:First,estimatethesecondandfourthmomentsoftherandomeffectsanderrors;then,drawbootstrapsamplesoftherandomeffectsanderrorsfromcertaindistribu-tionsthatmatchthefirst(whichis0)andestimatedsecondandfourthmoments;giventhebootstrappedrandomeffectsanderrors,usetheNERmodeltogeneratethebootstrappeddata.Specifically,lettheNERmodelbeexpressedasy=xβ+v+e,i=1,...,m,j=1,...,n.(7.11)ijijiiji

221January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page209OtherTopics209Then,thebootstrappeddata,yij,aregeneratedby(7.11)withβreplacedbyβˆ,theempiricalBLUE(EBLUE)ofβgivenby(5.4)withthevariancecomponentsinvolvedinVbyconsistentestimators,andvi,eijreplacedbyv∗,e∗,respectively,wherev∗,e∗,i=1,...,m,j=1,...,naretheiijiijibootstrappedrandomeffectsanderrors.Regardingthemoment-matchingbootstrapoftherandomeffectsanderrors,clearly,thechoiceisnotunique.Asimplechoiceisathree-pointdistributiondependingonasingleparameterp∈(0,1),definedasP(ξ=0)=1−pand11pPξ=√=Pξ=−√=.(7.12)pp2Itiseasytoshow(Exercise7.5)thatE(ξ)=0,E(ξ2)=1,andE(ξ4)=1/p.2√Thus,ifweletp=μ2/μ4,thedistributionofμ2ξhassecondmomentμ2andfourthmomentμ).Notethattheinequalityμ2≤μalwaysholdwith424theequalityholdingifandonlyifξ2isdegenerate(i.e.,a.s.aconstant).AnotherchoiceistherescaledStudentt-distributionwhosedegreesoffreedom(d.f.)ν>4andisnotnecessarilyaninteger.Thedistributionhasfirstandthirdmoments0,μ2=1,andμ4=3(ν−2)/(ν−4),whichisalwaysgreaterthan3,meaningthatthetailsareheavierthanthoseofthenormaldistribution.However,themoment-matchingmaynotalwaystakeplace.Toseethis,letμˆ2andμˆ4betheestimatedsecondandfourthmomentsoftherandomeffect,andηhavetherescaledt-distributionasabove.Then,μˆ1/2ηhassecondmomentμˆandfourthmoment3(ν−2)ˆμ2/(ν−4).By222lettingthelatterequaltoμˆ4,onehasμˆ/μˆ2=3(ν−2)/(ν−4)>3,(7.13)42whichmaynotholdevenifthetruemomentssatisfyμ/μ2>3.When42(7.16)doesnothold,thematchingd.f.,ν,cannotbefound.Basedonthemoment-matchingbootstrap,HallandMaiti(2006)de-velopedadouble-bootstrapMSPEestimatorandshowedthatitissecond-orderunbiased.Ontheotherhand,themoment-matching,double-bootstrapproceduremaybecomputationallyintensive,andsofartherehavenotbeenpublishedcomparisonsofthemethodwithotherexistingmethodsthatalsoproducesecond-orderunbiasedMSPEestimators,suchasthePrasad–Rao(seeSubsection5.8)andjackknife[Jiangetal.(2002)].Ontheotherhand,theideaof[HallandMaiti(2006)]ispotentiallyappli-cabletomixedmodelswithmorecomplicatedcovariancestructure,suchasmixedmodelswithcrossedrandomeffects.Weconcludethissectionwithanillustrativeexample.

222January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page210210RobustMixedModelAnalysisExample7.3.Consideratwo-wayrandomeffectsmodelyij=μ+ui+vj+eij,i=1,...,m1,j=1,...,m2,whereμisanunknownmean;theui’sarei.i.d.withmean0,varianceσ2,andanunknowndistributionF;thev’sarei.i.d.11jwithmean0,varianceσ2,andanunknowndistributionF;thee’sare22iji.i.d.withmean0,varianceσ2,andanunknowndistributionF;andu,v,00andeareindependent.Notethattheobservationsarenotclusteredunderthismodel.Asaresult,thejackknifemethodof[Jiangetal.(2002)]maynotapplyinestimatingtheMSPEofEBLUPinthiscase.However,itisfairlyeasytoobtainestimatorsofthesecondandfourthmomentsoftherandomeffectsanderrors.ForthesecondmomentsonemayusetheMLorREMLmethods(seeSection3.2);forthefourthmomentsonemayusetheEMMmethod(seeSubsection3.3.1;alsoTheorem4.8).Itwouldbeinterestingtoseeifthemoment-matching,double-bootstrapmethodcanproduceasecond-orderunbiasedMSPEestimatorinthiscase.7.3Bayesiananalysis7.3.1ArobusthierarchicalBayesmethodInthecontextofSAE,Chakrabortyetal.(2018)showedthatahierarchicalBayes(HB)smallareapredictorproposedbyDattaandGhosh(1991)isnotrobusttooutliersinawaysimilartotheEBLUP.FollowingSinhaandRao(2009)’sefforttorobustifytheEBLUP(seebelow),Chakrabortyetal.(2018)proposedarobustBayesianmethodbasedonanormalmixtureidea.AsimilarapproachwasconsideredbyGershunskaya(2018),butunderanon-Bayesianframework.BelowweshallfocusondescribingthemethodofChakrabortyetal.(2018).TheyconsideredaBayesianversionoftheNERmodel[Batteseetal.(1988),seeSubsection5.4.2]exceptthatthedistributionoftheunit-levelerrorsisassumedtofollowatwo-componentnormalmixturedistribution,insteadofnormaldistribution.Specifically,theirproposednormal-mixture(NM)HBmodelisdefinedasfollows:(I)Conditionalonβ=(βj)1≤j≤p,v=(vi)1≤i≤m,zij,1≤i≤m,1≤j≤N,p,σ2,σ2,andσ2,Y,1≤i≤m,1≤j≤Nareindependentwithie12vijiY∼zN(xβ+v,σ2)+(1−z)N(xβ+v,σ2).(7.14)ijijiji1ijiji2(II)Thezij’sarei.i.d.indicatorswithP(zij=1|pe)=pe,andareinde-pendentofβ,v,σ2,σ2,andσ2.12v

223January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page211OtherTopics211(III)Conditionalonβ,thez’s,p,σ2,σ2,andσ2,therandomeffectsije12vv,1≤i≤mareindependentanddistributedasN(0,σ2).ivNotethat,in(I),bothcomponentsoftheNMhavethesamemeanbutdifferentvariances;equivalently,thismeansthattheunit-levelerror,eij,hasaNMdistribution,inwhichbothcomponentshavemeanzerobutdifferentvariances,σ2andσ2,withthelargervariancecorrespondingto12sourceoftheoutliers.Thepriorfortheparameterswasassumedtobenoninformative;sufficientconditionsweregiventoensureproprietyoftheposterior.Thepriorwasalsochosencarefullysothattheconditionaldis-tributions,usedintheMarkovchainMonte-Carlocomputationaresimple.Theauthorsstudiedempirical,frequentistpropertiesoftheirproposedBayesianpredictors,andshowedthattheyperformedsimilarlytotherobustEBLUPofSinhaandRao(2009)whenoutlierswerepresent,andbothmethodsoutperformedtheHBmethodofDattaandGhosh(1991)andtheM-quantilemethodofChambersandTzavidis(2006).Intheabsenceofoutliers,theproposedmethodperformedsimilarlytothatofDattaandGhosh(1991).Asanapplication,Chakrabortyetal.(2018)reanalyzedtheIowacropsdataofBatteseetal.(1988).Wepresenttheiranalysisresultsbelowasanexample.7.3.2Real-dataexample:IowacropsdataOneofthewell-knownproblemsinSAEwasdiscussedinBatteseetal.(1988),inwhichtheauthorspresenteddatafrom12Iowacountiesobtainedfromthe1978JuneEnumerativeSurveyoftheU.S.DepartmentofAgricul-tureaswellasfromlandobservatorysatellitesoncropareasinvolvingcornandsoybeans.Theobjectivewastopredictthemeanhectaresofcornandsoybeanspersegmentforthe12countiesusingthesatelliteinformation.Inthispaper,theauthorsintroduced,forthefirsttime,theNERmodelthathassincebecomepopularinSAE.Themodelforthecropsdatacanbeexpressedasyij=β0+β1x1ij+β2x2ij+vi+eij,(7.15)i=1,...,12,j=1,...,ni,wherenirangesfrom1to6.Hereirepresentscountyandjsegmentwithinthecounty;yijisthenumberofhectaresofcorn(orsoybeans);x1ijandx2ijarethenumberofpixelsclassifiedascornandsoybeans,respectively,accordingtothesatellitedata.Further-more,viisacounty-specificrandomeffect,andeijisthesamplingerror.

224January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page212212RobustMixedModelAnalysisTable7.3DataforHardinCounty[Batteseetal.(1988)]Reported#ofPixelsMean#ofPixelsCornSoybeansCornSoybeansCornSoybeans88.59102.59220262325.99177.0588.5929.4634087165.3569.28355160104.0099.1526122188.63143.66187345153.7094.49350190ItisassumedthattherandomeffectsareindependentanddistributedasN(0,σ2),thesamplingerrorsareindependentanddistributedasN(0,σ2),veandtherandomeffectsandsamplingerrorsareuncorrelated.Theprimaryinterestwastoestimatethemeanhectaresofcrops,whichcanbeexpressedasζi=β0+β1x¯1i(p)+β2x¯2i(p)+vi,wherex¯1i(p)andx¯2i(p)arethepopula-tionmeannumbersofpixelsclassifiedascornandsoybeanspersegment,respectively,fortheithcounty,whichareavailable.Thecorndataarethefocusinthisanalysis.Batteseetal.(1988)suggestedthatthesecondobservationfromHardincountywasanoutlier.Aninspectionofthedatafromthatcounty(seeTable7.3)tellswhy:thesecondobservationisexactlythesameasthefirstone,possiblyduetoanrecordingerror.Batteseetal.(1988)recommendedtosimplyremovethiscasefromtheanalysis;alsoseeDattaandGhosh(1991).AconcernraisedbyChakrabortyetal.(2018)isthat“removinganydatawhichmaybeanon-representativeoutlierfromanalysiswillresultinlossofvaluableinformationaboutapartofthenon-sampledunitsofthepopulationwhichmaycontainoutliers.”ThelatterauthorsproposedtoanalyzethecorndatausingtherobustHBmethodbasedontheNMdistribution,asdescribedabove.Theycomparedtheanalysisresultingusingfourdifferentmethods:therobustHBmethod,theSinha-Raomethod,theM-quantilemethodproposedbyChambersandTzavidis(2006),andtheDatta-Ghoshmethod.Thefirstthreemethodsweredevelopedonthebasisofrobustnessagainstoutliers(seethenextsectionformoredetailsabouttheSinha-RaoandM-quantilemethods);theDatta-Ghoshmethodwasknowntobenotrobustagainstoutliers[Chakrabortyetal.(2018)].Theresultsshowthat,forthefirst11counties,whichdonotcontainanypotentialoutlier,thereisacloseagreementamongtherobustHBmethod,theSinha-Raomethod,andtheDatta-Ghoshmethodinthees-timatedcountymeanhectares.ThissuggeststhattherobustHBmethodperformssimilarlyasthewell-establishedmethods,suchasDatta-Ghosh,

225March8,20199:47ws-book9x6RobustMixedModelAnalysisbook4page213OtherTopics213whennooutlierispresent.Ontheotherhand,forthe12thcountywhichcontainstheoutlier,theestimatedcountymeanhectaresbasedonthero-bustHBmethodandSinha-Raomethodareverysimilar;theyare,however,verydifferentfromthatbasedontheDatta-Ghoshmethod.Thisagreeswiththeexpectation,inviewofthenon-robustnessoftheDatta-Ghoshmethodagainstoutliers.AsfortheM-quantilemethod,itwasfoundthatforthefirstthreecountiestheestimatesbasedontheM-quantilemethodarewidelydifferentfromthosebasedontherobustHBandSinha-Raomethod.Chakrabortyetal.(2018)suggeststhatthisindicatessomepo-tentialbiasoftheM-quantileestimates.7.3.3BayesianempiricallikelihoodChaudhuriandGhosh(2011)proposedaBayesianempiricallikelihood(EL)approachtosemiparametricBayesianinference.El[Owen(1988);alsoseeOwen(2001)]isamethodofestimationthatreliesonfeweras-sumptionsthanthetraditionalmaximumlikelihoodmethod.TointroducetheBayesianELmethod,letusbeginwiththehierarchicalmodelthat,givenη1,...,ηm,observationsy1,...,ymareindependentwithηiyi−b(ηi)yi|ηi∼exp+c(yi,φi),(7.16)φiθ=x′β+v,E(v|β,A)=0,var(v|β,A)=A,(7.17)iiiiiwhereθi=h(ηi)forsomefunctionh(·).(7.16)isintheformofanexpo-nentialfamily[McCullaghandNelder(1989)].Letwi,1≤i≤mdenotethepossiblejumpsintheempiricaldistributionofyi,1≤i≤m.ForaQmgivenθ=(θi)1≤i≤mP,theELisdefinedasL=i=1wiwiththeconstraintsmwi≥0,1≤i≤m,i=1wi=1,andthatXmXm{{y−µ(θ)}2iiwi{yi−µ(θi)}=0,wi−1=0,(7.18)V(θi)i=1i=1forspecifiedfunctionsµ(·)andV(·)(Exercise7.6).Notethat(7.18)corre-spondstothemeanandvarianceundertheempiricaldistribution.Thenextstepistospecifyapriordistributionfortherandomeffects,v=(vi)1≤i≤m,andparametersβandA.Anyproperpriorfortheseleadstoaproperpriorforθthrough(7.17).ChaudhuriandGhoshshowedthat,ifapriorforθisproper,thecorrespondingposteriorisalsoproper.TheposteriorisevaluatedviaMarkovchainMonte-Carlo.Nowconsiderthecasewherethedataaredividedintoindependentclusters,yij,j=1,...,ni,fori=1,...,m.Theexponentialfamily(7.16)

226February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page214214RobustMixedModelAnalysisisnowreplacedbyηijyij−b(ηij)yij|ηij∼exp+c(yij,φij)φijwithθ=h(η)=x′β+v.Foragivenθ=(θ),theELisdefinedasQijQijijiPijmniL=i=1j=1wijwiththeconstraintswij≥0,i,jwij=1,andXX{y−µ(θ)}2ijijwij{yij−µ(θij)}=0,wij−1=0,V(θij)jjforeach1≤i≤m,wherethefunctionsµ(·)andV(·).Intuitively,becauseELavoidsfullparametricmodelassumptions,themethodispotentiallymorerobusttodeparturefrommodelassumptions.Somehow,therobustnessfeaturewasnotfullyexploredinChaudhuriandGhosh(2011)although,throughanapplicationtomedianfamilyincomedata,theauthorsshowedthattheELapproachproducedbetterestimatesthantheCurrentPopulationSurveyestimates[e.g.,Ghoshetal.(1996)]andthestandardHBestimates.7.3.4BayesianmodeldiagnosticsYanandSedransk(2007)andYanandSedransk(2010)discussedproblemsassociatedwithBayesianmodeldiagnostics.Thespecificaspectofmodelinadequacyhastodowithfittingamodelthatdoesnotaccountforallofthehierarchicalstructuresthatarepresent.Theauthorsproposedtwotestingproceduresbasedonthepredictiveposteriordistribution,Zf(˜y|y)=f(˜y|θ)p(θ|y)dθ.(7.19)In(7.19),ydenotestheobserveddataandy˜thefuturedata,f(·|θ)isthepdfgivenθ,theparametervector,andp(·|y)istheposterior.Itisassumedthaty˜andyareindependentgivenθ.Notethat(7.19)isverysimilartothepredictivelikelihood,(4.45),exceptthatnowy˜andyaredifferent[whiley˜=yin(4.45)].YanandSedransk(2007)developedtwoproceduresforthediagnostics.Thefirstistocomputetheposteriorpredictivep-value,definedaspij=P(˜yij≤yij|y),(7.20)wherey˜ij,yijarethe(i,j)componentsofy,y˜,respectively.Thesecondprocedureistocomputethep-valueofadiagnosticstatistic,t(·),thatisP[t(˜y≥t(y)|y).

227January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page215OtherTopics215Examplesoft(·)includethesamplemean,median,andstandarddeviation.Weconsideranexample.Example7.4.Considerasimplemodelinwhichyij,i=1,...,m,j=1,...,kareindependentanddistributedasN(μ,τ2),givenμ,τ2.Itcanbeshownthat,iftheassumedmodeliscorrect,then,asN=mk→∞,thedistributionofyandy˜|yaretheapproximatelythesame,hence,thep-value(7.20)hasaUniform[0,1]distribution(Exercise7.7).Thissug-gestsagraphicalwayofcheckingthemodelbutmeansofaQ-Qplotofthedistributionofpijagainsttheuniformdistribution.Forexample,sup-posethatthetrueunderlyingdistributionhasanextrahierarchy,thatis,y|μ,τ2∼N(μ,τ2)andμ|σ2,τ2∼N(μ,σ2);theconditionalindepen-ijiiidenceassumptionisunchanged.YanandSedransk(2007)showedthat,inthiscase,thepreviousmodeliscorrectintermsofthemeanandvariance,butnotintermsofthewithin-clustercovariance.Thus,iftheintra-clustercorrelation,definedascov(yij,yik)/var(yij)forj=k,issufficientlyhigh,theQ-Qplotwoulddetectsuchadifference.Inasubsequentwork,YanandSedransk(2010)proposedanothermethodofdetectingamissinghierarchicalstructure,whichisbasedontheQ-Qplotofthepredictivestandardizedresiduals,yij−E(˜yij|y)rij=.{var(˜yij|y)}2TheauthorsshowedthatthenewprocedureworksunderasimilarwayasdescribedinExample7.4.7.4MoreaboutoutliersTheworkofSinhaandRao(2009)wasmentionedinSubsections7.3.1and7.3.2.Wenowdescribetheirmethodinfurtherdetail.Thegoalwastostudyimpactof“representativeoutliers”onthenormality-basedEBLUP(seeSection5.1).Arepresentativeoutlierisdefinedas[Chambers(1986)]a“sampleelementwithavaluethathasbeencorrectlyrecordedandcannotberegardedasunique”that“thereisnoreasontoassumethattherearenomoresimilaroutliersinthenonsampledpartofthepopulation”.AlthoughtheEBLUPsareefficientundertheassumedGaussianmixedmodel,theyaresensitivetooutliersthatdeviatefromtheassumedmodel.Suchoutliersexistpractically,becausetheGaussianmodelmayneverholdexactly.TorobustifytheEBLUP,SinhaandRaofocusedonthelikelihoodequation,derivedundernormality.Considerthelongitudinalmodel(seeSubsection

228February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page216216RobustMixedModelAnalysis3.1.2).ThelikelihoodequationcanbewrittenasXm′−1XiVi(yi−Xiβ)=0,(7.21)i=1mX∂V(y−X′β)′V−1iV−1(y−Xβ)iii∂ψiiiki=1−1∂Vi−trVi=0,k=1,...,q,(7.22)∂ψkwhereψ=(ψk)1≤k≤qisthevectorofvariancecomponentsinvolvedintheVis.SinhaandRaoproposetorobustifytheEBLUPbyrobustifyingthe1/2−1/21/2likelihoodequation.Writeyi−XiβasUiUi(yi−Xiβ)=Uiri,whereUiisadiagonalmatrixwithdiagonalelementsequaltothoseofVi.Replacingribyψ(ri),whereψ(·)isthevector-valuedfunctionwithdimensionni=dim(yi)andcomponentfunctionψh(·),andψhisHuber’sψ-function[Huber(1964)],andaddingamatrixKiinsidethetracein(7.22)(tobeexplainedbelow),onehastherobustversionsof(7.21)and(7.22):XmX′V−1U1/2ψ(r)=0,(7.23)iiiii=1mX∂Vψ′(r)U1/2V−1iV−1U1/2ψ(r)iii∂ψiiiki=1−1∂Vi−trKiVi=0,k=1,...,q.(7.24)∂ψkIn(7.24),KiisadiagonalmatrixchosenasKi=cIni,wherec=E{ψh(Z)}withZ∼N(0,1).TherobustifiedMLestimatorof(β,ψ)isdefinedasthesolutionto(7.23)and(7.24).SinhaandRaoestablishedasymptoticnormalityoftherobustifiedMLestimator,whichmaybeviewedasM-estimator[Huber(1981)].IntermsofmeasureofuncertaintyfortherobustEBLUP,SinhaandRaoadoptedaparametricbootstrapapproach,andstudieditsempiricalperformance.Alsomentionedearlier(seeSubsection7.3.2)wastheM-quantilemethodofChambersandTzavidis(2006).Quantileshavebeenusedinstatisticsasalternativestothemeanasmeasuresoflocation.Oneparticularcaseisthemedianof,say,adataset,whichcorrespondstoapointsuchthat50%ofthedataarelessthanit.InthecontextofSAE,ChambersandTza-vidis(2006)proposedaquantile-basedapproach.Theintentionwastoofferanalternativewastomodelthebetween-areavariationtotherandomef-fectsapproach.Motivatedbythequantileregression[KoenkerandBassett

229February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page217OtherTopics217(1978)],theauthorsdefinedtheM-quantileasaquantity,Q=Qq(x;ψ),thatsatisfiesZψq(y−Q)f(y|x)dy=0,whereqisagivennumberin(0,1)andψ(r)=2ψ−1(r/s){q1+(1−q)1},q(r>0)(r≤0)ψ(·)beinganinfluencefunction,andsisa“suitablerobustestimatorofscale”.Hereyandxdenotetheresponseandcovariate,respectively.TheauthorsarguedthattheuseofM-quantileinsteadofstandardquantileregressionismainlyduetopracticalreasons,becausetheM-quantilere-gressioniseasiertofitbyutilizingaiterativelyreweightedleastsquaresalgorithm.InapplyingtheM-quantilestoSAE,theauthorsdefineaunit-levelM-quantilecoefficient,qi,viatheequationQqi(xi;ψ)=yi,wherexiandyiarexandyfortheithunit.Thearea-specificM-quantilecoefficientisthendefinedasaverageoftheunit-levelones.EstimationoftheareaM-quantilecoefficientsareobtainedbyfittingthemodelwithsampleM-quantiles.TheauthorsclaimedthatamainadvantageoftheirM-quantilemodelisthatitallowsforoutlier-robustin-ferenceusing“widelyavailableM-estimationsoftware”.Theauthorstriedtolinktheirarea-specificM-quantilecoefficienttotherandomeffectbyar-guingthat,forexample,ifallareaM-quantilecoefficientsareequalto0.5,oneconcludes“thereisnobetween-areavariationbeyondthatexplainedbythemodelcovariates”.Intermsofmeasureofuncertainty,theauthorssuggestedtousethemeansquarederror(MSE).Thisseemsabitcontra-dictingasthepointwasthoughttoavoidusingthemean-basedapproach,whichleadstothebestpredictorundertheMSE.Also,unlikethemean-basedapproach,suchastheEBLUP,therewasnooptimalityconsiderationundertheM-quantileframework,whosetheoreticalfoundationappearstobelacking.7.5BenchmarkingBenchmarkingisatechniquethatisoftenused,especiallyinthecontextofsurveys.Forexample,Pfeffermann(2013),wrote,inhisreviewof“newimportantdevelopments”inSAE:“Benchmarkingrobustifiestheinference

230January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page218218RobustMixedModelAnalysisbyforcingthemodel-basedpredictorstoagreewithadesign-basedestima-torforanaggregateoftheareasforwhichthedesign-basedestimatorisreliable”.Abenchmarkingequationtypicallylookslikethefollowing:mmwiθˆi=wiyi,(7.25)i=1i=1whereθˆiisamodel-basedpredictorofsmallareameanfortheithsmallarea,andyiisadesign-basedestimatorforthesamesmallareamean;thewi’sareknownweights.Unfortunately,thestandardpredictors,suchastheEBLUP,donotnec-essarilysatisfythebenchmarkingequation,(7.25).Inordertohandlethisproblem,twomainapproacheshavebeenproposed.ThefirstapproachistomakesomeadjustmentoftheEBLUPtoforceittosatisfythebench-markingcondition.Aratio-benchmarkedpredictorisdefinedasmθˆ=θˆi=1wiyi.(7.26)rb,iimθˆi=1wiiAdifference-benchmarkedpredictorisdefinedasmmθˆdb,i=θˆi+wiyi−wiθˆi.(7.27)i=1i=1Itiseasytoseethatbothθˆrb,i,1≤i≤mandθˆdb,i,1≤i≤msatisfy(7.25).Aquestionofboththeoreticalandpracticalinterestisregardingthe“optimal”adjustment.Forexample,onecanextend(7.27)asmmθˆi,λ=θˆi+λiwiyi−wiθˆi(7.28)i=1i=1foranyconstantsλi,1≤i≤mwiththeresultingpredictor,θˆi,λ,1≤i≤m,stillsatisfyingthebenchmarkingcondition,(7.25),providedthatmλiwi=1.(7.29)i=1Thedifference-benchmarkedpredictorisaspecialcaseofθˆi,λwithλi=1,1≤i≤m.Inthesubsequentdiscussion,weshallfocusonthearea-levelmodel,orFay-Herriotmodel,whichiswidelyusedinSAE.Wangetal.(2008)consideredaweightedsumofMSPE,thatis,mS(λ)=φE(θˆ−θ)2,(7.30)ii,λii=1

231January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page219OtherTopics219whereφi,1≤i≤marespecifiedpositiveweights.Theauthorsshowedthat,undersuitableconditions,theoptimalchoiceofλ,inthesenseofminimizingS(λ),isgivenby(Exercise7.8)wi/φiλi=m2,1≤i≤m.(7.31)j=1wj/φjItisalsoeasytoseethattheλisdefinedby(7.31)satisfy(7.29),therefore,thebenchmarkingcondition(7.25)issatisfied.Anotherapproachtobenchmarkingiscalledselfbenchmarking.Theideaistomakesomemodificationintheestimationproceduresothattheresultingpredictorsautomaticallysatisfythebenchmarkingcondition.YouandRao(2002)proposedtomodifytheBLUEofβ[see(5.4)]sothattheresultingEBLUPisselfbenchmarking.AnotherinterestingapproachwasproposedbyWangetal.(2008),inwhichtheauthorsintroducedthefollowingaugmentedFay-Herriotmodel:y=xβ+βwD+v+e,i=1,...,m,(7.32)iiaiiiiwiththesameassumptionsasfortheFay-Herriotmodel,whereβaisanadditionalunknownparameter.Itcanbeshown(Exercise7.9)thattheEBLUP,θˆ,ofthesmallareamean,θ=xβ+v,1≤i≤m,obtainediiiiundertheaugmentedmodel,(7.32),areselfbenchmarking.Itshouldbenotedthat(7.32)isnotthetrueunderlyingmodel—itisjustamodelthatonefitstogettheEBLUPs;thetrueunderlyingmodelisstilltheFay-Herriotmodel(seeExample5.1).Bandyopadhyay(2017)extendedbothapproachesofbenchmarkingtheEBLUPdescribedabovetobenchmarkingtheOBP(seeChapter5).Wedescribebelowthesecondapproachbasedonanaugmentedmodel(exten-sionofthefirstapproachisrelativelystraightforward).Theaugmentedmodelisdifferentfrom(7.32),expressedasy=xβ+βw(1−B)−1+v+e,i=1,...,m,(7.33)iiaiiiiwhereBi=A/(A+Di);otherassumptionsarethesameasintheFay-Herriotmodel.Amaindifferencebetween(7.33)and(7.32)isthat,in(7.33),Biisnotaknownquantity,becauseitdependsontheunknownvarianceA.Thus,inaway,thelattermodelisnotalinearmodel,becausethemeanfunctiondependsonaparameterthatisinvolvedinthevari-ancefunction.Nevertheless,foranyfixedA≥0,onecandefineX(A)=[XX(A)],whereX=(x)andX(A)=[w(1−B)−1],and121i1≤i≤m2ii1≤i≤mβ∗=(β,β).Then,(7.33)canbewritteninamatrixformasay=X(A)β∗+v+e,(7.34)

232January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page220220RobustMixedModelAnalysiswherey=(yi)1≤i≤mandv,earedefinedsimilarly.Accordingto(5.11),ifAisknown,theBPEofβ∗isgivenbyβ˜∗(A)={X(A)Γ2(A)X(A)}−1X(A)Γ2(A)y(7.35)withΓ=diag(1−Bi,1≤i≤m).IncasethatAisunknown(thepracticalsituation),itsBPEisobtainedbyminimizing(5.20),thatis,Q∗(A)={y−X(A)β˜∗(A)}Γ2(A){y−X(A)β˜∗(A)}+2Atr{Γ(A)}(7.36)overA≥0.Denotetheminimizerof(7.36)byAˆ.Then,theBPEofβ∗isgivenbyβˆ∗=β˜∗(Aˆ).OnethendefinetheOBPofθ=(θ)asi1≤i≤mθˆ=AˆVˆ−1y+DVˆ−1X(Aˆ)βˆ∗,(7.37)whereVˆ=AIˆm+DwithD=diag(Di,1≤i≤m).Itcanbeshown(Exercise7.9)thattheOBPdefinedby(7.37),withtheithcomponentθˆi,1≤i≤m,areselfbenchmarking,thatis,theyautomaticallysatisfy(7.25).TocompareperformanceofthebenchmarkedOBPwiththoseofthebenchmarkedEBLUP,Bandyopadhyay(2017)carriedoutasimulationstudyunderasettingusedbyWangetal.(2008).Thedataweresimu-latedunderthemodely=xβ+v+e,i=1,...,50,j=1,...,n,(7.38)ijiiijiwherex=(1,z)withz=pop0.2−pop0.2,withpopbeingtheapprox-iiiiiimatepopulationsizeofstatei[seeTable1ofWangetal.(2008)];the√samplesizesniareapproximatelyproportionaltopopi.Onethentake−1ni−1niyi=nij=1yij,henceei=nij=1eij,toimplyaFay-Herriotmodel:y=xβ+v+e.iiiiThesimulationsettingisexactlyasthatofWangetal.(2008).Namely,thevisaregeneratedfromtheN(0,1)andeijsfromtheN(0,16)distribu-tions,sothatei∼N(0,16/ni),thusDi=16/ni,1≤i≤m;thetrueβis(6.0,3.0).(7.38)isthetrueunderlyingmodel;however,itisnottheassumedmodel.Theassumedmodelis(7.38)withxi=1(i.e.,withoutzi).Inotherwords,theassumedmodelismisspecified.Theperformanceofthreedif-ferenttypesofpredictors:(i)predictorwithoutbenchmarking;(ii)bench-markingviaadjustment;and(iii)benchmarkingviaaugmentedmodel,eachwithtwodifferentmethods:EBLUPandOBP,arecomparedintermsofempiricalMSPEbasedon10,000simulationruns.TheresultsarepresentedinFigure7.4.Notethatthefirstcolumn(fromtheleft)ofFigure7.4correspondtoEBLUPandOBPwithoutbenchmarking;thelasttwocolumnscorrespond

233January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page221OtherTopics221EBLUPEBLUP:YREBLUP:WFQ0.01.02.00.01.02.00.01.02.0OBPOBP:AdjustedOBP:Augmented0.01.02.00.01.02.00.01.02.0Fig.7.4EmpiricalMSPEofPredictorstoEBLUPandOBPwithbenchmarking.Overall,itappearsthatOBPperformsbetterthanEBLUPeitherwithoutbenchmarkingorwithbench-markingviaadjustment.Asforbenchmarkingviaaugmentedmodels,OBPperformsslightlybetterthanEBLUPintermsoftheinterquartilerange.7.6Moreaboutprediction7.6.1PredictionoffutureobservationSofaraspredictionisconcerned,themajorityoftheliteratureonmixedeffectsmodelsfocusesonpredictionofmixedeffects.However,predictionoffutureobservationsarealsoofinterestinpractice.Considertheproblemofpredictingafutureobservationunderanon-GaussianLMM(seeChapter3).Becausenormalityisnotassumed,ourapproachisdistribution-free;thatis,itdoesnotrequireanyspecificas-sumptionaboutthedistributionoftherandomeffectsanderrors.Firstnotethatforthistypeofprediction,itisreasonabletoassumethatafutureobservationisindependentofthecurrentones.Weoffersomeexamples.

234January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page222222RobustMixedModelAnalysisExample7.5.Inlongitudinalstudies,onemaybeinterestedinpre-diction,basedonrepeatedmeasurementsfromtheobservedindividuals,ofafutureobservationfromanindividualnotpreviouslyobserved.Itisoflessinteresttopredictanotherobservationfromanobservedindividual,be-causelongitudinalstudiesoftenaimatapplicationstoalargerpopulation(e.g.,drugsgoingtothemarketafterclinicaltrials).Example7.6.Insurveys,responsesmaybecollectedintwosteps:inthefirststep,anumberoffamiliesarerandomlyselected;inthesecondstep,somefamilymembers(e.g.,allfamilymembers)areinterviewedforeachoftheselectedfamilies.Again,onemaybemoreinterestedinpredictingwhathappenstoafamilynotselected,becauseonealreadyknowsenoughaboutselectedfamilies(especiallywhenallfamilymembersintheselectedfamiliesareinterviewed).Therefore,weassumethatafutureobservation,y∗,isindependentofthecurrentones.Then,wehaveE(y|y)=E(y)=xtβ,sothebest∗∗∗predictorisxtβ,ifβisknown;otherwise,anempiricalbestpredictor(EBP)∗isobtainedbyreplacingβbyanestimator.Sothepointpredictionisfairlystraightforward.Aquestionthatisoftenofpracticalinterestbuthasbeensofarneglected,forthemostpart,isthatofpredictionintervals.Apredictionintervalforasinglefutureobservationisanintervalthathasaspecifiedcoverageprobabilityforthefutureobservation.Inmodel-basedstatisticalinference,itisassumedthatthefutureobservationhasacertaindistribution.Sometimes,thedistributionisspecifieduptoafinitenumberofunknownparameters,forexample,meanandvarianceofthenormaldistribution.Then,apredictionintervalmaybeobtained,iftheparametersareadequatelyestimated,andtheuncertaintyintheparameterestimationsissuitablyassessed.Clearly,suchaprocedureisdependentontheunderlyingdistributioninthat,ifthedistributionalassumptionfails,thepredictionintervalmaybeseriouslyinaccurate,thatis,eitherwiderthannecessary,ornothavingtheclaimedcoverageprobability.Analter-nativetotheparametricmethodisadistribution-freeone,inwhichonedoesnotassumethattheformofthedistributionisknown.Forareviewofliteratureonpredictionintervalforafutureobservationusingeitherdistribution-dependentordistribution-freeapproaches,see,forexample,sec.2.3.2ofJiang(2007).Notethat,evenifβisunknown,itisstillfairlyeasytoobtainapre-dictionintervalfory∗ifoneiswillingtomaketheassumptionthatthedistributionsoftherandomeffectsanderrorsareknownuptoavectorofvariancecomponents.Toseethis,consider(7.11),wheretherandomeffect

235January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page223OtherTopics223vanderroreareindependentsuchthatv∼N(0,σ2)ande∼N(0,τ2).iijiijItfollowsthatthedistributionofyisN(xβ,σ2+τ2).Becausemethodsijijarewelldevelopedforestimatingfixedparameterssuchasβ,σ2,andτ2(seeChapter5),apredictionintervalwithasymptoticcoverageprobability1−ρiseasytoobtain.However,itismuchmoredifficultifonedoesnotknowtheformsofthedistributionsoftherandomeffectsanderrors,oroneisnotwillingtomakespecifieddistributionalassumptionsabouttherandomeffectsanderrorsduetorobustnessconcerns.Thisisthecasethatweconsiderbelow.AccordingtoChapter3,forconsistencyofREMLorMLestimatorsofthefixedeffectsandvariancecomponents,onedoesnotneedtoassumethattherandomeffectsanderrorsarenormallydistributed.ALMMissaidtobestandardifitcanbeexpressedas(1.1),whereeachZr(1≤r≤s)consistsonlyof0sand1s,thereisexactlyone1ineachrowandatleastone1ineachcolumn.ThemethodsdescribedbelowweredevelopedbyJiangandZhang(2002),whichtreatthestandardandnon-standardcasesquitedifferently.ForstandardLMM,themethodissurprisinglysimple.First,onethrowsawaythemiddletermsin(1.1)thatinvolvetherandomeffects,andpretendsthatitisalinearregressionmodel,withi.i.d.errors,y=Xβ+.Next,onecomputestheordinaryleastsquares(OLS)estimator,βˆ=(XX)−1Xy,andtheresiduals,ˆ=y−Xβˆ.Letaˆandˆbbetheρ/2and1−ρ/2quantilesoftheresiduals.Then,apredictionintervalfory∗withasymptoticcoverageprobability1−ρ[ˆy∗+ˆa,yˆ∗+ˆb],(7.39)whereyˆ=xβˆ.Notethat,althoughthemethodsoundsalmostthesame∗∗astheresidualmethodinlinearregression,itsjustificationisnotsoobvi-ousbecause,unlikelinearregression,theobservationsundera(standard)LMMarenotindependent.ThemethodmaybeimprovedifoneusesmoreefficientestimatorssuchastheEBLUE,insteadoftheOLSestimator.Toseehow(7.39)isderived,lety∗beafutureobservationthatonewishestopredict.Supposethaty∗satisfiesthesamestandardLMM.Then,y∗canbeexpressedasy=xβ+α+···+α+,∗∗∗1∗s∗wherex∗isaknownvectorofcovariates(notnecessarilypresentwiththedata),α∗rsarerandomeffects,and∗isanerror,suchthatα∗i∼Fir,≤i≤s,∗∼F0,wheretheFsareunknowndistributions,andα∗1,...,α∗s,∗

236January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page224224RobustMixedModelAnalysisareindependent.Accordingtoearlierdiscussion,y∗isindependentofthedata,y=(yi)1≤i≤n.Itfollowsthatthebest(point)predictorofy∗,whenβisknown,isE(y|y)=E(y)=xβ.Becauseβisunknown,itisreplaced∗∗∗byaconsistentestimator,βˆ,whichmaybetheOLSestimatororEBLUE.Thisresultsinanempiricalbestpredictor:yˆ=xβ.ˆ(7.40)∗∗Letδˆ=y−xβˆ.DefineiiinFˆ(x)=#{1≤i≤n:δˆi≤x}=11.(7.41)nn(δˆi≤x)i=1Notethat,although(7.41)resemblestheempiricaldistribution[e.g.,Jiang(2010),sec.7.1],itisnotoneintheclassicsense,becausetheδˆisarenotindependent(theyisaredependent,andβˆdependsonallthedata).Leta<ˆˆbbeanynumberssatisfyingFˆ(ˆb)−Fˆ(ˆa)=1−ρ(0<ρ<1).Then,apredictionintervalfory∗withasymptoticcoverageprobability1−ρisgivenby(7.39).Notethatatypicalchoiceofaˆ,ˆbhasFˆ(ˆa)=ρ/2andFˆ(ˆb)=1−ρ/2.Anotherchoicewouldbetoselectaˆandˆbtominimizeˆb−aˆ,thelengthofthepredictioninterval.Usually,aˆ,ˆbareselectedsuchthattheformerisnegativeandthelatterpositive,sothatyˆ∗iscontainedintheinterval.Alsonotethat,ifoneconsiderslinearregressionasaspecialcaseoftheLMM,inwhichtherandomeffectsarezero,δˆiisthesameasˆi,theresidual,ifβˆistheleastsquaresestimator.Inthiscase,Fˆistheempiricaldistributionoftheresiduals,andthepredictioninterval(7.39)correspondstothatobtainedbythebootstrapmethod[Efron(1979)].However,thereisonedifference.Thedifferenceisthatourpredictionintervalisobtainedinclosedform,ratherthanbyaMonteCarlomethod.Formorediscussiononbootstrappredictionintervals,see,forexample,Section7.3ofShaoandTu(1995).NowletusconsideranonstandardLMM(i.e.,LMMthatisnotstan-dard).First,themethoddevelopedforstandardmodelsmaybeappliedtosomeofthenonstandardcases.Weillustratewithanexample.Example7.7.Supposethatthedataaredividedintotwoparts.Thefirstpartsatisfiesy=xβ+α+,i=1,...,m,j=1,...,n,whereijijiijiα1,...,αmarei.i.d.randomeffectswithmean0anddistributionF1;ijsarei.i.d.errorswithmean0anddistributionF0,andtheαsandsareindependent.Thesecondpartsatisfiesy=xβ+,k=N+1,...,N+kkkmK,whereN=i=1ni,andtheksarei.i.d.errorswithmean0anddistributionF0.Notethattherandomeffectsonlyappearinthefirstpart(hencethereisnoneedtouseadoubleindexforthesecondpart).

237January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page225OtherTopics225Forthefirstpart,letthedistributionofδ=y−xβbeF(=F∗F).ijijij01Forthesecondpart,letδ=y−xβ.Ifβwereknown,theδs(δs)kkkijkwouldbesufficientstatisticsforF(F0).ThereforeitsufficestoconsideranestimatorofF(F0)basedontheδijs(δks).NotethatthepredictionintervalforanyfutureobservationisdeterminedeitherbyForbyF0,dependingontowhichparttheobservationcorresponds.Now,becauseβisunknown,itiscustomarytoreplaceitbyβˆ.Thus,apredictionintervalfory∗,afutureobservationcorrespondingtothefirstpart,is(7.39),whereyˆ=xβˆ,aˆ,ˆbaredeterminedbyFˆ(ˆb)−Fˆ(ˆa)=1−ρwith∗∗1Fˆ(x)=#{(i,j):1≤i≤m,1≤j≤ni,δˆij≤x}Nandδˆ=y−xβˆ.ijijijSimilarly,apredictionintervalfory∗,afutureobservationcorrespondingtothesecondpart,is(7.39),whereyˆ=xβˆ,aˆ,ˆbaredeterminedsimilarly∗∗withFˆreplacedby1Fˆ0(x)=#{k:N+1≤k≤N+K,δˆk≤x}Kandδˆ=y−xβˆ.Thepredictionintervalhasasymptoticcoverageprob-kkkability1−ρ[seeJiangandZhang(2002)].Jiang(1998b)consideredestimationofthedistributionsoftherandomeffectsanderrorsinnon-GaussianLMM.Hisapproachisthefollowing.ConsidertheEBLUPoftherandomeffects[see(5.5)]:αˆ=ˆσ2ZVˆ−1(y−Xβˆ),1≤i≤s,iiiwhereβˆistheEBLUEofβ.The“EBLUP”oftheerrorscanbedefinedassˆ=y−Xβˆ−Ziαˆi.i=1Itwasshownthat,iftheREMLorMLestimatorsofthevariancecompo-nentsareused,then,undersuitableconditions,miFˆ(x)=11−→PF(x),x∈C(F),i(ˆαi,u≤x)iimiu=1whereαˆi,uistheuthcomponentofαˆi,1≤i≤s,andnFˆ(x)=11−→PF(x),x∈C(F),0(ˆu≤x)00nu=1whereˆuistheuthcomponentofˆ.HereC(Fi)representsthesetofallcontinuitypointsofFi,0≤i≤s.

238January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page226226RobustMixedModelAnalysisForsimplicity,assumethatallofthedistributions,F0,...,Fsarecon-tinuous.Lety∗beafutureobservationwewouldliketopredict.Asbefore,weassumethaty∗isindependentofyandsatisfiesaLMM,whichcanbeexpressedcomponentwiseasy=xβ+zα+···+zα+,i=1,...,n.iii11issiThismeansthaty∗canbeexpressedasly∗=x∗β+wjγj+∗,j=1wherex∗isaknownvectorofcovariates(notnecessarilypresentwiththedata),wjsareknownnonzeroconstants,γjsareunobservablerandomef-fects,and∗isanerror.Inaddition,thereisapartitionoftheindicesq{1,...,l}=∪k=1Ik,suchthatγj∼Fr(k)ifj∈Ik,wherer(1),...,r(q)aredistinctintegersbetween1ands(soq≤s);∗∼F0;γ1,...,γl,∗areindependent.Definemr(k)Fˆ(j)(x)=m−11,ifj∈Ir(k)(wjαˆr(k),u≤x)ku=1for1≤k≤q.LetFˆ(x)=(Fˆ(1)∗···∗Fˆ(l)∗Fˆ)(x)0q−1|Ik|=mnr(k)k=1q×#{(u1,...,ul,u):wjαˆr(k),uj+ˆu≤x},(7.42)k=1j∈Ikwhere∗representsconvolution[e.g.,Jiang(2007),AppendixC],and1≤uj≤mr(k)ifj∈Ik,1≤k≤q;1≤u≤n.ItcanbeshownthatPsup|Fˆ(x)−F(x)|−→0,xwhereF=F(1)∗···∗F(l)∗F,andF(j)isthedistributionofwγ,1≤j≤l.0jjNotethatFisthedistributionofy−xβ.Letyˆbedefinedby(7.40)∗∗∗withβˆbeingaconsistentestimator,andaˆ,ˆbdefinedbyFˆ(ˆb)−Fˆ(ˆa)=1−ρ,whereFˆisgivenby(7.43).Then,thepredictioninterval(7.39)hasasymp-toticcoverageprobability1−ρ.

239January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page227OtherTopics227JiangandZhang(2002)carriedoutasimulationstudytocompareem-piricalperformanceofthepredictioninterval(7.39)basedonOLSestima-tor,thatbasedonEBLUE,andthestandardregressionpredictioninterval,whichignoresthepresenceofrandomeffectsandtreatsthedataasindepen-dent.Theyfoundthatpredictionintervals(7.39)basedoneitherOLSorEBLUEperformedbetterthanthestandardregressionpredictioninterval.AsforthecomparisonbetweenOLS-basedandEBLUE-basedpredictionintervals(7.39),thetwomethodsperformedsimilarlyinthethisstudy.7.6.2ClassifiedmixedmodelpredictionMixedmodelprediction(MMP)hasbeenatopicthroughoutthismono-graph.See,forexample,Chapter5.Thetraditionalfieldsofapplicationsincludegenetics,agriculture,education,andsurveys[e.g.,Robinson(1991),RaoandMolina(2015)].ThisisafieldwherefrequentistandBayesianap-proachesfoundcommongrounds.Nowadays,newandchallengingproblemshaveemergedfromsuchfieldsasbusinessandhealthsciences,inadditiontothetraditionalfields,towhichmethodsofMMParepotentiallyapplica-ble,butnotwithoutfurthermethodologyandcomputationaldevelopments.Someoftheseproblemsoccurwheninterestisatsubjectlevel(e.g.,person-alizedmedicine),or(small)sub-populationlevel(e.g.,county),ratherthanatlargepopulationlevel(e.g.,epidemiology).Insuchcases,itispossibletomakesubstantialgainsinpredictionaccuracybyidentifyingaclassthatanewsubjectbelongsto.ThisideawasrecentlyimplementedbyJiangetal.(2018),whoproposedaninnovative,andinterestingmethodcalledclassifiedmixedmodelprediction(CMMP).Asnotedintheearliersubsection,therearetwotypesofpredictionproblemsassociatedwiththemixedeffectsmodels.Thefirsttype,whichisencounteredmoreofteninpractice,ispredictionofmixedeffects;thesecondtypeispredictionoffutureobservation.LetusfirstconsiderCMMPformixedeffects;wecanthenusethemethodtotacklepredictionoffutureobservation.1.Predictionofmixedeffects.Supposethatwehaveasetoftrainingdata,yij,i=1,...,m,j=1,...,niinthesensethattheirclassificationsareknown,thatis,oneknowswhichgroup,i,thatyijbelongsto.TheassumedLMMforthetrainingdataisyi=Xiβ+Ziαi+i,(7.43)wherey=(y),X=(x)isamatrixofknowncovariates,iij1≤j≤niiij1≤j≤ni

240February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page228228RobustMixedModelAnalysisβisavectorofunknownregressioncoefficients(thefixedeffects),Ziisaknownni×qmatrix,αiisaq×1vectorofgroup-specificrandomeffects,andǫiisanni×1vectoroferrors.Itisassumedthattheαi’sandǫi’sareindependent,withαi∼N(0,G)andǫi∼N(0,Ri),wherethecovariancematricesGandRidependonavectorψofvariancecomponents.Thegoalistomakeaclassifiedpredictionforamixedeffectassociatedwithasetofnewobservations,yn,j,1≤j≤nnew(thesubscriptnrefersto“new”).Supposethat′′yn,j=xnβ+znαI+ǫn,j,1≤j≤nnew,(7.44)wherexn,znareknownvectors,I∈{1,...,m}butonedoesnotknowwhichelementi,1≤i≤m,isequaltoI.Furthermore,ǫn,j,1≤j≤nnewarenewerrorsthatareindependentwithE(ǫn,j)=0andvar(ǫn,j)=Rnew,andareindependentwiththeαisandǫis.Notethatthenormalityassumptionisnotalwaysneededforthenewerrors,unlesspredictionintervalisconcerned(seebelow).Also,thevarianceRnewofthenewerrorsdoesnothavetobethesameasthevarianceofǫij,thejthcomponentofǫiassociatedwiththetrainingdata.Themixedeffectthatwewishtopredictisθ=E(y|α)=x′β+z′α.(7.45)n,jInnIFromthetrainingdata,onecanestimatetheparameters,βandψ.Forexample,onecanusethestandardmixedmodelanalysistoobtainMLorREMLestimators(e.g.,§3.2).Alternatively,onemayusetheOBPmethod(seeChapter5),whichismorerobusttomodelmisspecificationsintermsofthepredictiveperformance.Thus,wecanassumethatestimatorsβ,ˆψˆareavailableforβ,ψ.ToderiveCMMP,letusfirstassumethatthereisamatchbetweentherandomeffect,αI,correspondingtothenewobservationsandoneoftherandomeffectsassociatedwiththetrainingdata.ThismeansthatI=iforsome1≤i≤m.Itthenfollowsthatthevec-torsy,...,y,(y′,θ)′,y,...,yareindependent.Thus,wehave1i−1ii+1mE(θ|y1,...,ym)=E(θ|yi).Bythenormaltheory[e.g.,(5.3)],wehave′′′′−1E(θ|yi)=xnβ+znGZi(Ri+ZiGZi)(yi−Xiβ).(7.46)Therightsideof(7.46)istheBPundertheassumedLMM,ifthetrueβandψ,areknown.Becausethelatterareunknown,wereplacethembyβˆandψˆ,respectively.TheresultistheEBP,denotedbyθ˜(i).Inpractice,however,Iisunknownandtreatedasaparameter.Inordertoidentify,orestimate,I,weconsidertheMSPEofθbytheBPwhenIis

241January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page229OtherTopics229classifiedasi,thatisMSPE=E{θ˜−θ}2=E{θ˜2}−2E{θ˜θ}+E(θ2).i(i)(i)(i)Usingtheexpressionθ=¯y−¯,wherey¯=n−1nnewyand¯isnnnnewj=1n,jndefinedsimilarly,wehaveE{θ˜(i)θ}=E{θ˜(i)y¯n}−E{θ˜(i)¯n}=E{θ˜(i)y¯n}.Thus,wehavetheexpression:MSPE=E{θ˜2−2θ˜y¯+θ2}.(7.47)i(i)(i)nItfollowsthattheobservedMSPEcorrespondingto(7.47)istheexpressioninsidetheexpectation.Therefore,anaturalideaistoidentifyIastheindexithatminimizestheobservedMSPE.Becauseθ2doesnotdependoni,theminimizerisgivenby9:Iˆ=argminθ˜2−2θ˜y¯.(7.48)i(i)(i)nTheclassifiedmixed-modelpredictor(CMMP)ofθisthengivenbyθˆ=θ˜.(Iˆ)2.Predictionoffutureobservations.Nowsupposethattheinterestistopredictafutureobservation,yf,thatbelongstoanunknowngroupthatmatchesoneoftheexistinggroups.Firstnotethat,eventhoughtheinterestispredictionforafutureobservation,itisimpossible,ingeneral,todobetterwithCMMP,ifonedoesnothaveanyotherobservationsthatareknowntobefromthesamegroupasyf.Forexample,takealookatthesimplestcasewhenthereisnocovariates,thatis,Xβ=(μ,...,μ),iZα=(α,...,α)in(7.43),andxβ=μ,zα=αin(7.44).InthisiiiiffIIcase,oneknowsnothingaboutyf,because(7.44)isnodifferentfrom(7.43)atthesingle-componentlevel.Thus,withoutadditionalinformation,one,ofcourse,cannottelltowhichgroupthenewobservationyfbelongs.There-fore,weshallassumethatonehassomeobservation(s)thatareknowntobefromthesamegroupasyf,inadditiontothetrainingdata.Notethatthisadditionalgroupisdifferentfromthetrainingdata,becausewedonotknowtheclassificationnumberoftheadditionalgroupwithrespecttothetrainingdatagroups.Letyn,j,1≤j≤nnewbetheadditionalobservations.Forexample,theadditionalobservationsmaybedatacollectedpriortoamedicaltreatment,andthefutureobservation,yf,istheoutcomeafterthemedicaltreatmentthatonewishestopredict.Supposethatyfsatisfies(7.44),thatisyf=xβ+zα+,whereisthenewerrorthatisindependentwiththeffIfftrainingdata.ItfollowsthatE(yf|y1,...,ym)=E(θ|y1,...,ym)+E(f|y1,...,ym)=E(θ|y1,...,ym)(7.49)

242January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page230230RobustMixedModelAnalysiswithθ=xβ+zα.(7.49)showsthattheBPforyisthesameastheBPffIfforθ,whichistherightsideof(7.46),thatis,θ(i)withxn,znreplacedbyxf,zf,respectively,whenI=i.Supposethattheadditionalobservationssatisfyy=xβ+zα+,1≤j≤n.Ifx,zdonotn,jn,jn,jIn,jnewn,jn,jdependonj(whichincludesthespecialcaseofnnew=1),wecantreatyn,j,1≤j≤nnewthesamewayasthenewobservations,andidentifytheclassificationnumber,Iˆ,by(7.48).TheCMMPofyfisthengivenbytherightsideof(4)withi=Iˆ,β,ψreplacedbyβ,ˆψˆ,andxn,znreplacedbyxf,zf,respectively.Acasethatisslightlymorecomplicatediswhenxn,jdependsonj.Forthesimplicityofillustration,letzα=α.Bytreatingeachyasn,jIIn,jthenewobservation(withnnew=1),andusingtheCMMPdevelopedinSubsection2.1,wecanobtainCMEPofthemixedeffectassociatedwithy,givenbyθˆ=xβˆ+ˆα,whereIˆ(j)istheIˆof(7.48)correspondingn,jn,jn,jIˆ(j)toy.Wedothisforeachj=1,...,n,leadingtoαˆ=θˆ−xβˆ,n,jnewIˆ(j)n,jn,j1≤j≤n.Wethentaketheaverage,α8=n−1nnewαˆ.ThenewInewj=1Iˆ(j)CMMPofyisthengivenbyyˆ=xβˆ+α8.fffI3.RobustnessofCMMP.RecallthatCMMPisderivedundertheas-sumptionthatthereisamatchbetweentherandomeffectassociatedwiththenewobservationsandoneoftherandomeffectsassociatedwiththetrainingdata.Anice,andsomewhatsurprisingfeatureofCMMPisthat,evenifthisassumptiondoesnothold,thatis,the“match”doesnotex-ist,onestillgainsinpredictionaccuracybydoingCMMP,pretendingthatthereisamatch.Inthissense,CMMPisrobusttofailureofthematchingassumption.ThisrobustfeaturemakestheCMMPmethod(much)moreapplicablebecause,inpractice,anexactmatchmaywellnotexist.ToillustratetherobustnesspropertyofCMMP,Jiangetal.(2018)car-riedoutasimulationstudy.Thetrainingdataweregeneratedunderthefollowingmodel:yij=1+2x1,ij+3x2,ij+αi+ij,(7.50)i=1,...,m,j=1,...,n,withn=5,αi∼N(0,G),ij∼N(0,1),andαi’s,ij’sareindependent.Thexk,ij,k=1,2weregeneratedfromtheN(0,1)distribution,thenfixedthroughoutthesimulation.ThereareK=10newobservations,generatedundertwoscenarios.ScenarioI:ThenewobservationshavethesameαiasthefirstKgroupsinthetrainingdata(K≤m),butindependent’s;thatis,theyhave“matches”.Sce-narioII:Thenewobservationshaveindependentα’sand’s;thatis,theyare“unmatched”.NotethatthereareKdifferentmixedeffects.CMMP

243January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page231OtherTopics231Table7.4AverageMSPEforPredictionofMixedEffects.%MATCH=%oftimesthatthenewobservationswerematchedtosomeofthegroupsinthetrainingdata.Scenarioσα20.1123m=10IRP0.1571.0021.9402.878ICMMP0.2060.6530.7740.836I%MATCH91.594.693.693.2IIRP0.1761.1892.3143.439IICMMP0.2250.7650.9921.147II%MATCH91.294.192.692.5m=50IRP0.1121.0132.0143.016ICMMP0.1930.7990.8970.930I%MATCH98.798.598.698.2IIRP0.1131.0252.0383.050IICMMP0.1950.8000.9090.954II%MATCH98.898.798.498.4wascomparedwiththestandardregressionprediction(RP)methodinpre-dictingeachoftheKmixedeffects.TheaveragesimulatedMSPEoftheprediction,obtainedbasedonT=1000simulationruns,arereportedinTable7.4.Twocases,m=10andm=50,wereconsidered.TheCMMPmethodhasacertainwayoftellingwhetherornotthereisa“match”;ofcourse,itwouldbetotallywrongiftheactualmatchdoesnotexist.Nev-ertheless,thereported%MATCHisthepercentageoftimes,outofthesimulationruns,thatCMMPthoughttherewasamatch,inwhichcaseitusestheCMMP,givenbelow(7.48),forprediction;otherwiseitusesRPforprediction.Itappearsthat,regardlessofwhetherthenewobservationsactuallyhavematchesornot,CMMPmatchthemanyway.Moreimportantly,theresultsshowthatevena“fake”matchstillhelps.Atfirst,thismightsoundalittlesurprising,butitactuallymakessense,bothpracticallyandtheoretically.Thinkaboutabusinesssituation.Evenifonecannotfindaperfectmatchforacustomer,butifonecanfindagroupthatiskindofsimilar,onecanstillgainintermsofpredictionaccuracy.Thisis,infact,howbusinessdecisionsareoftenmade.Inthesimulationstudy,evenifthereisnomatchintermsoftheindividualrandomeffects,thereisatleasta“match”intermsoftherandomeffectsdistribution,thatis,thenewrandomeffectisgeneratedfromthesamedistributionthathasgeneratedthe(previous)training-datarandomeffects;therefore,itisnotsurprisingthatonecanfindoneamongthelatterthatisclosetothenewrandomeffect.ComparingRPwithCMMP,PRassumesthatthemixedeffectisxβiwithnothingextra,whileCMMPassumesthatthemixedeffectisxβplusi

244January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page232232RobustMixedModelAnalysissomethingextra.Forthenewobservation,thereis,forsure,somethingextra,soCMMPisright,atleast,inthattheextraisnon-zero;itthenselectsthebestextrafromanumberofchoices,someofwhicharebetterthanthezeroextrathatPRisusing.Therefore,itisnotsurprisingthatCMMPisdoingbetter,regardlessoftheactualmatch(whichmayormaynotexist).Infact,theempiricalresultsreportedinTable7.4isconsistentwiththetheoreticalfindings,asshownbythefollowingtheorem[seeJiangetal.(2018)fordetail].Theorem7.2.Underregularityconditions,asm→∞,(logm)2ν/n→0,wheren=minn,andn→∞,wehaveminmin1≤i≤minew9:E{(θˆ−θ)2}→0andliminfE(θˆ−θ)2≥δnnn,rnforsomeconstantδ>0,whereθnisthemixedeffectassociatedwiththenewobservations,θˆnistheCMMPofθn,andθˆn,ristheRPofθn.Notethatm→∞andmin1≤i≤mni→∞meansthattheinformationcontainedinthetrainingdataisexpanding;whilennew→∞suggeststhattheadditionalinformationaboutthemixedeffectassociatedwiththenewobservationsisalsogrowing.Intuitively,withthesetwosourcesofsufficientinformation,onecanmakeaccurateprediction,andTheorem7.2hasjustconfirmedthat.Furthermore,itshowsthat,asymptotically,CMMPisdoingbetterthanRPintermsofMSPE.3.Predictionintervals.Predictionintervalsareofsubstantialpracticalinterest[e.g.,Chatterjeeetal.(2008)].Followingthemodel(7.43),(7.44),weassume,inaddition,thatnew,jisdistributedasN(0,R),whereRisthesamevarianceasthatofijin(7.43).AlsowriteαIasαnew.Still,itisnotnecessarytoassumethatαnewhasthesamedistribution,oreventhesamevariance,astheαiin(7.43).Also,αnewisunderstoodaseither(a)anewrandomeffector(b)identicaltooneoftheαi,1≤i≤m.Considerthefollowingpredictionintervalforθ=xβ+α:nnew⎡44⎤RˆRˆ⎣θˆ−za/2,θˆ+za/2⎦,(7.51)nnewnnewwhereθˆistheCMEPofθ,RˆistheREMLestimatorofR,andzaisthecriticalvaluesothatP(Z>za)=aforZ∼N(0,1).Forafutureobservation,yf,weassumethatitsharesthesamemixedeffectsastheobservednewobservationsyn,jin(7.44),thatis,yf=θ+f,(7.52)

245January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page233OtherTopics233wherefisanewerrorthatisdistributedasN(0,R),andindependentwithalloftheα’sandother’s.Weconsiderthefollowingpredictionintervalforyf:θˆ−z(1+n−1R,ˆθˆ+z(1+n−1Rˆ,(7.53)a/2new)a/2new)whereθ,ˆRˆarethesameasin(7.51).Jiangetal.(2018)provedthat,un-derregularityconditions,both(7.51)and(7.53)haveasymptoticcoverageprobabilityof1−aforθandyf,respectively.4.Classifiedmixedlogisticmodelprediction.Sunetal.(2018)hasex-tendedtheCMMPmethodtobinaryobservationsunderamixedlogisticmodel.Themodelassumesthat,giventhesubject-specificrandomeffects,α1,...,αm,binaryresponsesyij,i=1,...,m,j=1,...,niarecondition-allyindependentwiththeconditionalprobabilitysatisfyingP(y=1|α)=pwithlogit(p)=xβ+α,(7.54)ijijijijiwherelogit(p)=log{p/(1−p)}.Hereiistheindexforsubject(e.g.,patient,orgroupofpatients),jistheindexforobservationwithinthesubject(e.g.,observationcollectedatthejthtimepoint,orobservationcollectedfromthejpatientinthesubjectgroup);xijisavectorofobservedcovariates,andβisavectorofunknownfixedeffects.Furthermore,αi,1≤i≤marerandomeffects,assumedtobeindependentanddistributedasN(0,σ2),whereσ2isanunknownvariance.Foranyknownvectorxandfunctiong(·),theBPofthemixedeffect,θ=g(xβ+α),isgivenbyiθ˜=E(θ|y)nixijβ+σξE[g(xβ+σξ)exp{yi·σξ−j=1log(1+e)}]=nixβ+σξ,(7.55)E[exp{yi·σξ−log(1+eij)}]j=1niwhereyi·=j=1yij,andtheexpectationsaretakenwithrespecttoξ∼N(0,1).Twospecialcasesof(7.55)arethefollowing:(i)Ifthecovariatesareattheclusterlevel,thatis,xij=xi,andg(u)=logit−1(u)=eu/(1+eu),then,(7.55)reducesto−1xβ+σξE[logit(xβ+σξ)exp{yi·σξ−nilog(1+ei)}]p˜=,(7.56)xβ+σξE[exp{yi·σξ−nilog(1+ei)}]whichistheBPofp=logit−1(xβ+α).Notethat,inthiscase,themixedieffectisasubject-specific(conditional)probability,suchastheprobabilityofhemorrhagecomplicationoftheATtreatmentintheECMOproblem

246February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page234234RobustMixedModelAnalysisdiscussedinthesequel,foraspecificpatient.(ii)Ifx=0,andg(u)=u,(7.55)reducestoPnix′β+σξE[ξexp{yi·σξ−log(1+eij)}]j=1α˜i=σPnix′β+σξ,(7.57)E[exp{yi·σξ−log(1+eij)}]j=1whichistheBPofαi,thesubject-specific(e.g.,hospital)randomeffect.In(7.55)–(7.57),βandσareunderstoodasthetrueparameters,whicharetypicallyunknowninpractice.Itisthencustomarytoreplaceβ,σbytheirconsistentestimators.TheresultsarecalledempiricalBP,orEBP.ItisassumedthatthesamplesizeforthetrainingdataissufficientlylargethattheEBPisapproximatelyequaltotheBP[JiangandLahiri(2001)].Here,bytrainingdatawerefertothedatayij,1≤i≤m,1≤j≤nidescribedabovethatsatisfytheassumedmixedlogisticmodel.Themaininterestistopredictamixedeffectthatisassociatedwithasetofnewobservations.Morespecifically,letthenew,binaryobservationsbeyn,k,k=1,...,nnew,andthecorrespondingcovariatesbexn,k,k=1,...,nnewsuchthat,conditionalonarandomeffectαIthathasthesameN(0,σ2)distribution,y,1≤k≤nareindependentwithn,knewP(y=1|α)=pandlogit(p)=x′β+α,(7.58)n,kIn,kn,kn,kIwhereβisthesameasin(7.54).Typically,thesamplesize,nnew,forthenewobservationsislimited.Ifonereliesonlyonthenewobservationstoestimatethemixedeffect,say,pn,kforagivenk,theavailableinformationislimited.Luckily,onehasmuchmorethanjustthenewobservations.Itwouldbebeneficialifonecould“borrowstrength”fromthetrainingdata,whicharemuchlargerinsize.Forexample,ifoneknowsthatI=i,then,thereisamuchlargerclusterinthetrainingdata,namely,yij,j=1,...,ni,correspondingtothesamecluster-specificrandomeffect,αI.Thisclusterinthetrainingdataismuchlargerbecause,quiteoften,niismuchlargerthannnew.Onecanalsoutilizethetrainingdatatoestimatetheunknownparameters,βandσ,whichwouldbemuchmoreaccuratethanusingonlythenewobservations.Asnoted,withaccurateestimationoftheparameters,theEBPwillcloselyapproximatetheBP[JiangandLahiri(2001)].Thus,potentiallyonehasalotmoreinformationthatcanbeusedtoestimatethemixedeffectofinterestassociatedwithαI.Thedifficultyis,however,thatIisunknown.Infact,atthispoint,onedoesnotknowtheanswertoanyofthefollowingquestions:(I)istherea“match”betweenIandoneofthe1≤i≤mcorrespondingtothetrainingdataclusters?and(II)ifthereis,whichone?

247January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page235OtherTopics235Itturnsoutthat,asinthedevelopmentofCMMP,theanswerto(I)doesnotreallymatter,sofaraspredictionofthemixedeffectisconcerned.Inotherwords,eveniftheactualmatchdoesnotexist,aCMMPprocedurebasedonthefalsematchstillhelpsinimprovingpredictionaccuracyofthemixedeffects.Thus,wecansimplyfocuson(II).Toillustratethemethod,whichiscalledclassifiedmixedlogisticmodelprediction(CMLMP),asanextensionofCMMP,letusconsideraspecialcasewherethecovariatesareattheclusterlevel,thatis,xij=xiforalli,j.Similarly,thecovariatesforthenewobservationsarealsoattheclusterlevel,thatis,xn,k=xn.FirstassumethatthereisamatchbetweenI,theindexfortherandomeffectassociatedwiththenewobservations,andoneoftheindexes,1≤i≤m,associatedwiththetraining-datarandomeffects.However,thismatchisunknowntous.Thus,asafirststep,weneedtoidentifythematch,thatis,anindexIˆ∈{1,...,m}computedfromthedata,whichmaybeviewedasanestimatorofI.SupposethatI=i.Then,by(7.56),theBPofpn=P(yn,k=1|αI)=logit−1(xβ+α)=logit−1(xβ+α)isnIni−1xβ+σξE[logit(xnβ+σξ)exp{yi·σξ−nilog(1+ei)}]p˜n,i=xβ+σξ.(7.59)E[exp{yi·σξ−nilog(1+ei)}]In(7.59),theparametersβ,σareunderstoodasthetrueparameters,whichareunknowninpractice.Ifwereplacetheseparametersbytheirconsistentestimators,suchastheMLorGEEestimators[e.g.,Jiang(2007),sec.4.2;alsoseeChapter2]basedonthetrainingdata,weobtaintheEBPofpn,denotedbypˆn,i.Ontheotherhand,an“observed”pnisthesampleproportion,y¯n=−1nnewnnewk=1yn,k.OurideaistoidentifyIastheindex1≤i≤mthatminimizesthedistancebetweenpˆn,iandy¯n,thatis,Iˆ=argmin1≤i≤m|pˆn,i−y¯n|.(7.60)Theclassifiedmixedlogisticmodelpredictor(CMLMP)ofpnisthenpˆn,Iˆ.Althoughtheabovedevelopmentisbasedontheassumptionthatamatchexistsbetweentherandomeffectcorrespondingtothenewobserva-tionsandoneoftherandomeffectsassociatedwiththetrainingdata,itwasshown[Sunetal.(2018)],boththeoreticallyandempirically,thatCMLMPenjoysasimilarnicebehaviorasCMMP,thatis,eveniftheactualmatchdoesnotexist,CMLMPstillgainsinpredictionaccuracycomparedtothestandardlogisticregressionprediction(SLRP).Furthermore,CMLMPisconsistentinpredictingthemixedeffectassociatedwiththenewobserva-

248March1,201916:19ws-book9x6RobustMixedModelAnalysisbook4page236236RobustMixedModelAnalysistionsasthesizeoftrainingdatagrowsandsodoestheadditionalinforma-tionfromthenewobservations.Sunetal.(2018)alsodevelopedamethodofestimatingtheMSPEofCMLMPasameasureofuncertainty.5.Anapplication.Weconcludethissectionwithanexampleofreal-dataapplication.Thromboembolicorhemorrhagiccomplications[e.g.,Glassetal.(1997)]occurinasmanyas60%ofpatientswhounderwentex-tracorporealmembraneoxygenation(ECMO),aninvasivetechnologyusedtosupportchildrenduringperiodsofreversibleheartorlungfailure[e.g.,Muntean(2002)].OverhalfofpediatricpatientsonECMOarecurrentlyreceivingantithrombin(AT)tomaximizeheparinsensitivity.Inaretro-spective,multi-center,cohortstudyofchildren(≤18yearsofage)whounderwentECMObetween2003and2012,8,601subjectsparticipatedin42free-standingchildren’shospitalsacross27U.S.statesandtheDistrictofColumbiaknownasPediatricHealthInformationSystem(PHIS).Datawerede-identifiedpriortoinclusioninthestudydataset;however,en-cryptedmedicalrecordnumbersallowedfortrackingofindividualsacrossmultiplehospitalizations.Manyoftheoutcomevariableswerebinary,suchasthebleed_binaryvariable,whichisamainoutcomevariableindicatinghemorrhagecomplicationofthetreatment;andtheDischargeMortalit1Flagvariable,whichisassociatedwithmortality.HerethetreatmentreferstoAT.Predictionofcharacteristicsofinterestassociatedwiththebinaryout-comes,suchasprobabilitiesofhemorrhagecomplicationorthoseofmortal-ityforspecificpatientsareofconsiderableinterest.Notethatthedataarealsopotentiallyclustered,withtheclusterscorrespondingtothechildren’shospitals.Inadditiontothetreatmentindicator,therewere20otherco-variatevariables,forwhichinformationwereavailable.SeeTable7.5foralistofcovariatevariables.Wefocusonthetwooutcomesofinterest,bleed_binaryvariableandDischargeMortalit1Flagvariable,thatwerementionedintheabove.Thedataincludes8601patientsdatafrom42hospitals.Thenumbersofpatientsindifferenthospitalsrangefrom3to487.Wefirstuseaforward-backward(F-B)BICprocedure[e.g.,BromanandSpeed(2002)]tobuildamixedlogisticmodel.Namely,weuseaforwardselectionbasedonlogisticregres-siontoaddcovariatevariables,onebyone,until50%ofthevariableshavebeenadded;wethencarryoutabackwardeliminationtodropthevariablesthathavebeenadded,onebyone,untilallofthevariablesaredropped.ThisF-Bprocessgeneratesasequenceof(nested)models,towhichtheBICprocedure[Schwarz(1978)]isappliedtoselectthemodel.TheF-BBICprocedureleadstoasubsetof12patient-levelcovari-

249January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page237OtherTopics237Table7.5ECMODataVariableDescriptionVariableNameDescriptionbleed_binaryhemorrhagic(Yes/No)DischargeMortalit1Flagmortalityatdischarge(Yes/No)LengthOfStaynumberofdaysduringhospitalizationMajSurgduringHosp_CountnumberofsurgeriesduringhospitalizationMajSurgduringHosp_binarymajorsurgeryduringhospitalization(Yes/No)ecmovol_ind1ECMOvolume:Highvs.Lowecmovol_ind2ECMOvolume:Mediumvs.Lowage_ind11:≤30daysvs.5:≥10yrsage_ind22:31–364daysvs.5:≥10yrsage_ind33:1–2.9yrsvs.5:≥10yrsage_ind44:3–9.9yrsvs.5:≥10yrsTop5PrincDxtopfivecommonICD-9stems[Yes:apatienthadatleastoneof747,746,745,770,or756(seebelow);No:None]flag_renalflagRenaltrack(Yes/No)flag_CVflagCardiovascular(Yes/No)flag_GIflagGastrointestinal(Yes/No)flag_hemimmflagHematologic/Immunology(Yes/No)flag_oncflagMalignancy(Yes/No)flag_metabflagMetabolic(Yes/No)flag_neuromuscflagNeuromuscular(Yes/No)flag_congengenflagOthercongenital/Genetic(Yes/No)flag_resprespiratory(Yes/No)ALLecmodaysnumberofdaysunderECMOduringhospitalizationgenderSexATantithrombintreatment(Yes:atleastonedoesofantithrombinduringhospitalization;No:None)atesoutofatotalofmorethan20covariates.Thesame12covariateswereselectedforbothoutcomevariables.Specifically,intheselectedmodel,theprobabilityofhemorrhagecomplication(ormortality)isas-sociatedwithnumberofdaysduringhospitalization(LengthOfStay),ma-jorsurgeryduringhospitalization(MajSurgduringHosp_binary;Yes/No),whetherthepatientisnomorethan30daysold(age_ind1),whetherthepatienthashadatleastoneofthefollowing:747–Othercongenitalanoma-liesofcirculatorysystem;746–Othercongenitalanomaliesofheart,exclud-ingendocardialfibroelastosis;745–Bulbuscordisanomaliesandanomaliesofcardiacseptalclosure;770–Otherrespiratoryconditionsoffetusandnewborn;756–Othercongenitalmusculoskeletalanomalies,excludingcon-genitalmyotonicchondrodystrophy(Top5PrincDx),whetherthepatientisflaggedforcardiovascular(flag_CV;Yes/No),hematologic/immunology(flag_hemimm;Yes/No),metabolic(flag_metab;Yes/No),neuromuscu-

250January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page238238RobustMixedModelAnalysislar(flag_neuromusc;Yes/No),othercongenital/genetic(flag_congengen;Yes/No),orrespiratory(flag_resp;Yes/No),numberofdaysunderECMOduringhospitalization(ALLecmodays),andwhetherthepatienthasre-ceivedtheATtreatment(AT;Yes/No).Outofthe12patient-levelcovariates,twoarecontinuouscorrespond-ingtonumberofdaysduringhospitalizationandthenumberofdaysun-derECMOduringhospitalization;therestarebinary.Inadditiontothepatient-levelcovariates,therearetwohospital-levelcovariates,namely,thetotalnumberofpatientsduringthe10yearstudywhodidreceiveAT(yesat)andtotalnumberofpatientsthatwerewereincludedinthe10-yearstudy(total).Bothhospital-levelcovariatesarecontinuous.ItshouldbenotedthatthefourcontinuouscovariatesneedtobestandardizedbeforecarryingouttheCMLMPanalysis.Theproposedmixedlogisticmodelincludestheabove12patient-levelcovariatesaswellasthe2hospital-levelcovariates,plusahospital-specificrandomeffectthatcapturesthe“uncaptured”aswellasbetween-hospitalvariation.Themixedeffectsofinterestareprobabilitiesofhemorrhagecomplicationcorrespondingtobleed_binary,andmortalityprobabilitiesassociatedwithDischargeMortalit1Flag,fornewobservations.Notethat,becausemostofthecovariatesareatthepatient-level,theseprobabilitiesarepatient-specific.However,theresponsesareclusteredwiththeclusterscorrespondingtothehospitals,andthereare42randomeffectsassociatedwiththehospitalsunderthemixedlogisticmodel.InordertotesttheCMLMPmethod,werandomlyselect5patientsfromagivenhospitalandtreattheseasthenewobservations.Therestofthehospitals,andrestofthepatientsfromthesamehospital(ifany),cor-respondtothetrainingdata.Wethenusethematchingstrategydescribedabove,withyesatandtotalasthecluster-levelcovariatesthatareusedtoidentifythegroupforthenewobservations,thencomputetheCMLMPforeachofthe5selectedpatients.Inaddition,MSPEestimateofSunetal.(2018)werecomputed,whosesquareroot,multipliedby2,isusedasmarginoferror.Thisanalysisappliestoallbutonehospital(Hospital#2033),forwhichonlythreepatientsareavailable.Forthishospitalallthreepatientsareselectedforthenewobservations,andtheCMLMPandmarginoferrorareobtainedforall3patients.Therefore,for41out42oftheseanalyses,thereisamatchbetweenthenewobservations’groupandoneofthetrainingdatagroups;andfor1analysisthereisnosuchamatch.

251February12,201916:38ws-book9x6RobustMixedModelAnalysisbook4page239OtherTopics2391.08PredictedProbability0.00.20.40.60.050100150200PatientFig.7.5PredictedProbabilitiesofHemorrhageComplication(bleed_binary)withMar-ginsofErrors:DashLinesIndicateMarginsofErrorsOverall,theanalysisyieldatotalof208predictedprobabilitieswiththecorrespondingmarginsoferrors.TheresultsarepresentedinFigure7.5(bleed_binary)andFigure7.6(DischargeMortalit1Flag).Notethat,forDischargeMortalit1Flag,someofthepredictedareclosetozero;asaresult,thelowermarginisnegative,andthereforetruncatedat0.Ontheotherhand,thereisnoneedfortruncationofthelower(orupper)marginforbleed_binary.

252February12,201916:38ws-book9x6RobustMixedModelAnalysisbook4page240240RobustMixedModelAnalysis1.08PredictedProbability0.00.20.40.60.050100150200PatientFig.7.6PredictedMortalityProbabilities(DischargeMortalit1Flag)withMarginsofErrors:DashLinesIndicateMarginsofErrors7.7Exercises7.1.InExample7.1ofSubsection7.1,letthetrueparametersbeμ=−0.5,σ2=2.0,andτ2=1.0.Also,letm=100andk=5,1≤i≤m.Inithefollowing,theerrorsarealwaysgeneratedfromanormaldistribution.a.Generatetherandomeffectsfromanormaldistribution.MakeaQ–Qplottoassessnormalityoftherandomeffects,usingREMLestimatorsoftheparameters.b.Generatetherandomeffectsfromadouble-exponentialdistribution(withthesamevariance).MakeaQ–Qplottoassessnormalityoftherandomeffects,againusingREMLestimatorsoftheparameters.

253February12,201916:38ws-book9x6RobustMixedModelAnalysisbook4page241OtherTopics241c.Generatetherandomeffectsfromacentralized-exponentialdistribu-tion(withthesamevariance).Hereacentralized-exponentialdistributionisthedistributionofξ−E(ξ),whereξhasanexponentialdistribution.MakeaQ–Qplottoassessnormalityoftherandomeffects,usingREMLestimatorsoftheparameters.d.Comparetheplotsina,b,andc.Whatdoyouconclude?7.2.Showthattheminimizerof(7.3)isthesameasthebestlinearunbiasedestimator(BLUE)forβandthebestlinearunbiasedpredictor(BLUP)forγinthelinearmixedmodely=Xβ+Zγ+,whereγ∼N(0,σ2I),∼N(0,τ2I),andγandareindependent,providedthatλqnisidenticaltotheratioτ2/σ2.ForthedefinitionofBLUEandBLUP,seeSection5.1.7.3.ProveLemma7.1.7.4.Considerthequadraticsplinegivenby(7.10).Showthattheshapeofthesplineishalfcirclebetween0and1facingup,halfcirclebetween1and2facingdown,andhalfcirclebetween2and3facingup.Alsoshowthatthefunctionissmoothinthatithasacontinuousderivative.7.5.Considerthethree-pointdistributionintheparagraphcontaining(7.12).Verifythemomentpropertiesmentionedbelow(7.12).7.6.Showthat,under(7.16),(7.17),onehasE(y|η)=b(η)≡μ(θ)iiiiandvar(y|η)=φb(η)≡V(θ).iiiii7.7.ThisexerciseisrelatedtoExample7.4.a.Showthat,ifthedistributionofyisthesameasthatofy˜|y,thep-value(7.20)hasaUniform[0,1]distribution.b.Showthatthetwomodelsdescribedintheexampleagreeintermsofthemeanandvarianceofyijbutnotintermsofthecovariancebetweenyijandyikforj=k.7.8.Considertheθˆi,λdefinedby(7.28),whereθˆiistheBLUP(seeSubsection5.1),assumingFay-Herriotmodel(seeExample5.1)withAbeingknown.Showthattheλi,1≤i≤mthatminimize(7.30)aregivenby(7.31).7.9.Thisexercisehastodowiththeselfbenchmarkingpropertyofpredictorsderivedunderanaugmentedmodel;seeSection7.5.a.ThisisaboutselfbenchmarkingofEBLUPderivedunderanaug-mentedmodel.Showthat,undertheFay-Herriotmodel(seeExample5.1),theEBLUPsofθ=xβ+v,1≤i≤m,obtainedbyfittingtheaugmentediiimodel(7.32),areselfbenchmarking,thatis,theysatisfy(7.25).

254January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page242242RobustMixedModelAnalysisb.ThisisaboutselfbenchmarkingofOBPderivedunderanaugmentedmodel.LetθˆidenotetheithcomponentoftheOBPdefinedby(7.37).Showthatθˆi,1≤i≤mareselfbenchmarking,thatis,theysatisfy(7.25).7.10.Verifytheconditionalexpectation(7.55),andthetwospecialcases(7.56)and(7.57).

255January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page243BibliographyAkaike,H.(1973),Informationtheoryasanextensionofthemaximumlikelihoodprinciple,inSecondInternationalSymposiumonInformationTheory(B.N.PetrovandF.Csakieds.),pp.267–281(AkademiaiKiado,Budapest).Arvesen,J.N.(1969),JackknifingU-statistics,inAnn.Math.Statist.40,pp.2076–2100.Arvesen,J.N.andSchmitz,T.H.(1970),Robustproceduresforvariancecompo-nentproblemsusingthejackknife,inBiometrics26,pp.677–686.Azzalini,A.andCapitanio,A.(2014),TheSkew-NormalandRelatedFamilies,(CambridgeUniversityPress,NewYork).Bandyopadhyay,R.(2017),BenchmarkingofObservedBestPredicror,(Ph.D.Dissertation,Dept.ofStat.,Univ.ofCalif.,Davia,CA).Bartlett,M.S.(1936),Somenotesoninsecticidetestsinthelaboratoryandinthefield,inJ.Roy.Statist.Suppl.3,pp.185–194.Basawa,I.V.,andRao,B.L.S.P.(1980),StatisticalInferenceforStochasticProcesses,(AcademicPress,London).Battese,G.E.,Harter,R.M.,andFuller,W.A.(1988),Anerror-componentsmodelforpredictionofcountycropareasusingsurveyandsatellitedata,inJ.Amer.Statist.Assoc.80,pp.28–36.Bhatia,R.(1997),MatrixAnalysis,(Springer,NewYork).Booth,J.G.andHobert,J.P.(1999),MaximumgeneralizedlinearmixedmodellikelihoodwithanautomatedMonteCarloEMalgorithm,inJ.Roy.Statist.Soc.B61,pp.265–285.Breslow,N.E.andClayton,D.G.(1993),Approximateinferenceingeneralizedlinearmixedmodels,inJ.Amer.Statist.Assoc.88,pp.9–25.Bondell,H.D.,Krishna,A.andGhosh,S.K.(2010),Jointvariableselectionforfixedandrandomeffectsinlinearmixed-effectsmodels,inBiometrics66,pp.1069–1077.Broman,K.W.andSpeed,T.P.(2002),Amodelselectionapproachfortheidentificationofquantitativetraitlociinexperiementalcrosses,inJ.Roy.Statist.Soc.Ser.B64,pp.641–656.Calvin,J.A.,andSedransk,J.(1991),Bayesianandfrequentistpredictivein-ferenceforthepatternsofcarestudies,inJ.Amer.Statist.Assoc.86,pp.243

256January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page244244RobustMixedModelAnalysis36–48.Casella,G.andBerger,R.L.(2002),StatisticalInference,(2nded.),(Duxbury).Chakraborty,A.,Datta,G.K.andMandal,A.(2018),Robusthierarchi-calBayessmallareaestimationfornestederrorregressionmodel,inarXiv:1702.05832v2..Chambers,R.(1986),Outlierrobustfinitepopulationestimation,inJ.Amer.Statist.Assoc.81,pp.1063–1069.Chambers,R.andTzavidis,N.(2006),M-quantilemodelsforsmallareaestima-tion,inBiometrika93,pp.255–268.Chatterjee,S.andLahiri,P.,andLi,H.(2008),Parametricbootstrapapproxima-tiontothedistributionofEBLUP,andrelatedpredictionintervalsinlinearmixedmodels,inAnn.Statist.36,pp.1221–1245.Chaudhuri,S.andGhosh,M.(2011),Empiricallikelihoodforsmallareaestima-tion,inBiometrika98,pp.473–480.Chen,C.F.(1985),Robustnessaspectsofscoretestsforgeneralizedlinearandpartiallylinearregressionmodels,inTechnometrics27,pp.277–283.Chen,S.(2012),Predictivemodelingforclustereddatawithapplications,Ph.D.Dissertation,(Dept.ofStatist.,Univ.ofCalif.,Davis,CA).Chen,S.,Jiang,J.andNguyen,T.(2015),Observedbestpredictionforsmallareacounts,inJ.SurveyStatist.Methodology3,pp.136–161.Chernoff,H.,andLehmann,E.L.(1954),Theuseofmaximum-likelihoodesti-2matesinχtestsforgoodnessoffit,inAnn.Math.Statist.25,pp.579–586.CiscoSystemsInc.(1996),NetFlowServicesandApplications,WhitePaper.Claeskens,G.andHart,J.D.(2009),Goodness-of-fittestsinmixedmodels(withdiscussion),inTEST18,pp.213–239.Copas,J.andEguchi,S.(2005),Localmodeluncertaintyandincomplete-databias(withdiscussion),inJ.Roy.Statist.Soc.B67,459–513.Datta,G.S.andGhosh,M.(1991),Bayesianpredictioninlinearlinearmodels:Applicationtosmallareaestimation,inAnn.Statist.19,pp.1748–1770.Datta,G.S.andLahiri,P.(2000),Aunifiedmeasureofuncertaintyofestimatedbestlinearunbiasedpredictorsinsmallareaestimationproblems,inStatist.Sinica10,pp.613–627.Datta,G.S.,Rao,J.N.K.andSmith,D.D.(2005),Onmeasuringthevariabilityofsmallareaestimatorsunderabasicarealevelmodel,inBiometrika92,pp.183–196.Datta,G.S.,Kubokawa,T.,Rao,J.N.K.,andMolina,I.(2011),Estimationofmeansquarederrorofmodel-basedsmallareaestimators,inTEST20,pp.367–388.deLeeuw,J.(1992),IntroductiontoAkaike(1973)informationtheoryandanextensionofthemaximumlikelihoodprinciple,inBreakthroughsinStatistics(S.KotzandN.L.Johnsoneds.),Vol.1,pp.599–609(Springer,London).Demidenko,E.(2013),MixedModels:TheoryandApplicationwithR,(2nded.),(Wiley,NewYork).Dempster,A.,Laird,N.,andRubin,D.(1977),Maximumlikelihoodfromincom-pletedataviatheEMalgorithm(withdiscussion),inJ.Roy.Statist.Soc.B39,pp.1–38.

257January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page245Bibliography245Dempster,A.P.andRyan,L.M.(1985),Weightednormalplots,inJ.Amer.Ststist.Assoc.80,pp.845–850.Diggle,P.J.,Liang,K.Y.,andZeger,S.L.(1994),AnalysisofLongitudinalData,(OxfordUniv.Press).Diggle,P.J.,Heagerty,P.,Liang,K.Y.,andZeger,S.L.(2002),AnalysisofLongitudinalData(2nded.),(OxfordUniv.Press).Dzhaparidze,K.(1986),ParameterEstimationandHypothesisTestinginSpectralAnalysisofStationaryTimeSeries,(Springer,NewYork).Efron,B.(1979),Bootstrapmethod:Anotherlookatthejackknife,inAnn.Statist.7,pp.1–26.Efron,B.andHinkley,D.V.(1978),Assessingtheaccuracyofthemaximumlikeli-hoodestimator:observedversusexpectedFisherinformation,inBiometrika65,pp.457–487.Efron,B.andTibshirani,R.J.(1993),AnIntroductiontotheBootstrap,(Chap-man&Hall/CRC).Efron,B.andTibshirani,R.(2007),Ontestingthesignificanceofsetsofgenes,inAnn.Appl.Statist.1,pp.107–129.Fabrizi,E.andLahiri,P.(2013),Adesign-basedapproximationtotheBayesinformationcriterioninfinitepopulationsampling,inStatistica73,pp.289–301.Fan,J.andYao,Q.(2003),NonlinearTimeSeries:NonparametricandPara-metricMethods,(Springer,NewYork).Fay,R.E.andHerriot,R.A.(1979),Estimatesofincomeforsmallplaces:anapplicationofJames-Steinprocedurestocensusdata,inJ.Amer.Statist.Assoc.74,pp.269–277.Ferrante,M.R.andTrivisano,C.(2010),Smallareaestimationofthenumberoffirms’recruitsbyusingmultivariatemodelsforcountdata,inSurveyMethodology36,pp.171–180.Fisher,R.A.(1922),Ontheinterpretationofchi-squarefromcontingencytables,andthecalculationofP,inJ.Roy.Statist.Soc.85,87–94.Foutz,R.V.,andSrivastava,R.C.(1977),Theperformanceofthelikelihoodratiotestwhenthemodelisincorrect,inAnn.Statist.5,pp.1183–1194.Friedman,J.(1991),Multivariateadaptiveregressionsplines(withdiscussion),inAnn.Statist.19,pp.1–67.Fuller,W.A.(2009),SamplingStatistics,(Wiley,Hoboken,NJ).Ganesh,N.(2009),Simultaneouscredibleintervalsforsmallareaestimationprob-lems,inJ.MultivariateAnal.100,pp.1610–1621.Gershunskaya,J.(2018),Robustempiricalbestsmallareafinitepopulationmeanestimationusingamixturemodel,inCalcuttaStatist.Assoc.Bull.69,pp.183–204.Ghosh,M.,Nangia,N.,andKim,D.(1996),Estimationofmedianincomeoffour-personfamilies:ABayesiantimeseriesapproach,inJ.Amer.Statist.Assoc.91,1423–1431.Ghosh,M.,Natarajan,K.,Stroud,T.W.F.andCarlin,B.P.(1998),Generalizedlinearmodelsforsmall-areaestimation,inJ.Amer.Statist.Assoc.93,pp.273–282.

258January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page246246RobustMixedModelAnalysisGlassP,Bulas,D.I.,Wagner,A.E.,etal.(1997),Severityofbraininjuryfol-lowingneonatalextracorporealmembraneoxygenationandoutcomeatage5years,inDev.Med.Childneurol.39,pp.441–448.Guo,W.(2002),Functionalmixedeffectsmodels,inBiometrics58,pp.121–128.Gourieroux,C.,andMonfort,A.(1995),StatisticsandEconometricModels,Vol.2,(CambridgeUniv.Press).Hajarisman,N.(2013),Two-levelhierarchicalBayesianPoissonmodelsforsmallareaestimationofinfantmortalityrates,Ph.D.Dissertation,(Dept.ofMath.NaturalSci.,BogorAgriculturalUniv.,Indonesia).Hampel,F.R.,Ronchetti,E.M.,Rousseeuw,P.J.,andStahe,W.A.(1986),RobustStatistics:TheApproachBasedonInfluenceFunctions,(Wiley,NewYork).Hall,P.,andMaiti,T.(2006),Nonparametricestimationofmean-squaredpre-dictionerrorinnested-errorregressionmodels,inAnn.Statist.34,pp.1733–1750.Hand,D.andCrowder,M.(2002),PracticalLongitudinalDataAnalysis,(Chap-manandHall,London).Hannan,E.J.andQuinn,B.G.(1979),Thedeterminationoftheorderofanautoregression,inJ.Roy.Statist.Soc.B41,pp.190–195.Hansen,L.P.(1982),Largesamplepropertiesofgeneralizedmethodofmomentsestimators,inEconometrica50,pp.1029–1054.Hartley,H.O.andRao,J.N.K.,Maximumlikelihoodestimationforthemixedanalysisofvariancemodel,inBiometrica54,pp.93–108.Hastie,T.J.andTibshirani,R.J.(1990),GeneralizedAdditiveModels,(Chapman&Hall,CRC).Hayes,P.M.etal.(1993),QuantitativetraitlocuseffectsandenvironmentalinteractioninasampleofNorthAmericanbarleygermplasm,inTheor.Appl.Genet.87,pp.392–401.He,X.,Zhu,Z.-Y.,andFung,W.K.(2002),Estimationinasemiparamet-ricmodelforlongitudinaldatawithunspecifieddependencestructure,inBiometrika89,pp.579–590.Hedeker,D.,Gibbons,R.D.,andFlay,B.R.(1994),Random-effectsregressionmodelsforclustereddatawithanexamplefromsmokingpreventionresearch,inJ.ConsultingClinicalPsych.62,pp.757–765.Henderson,C.R.(1948),Estimationofgeneral,specificandmaternalcombiningabilitiesincrossesamonginbredlinesofswine,Ph.D.Dissertation,(IowaStateUniv.,Ames,IA).Heritier,S.,andRonchetti,E.(1994),Robustbounded-influencetestsingeneralparametricmodels,inJ.Amer.Statist.Assoc.89,pp.897–904.Heyde,C.C.(1994),Aquasi-likelihoodapproachtotheREMLestimatingequa-tions,inStatist.Probab.Letters21,pp.381–384.Heyde,C.C.(1997).Quasi-likelihoodandItsApplication,(Springer,NewYork).Hu,K.,Choi,J.,Sim,A.,andJiang,J.(2013),Bestpredictivegeneralizedlinearmixedmodelwithpredictivelassoforhigh-speednetworkdataanalysis,inInternationalJ.Statist.Probab.4,pp.132i–148.Huber,P.J.(1964),Robustestimationofalocationparameter,inAnn.Math.

259January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page247Bibliography247Statist.35,pp.73–101.Huber,P.J.(1981).RobustStatistics,(Wiley,NewYork).Ibrahim,J.G.,Zhu,H.,Carcia,R.I.,andGuo,R.(2011),Fixedandrandomeffectsselectioninmixedeffectsmodels,inBiometrics67,pp.495–503.Ibrahim,J.G.,Zhu,H.,andTang,N.(2008),Modelselectioncriteriaformissing-dataproblemsusingtheEMalgorithm,inJ.Amer.Statist.Assoc.103,pp.1648–1658.Jiang,J.(1996),REMLestimation:Asymptoticbehaviorandrelatedtopics,inAnn.Statist.24,pp.255–286.Jiang,J.(1997),WaldconsistencyandthemethodofsievesinREMLestimation,inAnn.Statist.25,pp.1781–1803.Jiang,J.(1998a),Consistentestimatorsingeneralizedlinearmixedmodels,inJ.Amer.Statist.Assoc.93,pp.720–729.Jiang,J.(1998b),AsymptoticpropertiesoftheempiricalBLUPandBLUEinmixedlinearmodels,inStatisticaSinica8,pp.861–885.Jiang,J.(2001),Goodness-of-fittestsformixedmodeldiagnostics,inAnn.Statist.29,pp.1137–1164.Jiang,J.(2003),Empiricalmethodofmomentsanditsapplications,inJ.Statist.Plann.Inference115,pp.69–84.Jiang,J.(2005),Partiallyobservedinformationandinferenceandinferenceaboutnon-Gaussianmixedlinearmodels,inAnn.Statist.33,pp.2695–2731.Jiang,J.(2007),LinearandGeneralizedLinearMixedModelsandTheirAppli-cations,(Springer,NewYork).Jiang,J.(2010),LargeSampleTechniquesforStatistics,(Springer,NewYork).Jiang,J.(2017),LargeAsymptoticAnalysisofMixedEffectsModels:Theory,Applications,andOpenProblems,(Chapman&Hall/CRC,BocaRaton,FL).Jiang,J.(2012),Onrobustversionsofclassicaltestswithdependentdata,inNon-parametricStatisticalMethodsandRelatedTopics-AFestschriftinHonorofProfessorP.K.BhattacharyaontheOccasionofHis80thBirthday,J.Jiang,G.G.Roussas,F.J.Samaniegoeds.,pp.77–99,(WorldScientific,Singapore).Jiang,J.,Jia,H.,andChen,H.(2001),Maximumposteriorestimationofrandomeffectsingeneralizedlinearmixedmodels,inStatisticaSinica11,pp.97–120.Jiang,J.andLahiri(2001),Empiricalbestpredictionforsmallareainferencewithbinarydata,inAnn.Inst.Statist.Math.53,217–243.Jiang,J.andLahiri(2005),Mixedmodelpredictionandsmallareaestimation(withdiscussion),inTEST15,1–96.Jiang,J.,Lahiri,P.andWan,S.(2002),AunifiedjackknifetheoryforempiricalbestpredictionwithM-estimation,inAnn.Statist.30,pp.1782–1810.Jiang,J.andNguyen,T.(2009),Commentson:Goodness-of-fittestsinmixedmodelsbyG.ClaeskensandJ.D.Hart,TEST18,248–255.Jiang,J.andNguyen,T.(2012),Smallareaestimationviaheteroscedasticnested-errorregression,inCanadianJ.Statist.40,pp.588–603.Jiang,J.,Nguyen,T.andRao,J.S.(2009),Asimplifiedadaptivefenceprocedure,

260January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page248248RobustMixedModelAnalysisinStatist.Probab.Letters79,pp.625–629.Jiang,J.,Nguyen,T.andRao,J.S.(2010),Fencemethodfornonparametricsmallareaestimation,inSurveyMethodology36,pp.3–11.Jiang,J.,Nguyen,T.andRao,J.S.(2011a),Bestpredictivesmallareaestimation,inJ.Amer.Statist.Assoc.106,pp.732–745.Jiang,J.,Nguyen,T.andRao,J.S.(2011b),Invisiblefencemethodandtheidentificationofdifferentiallyexpressedgenesets,inStatist.Interface4,pp.403–415.Jiang,J.andNguyen,T.(2015),TheFenceMethods,(WorldScientific,Singa-pore).Jiang,J.,Nguyen,T.andRao,J.S.(2015a),TheE-MSalgorithm:Modelselec-tionwithincompletedata,inJ.Amer.Statist.Assoc.110,pp.1136–1147.Jiang,J.,Nguyen,T.andRao,J.S.(2015b),Observedbestpredictionvianested-errorregressionwithpotentiallymisspecifiedmeanandvariance,inSurveyMethodology41,pp.37–55.Jiang,J.,Luan,Y.andWang,Y.-G.(2007),Iterativeestimatingequations:Lin-earconvergenceandasymptoticproperties,inAnn.Statist.35,pp.2233–2260.Jiang,J.andRao,J.S.(2003),Consistentproceduresformixedlinearmodelselection,inSankhya65A,23–42.Jiang,J.,Rao,J.S.,Fan,J.andNguyen,T.(2018),Classifiedmixedmodelprediction,inJ.Amer.Statist.Assoc.113,pp.269–279.Jiang,J.,Rao,J.S.,Gu,Z.andNguyen,T.(2008),Fencemethodsformixedmodelselection,inAnn.Statist.36,pp.1669–1692.Jiang,J.andTorabi,M.(2018),Aunifiedapproachtogoodness-of-fittestswithapplicationtosmallareaestimation,TechnicalReport.Jiang,J.andZhang,W.(2001),Robustestimationingeneralizedlinearmixedmodels,inBiometrika88,pp.753–765.Jiang,J.andZhang,W.(2002),Distribution-freepredictionintervalsinmixedlinearmodels,inStatisticaSinica12,pp.537–553.Jin,X.,Carlin,B.P.,andBanerjee,S.(2005),Generalizedhierarchicalmultivari-ateCARmodelsforareadata,inBiometrics61,pp.950–961.Jung,S.-H.(1996),Quasi-likelihoodformedianregressionmodels,inJ.Amer.Statist.Assoc.91,pp.251–257.Karim,M.R.andZeger,S.L.(1992),Generalizedlinearmodelswithrandomeffects:Salamandermatingrevisited,inBiometrics48,pp.631–644.Kass,R.E.andWassermannL.(1995),AreferencetestfornestedhypothesesanditsrelationshiptotheSchwartzcriterion,inJ.Amer.Statist.Assoc.90,pp.928–934.Kauermann,G.(2005),Anoteonsmoothingparameterselectionforpenalizedsplinesmoothing,inJ.Statist.Planning&Inference127,pp.53–69.Kent,J.T.(1982),Robustnesspropertiesoflikelihoodratiotests,inBiometrika69,pp.19–27.Khuri,A.I.,Mathew,T.,andSinha,B.K.(1998),Statisticaltestsformixedlinearmodels,(Wiley,NewYork).Kim,H.J.,andCai,L.(1993),Robustnessofthelikelihoodratiotestforachange

261January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page249Bibliography249insimplelinearregression,inJ.Amer.Statist.Assoc.88,pp.864–871.Koenker,P.andBassett,G.(1978),Regressionquantiles,inEconometrica46,pp.33–50.Koopmans,L.H.(1995),TheSpectralAnalysisofTimeSeries,(Elsevier).Kott,P.S.(1991),Robustsmalldomainestimationusingrandomeffectsmod-elling,inSurveyMethodology15,pp.3–12.Krafty,R.T.,Hall,M.andGuo,W.(2011),Functionalmixedeffectsspectralanalysis,inBiometrika98,pp.583–598.Lahiri,P.andRao,J.N.K.(1995),Robustestimationofmeansquarederrorofsmallareaestimators,inJ.Amer.Statist.Assoc.90,pp.758–766.Lai,P.Y.andLee,S.(2005),AnoverviewofasymptopicpropertiesofLpregres-sionundergeneralclassesoferrordistributions,inJ.Amer.Statist.Assoc.100,pp.446–458.Lander,E.S.,andBotstein,D.(1989),MappingMendelianfactorsunderlyingquantitativetraitsusingRFLPlinkagemaps,inGenetics121,pp.185–199.Lange,N.andRyan,L.(1989),Assessingnormalityinrandomeffectsmodels,inAnn.Statist.17,pp.624–642.Lee,L.F.(1992),Ontheefficiencyofmethodsofsimulatedmomentsandmaxi-mumsimulatedlikelihoodestimationofdiscreteresponsemodels,inEcono-metricTheory8,pp.518–552.Lehmann,E.L.(1999),ElementsofLarge-SampleTheory,(Springer,NewYork).Lehmann,E.L.andCasella,G.(1998),TheoryofPointEstimation,2nded.,(Springer,NewYork).Liang,K.Y.andZeger,S.L.(1986),Longitudinaldataanalysisusinggeneralizedlinearmodels,inBiometrika73,pp.13–22.Lin,X.andBreslow,N.E.(1996),Biascorrectioningeneralizedlinearmixedmodelswithmultiplecomponentsofdispersion,inJ.Amer.Statist.Assoc.91,pp.1007–1016.Little,R.J.A.andRubin,D.B.(2002),StatisticalAnalysiswithMissingData,(2nded.),(Wiley,NewYork).Lombardía,M.J.andSperlich,S.(2008),Semiparametricinferenceingeneralizedmixedeffectsmodels,inJ.Roy.Statist.Soc.B70,pp.913–930.Luo,Z.W.etal.(2007),SFPgenotypingfromaffymetrixarraysisrobustbutlargelydetectscis-actingexpressionregulators,inGenetics176,pp.789–800.Malec,D.,Sedransk,J.,Moriarity,C.L.,andLeClere,F.B.(1997),SmallareainferenceforbinaryvariablesintheNationalHealthInterviewSurvey,inJ.Amer.Statist.Assoc.92,815–826.McCullagh,P.andNelder,J.A.(1989),GeneralizedLinearModels(2nded.),(ChapmanandHall,London).McCulloch,C.E.,Searle,S.R.,andNeuhaus,J.M.(2008),Generalized,Linear,andMixedModels,(2nded.),(Wiley,Hoboken,NJ).McFadden,D.(1989),Amethodofsimulatedmomentsforestimationofdiscreteresponsemodelswithoutnumericalintegration,inEconometrika57,pp.995–1026.Moore,D.S.(1978),Chi-squaretests,inStudiesinStatistics(R.V.Hogg,ed.),

262January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page250250RobustMixedModelAnalysisMathematicalSocietyofAmerica,Providence,RI.Morris,C.N.andChristiansen,C.L.(1995),Hierarchicalmodelsforrankingandforidentifyingextremeswithapplications,inBayesStatistics5,(OxfordUniv.Press).Morton,R.(1987),Ageneralizedlinearmodelwithnestedstrataofextra-Poissonvariation,inBiometrika74,pp.247–257.Mou,J.(2012),Two-stagefencemethodsinselectingcovariatesandcovarianceforlongitudinaldata,Ph.D.Dissertation,(Dept.ofStatist.,Univ.ofCalif.,Davis,CA).M¨uller,S.,Scealy,J.L.,andWelsh,A.H.(2013),Modelselectioninlinearmixedmodels,inStatist.Sci.28,pp.135–167.M¨unnich,R.,Burgard,J.P.,andVogt,M.(2009),Smallareaestimationforpop-ulationcountsintheGermanCensus2011,inSectiononSurveyResearchMethods,JSM2009,Washington,D.C..MunteanW.(2002),Freshfrozenplasmainthepediatricagegroupandincon-genitalcoagulationfactordeficiency,inThromb.Res.107,S29-S32,pp.0049–3848.Newey,W.K.(1985),Generalizedmethodofmomentsspecificationtesting,J.Econometrics29,229–256.Nishii,R.(1984),Asymptoticpropertiesofcriteriaforselectionofvariablesinmultipleregression,inAnn.Statist.12,pp.758–765.Opsomer,J.D.,Breidt,F.J.,Claeskens,G.,Kauermann,G.&Ranalli,M.G.(2008),Nonparametricsmallareaestimationusingpenalizedsplineregres-sion,inJ.Roy.Statist.Soc.B70,pp.265–286.Owen,A.B.(1988),Empiricallikelihoodratioconfidenceintervalsforasinglefunctional,inBiometrika75,pp.237–249.Owen,A.B.(2001),EmpiricalLikelihood,(Chapman&Hall).Pan,W.(2001),Ontherobustvarianceestimatoringeneralisedestimatingequa-tions,inBiometrika88,pp.901–906.Pfeffermann,D.(2013),Newimportantdevelopmentsinsmallareaestimation,inStatist.Sci.28,40–68.Pierce,D.(1982),Theasymptoticeffectofsubstitutingestimatorsforparametersincertaintypesofstatistics,inAnn.Statist.10,pp.475–478.Prasad,N.G.N.andRao,J.N.K.(1990),Theestimationofmeansquarederrorsofsmallareaestimators,inJ.Amer.Statist.Assoc.85,pp.163–171.Press,W.H.,Teukolsky,S.A.,Vetterling,W.T.andFlannery,B.P.(1997),Nu-mericalRecipesinC—TheArtsofScientificComputing,(2nded.),(Cam-bridgeUniv.Press).Rady,E.A.,Kilany,N.M.andEliwa,S.A.(2015)Estimationinmixed-effectsfunctionalANOVAmodels,inJ.MultivariateAnal.133,pp.346–355.Rao,C.R.,andWu,Y.(1989),Astronglyconsistentprocedureformodelselectioninaregressionproblem,inBiometrika76,pp.369–374.Rao,J.N.K.andMolina,I.(2015),SmallAreaEstimation(2nded.),(Wiley,NewYork).Richardson,A.M.andWelsh,A.H.(1994),Asymptoticpropertiesofrestrictedmaximumlikelihood(REML)estimatesforhierarchicalmixedlinearmodels,

263January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page251Bibliography251inAustral.J.Statist.36,pp.31–43.Rice,J.A.(1995),MathematicalStatisticsandDataAnalysis,2nded.,DuxburyPress,Belmont,CA.Richardson,A.M.andWelsh,A.H.(1996),Covariatescreeninginmixedlinearmodels,inJ.MultivariateAnal.58,pp.27–54.Robinson,G.K.(1991),ThatBLUPisagoodthing:Theestimationofrandomeffects(withdiscussion),inStatist.Sci.6,pp.15–51.Schrader,R.M.,Hettmansperger,T.P.(1980),Robustanalysisofvariancebaseduponalikelihoodratiocriterion,inBiometrika67,pp.93–101.Schwarz,G.(1978),Estimatingthedimensionofamodel,inAnn.Statist.6,pp.461–464.Sen,A.andSrivastava,M.(1990),RegressionAnalysis:Theory,Methods,andApplications,(Spriner,NewYork).Searle,S.R.(1971),LinearModels,(Wiley,NewYork).Searle,S.R.,CasellaG.andMcCulloch,C.E.(1992),VarianceComponents,(Wiley,NewYork).Self,S.G.andLiang,K.Y.(1987),Asymptoticpropertiesofmaximumlikelihoodestimatorsandlikelihoodratiotestsundernonstandardconditions,inJ.Amer.Statist.Assoc.82,pp.605–610.Shao,J.(1993),Linearmodelselectionbycross-validation,inJ.Amer.Statist.Assoc.88,pp.486–494.Shao,J.andTu,D.(1995),JackknifeandBootstrap,(Springer,NewYork).Shibata,R.(1984),Approximateefficiencyofaselectionprocedureforthenumberofregressionvariables,inBiometrika71,pp.43–49.Silvapulle,M.J.(1992),RobustWald-typetestsofone-sidedhypothesesinthelinearmodel,inJ.Amer.Statist.Assoc.87,pp.156–161.Sinha,S.K.andRao,J.N.K.(2009),Robustsmallareaestimation,inCanadianJ.Statist.37,pp.381–399.Subramanian,A.,Tamayo,P.,Mootha,V.K.,Mukherjee,S.,Ebert,B.L.,Gillette,M.A.,Paulovich,A.,Pomeroy,S.L.,Golub,T.P.,Lander,E.S.andMesirov,J.P.(2005),Genesetenrichmentanalysis:Aknowledge-basedapproachforinterpretinggenome-wideexpressionprofileshypothesesinthelinearmodel,inProc.Natl.Acad.Sci.USA102,pp.15545–15550.Sun,H.,Nguyen,T.,Luan,Y.,andJiang,J.(2018),Classifiedmixedlogisticmodelprediction,J.MultivariateAnal.,inpress.Thall,P.F.andVail,S.C.(1990),Somecovariancemodelsforlongitudinalcountdatawithoverdispersion,inBiometrics46,pp.657–671.Torabi,M.(2012),Likelihoodinferenceingeneralizedlinearmixedmodelswithtwocomponentsofdispersionusingdatacloning,inComput.Statist.DataAnal.56,pp.4259–4265.Torabi,M.(2014),SpatialgeneralizedlinearmixedmodelswithmultivariateCARmodelsforareadata,inSpatialStatist.10,pp.12–26.Vaida,F.andBlanchard,S.(2005),ConditionalAkaikeinformationformixed-effectsmodels,inBiometrika92,pp.351–370.Verbeke,G.,Molenberghs,G.,andBeunckens,C.(2008),Formalandinformalmodelselectionwithincompletedata,inStatist.Sci.23,201–218.

264January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page252252RobustMixedModelAnalysisWahba,G.(1978),Improperpriors,splinesmoothingandtheproblemofguardingagainstmodelerrorsinregression,inJ.Roy.Statist.Soc.B40,pp.364–372.Wahba,G.(1983),Bayesianconfidenceintervalsforthecross-validatedsmoothingspline,inJ.Roy.Statist.Soc.B45,pp.133–150.Wand,M.(2003),Smoothingandmixedmodels,inComput.Statist.18,pp.223–249.Wang,Y.-G.,Bai,Z.-D.,andJiang,J.(2015),M-estimationforanalysisoflon-gitudinaldata,ElectronicJ.Statist.,revised.Wang,J.,Fuller,W.A.andQu,Y.(2008),Smallareaestimationunderare-striction,SurveyMethodology34,pp.29–36.Weiss,L.(1975),Theasymptoticdistributionofthelikelihoodratioinsomenon-standardcases,inJ.Amer.Statist.Assoc.70,pp.204–208.Welham,S.J.,andThompson,R.(1997),Likelihoodratiotestsforfixedmodeltermsusingresidualmaximumlikelihood,inJ.Roy.Statist.Soc.B59,pp.701–714.White,H.(1982),Maximumlikelihoodestimationofmisspecifiedmodels,inEconometrika50,pp.1–25.Yan,G.andSedransk,J.(2007),Bayesiandiagnostictechniquesfordetectinghierarchicalstructure,inBayesianAnal.2,pp.735–760.Yan,G.andSedransk,J.(2010),AnoteonBayesianresidualsasahierarchicalmodeldiagnostictechnique,inStatist.Papers51,pp.1–10.Ye,J.(1998),Onmeasuringandcorrectingtheeffectsofdataminingandmodelselection,inJ.Amer.Statist.Assoc.93,pp.120–131.You,Y.andRao,J.N.K.(2002),Apseudo-empiricalbestlinearunbiasedpredic-tionapproachtosmallareaestimationusingsurveyweights,inCanadianJ.Statist.30,pp.431–439.Zhan,H.,Chen,X.,andXu,S.(2011),Astochasticexpectationandmaximizationalgorithmfordetectingquantitativetrait-associatedgenes,inBioinformatics27,pp.63–69.Zheng,X.,andLoh,W.-Y.(1995),Consistentvariableselectioninlinearmodels,inJ.Amer.Statist.Assoc.90,pp.151–156.Zou,H.(2006),TheadaptiveLassoanditsoracleproperties,inJ.Amer.Statist.Assoc.101,pp.1418–1429.

265January14,201915:20ws-book9x6RobustMixedModelAnalysisbook4page253Index1a,53bestlinearunbiasedestimatorIa,53(BLUE),104Ja,53bestlinearunbiasedprediction(BLUP),103adaptivefence,170,239bestlinearunbiasedpredictorAkaike’sinformationcriterion(AIC),(BLUP),104150bestpredictiveestimator(BPE),106analysisoflongitudinaldata,16bestpredictor(BP),103ANOVA,47Beta-binomial,14,30approximationtechnique,57bootstrap,224area-specificMSPE,123BRUTO,204asymptoticcovariancematrix(ACVM),15,52cdf,cumulativedistributionfunction,asymptoticnormality,33195augmentedmodel,219Choleskydecomposition,183autoregressive(AR)process,19,49classifiedmixedlogisticmodelprediction(CMLMP),235backcrossexperiment,180classifiedmixedmodelpredictionbackward-forwardselection(B-F),(CMMP),227187classifiedmixed-modelpredictorbalanceddata,30(CMMP),229balancedmixedANOVAmodel,48conditionalAIC,164basestatistics,14,29confidencelowerbound,200Bayesfactor(BF),160consistency,11Bayesianinformationcriterion(BIC),consistencyofEMMestimator,85150Cramérrepresentation,207Bayesianmodeldiagnostics,214benchmarking,218degreesoffreedom(d.f.),209benchmarkingOBP,219difference-benchmarkedpredictor,218bestlinearunbiasedestimationdispersionparameter,11(BLUE),16distribution-free,222253

266February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page254254RobustMixedModelAnalysisdouble-bootstrap,209Hannan-Quinn(HQ)criterion,150Hartley–Raoform,48,50,82ECMOdata,236heritability,61EMAF,175hypothesistesting,160EMIF,180empiricalBayes(EB),194i.i.d.samples,160empiricalbestpredictor(EBP),107,IEEestimator,IEEE,22128,222,224,228iff(ifandonlyif),154empiricalBLUE(EBLUE),209,223incompletedata,174empiricalBLUP(EBLUP),85,104,independence,153165,194,225intra-clustercorrelation,163empiricaldistribution,224invisiblefence,173empiricallikelihood(EL),213iterativeestimatingequations,IEE,empiricalmethodofmoments22(EMM),53,85iterativeWLS(I-WLS),22equicorrelated(EQC),36Kroneckerproduct,53,183equicorrelation,20kurtoses,54,83estimatingequation,15,51estimatingfunction,51L-teststatistic,77,100extendedGEEestimator,20Laplaceapproximation,115leastsquares(LS),3,34Fay–Herriotmodel,105leastsquares(LS)estimator,117,149fencemethods,167,197likelihood-ratiotest(LRT),80,93finitepopulationBIC,161linearmixedmodel(LMM),1,20,61first-orderunbiasedness,129linearregressionmodel,66,223first-stepestimator,29log-normal(LN)model,114forward-backwardBICprocedure,236logitfunction,8longitudinaldataanalysis,20Gamma(GM)model,114longitudinalLMM,20GaussianREMLestimator,2longitudinalmodel,48,194genesetanalysis,176lossfunction,35generallinearmodel,151generalizedadditivemodel(GAM),Maple,93203MarkovchainMonte-Carlo,211,213generalizedestimatingequationmaximumlikelihood,35(GEE),4,16maximumlikelihood(ML),13generalizedinformationcriterionmeansquaredpredictionerror(GIC),150(MSPE),90,103,228generalizedlinearmixedmodelmedianregression,35(GLMM),1,8,40,86methodofmoments(MM),7,21generalizedlinearmixedmodelsmethodofsimulatedmoments(GLMM),187(MSM),8generalizedmethodofmomentsmixedANOVAmodel,1,82(GMM),7,87,90mixedlogisticmodel,40,90,233growthcurvemodel,20,49mixedmodeldiagnostics,193

267February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page255Index255MLequations,50probabilitydensityfunction(pdf),2MLestimator,73properprior,213MLE,73,104MMequation,8Q–Qplot,194,215MMestimator(MME),7quasi-likelihood,50,51modelmisspecification,4quntiles,216modelselection,149moment-matching,209randomeffect,48,49Moore-Penroseinverse,76ratio-benchmarkedpredictor,218MSMestimator,10,29REMLequations,50multivariatet-distribution,2,50residuals,4restrictedmaximumlikelihood,nested-errorregression(NER),99,REML,50110,208,211robustclassicaltests,72NetFlow,185robustdispersiontest,61,69non-Gaussianlinearmixedmodel,47robustEBLUP,215nonparametricregression,199robustGEEequation,35robustness,1objectivefunction,99robustnessoftailoring,90observedbestprediction(OBP),85observedbestpredictor(OBP),106S-teststatistic,77observedinformation,57salamandermatingexperiments,39optimalmodel,151samplelikelihood,161optimalityofthesecond-stepsamplingwithoutreplacement,161estimator,29sandwichestimator,19,43,52ordinaryleastsquares(OLS),18,151second-orderunbiasedness,129ordinaryleastsquares(OLS)second-stepestimator,29estimator,223selfbenchmarking,219semiparametricregressionmodel,21P-splineapproximation,196serialcorrelation,48P-spline,connectiontolinearmixedshrinkagemixedmodelselection,184model,196signal-consistency,178partialobservedinformation(POI),smallareaestimation(SAE),10960smallareamean,105penalizedleastsquares,196standarddesignmatrix,29,152posterior,213standarddeviation(s.d.),19posteriormean,114standarderror(s.e.),19,66posteriorpredictivep-value,214standardLMM,223Prasad-Raomethod,129subtractivemeasure,173predictioninterval,222super-population,110,162predictivemeasureoflack-of-fit,184predictiveposteriordistribution,214tailoring,88,93,95predictiveshrinkageselection(PSS),TVSFPdata,99,141185two-wayrandomeffectsmodel,210predictivestandardizedresiduals,215priordistribution,160variancecomponents,49,50,104,194

268February12,201910:59ws-book9x6RobustMixedModelAnalysisbook4page256256RobustMixedModelAnalysisvariance-covariancestructure(VCS),weightednormalplot,19418Wienerprocess,205workingcovariancematrix,17,36W-teststatistic,772weightedχ,80workinginversecovariancematrixweightedleastsquares(WLS),107(WICM),36

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。
关闭