资源描述:
《Toward Integrating Feature Selection Algorithms for Classification and Clustering》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、IEEETRANSACTIONSONKNOWLEDGEANDDATAENGINEERING,VOL.17,NO.4,APRIL2005491TowardIntegratingFeatureSelectionAlgorithmsforClassificationandClusteringHuanLiu,SeniorMember,IEEE,andLeiYu,StudentMember,IEEEAbstract—Thispaperintroducesconceptsandalgorithmsoffeatureselection,surveysexistingfe
2、atureselectionalgorithmsforclassificationandclustering,groupsandcomparesdifferentalgorithmswithacategorizingframeworkbasedonsearchstrategies,evaluationcriteria,anddataminingtasks,revealsunattemptedcombinations,andprovidesguidelinesinselectingfeatureselectionalgorithms.Withthecateg
3、orizingframework,wecontinueoureffortstowardbuildinganintegratedsystemforintelligentfeatureselection.Aunifyingplatformisproposedasanintermediatestep.Anillustrativeexampleispresentedtoshowhowexistingfeatureselectionalgorithmscanbeintegratedintoametaalgorithmthatcantakeadvantageofind
4、ividualalgorithms.Anaddedadvantageofdoingsoistohelpauseremployasuitablealgorithmwithoutknowingdetailsofeachalgorithm.Somereal-worldapplicationsareincludedtodemonstratetheuseoffeatureselectionindatamining.Weconcludethisworkbyidentifyingtrendsandchallengesoffeatureselectionresearcha
5、nddevelopment.IndexTerms—Featureselection,classification,clustering,categorizingframework,unifyingplatform,real-worldapplications.æ1INTRODUCTIONScomputeranddatabasetechnologiesadvancerapidly,validation[18].Subsetgenerationisasearchprocedure[48],Adataaccumulatesinaspeedunmatchableb
6、yhuman’s[53]thatproducescandidatefeaturesubsetsforevaluationcapacityofdataprocessing.Datamining[1],[29],[35],[36],basedonacertainsearchstrategy.Eachcandidatesubsetisasamultidisciplinaryjointeffortfromdatabases,machineevaluatedandcomparedwiththepreviousbestonelearning,andstatistics
7、,ischampioninginturningmoun-accordingtoacertainevaluationcriterion.Ifthenewsubsettainsofdataintonuggets.Researchersandpractitionersturnsouttobebetter,itreplacesthepreviousbestsubset.realizethatinordertousedataminingtoolseffectively,Theprocessofsubsetgenerationandevaluationisrepeat
8、eddatapreprocessingisessentialtos