欢迎来到天天文库
浏览记录
ID:52470906
大小:1.06 MB
页数:27页
时间:2020-03-27
《Data-Centric AutomatedData Mining以数据为中心地自动化数据挖掘.pdf》由会员上传分享,免费在线阅读,更多相关内容在教育资源-天天文库。
1、Data-CentricAutomatedDataMiningMarcosM.CamposPeterJ.StengardBorianaL.MilenovaDataMiningTechnologiesOverviewóDataminingcomplexityóProposeddesignsolutionóApplicationI:OraclePredictiveAnalyticsóApplicationII:SpreadsheetAdd-InforPAOverviewóDataminingcomplexityóProposeddesignsolutio
2、nóApplicationI:OraclePredictiveAnalyticsóApplicationII:SpreadsheetAdd-InforPADataMiningComplexityóKnowledgeofdataminingtechniques–WhichalgorithmdoIuse?óAlgorithmspecificdatapreparation–HowshouldIpreparemydata?óModelparametertuning–WhatkernelfunctionshouldIuse?óDeployment–I’vede
3、ployedamodel,nowwhatdatacanIscorewithit?IndustryFuture“Predictiveanalyticsbuildsonthedataminingmultistepprocessandstatisticalmodelingtechniquestoaddalayerofautomationandself-directedbuilt-inintelligence.Businessusers(andnotjustPh.D.statisticians)cannowanalyzelargeamountsofcusto
4、mer,supplier,employeeandproductdataforpatternsandtrends.”-KentBauerDMReviewMagazine,Dec.2005DesignApproachóGoal:“Goodresultswithminimumeffort”óData-centricfocus–familiartodatabaseandbusinessintelligencecommunitiesóProcessautomation–ease-of-usefornon-expertusersData-centricDesig
5、nóEliminatesconceptsofmodelsorcomplexmethodologiesóRequiresonlyknowledgeofthedatasourceóSupportingobjectsareeitherremovedorlinkedtothedatasourceóUsersseeonlypredictiveordescriptiveresultsóGoal-orientedtasksGoal-orientedTasksóExplain-attributeimportanceóPredict-classificationorr
6、egressionóGroup-clustering/segmentationóDetect-anomaly/outlierdetectionóMap-projectdatatolowerdimensionalityóProfile-supervisedsegmentationProcessAutomationóStatisticscomputationóSamplingóAttributetypeidentificationóAttributeselectionóAlgorithmselectionóDatatransformationóModel
7、creationandselectionóOutputgenerationProcessAutomation(cont.)óStatisticscomputation–numberofrecords,numberofattributes,attributerangesandcardinality–usedtomakedecisionsabouttargetandattributetypeandguidedatatransformationsóSampling(random&stratified)–improvetrainingtimesforlarg
8、edatasets–ensuresufficientraretargetvalue/rangereprese
此文档下载收益归作者所有