资源描述:
《机器学习》由会员上传分享,免费在线阅读,更多相关内容在教育资源-天天文库。
1、SupervisionlessMachineLearningwithWorldKnowledgeYangqiuSongCognitiveComputationGroupDepartmentofComputerScienceUniversityofIllinoisatUrbana-Champaign1TextCategorization•Aclassicalproblemwithmanyapplications!•News,socialmediaanalysis,medicalrecords,…•Traditionalapproach:LabelTr
2、ainaMakedataclassifierprediction2ChallengesinBigDataEra•Shortertexts–Tweets–Queries–Entitiesorphrases•ManyfinegrainedlabelsInsuranceAutoHealthProperty…LifeinsuranceinsuranceinsuranceinsuranceVisionDental…Disabilityinsuranceinsuranceinsurance3AcquireLabeledData•ExpertAnnotation
3、–Costly:onlybigcompaniescanhirealotofexperts•Crowdsourcing–Relativelysimpletasks–Sometimeslowquality–Stillcostly•Semi-supervised/transferlearning–Notgeneralizableto:•Manydiversedomains•Fastchangingdomains4KnowledgeEnabledLearning•Worldknowledgebases–Wikipedia–Freebase–DBPedia–
4、Yago–Googleknowledgegraph/knowledgevault5Example:KnowledgeEnabledTextClassificationDongNguyenannouncedthathewouldberemovinghishitgameFlappyBirdfromboththeiOSandAndroidappstores,sayingthatthesuccessofthegameissomethingheneverwanted.Somefansofthegametookitpersonally,replyingthat
5、theywouldeitherkillNguyenorkillthemselvesifhefollowedthroughwithhisdecision.Pickalabel:MobileClass1orClass2?GameorSports•Labelscarryalotofinformation!•Buttraditionalapproachesarenotusingit•Modelsaretrainedwith“numbersorIDs”aslabels6Classificationof20NewsgroupsDocumentsClassifi
6、cationwithknowledge:0.840.870.830.770.640.52ClassificationF11002005001,0002,0007SongandRoth.AAAI’14KnowledgeEnabledLearning•Challenges–Generalpurposevs.domainproblem–Knowledgevs.datarepresentation–Largevs.smallscaleinference•Future–Nextgenerationofmachinelearning–Bigdataenable
7、dmachinelearning8SupervisionlessTextClassification:ClassifyDocumentsontheFlyLabelNameLabel/documentSimilarityRepresentationDocumentChooseLabelM.W.Chang,L.Ratinov,D.Roth,and,V.Srikumar.Importanceofsemanticrepresentation:datalessclassification.AAAI,2008.Y.Song,D.Roth:OnDatalessH
8、ierarchicalTextClassification.AAAI,2014.9M.Elhoseiny,B.Saleh,