资源描述:
《Running title Knowledge discovery by NLP》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库。
1、Knowledgediscoverybasedonanimplicitandexplicitconceptualnetwork1,21AsakoKoikeandToshihisaTakagi1Dept.ofComputationalBiology,GraduateSchoolofFrontierScience,TheUniversityofTokyo,Kiban-3A1(CB01)5-1-5,Kashiwanoha,Kashiwa,Chiba,277-8561,Japan2CentralResearchLaboratory,HitachiLtd.1-280Higashi-K
2、oigakubo,Kokubunji,Tokyo,185-8601,JapanRunningtitle:KnowledgediscoverybyNLPKeywords:knowledgediscovery,informationretrieval,informationextraction,naturallanguageprocessingCorrespondingauthor:AsakoKoikeDept.ofComputationalBiologyGraduateSchoolofFrontierScience,Univ.ofTokyo,Kiban-3A1(CB01)5-
3、1-5Kashiwanoha,Kashiwa,Chiba,277-8561,JapanPhone:+81-4-7136-3982;Fax:+81-4-7136-3975E-mail:akoike@hgc.jp1ABSTRACTTheamountofknowledgeaccumulatedinpublishedscientificpapershasincreasedduetothecontinuingprogressbeingmadeinscientificresearch.Sincenumerouspapershaveonlyreportedfragmentsofscien
4、tificfacts,therearepossibilitiesfordiscoveringnewknowledgebyconnectingthesefacts.Wethereforedevelopedasystemcalled“BioTermNet”todraftaconceptualnetworkwithhybridmethodsofinformationextractionandinformationretrieval.Twoconceptsareregardedtoberelatedinthissystemif1)theirrelationshipisclearly
5、describedinMEDLINEabstractsor2)theyhavedistinctivelyco-occurredinabstracts.PRIMEdata,includingprotein-interactionsandfunctionsextractedbyNLPtechniques,areusedintheformerandtheSinghal-measureforinformationretrievalisusedinthelatter.Relationshipsthatarenotclearly/directlydescribedinanabstrac
6、tcanbeextractedbyconnectingmultipleconcepts.Toevaluatehowwellthissystemperformed,theassociationbetweenSwanson’sRaynaud’sdisease-fishoilandthatbetweenmigraine-magnesiumweretestedwithabstractsthathadbeenpublishedbeforethediscoveryoftheirassociations.Asaresult,whenstartandendconceptsweregiven
7、,plausibleandunderstandableintermediateconceptsconnectingthemcouldbedetected.Whenonlythestartconceptwasgiven,notonlythefocusedconcept(Magnesiumandfishoil)butalsootherprobableconceptscouldbedetectedasrelatedconceptcandidates.Finally,thissystemwasappliedtofinddi