欢迎来到天天文库
浏览记录
ID:37657779
大小:64.68 KB
页数:8页
时间:2019-05-27
《Experiments with Data-Intensive NLP on a Computational Grid》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库。
1、ExperimentswithData-IntensiveNLPonaComputationalGridBadenHughes,StevenBirdHaejoongLeeEwanKleinDeptofComputerScienceLinguisticDataConsortiumSchoolofInformaticsUniversityofMelbourneUniversityofPennsylvaniaUniversityofEdinburghVictoria3010,AUSTRALIAPhiladelp
2、hiaPA19104,USAEdinburghEH89LW,UKfbadenh,sbg@cs.mu.oz.auhaejoong@ldc.upenn.eduewan@inf.ed.ac.ukAbstractthecommunityneedstoexplorenovelwaystoLargedatabasesofannotatedtextandspeecharewidelycollaborateontheprocessingoflargedatasets.usedfordevelopingandtesting
3、languagetechnologies.How-ever,thesizeofthesecorporaandassociatedlanguagemod-Thispaperreportsaseriesofexperimentselsareoutpacingthegrowthofprocessingpowerandnet-withdata-intensivenaturallanguageprocessingworkbandwidthavailabletomostresearchers.Thesolu-onac
4、omputationalgrid.Acomputationalgridtion,webelieve,istoexploitfourcharacteristicsoflanguagetechnologyresearch:manylargecorporaarealreadyheldfacilitateslarge-scaleanalysisusingdistributedatmostsiteswheretheresearchisconducted;mostdata-resources.Giventheprev
5、alenceoflargedataintensiveprocessingtakesplaceinthedevelopmentphasesourcesinnaturallanguageengineeringandandnotatrun-time;mostprocessingtaskscanbeconstruedasaddinglayersofannotationtoimmutablecorpora;andtheneedforrawcomputationalpowerinthemanyclassesoflan
6、guagemodelscanbeapproximatedastheanalysisandmodellingofsuchdata,thegridsumofsmallermodels.Wereportonaseriesofexperimentscomputingparadigmprovidesefficienciesandwithdata-intensivelanguageprocessingonacomputationalgrid.Keyfeaturesoftheapproachareitsuseofascr
7、iptingscalabilityotherwiseunavailabletonaturallanguageforeasydisseminationofcontrolcodetoprocess-languageengineering.Agridbrokeractsasingnodes,theuseofagridbrokertomanagetheexecutionoftasksonremotenodesandcollatetheiroutput,theuseaatask-farmingengine,empl
8、oyingthesimplestdata-decompositionapproachbywhichparametricandparal-formofdistributioninwhichparalleltasksarelelprocessingofindividuallanguageprocessingcomponentsfullyindependent.Thuswedonotenvisageoccursonsegmented
此文档下载收益归作者所有