预览加载中,请您耐心等待几秒...
1/3
2/3
3/3

在线预览结束,喜欢就下载吧,查找使用更方便

如果您无法下载资料,请参考说明:

1、部分资料下载需要金币,请确保您的账户上有足够的金币

2、已购买过的文档,再次下载不重复扣费

3、资料包下载后请先用软件解压,在使用对应软件打开

HAMA计算平台的性能研究 Abstract Withtherapidgrowthofdataandthepopularityofbigdataanalysis,high-performancecomputingplatformshavegraduallybecometheresearchfocusofmanyfields.HAMAisanopen-sourceplatformforlarge-scalematrixcomputationswhichisdesignedfordistributedcomputing.Inthispaper,weinvestigatetheperformanceofHAMAintermsofscalability,resourceutilization,andefficiency.Weconductexperimentsonthreereal-worlddatasetsofdifferentsizestoevaluateitsperformance.TheexperimentalresultsshowthatHAMAhasexcellentscalabilityandresourceutilization,butitsefficiencyneedstobefurtheroptimized. Introduction Withtheexplosionofdata,manyfieldshavebeguntopaymoreattentiontohigh-performancecomputingplatforms.Theprocessingoflarge-scaledatasetshasbecomeasignificantchallengefortraditionalcomputingsystems.HAMAisanopen-sourceplatformspecificallydesignedforlarge-scalematrixcomputationsandprovidesanexcellentsolutiontothisproblem.HAMAisbasedonApacheHadoopandusesMapReducetodistributetasksacrossacluster. Inthispaper,weinvestigatetheperformanceofHAMA.Specifically,wefocusonthescalability,resourceutilization,andefficiencyoftheplatform.Therestofthepaperisorganizedasfollows.Section2introducestherelatedworkonhigh-performancecomputingplatforms.Section3describestheexperimentalsetup.Section4presentstheexperimentalresults.Section5discussestheresultsanddrawsconclusions. Relatedwork Manyhigh-performancecomputingplatformshavebeenproposedtoaddressthechallengesofbigdataprocessing,suchasApacheHadoop,ApacheSpark,andApacheFlink.ApacheHadoopisapopularplatformforprocessingbigdata,andHAMAisbasedonit.ApacheSparkisafastandgeneral-purposeengineforlarge-scaledataprocessing.ApacheFlinkisastreamingdataflowenginethatsupportsbatchprocessing.Theseplatformshavetheirstrengthsandweaknesses,butHAMAfocusesonlarge-scalematrixcomputationsandisanexcellentexampleofahigh-performancecomputingplatform. Experimentalsetup ToevaluatetheperformanceofHAMA,weconductexperimentsonthreereal-worlddatasetsofdifferentsizes:asmalldatasetwith1000x1000matrixsize,amediu