OpenCL_Best_Practices_Guide英文文献资料

OpenCL_Best_Practices_Guide英文文献资料

ID:39256689

大小:3.28 MB

页数:54页

时间:2019-06-28

OpenCL_Best_Practices_Guide英文文献资料_第1页
OpenCL_Best_Practices_Guide英文文献资料_第2页
OpenCL_Best_Practices_Guide英文文献资料_第3页
OpenCL_Best_Practices_Guide英文文献资料_第4页
OpenCL_Best_Practices_Guide英文文献资料_第5页
资源描述:

《OpenCL_Best_Practices_Guide英文文献资料》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库

1、OptimizationOpenCLBestPracticesGuideMay27,2010OpenCLBestPracticesGuideREVISIONSJuly2009(OriginalRelease)April2010May2010iiMay27,2010TableofContentsPreface............................................................................................................................vWhatIsT

2、hisDocument?vWhoShouldReadThisGuide?vRecommendationsandBestPracticesvContentsSummaryviChapter1.HeterogeneousComputingwithOpenCL.....................................................11.1DifferencesBetweenHostandDevice11.2WhatRunsonanOpenCL-EnabledDevice?21.3MaximumPerformanceBenefit3Chap

3、ter2.PerformanceMetrics.....................................................................................52.1Timing52.1.1UsingCPUTimers52.1.2UsingOpenCLGPUTimers62.2Bandwidth62.2.1TheoreticalBandwidthCalculation62.2.2EffectiveBandwidthCalculation72.2.3ThroughputReportedbytheOpenCLVi

4、sualProfiler7Chapter3.MemoryOptimizations..................................................................................93.1DataTransferBetweenHostandDevice93.1.1PinnedMemory93.1.2AsynchronousTransfers123.1.3OverlappingTransfersandDeviceComputation123.2DeviceMemorySpaces163.2.1Coale

5、scedAccesstoGlobalMemory173.2.1.1ASimpleAccessPattern183.2.1.2ASequentialbutMisalignedAccessPattern183.2.1.3EffectsofMisalignedAccesses193.2.1.4StridedAccesses213.2.2SharedMemory223.2.2.1SharedMemoryandMemoryBanks223.2.2.2SharedMemoryinMatrixMultiplication(C=AB)23T3.2.2.3SharedMemoryin

6、MatrixMultiplication(C=AA)27April30,2010iiiOpenCLBestPracticesGuide3.2.2.4SharedMemoryUsebyKernelArguments293.2.3LocalMemory293.2.4TextureMemory303.2.4.1TexturedFetchvs.GlobalMemoryRead303.2.4.2AdditionalTextureCapabilities303.2.5ConstantMemory313.2.6Registers313.2.6.1RegisterPressure3

7、1Chapter4.NDRangeOptimizations..............................................................................334.1Occupancy334.2CalculatingOccupancy334.3HidingRegisterDependencies354.4ThreadandBlockHeuristics354.5EffectsofSharedMemory36Chapter5.InstructionOptimizations................

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。