且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

OpenCL在线编译:从cl :: program或cl :: kernel获取程序集

更新时间:2023-11-18 23:18:04

对于Intel Graphics,您可以使用clGetKernelInfo(...,CL_KERNEL_BINARY_PROGRAM_INTEL,...)直接获取内核ISA位.要反汇编这些位,您可以获取最新的GEN ISA反汇编器并按照此处.具体来说,请参见Building an Intel GPU ISA Disassembler上的部分.我已经有一段时间没有使用它了,但是Intel OpenCL SDK曾经做得更好(不是GUI用户).而是一篇有关如何使用的好文章检查组装的工具.

For Intel Graphics you can use clGetKernelInfo(...,CL_KERNEL_BINARY_PROGRAM_INTEL,...) to directly get the kernel ISA bits. To disassemble those bits, you can get the latest GEN ISA disassembler and build it as described here. Specifically, see the section on Building an Intel GPU ISA Disassembler. I haven't used it in a while, but The Intel OpenCL SDK used to do a better job (not a GUI person). And this is a good article on how to use that tool to scrutinize the assembly.

对于NVidia,由clGetProgramInfo(...CL_PROGRAM_BINARIES...)返回的二进制"实际上返回ptx.这可能就足够了,但是如果您要执行确切的着色器程序集,则可以将ptx实际输入到ptxas中,然后使用--dump-sass选项反汇编cuobjdump以获得最低级别的程序集.请注意,我们只能猜测NVidia驱动程序使用的是与ptxas相同的算法,但似乎合乎逻辑.

For NVidia, the "binary" returned by clGetProgramInfo(...CL_PROGRAM_BINARIES...) actually returns ptx. This might be enough, but if you want the exact shader assembly executed, then you can actually feed the ptx into ptxas and then disassemble cuobjdump with the --dump-sass option to get the lowest level assembly. Note, we're reduced to guessing that the NVidia driver is using the same algorithm as ptxas, but it seems logical.

AMD可能具有类似的工具,但我对它们不那么了解.

AMD likely has similar tools, but I am less versed on them.