且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何使用顶点缓冲区将计算着色器结果馈入不包含顶点着色器的顶点中?

更新时间:2022-12-03 10:15:57

好吧,我建议的第一件事是打开调试层(在创建设备时使用Debug标志),然后转到项目属性,debug选项卡,并勾选启用非托管代码调试"或启用本机代码调试".

当您开始调试程序时,如果管道状态有问题,运行时将向您发出潜在的警告.

一个潜在的问题(从您发布的内容来看,这很可能是一个问题): 调度后,请确保清理您的计算着色器UAV插槽.如果您尝试将vertexTable2绑定到您的顶点着色器,但是资源仍被绑定为计算着色器输出,则运行时将自动将您的ShaderView设置为null(当您尝试读取它时,它将返回0).

要清洁您的Compute Shader,请在设备上下文中调用此方法,即已完成调度:

ComputeShader.SetUnorderedAccessView(TRIANGLE_SLOT, null)

还请注意,PixelShader可以访问RWStructuredBuffer(从技术上讲,如果您具有功能级别11.1,则意味着可以将RWStructuredBuffer用于任何着色器类型,这意味着需要使用最新的ATI卡和Windows 8 +).

Before I go into details I want outline the problem:

I use RWStructuredBuffers to store the output of my compute shaders (CS). Since vertex and pixel shaders can’t read from RWStructuredBuffers, I map a StructuredBuffer onto the same slot (u0/t0) and (u4/t4):

cbuffer cbWorld : register (b1) 
{
    float4x4 worldViewProj;
    int dummy;
}   

struct VS_IN
{
    float4 pos : POSITION;
    float4 col : COLOR;
};

struct PS_IN
{

    float4 pos : SV_POSITION;
    float4 col : COLOR;
};

RWStructuredBuffer<float4> colorOutputTable : register (u0);    // 2D color data
StructuredBuffer<float4> output2 :            register (t0);    // same as u0
RWStructuredBuffer<int> counterTable :        register (u1);    // depth data for z values
RWStructuredBuffer<VS_IN>vertexTable :        register (u4);    // triangle list
StructuredBuffer<VS_IN>vertexTable2 :         register (t4);    // same as u4

I use a ShaderRecourceView to grant pixel and/or vertex shader access to the buffers. This concept works fine for my pixel shader, the vertex shader however seems to read only 0 values (I use SV_VertexID as index to the buffers):

PS_IN VS_3DA ( uint vid : SV_VertexID ) 
{           
    PS_IN output = (PS_IN)0; 
    PS_IN input = vertexTable2[vid];
    output.pos = mul(input.pos, worldViewProj); 
    output.col = input.col; 
    return output;
}

No error messages or warnings from the hlsl compiler, the renderloop runs with 60 fps (using vsync), but the screen remains black. Since I blank the screen with Color.White before Draw(..) is called, the render pipeline seems to be active.

When I read the triangle data content via an UAView from the GPU into "vertArray" and feed it back into a vertex buffer, everything works however:

Program:

    let vertices = Buffer.Create(device, BindFlags.VertexBuffer, vertArray)
    context.InputAssembler.SetVertexBuffers(0, new VertexBufferBinding(vertices, Utilities.SizeOf<Vector4>() * 2, 0))

HLSL:

PS_IN VS_3D (VS_IN input )
{
    PS_IN output = (PS_IN)0;    
    output.pos = mul(input.pos, worldViewProj);
    output.col = input.col; 
    return output;
}

Here the definition of the 2D - Vertex / Pixelshaders. Please note that PS_2D accesses the buffer "output2" in slot t0 - and that's exactly the "trick" what I want to replicate for then 3D vertex shader "VS_3DA":

float4 PS_2D ( float4 input : SV_Position) : SV_Target
{        
    uint2 pixel =  uint2(input.x, input.y);         
    return output2[ pixel.y * width + pixel.x]; 
}

float4 VS_2D ( uint vid : SV_VertexID ) : SV_POSITION
{
if (vid == 0)
    return float4(-1, -1, 0, 1);
if (vid == 1)
    return float4( 1, -1, 0, 1);
if (vid == 2)
    return float4(-1,  1, 0, 1);    

return float4( 1,  1, 0, 1);    
}

For three days I have searched and experimented to no avail. All informations I gathered seem to confirm that my approach using then SV_VertexID should work.

Can anybody give advice? Thanks for reading my post!

=====================================================================

DETAILS:

I like the concept of DirectX 11 compute shaders very much and I want to employ it for algebraic computing. As a test case I render fractals (Mandelbrot sets) in 3D. Everything works as expected – except one last brick in the wall is missing.

The computation takes the following steps:

  1. Using a CS to compute a 2D texture (output is "counterTable" and "colorOutbutTable" (works)

  2. Optionally render this texture to screen (works)

  3. Using another CS to generate a mesh (triangle list). This CS takes x, y, and color values from step 1, computes the z coordinate, and finally creates a quad for each pixel. The result is stored in "vertexTable". (works)

  4. Feeding the triangles list to the vertex shader (problem!!!)

  5. Render to screen (works - using a vertex buffer).

For programming I use F# 3.0 and SharpDX as .NET wrapper. The ShaderRessourceView for both shaders (pixel & vertex) is set up with the same parameters (except the size parameters):

let mutable descr = new BufferDescription()     
descr.BindFlags <- BindFlags.UnorderedAccess ||| BindFlags.ShaderResource 
descr.Usage <- ResourceUsage.Default  
descr.CpuAccessFlags <- CpuAccessFlags.None
descr.StructureByteStride <- xxx    / / depends on shader
descr.SizeInBytes <-  yyy       / / depends on shader
descr.OptionFlags <- ResourceOptionFlags.BufferStructured

Nothing special here. Creation of 2D buffer (binds to buffer "output2" in slot t0):

outputBuffer2D <- new Buffer(device, descr) 
outputView2D <- new UnorderedAccessView (device, outputBuffer2D)  
shaderResourceView2D <- new ShaderResourceView (device, outputBuffer2D)

Creation of 3D buffer (binds to "vertexTable2" in slot t4):

vertexBuffer3D <- new Buffer(device, descr) 
shaderResourceView3D <- new ShaderResourceView (device, vertexBuffer3D)
//  UAView not required here

Setting resources for 2D:

context.InputAssembler.PrimitiveTopology <- PrimitiveTopology.TriangleStrip
context.OutputMerger.SetRenderTargets(renderTargetView2D)
context.OutputMerger.SetDepthStencilState(depthStencilState2D)
context.VertexShader.Set (vertexShader2D)
context.PixelShader.Set (pixelShader2D) 

render 2D:

context.PixelShader.SetShaderResource(COLOR_OUT_SLOT, shaderResourceView2D)
context.PixelShader.SetConstantBuffer(CONSTANT_SLOT_GLOBAL, constantBuffer2D )
context.ClearRenderTargetView (renderTargetView2D, Color.White.ToColor4())         
context.Draw(4,0)                                                
swapChain.Present(1, PresentFlags.None)            

Setting resources for 3D:

context.InputAssembler.PrimitiveTopology <- PrimitiveTopology.TriangleList
context.OutputMerger.SetTargets(depthView3D, renderTargetView2D)
context.VertexShader.SetShaderResource(TRIANGLE_SLOT, shaderResourceView3D )
context.VertexShader.SetConstantBuffer(CONSTANT_SLOT_3D, constantBuffer3D)
context.VertexShader.Set(vertexShader3D)
context.PixelShader.Set(pixelShader3D)

render 3D (doesn’t work – black screen as output result)

context.ClearDepthStencilView(depthView3D, DepthStencilClearFlags.Depth, 1.0f, 0uy)
context.Draw(dataXsize * dataYsize * 6, 0)
swapChain.Present(1, PresentFlags.None)

Finally the slot numbers:

static let CONSTANT_SLOT_GLOBAL = 0
static let CONSTANT_SLOT_3D = 1
static let COLOR_OUT_SLOT = 0
static let COUNTER_SLOT = 1
static let COLOR_SLOT = 2    
static let TRIANGLE_SLOT = 4

Ok first thing I would suggest, is to turn on debug layer (Use Debug flag when you create your device), then go to project properties, debug tab, and tick "Enable unmanaged code debugging" or "Enable native code debugging".

When you start to debug the program the runtime will give you potential warnings if something wrong with pipeline state.

One potential issue (which looks the most likely one from what you posted): Make sure to clean your compute shader UAV slots after dispatching. If you try to bind vertexTable2 to your vertex shader, but the resource is still bound as compute shader output, the runtime will automatically set your ShaderView to null (which will in turn return 0 when you try to read it).

To clean your Compute Shader, call this on your device context one you're done with dispatch:

ComputeShader.SetUnorderedAccessView(TRIANGLE_SLOT, null)

Please also note that PixelShader can access RWStructuredBuffer (technically you can use RWStructuredBuffer for any shader type if you have feature level 11.1, that means recent ATI card and Windows 8+).