質問

What are the diffences between "Shader Storage Buffer Objects" (SSBO) and Image load store operations

When should one be used and not the other?

They both can have atomic operations and I assume they are stored in the same type of memory. And regardless if they are stored in the same type of memory, do they have the same performance characteristics?

edit: the original question was asking between SSBOs and Uniform buffer objects, it was meant to be between SSBO and Image load store.

役に立ちましたか?

解決

The difference between shader storage buffer objects and image textures and why one would want to use them is that they can use interface blocks.

Images are just textures which mean only vec4's are in the data structure. Well not only vec4, it could have other formats, but the data structure would be many of one data type.

Where as, SSBO's are generic. They can use combinations of int's, float's, arrays of vec3's all in a single interface block.

So, SSBO's are much more flexible than just Image Texture's.

他のヒント

Your question is already answered more or less definitively at http://www.opengl.org/wiki/Shader_Storage_Buffer_Object. It says:

SSBOs are a lot like Uniform Buffer Objects. Shader storage blocks are defined by Interface Block (GLSL)s in almost the same way as uniform blocks. Buffer objects that store SSBOs are bound to SSBO binding points, just as buffer objects for uniforms are bound to UBO binding points. And so forth.

The major differences between them are:

  1. SSBOs can be much larger. The smallest required UBO size is 16KB; the smallest required SSBO size is 16MB, and typical sizes will be on the order of the size of GPU memory.

  2. SSBOs are writable, even atomically; UBOs are uniform​s. SSBOs reads and writes use incoherent memory accesses, so they need the appropriate barriers, just as Image Load Store operations.

  3. SSBOs can have unbounded storage, up to the buffer range bound; UBOs must have a specific, fixed storage size. This means that you can have an array of arbitrary length in an SSBO. The actual size of the array, based on the range of the buffer bound, can be queried at runtime in the shader using the length​ function on the unbounded array variable.

As others have mentioned, SSBOs have much larger storage and supports atomic operations, the accepted answer also mentioned that SSBOs are generic in the sense that they allow users to combine different types. But personally, I just want to point out that I think this is usually BAD, it is not always ideal to use interface blocks or structs in SSBO. Here's an example:

Let's say you have a struct in C++ like this:

struct Foo {
    glm::vec4 position;
    glm::vec4 velocity;
    glm::vec4 padding_and_range;  // range is a float padded to a vec4
};

which corresponds to an SSBO buffer in glsl:

struct Foo {
    vec4 position;
    vec4 velocity;
    vec4 padding_and_range;  // range is a float padded to a vec4
};

layout(std430, binding = 0) readonly buffer SSBO {
    Foo data[];
} foo;

Although the SSBO buffer is able to hold an array of struct Foo, notice that paddings must be taken into account as per the std430 memory layout, you have to padded your float range to a vec4, and then use foo.data[i].padding_and_range.w to access it. This is error-prone, let alone the waste of memory spaces, especially when your SSBO is large (to be used in a compute shader) and your Foo struct is complex (needs a lot of paddings). Apart from that, you often need to fill in the buffer data in a loop like this:

Foo* foos = reinterpret_cast<Foo*>(glMapNamedBufferRange(ssbo, offset, size, GL_MAP_READ_BIT));

for (int i = 0; i < n_foos; i++) {
    Foo& foo = foos[i];
    foo.position          = glm::vec4(1.0f);
    foo.velocity          = glm::vec4(2.0f);
    foo.padding_and_range = glm::vec4(glm::vec3(0.0f), 3.5f);
}

glUnmapNamedBuffer(ssbo);

instead of simply writing data to it in one go using glNamedBufferData or glNamedBufferSubData.

A better way of handling struct is to store each struct element into a separate SSBO, so that each SSBO buffer array is tightly packed and homogeneous. Even though the performance may not be any better, it helps keep your code clean and more readable. Rarher than using the struct, you would want to use:

layout(std430, binding = 0) buffer FooPosition {
  vec4 position[];
};

layout(std430, binding = 1) buffer FooVelocity {
  vec4 velocity[];
};

layout(std430, binding = 2) buffer FooRange {
  float range[];
};
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top