>The reason people do potentially more expensive mix/lerps is because while it m...

account42 · on Feb 10, 2025

> So, to answer your question - any computation based on shader inputs (vertices, computer shader indices and what not) cannot and won't branch.

It can do an actual branch if the condition ends up the same for the entire workgroup - or to be even more pedantic, for the part of the workgroup that is still alive.

You can also check that explicitly to e.g. take a faster special case branch if possible for the entire workgroup and otherwise a slower general case branch but also for the entire workgroup instead of doing both and then selecting.

pandaman · on Feb 10, 2025

And this is why I wrote There can be shortcut branches emitted by the compiler to quickly bypass computations when all ways are the same value but in general case everything will be computed for every condition being true as well as being false.

ribit · on Feb 11, 2025

Execution with masking is pretty much how broaching works on GPUs. What’s more relevant however is that conditional statements add overhead on terms of additional instructions and execution state management. Eliminating small branches using conditional moves or manual masking can be a performance win.

pandaman · on Feb 12, 2025

No, branching works on GPU just like everywhere else - the instruction pointer gets changed to another value. But you cannot branch on a vector value unless every element of the vector is the same, this is why branching on vector values is a bad idea. However, if your vectorized computation is naturally divergent then there is no way around it, conditional moves are not going to help as they also will evaluate both branches in a conditional. The best you can do is to arrange it in such a way that you only add computation instead of alternating it, i.e. you do if() ... instead of if() ... else ... then you only take as long as the longest path.

This reminds me that people who believe that GPU is not capable of branches do stupid things like writing multiple shaders instead of branching off a shader constant e.g. you have some special mode, say x-ray vision, in a game and instead of doing a branch in your materials, you write an alternative version of every shader.

ryao · on Feb 9, 2025

You can always have the compiler dump the assembly output so you can examine it. I suspect few do that.

vanderZwan · on Feb 10, 2025

Does this also apply for shaders? And is it even useful given the enormous variation in hardware capabilities out there. My impression was that it's all JIT compiled unless you know which hardware you're targeting, e.g. Valve precompiling highly optimized shaders for the Steam Deck

(I'm not a grapics programmer, mind you, so please correct any misunderstandings on my end)

swiftcoder · on Feb 10, 2025

It's all JIT'd based on the specific driver/GPU, but the intermediate assembly language is sufficient to inspect things like branches and loop unrolling.

grg0 · on Feb 11, 2025

Not really. DXIL in particular will still have branches and not care much about unrolling. You need to look at the assembly that is generated. And yes, that depends on the target hardware and compiler/driver.

account42 · on Feb 10, 2025

You will have to check for the different GPUs you are targetting. But GPU vendors don't start from scratch for each hardware generation so you will often see similar results.

torginus · on Feb 10, 2025

I'll comment this here as I got downvoted when I made the point in a standalone comment - this is mostly an academic issue, since you don't want to use step of pixel-level if statements in your shader code, as it will lead to ugly aliasing artifacts as the pixel color transitions from a to b.

What you want is to use smoothstep which blends a bit between these two values and for that you need to compute both paths anyway.

pandaman · on Feb 11, 2025

It's absurd to claim that you'd never use step(), even in pixel shaders (there are all kinds of shaders not related to pixels at all).

torginus · on Feb 11, 2025

>since you don't want to use step of pixel-level if statements in your shader code

The observation relates to pixel shaders, and even within that, it relates to values that vary based on pixel-level data. In these cases having if statements without any sort of interpolation introduces aliasing, which tends to look very noticeable.

Now you might be fine with that, or have some way of masking it, so it might be fine in your use case, but most in the most common, naive case the issue does show up.

pandaman · on Feb 11, 2025

I don't know how many graphics products you shipped and when, but, say, clamping values at 0, is pretty common even in most basic shaders. It's not magic and won't introduce "aliasing" just for the fact of using it. On the other hand, for example, using negative dot products in you lighting computation will introduce bizarre artifacts. And yes, everyone uses various forms of MSAA for the past 15 years or so even in games. Welcome to the 21st century.

torginus · on Feb 11, 2025

The way you write seems to imply you have professional experience in the matter, which makes it very strange you're not getting what I'm writing about.

Nobody ever talked about clamping - and it's not even relavant to the discussion as it doesn't introduce discontinuity that can cause aliasing.

What I'm referring to is shader aliasing, which MSAA does nothing about - MSAA is for geometry aliasing.

To illustrate what I'm talking about with, an example that draws a red circle on a quad:

The bad version:

    gl_FragColor = vec4(vec3(1.0 - step(0.25, distance(vUv, vec2(0.5)))) * vec3(1.0, 0.0, 0.0), 1.0);

The good version:

    gl_FragColor = vec4(vec3(1.0 - smoothstep(0.24, 0.25, distance(vUv, vec2(0.5)))) * vec3(1.0, 0.0, 0.0), 1.0);

The first version has a hard boundary for the circle which has an ugly aliased and pixelated contour, while the latter version smooths it. This example might not be egregious, but this can and does show up in some circumstances.