Friday, May 8, 2009

Crazy GPU Tricks.

I haven't really posted yet on graphics programming. But my current task at Bunkspeed has me doing some crazy shader tricks, and I ran into something that GPUs do that is simultaneously awesome and awful, so I had to mention it.

If you don't know anything about GPU graphics reading this post will either confuse, fluster, or bore you. 

So here's the deal. You have a texture and some texture coordinates. Texturing correctly happens to involve calculating a differential equation of the texture coordinate over the screen. In the old style fixed function pipeline this is done for you. Well, relatively recently at least. If you remember how textures looked on the Playstation 1, how they kind of swam and made you sick, that's what happens if you don't do this differential correction.

Anyway, the hardware can do this for you since it knows what the texture coordinates are, since they're just in the polygons. But when we start using custom shaders that generate texture coordinates dynamically, this changes. If you can spit out any random number, how does the GPU determine this differential? It could make the shader generate it too, but it doesn't. The shader creator doesn't have to produce differentials himself, which is a very good thing; that'd be obnoxiously difficult.

To understand the answer you have to know something about shader execution. GPUs are fast because they are massively parallel. They execute many pixels in parallel. In fact, they cluster pixels together and run them in lockstep. That is, the shader executes for every pixel in the cluster at the identical time. Each pixel does every instruction simultaneously. 

So the processor cheats. To calculate the differential across the screen, it literally grabs the value from the next pixel over, subtracts, and calls it a day. It's perfectly accurate (if only first-order), and gets the job done, in most cases.

Which value does it compare to? Well, when you execute a texture lookup, it looks at which register the input texture coordinate is on, and uses that register index to lookup the neighboring pixels. Brilliant! So where's the problem?

Flow control. Say you have an if statement. Very simple. Not too common in shaders but we'd like them to be usable. Say you compute the texture coordinate inside the if statement. Say your neighboring pixel never went into that if statement. Now what? Your register has been computer, but the neighboring pixel has not! The differential is utterly invalid! Just because you used an if statement. Consider the following two blocks of shader code: 

float3 TC = float3(R.x, R.y, -R.z);
if (dot(R, R) > .01)
{ += texCUBE(SpecularMap, TC).xyz;

if (dot(R, R) > .01)
float3 TC = float3(R.x, R.y, -R.z); += texCUBE(SpecularMap, TC).xyz;

Not functionally different. In fact, most C programmers would opt for the second one, because why calculate something outside the if when you never need it? BUT, the second one causes visual artifacts, while the first one does not! 

The second use of that differential calculation is to select which mip-level of the texture to use. If the differential is high, that implies the texture is being scaled down, so the GPU uses a small mip level. If you have a correctly built mip-chain, the worst you'll get is a small color abberation. But if you're using a rendered texture without a good mip-chain, well, you get garbage data. Here's my recent example for you, guess which is which.

The white edges on the cells are pixels where the differential is invalid, picking a white pixel out of the mip-chain. 

So yes, it's a wonderous hack, because it's capable of computing complicated differentials perfectly and automatically. But it's an awful hack, because it's extremely difficult to track down when something's going wonky, and because the fix looks like an arbitrary nonsensical change to the shader code.

But It Is Awesome.

No comments:

Post a Comment