ps 1.x constants are clamped to [-1;1], constants in >= 2.0 pshaders
are not. This means we have to reload constants when switching between
those shader types in ARB. In GLSL this is not a concern because
constants are tied to program objects and are reloaded on a shader
change anyway.
This patch tries to find a free texture coordinate to load up to 4 clip
coordinates into the pixel shader, and uses KIL to throw away fragments
that are cut by a clipplane. If no free texture coordinate is found,
clipping is not done. If more than 4 clipplanes are used, only the first
4 are actually enabled. That should be pretty rare though.
Using GL_NV_vertex_program2_option so far. If we're really desparate we can
handle some cases without the extension by using a custom varying and texkill
in the fragment program.
This gives a small performance improvement. Don't enable NVfp for it though,
because the NVfp penalty is bigger than the gain from this patch. But if NVfp
is enabled anyway, make use of it.
This reverts patch ba35760f9fd5fd90a0fa34077862f04513d1ab16.
The original patch did not achive its goal, because CMP is a macro that is
expanded to SLT, SGE, MUL, MAD, at least on nvidia hardware. To make matters
worse, it uses a temporary register, and the assembler usually is not clever
enough to find a free temporary from the shader code. If we generate the code
outselves we can pick one of our temps for this job.
Many 2.0 and 3.0 shaders end with a "mov oC0, rx". If sRGB writing is enabled,
the ARB backend writes to a TMP_COLOR temporary, and at the end of the shader
writes the sRGB corrected color to result.color. If oC0 is not partially
rewritten after the mov, we can ignore the mov, not declare TMP_COLOR at all,
and just use the rx register as input for the sRGB correction code. This saves
a temporary and an instruction.
This reduces the number of methods in the shader backend(the instr
modifiers can be handled in that wrapper) and it will help flow
control emulation in the ARB backend.
SCS is unfortunately a fragment program only instruction. If we have the NV
extensions we can use SIN and COS. Otherwise we have to approximate sine and
cosine with a taylor series. Luckily we're provided with the necessary
constants by the application.
TMP_POS is only used in vertex shaders, declare it in the vshader
specific code. The sRGB constants are only used by pixel shaders, so
move them to the ps specific code, and avoid reading the stateblock.