I don't like that I have to do this because the posFixup is in all
vertex programs, so its at the same position and could be loaded
globally. Unfortunately, there are only 256 env parameters usually,
which makes it impossible for any shader to use c256, even if it does
not use indirect addressing, and so we can't claim 256 constant
support.
I find it helpful for debugging to have this controlled at a central place,
without having to disable the entire GL extension or manually find all the
places where GL_SUPPORT(NV_VERTEX_PROGRAM2_OPTION) controls clipplane use. It
is useful for debugging the emulation code on NV cards and for debugging mac
driver issues.
b2f09fd204 accidentally got the
device->vs_clipping check wrong. The FFP replacement should emulate
clipping if GL can't do this natively with vertex shaders, not the
other way. Also don't emulate clipping if we're using fixed function
vertex processing because (a) clipping is always supported by GL in
this case, and (b), fragment.texcoord[7] is undefined. (Or in the
worst case set to something bad by the app).
If the needed constants are available, we can support all vs_2_0 and ps_2_0
requirements with the plain ARB extensions. We cannot however, run SM 2.0a or
SM 2.0b.
ps 1.x constants are clamped to [-1;1], constants in >= 2.0 pshaders
are not. This means we have to reload constants when switching between
those shader types in ARB. In GLSL this is not a concern because
constants are tied to program objects and are reloaded on a shader
change anyway.
This patch tries to find a free texture coordinate to load up to 4 clip
coordinates into the pixel shader, and uses KIL to throw away fragments
that are cut by a clipplane. If no free texture coordinate is found,
clipping is not done. If more than 4 clipplanes are used, only the first
4 are actually enabled. That should be pretty rare though.
Using GL_NV_vertex_program2_option so far. If we're really desparate we can
handle some cases without the extension by using a custom varying and texkill
in the fragment program.
This gives a small performance improvement. Don't enable NVfp for it though,
because the NVfp penalty is bigger than the gain from this patch. But if NVfp
is enabled anyway, make use of it.
This reverts patch ba35760f9f.
The original patch did not achive its goal, because CMP is a macro that is
expanded to SLT, SGE, MUL, MAD, at least on nvidia hardware. To make matters
worse, it uses a temporary register, and the assembler usually is not clever
enough to find a free temporary from the shader code. If we generate the code
outselves we can pick one of our temps for this job.
Many 2.0 and 3.0 shaders end with a "mov oC0, rx". If sRGB writing is enabled,
the ARB backend writes to a TMP_COLOR temporary, and at the end of the shader
writes the sRGB corrected color to result.color. If oC0 is not partially
rewritten after the mov, we can ignore the mov, not declare TMP_COLOR at all,
and just use the rx register as input for the sRGB correction code. This saves
a temporary and an instruction.
This reduces the number of methods in the shader backend(the instr
modifiers can be handled in that wrapper) and it will help flow
control emulation in the ARB backend.
SCS is unfortunately a fragment program only instruction. If we have the NV
extensions we can use SIN and COS. Otherwise we have to approximate sine and
cosine with a taylor series. Luckily we're provided with the necessary
constants by the application.
TMP_POS is only used in vertex shaders, declare it in the vshader
specific code. The sRGB constants are only used by pixel shaders, so
move them to the ps specific code, and avoid reading the stateblock.
To be able keep the temporary register in the type independent NRM
instruction, the vertex temporary register is renamed to TA to match
the name of a pixel shader register.
texm3x2pad knows which register the following texm3x2depth or tex instruction
will use, and it knows that this register is uninitialized. So use it for
temporary storage instead of TMP.
This is the Nth attemt to make clipping work with GLSL shaders. The patch now
uses the GLSL quirk table to handle cards that need a custom varying for
gl_ClipPos, and the code is adapted to the changed state table and shader
backend system.
This simplifies the loading code a bit. The constants were never
designed to be at the same location in all shaders, so there's no
point in using program.env. This way we don't collide with the d3d
shader constants and its easier to work together with NP2 fixups and
other shaders.
This was needed unconditionally in the past to apply fog, but since we're
using the ARBfp fog defines it is only needed if an sRGB correction is done
at the end of the shader.