Based on a patch by Stefan Dösinger. This is more flexible, and allows
the shader backend implementation to be simpler, since it doesn't have
to know about specific formats. The next patch makes use of this.
This avoids the double search for a pixel shader. The pixel shader
compilation parameter structure is recorded in the GLSL program
hashmap, together with the WineD3D pixel shader.
Some stateblock parameters have to be compiled into the GL pixel
shader code, like lines for pixelformat fixups. This leads to problems
when applications switch those settings, requiring a recompilation of
the shader. This patch enables wined3d to have multiple GL shaders for
a D3D shader(pixel shaders only so far) to handle this more
efficiently.
This was suggested by Ivan quite a while ago, and we need it to better
handle conflicting texture format corrections and similar stateblock
value changes which until now required a recompilation of the entire
shader
A number of considerations contribute to this:
1) The shader backend knows best which shader(s) it needs. GLSL needs
both, arb only one
2) The shader backend may pass some parameters to the compilation
code(e.g. which pixel format fixup to use)
3) The structures used in (2) are different in vs and ps, so a
baseshader::Compile won't work
4) The structures in (2) are wined3d-private structures, so
having a public method in the vtable won't work(its a bad idea
anyway).
GL_ATI_envmap_bumpmap provides two things: Signed V8U8 pixel formats,
and bump mapping. The extension is only supported on fglrx, and this
driver also supports GL_ARB_fragment_program. Thus the bump mapping
code is never used on any driver out there. Furthermore, if it is
used, it tends to crash the driver
The signed pixel format is used, as it can be used by pixel shaders or
the ARBfp replacement. However, the format is broken in fglrx, and
negative values are clamped to 0.0. This results in test
failures. WineD3D has an alternative codepath using scale+bias to
enable V8U8 using a standard signed RGB which works correctly on
fglrx.
GL_ARB_fragment_program and GL_ATI_fragment_shader can disable
projected textures properly, and they can also handle
D3DTTFF_PROJECTED | D3DTTFF_COUNT3 properly.
If a format is not supported natively by opengl, a shader may be able
to convert it. Up to now, CheckDeviceFormat had magic knowldge which
GL extensions lead to which supported format. This patch adds
functions that allow CheckDeviceFormat to ask the actual
implementation for its capabilities.
This is an ATI specific format designed for compressed normal maps,
and quite a few games check for its existence. While it is an
ATI-specific "extension" in d3d9, it is a core part of
D3D10(DXGI_FORMAT_BC5), and supported on Geforce 8 cards.
It isn't related to the shader backend any longer. The nvts_enable in
the ffp code isn't quite right as well, it should be moved away once
there is a dedicated nvts fragment pipeline replacement
Calling shader_select() from inside depth_blt() isn't necessarily
safe. shader_select() assumes CompileShader() has been called for the
current shaders, but that depends on STATE_VSHADER / STATE_PIXELSHADER
being applied. That isn't always true when depth_blt() gets called,
with the result that sometimes GLSL programs could be created with no
shader objects attached.
SM3.0 requires 10 4 component float varyings for passing stuff between
vertex and pixel shaders. GF7 and earlier report 8 generic varyings +
gl_Color and gl_SecondaryColor in GLSL. This patch allows us to use
gl_Color and gl_SecondaryColor to get 2 extra varyings, which some
games, like C&C3 with highest gfx settings, require.
The previous logic assumed that if NVTS or ATIFS are available they
will be used. This happens to be true for NVTS, but ATIFS is only used
if neither ARBFP nor GLSL are supported. This breaks fixed function
fragment processing on ATI r300 and newer cards
Since the shader backend implementations might track opengl resources in
their private data inform them about reset calls. For example, the atifs
backend keeps track of the replacement shaders, which are lost during an
opengl context recreation.
The whole control structures in directx.c get terribly confusing with
the various codepaths for texturing and different shader
implementations. It is also hard to reflect the shader model
decisions this way too. This patch moves the shader specific parts of
the caps code into the shader backend where we can set our caps
dependent of the shader model decisions and without complex caps flag
checks.
Generating the shader ID and parts of the shader prolog and epilog was
done by the common vertexshader.c / pixelshader.c, which is ugly.
This patch doesn't get rid of all the uglyness, somewhen we'll still
have to sort out the relationship of [arb|glsl]_generate_shader and
[arb|glsl]_generate_declarations.
Add a new property of the shader backend which indicates whether the
shader backend is able to dirtify single constants rather than
dirtifying vshader and pshader constants as a whole. Depending on this
a different Set*ConstantF implementation is used which marks constants
dirty. The ARB shader backend uses this and marks constants clean
after uploading.
The GL_ARB_vertex_program extension does not define a standard value for
output texture coordinates. This makes problems when using vertex
shaders with fixed function fragment processing because fffp divides the
texture coords by its .w component. This means that gl shaders have to
write to the .w component of texture coords. Direct3D shaders however
do not.
gl_FogFragCoord and gl_PointSize are floats rather than vec4s in GLSL,
so they shouldn't have a destination swizzle, and the write mask we
return should consist of only the first component.
- Replace gl_FragColor with gl_FragData[0] for GLSL pixel shader output.
- Subtract 1 more constant from total GLSL allowed float constants to
accommodate the PROJECTION matrix row that we reference.
The current GLSL cmp instruction is incorrect, because:
- it ignores destination write mask
- it ignores source swizzle
- it ignores other source modifiers.
- it works incorrectly for src0 = 0
- Implement if, else, endif, rep, endrep, break
- Implement ifc, breakc, using undocumented comparison bits in the instruction token
- Fix bug in main loop processing of codes with no dst token
- Fix bug in GLSL output modifier processing of codes with no dst token
- Fix bug in loop implementation (src1 contains the integer data, src0 is aL)
- Add versioning for all the instructions above, and remove
GLSL_REQUIRED thing, which is useless and should be removed from all
opcodes in general.
- move DEF, DEFI, DEFB handling into the register counting pass
- keep track of defined constants as a linked list (because there's a
few of them)
- apply immediate constants after global constants in the constant
loading function
- both types of constants now get loaded with array notation in the
shader (into the same array)
This fixes the translations for a few instructions in GLSL and allows
Cubemap sampling in pixel shaders < 2.0. It makes some of the
lighting on textures in Half Life 2 look better, including some of the
water effects. It's not perfect yet, but much closer now.
- Implement D3DSIO_DP2ADD, D3DSIO_TEXKILL, D3DSIO_TEXM3X3PAD
- Partially implement D3DSIO_TEXBEM, D3DSIO_TEXM3X3VSPEC (as much as
they are implemented in ARB_fragment_program at least).
- Stop copying the SHADER_PARSE_STATE struct in each ARB shader
routine - use a pointer instead.
We are only checking against GL_MAX_TEXTURES when binding samplers,
when we should be checking against the maximum number of samplers that
the card supports. Spotted by H. Verbeet.
- NVidia allows "const vec4 = {1.0, 2.0, 3.0, 4.0};", even though
that's not part of the spec.
- It should be "const vec4 = vecr4(1.0, 2.0, 3.0, 4.0);"
- This patch fixes this for D3DSIO_DEF and D3DSIO_DEFI.
- Separate the declaration phase of the shader string generator into
the arb and glsl specific files.
- Add declarations and recognition for application-sent constant
integers and booleans (locally defined ones will follow).
- Standardize capitilization of pixel/vertex specific variable names.
- Moves GLSL constant loading code into glsl_shader.c and out of the
over-populated drawprim.c.
- Creates a new file named arb_program_shader.c which will hold code
specific to ARB_vertex_program & ARB_fragment_program.
- Remove the constant loading calls from drawprim.c
- Implemented: D3DSIO_SGN, LOOP, ENDLOOP, LOGP, LIT, DST, SINCOS
- Process instruction-based modifiers (function existed, it just
wasn't being called)
- Add loop checking to register maps.
- Renamed "sng" to "sgn" for D3DSIO_SGN - it's not handled anywhere
except for GLSL, so won't matter.
- track sampler declarations and store the sampler usage in reg_maps structure
- store a fake sampler usage for 1.X shaders (defined as 2D sampler)
- re-sync glsl TEX implementation with the ARB one (no idea why they diverged..)
- use sampler type in new TEX implementation to support 2D, 3D, and Cube sampling
- change drawprim to bind pixel shader samplers
Additional improvements:
- rename texture limit to texcoord to prevent confusion
- add sampler limit, and use that for samplers - *not* the same as texcoord above
SM 3.0 can pack multiple "semantics" into 12 generic input/output registers.
To support that, define temporaries called IN and OUT, and use those as
the output registers. At the end of the vshader, unpack the OUT temps
into the proper GL variables. At the beginning of the pshader, pack the
GL variables back into 12 IN registers.
Now that the declaration function is out of the way, the tracing pass,
which is very long and 100% the same can be shared between pixel and
vertex shaders.
The new function is called shader_trace_init(), and is responsible for:
- tracing the shader
- initializing the function length
- setting the shader version [needed very early]
The new function is called in pass 2 (getister counting/maps), and
it's now in baseshader. It operates on all INPUT and OUTPUT registers,
which, in addition to the old vertex shader input declarations covers
Shader Model 3.0 vshader output and pshader input declarations. The
result is stored into the reg_map structure.
Delete the entire namedArrays code path and all its dependencies (one
of which is quite long - storeOrder in drawprim is always FALSE, for
example). Delete declaredArrays, and make its code path the default.
I missed a register mask in the move to share the shader_hw_def()
function between pixel and vertex shaders for ARB shaders. Fixed
that, and made the GLSL version use the same mask for consistency.
- Based on comments from H. Verbeet
- Changed the distinction from .rgba & .xyzw masks to only use .xyzw
in GLSL shaders. They are interchangeable, and only served to make
the trace look more intuitive, but they don't always apply as-is, so
we'll just leave everything to .xyzw.
- Got rid of the "UseProgramObjectARB(0)" call in drawprim. If there
is no shader set on the next primitive, then that primitive will
call UseProgramObjectARB(0) when it begins to draw.
- Add a new file glsl_shader.c which contains almost every GLSL specific function we'll need
- Move print_glsl_info() into glsl_shader.c
- Move the shader_reg_maps struct info into the private header, and make it part of SHADER_OPCODE_ARG.
- Create a new shared ps/vs register map for float constants (future patch will make ARB programs use this, too)