- Implemented: D3DSIO_SGN, LOOP, ENDLOOP, LOGP, LIT, DST, SINCOS
- Process instruction-based modifiers (function existed, it just
wasn't being called)
- Add loop checking to register maps.
- Renamed "sng" to "sgn" for D3DSIO_SGN - it's not handled anywhere
except for GLSL, so won't matter.
There are a total of 17 instructions without a destination token. Of
those 9 have num_params != 0, which means that we will not process any
of them correctly, because we assume the first token (if present) is a
destination token.
Those are basically all the flow control instructions, which we plan to
support very soon. They have source tokens, and no destination. Add a
flag that marks them up to the ins table. Use this flag in the trace
pass, and generation pass.
- track sampler declarations and store the sampler usage in reg_maps structure
- store a fake sampler usage for 1.X shaders (defined as 2D sampler)
- re-sync glsl TEX implementation with the ARB one (no idea why they diverged..)
- use sampler type in new TEX implementation to support 2D, 3D, and Cube sampling
- change drawprim to bind pixel shader samplers
Additional improvements:
- rename texture limit to texcoord to prevent confusion
- add sampler limit, and use that for samplers - *not* the same as texcoord above
SM 3.0 can pack multiple "semantics" into 12 generic input/output registers.
To support that, define temporaries called IN and OUT, and use those as
the output registers. At the end of the vshader, unpack the OUT temps
into the proper GL variables. At the beginning of the pshader, pack the
GL variables back into 12 IN registers.
Various cleanups:
- do not use DWORD as a bitmask, that places artificial limit of 32 on
registers
- track attributes that are used and declare only those
- move declarations function call in pshader/vshader to allow us to
insert pixel or vertex specific code between the declarations and
the rest of the code
- remove redundant 0 intializers
- remove useless continue statement
Now that the declaration function is out of the way, the tracing pass,
which is very long and 100% the same can be shared between pixel and
vertex shaders.
The new function is called shader_trace_init(), and is responsible for:
- tracing the shader
- initializing the function length
- setting the shader version [needed very early]
The new function is called in pass 2 (getister counting/maps), and
it's now in baseshader. It operates on all INPUT and OUTPUT registers,
which, in addition to the old vertex shader input declarations covers
Shader Model 3.0 vshader output and pshader input declarations. The
result is stored into the reg_map structure.
Delete the entire namedArrays code path and all its dependencies (one
of which is quite long - storeOrder in drawprim is always FALSE, for
example). Delete declaredArrays, and make its code path the default.
I missed a register mask in the move to share the shader_hw_def()
function between pixel and vertex shaders for ARB shaders. Fixed
that, and made the GLSL version use the same mask for consistency.
- Based on comments from H. Verbeet
- Changed the distinction from .rgba & .xyzw masks to only use .xyzw
in GLSL shaders. They are interchangeable, and only served to make
the trace look more intuitive, but they don't always apply as-is, so
we'll just leave everything to .xyzw.
- Got rid of the "UseProgramObjectARB(0)" call in drawprim. If there
is no shader set on the next primitive, then that primitive will
call UseProgramObjectARB(0) when it begins to draw.
- Add functions to attach & detach shader objects, create and delete programs, and maintain the list of programs.
- Add a list of GLSL shader programs to the device which is initialized on Init3D(), and deleted on Release().
- Declare more variable names for GLSL programs.
- Some of these won't need to be declared eventually, but it doesn't hurt to do it for now.
- Correct output name for pixel shaders (gl_FragColor instead of glFragColor).
- Add a new file glsl_shader.c which contains almost every GLSL specific function we'll need
- Move print_glsl_info() into glsl_shader.c
- Move the shader_reg_maps struct info into the private header, and make it part of SHADER_OPCODE_ARG.
- Create a new shared ps/vs register map for float constants (future patch will make ARB programs use this, too)
loading float constants for GLSL.
- DrawPrim is just too big of a function. This separates the passing
of constants to the shader into new functions.
- Fixes an off-by-one error when loading vertex declaration constants
(should be <, not <=)
- Adds a function for GLSL loading of constants (aka Uniforms)
- Adds a GLSL program variable to the stateblock and sets it to 0 (a
future patch will actually create this program)
It is wrong to maintain a mapping from a constant index to a type
field, because different constant types do not share an index -
boolean constant 0 is supposed to co-exist with floating point
constant 0, not replace it. Drawprim and other code using the type
array to decide whether to look up a constant in bools, floats, or
ints is wrong - you can't make that decision based on the index.
Implement some basic opengl accelerated blts from and to render
targets. It's not perfect yet, but enought to make some D3D apps
happy. For now the only supported operations are:
- Full screen back -> Front buffer: Just call present
- Offscreen surface -> render target
- Render target -> offscreen surface(slow)
- render target colorfill
Each instruction can have a predication token. Account for it in the
trace pass, register count pass, and store it in the SHADER_OPCODE_ARG
structure for generation. MSDN claims the token is at the end of the
instruction, but that's not true - testing a demo, which lets me
manipulate the shader shows the predication token is the first source
token immediately following the destination token.
As previously mentioned, RASTOUT is invalid on pixel shaders.
On shaders 1.x, r0 is treated as the color output register:
http://www.gamedev.net/columns/hardcore/dxshader3/page2.asp
That's what we currently do in all cases, change it not to do so
for shaders >= 2.0. Support COLOROUT/DEPTHOUT instead.
Currently we hardcode a0.x, which I think is correct for shaders 1.0.
However, for shaders 2.0, we must look into the address token, and
print the register there. Handle both cases to correct the trace.
Change the trace pass, the register counting pass, and the hw
generator pass to take into account the new get_params() function. For
hw generation, store the address tokens into the SHADER_OPCODE_ARG
structure, so they're available to generator functions.
Add a new function to process parameters.
On shaders 1.0, processing parameters amounts to *pToken++.
On shaders 2.0+, we have a relative addressing token to account for.
This function should be used, instead of relying on num_params everywhere.
Share shader_dump_ins_modifiers(), and make vertex shaders use it.
The saturate modifer (_sat) is valid on vs_3_0+, and it isn't being
shown in the trace.