- currently half the shader selection code (GLSL vs ARB) is in
fillGLcaps. The parts that check for software shaders are in
GetDeviceCaps. That placement, will work, but is definitely not optimal.
FillGLcaps should detect support - it should not make decision as to
what's used, because that's not what the purpose of the function is.
GetDeviceCaps should report support as it has already been selected.
Instead, select shader mode in its own function, called in the
appropriate places.
- unifying pixel and vertex shaders into a single selection is a
mistake. A software vertex shader can be coupled with a hardware arb or
glsl pixel shader, or no shader at all. Split them back into two and add
a SHADER_NONE variant.
- drawprim is doing support checks for ARB_PROGRAM, and making shader
decisions based on that - that's wrong, support has already been
checked, and decided upon, and shaders can be implemented via software,
ARB_PROGRAm or GLSL, so that support check isn't valid.
- Store the shader selected mode into the shader itself. Different types
of shaders can be combined, so this is an improvement. In fact, storing
the mode into the settings globally is a mistake as well - it should be
done per device, since different cards have different capabilities.
On nVidia cards the value of GL_MAX_TEXTURE_UNITS is generally not
larger than 4. In Direct3D that would correspond to
MaxSimultaneousTextures in the caps, rather than MaxTextureBlendStages
(which can be much larger) to which it currently corresponds in
wined3d. Using register combiners we can get around that limitation
and get up to GL_MAX_GENERAL_COMBINERS_NV (typically 8) texture
stages. This patch adds code for doing the texture operations with
register combiners instead of ARB_texture_env_combine or
NV_texture_env_combine4, but doesn't make use of that code yet. That's
what the next patch will do.
GL_LIMITS(textures) is currently used for both the number of texture
stages and the maximum number of simultaneous textures. In the current
code that's the same, but in a later patch that will be separated,
since a texture stage doesn't have to reference an actual
texture. Also, shaders can access a larger number of samplers than the
number of texture units the fixed function pipeline can access.
- Implement D3DSIO_DP2ADD, D3DSIO_TEXKILL, D3DSIO_TEXM3X3PAD
- Partially implement D3DSIO_TEXBEM, D3DSIO_TEXM3X3VSPEC (as much as
they are implemented in ARB_fragment_program at least).
- Stop copying the SHADER_PARSE_STATE struct in each ARB shader
routine - use a pointer instead.
- Separate the declaration phase of the shader string generator into
the arb and glsl specific files.
- Add declarations and recognition for application-sent constant
integers and booleans (locally defined ones will follow).
- Standardize capitilization of pixel/vertex specific variable names.
- Moves GLSL constant loading code into glsl_shader.c and out of the
over-populated drawprim.c.
- Creates a new file named arb_program_shader.c which will hold code
specific to ARB_vertex_program & ARB_fragment_program.
- Remove the constant loading calls from drawprim.c
- Implemented: D3DSIO_SGN, LOOP, ENDLOOP, LOGP, LIT, DST, SINCOS
- Process instruction-based modifiers (function existed, it just
wasn't being called)
- Add loop checking to register maps.
- Renamed "sng" to "sgn" for D3DSIO_SGN - it's not handled anywhere
except for GLSL, so won't matter.
There are a total of 17 instructions without a destination token. Of
those 9 have num_params != 0, which means that we will not process any
of them correctly, because we assume the first token (if present) is a
destination token.
Those are basically all the flow control instructions, which we plan to
support very soon. They have source tokens, and no destination. Add a
flag that marks them up to the ins table. Use this flag in the trace
pass, and generation pass.
- track sampler declarations and store the sampler usage in reg_maps structure
- store a fake sampler usage for 1.X shaders (defined as 2D sampler)
- re-sync glsl TEX implementation with the ARB one (no idea why they diverged..)
- use sampler type in new TEX implementation to support 2D, 3D, and Cube sampling
- change drawprim to bind pixel shader samplers
Additional improvements:
- rename texture limit to texcoord to prevent confusion
- add sampler limit, and use that for samplers - *not* the same as texcoord above
SM 3.0 can pack multiple "semantics" into 12 generic input/output registers.
To support that, define temporaries called IN and OUT, and use those as
the output registers. At the end of the vshader, unpack the OUT temps
into the proper GL variables. At the beginning of the pshader, pack the
GL variables back into 12 IN registers.
Various cleanups:
- do not use DWORD as a bitmask, that places artificial limit of 32 on
registers
- track attributes that are used and declare only those
- move declarations function call in pshader/vshader to allow us to
insert pixel or vertex specific code between the declarations and
the rest of the code
- remove redundant 0 intializers
- remove useless continue statement
Now that the declaration function is out of the way, the tracing pass,
which is very long and 100% the same can be shared between pixel and
vertex shaders.
The new function is called shader_trace_init(), and is responsible for:
- tracing the shader
- initializing the function length
- setting the shader version [needed very early]
The new function is called in pass 2 (getister counting/maps), and
it's now in baseshader. It operates on all INPUT and OUTPUT registers,
which, in addition to the old vertex shader input declarations covers
Shader Model 3.0 vshader output and pshader input declarations. The
result is stored into the reg_map structure.
Delete the entire namedArrays code path and all its dependencies (one
of which is quite long - storeOrder in drawprim is always FALSE, for
example). Delete declaredArrays, and make its code path the default.
- Add a new file glsl_shader.c which contains almost every GLSL specific function we'll need
- Move print_glsl_info() into glsl_shader.c
- Move the shader_reg_maps struct info into the private header, and make it part of SHADER_OPCODE_ARG.
- Create a new shared ps/vs register map for float constants (future patch will make ARB programs use this, too)
loading float constants for GLSL.
- DrawPrim is just too big of a function. This separates the passing
of constants to the shader into new functions.
- Fixes an off-by-one error when loading vertex declaration constants
(should be <, not <=)
- Adds a function for GLSL loading of constants (aka Uniforms)
- Adds a GLSL program variable to the stateblock and sets it to 0 (a
future patch will actually create this program)
It is wrong to maintain a mapping from a constant index to a type
field, because different constant types do not share an index -
boolean constant 0 is supposed to co-exist with floating point
constant 0, not replace it. Drawprim and other code using the type
array to decide whether to look up a constant in bools, floats, or
ints is wrong - you can't make that decision based on the index.
Implement some basic opengl accelerated blts from and to render
targets. It's not perfect yet, but enought to make some D3D apps
happy. For now the only supported operations are:
- Full screen back -> Front buffer: Just call present
- Offscreen surface -> render target
- Render target -> offscreen surface(slow)
- render target colorfill
Each instruction can have a predication token. Account for it in the
trace pass, register count pass, and store it in the SHADER_OPCODE_ARG
structure for generation. MSDN claims the token is at the end of the
instruction, but that's not true - testing a demo, which lets me
manipulate the shader shows the predication token is the first source
token immediately following the destination token.
Currently we hardcode a0.x, which I think is correct for shaders 1.0.
However, for shaders 2.0, we must look into the address token, and
print the register there. Handle both cases to correct the trace.
Change the trace pass, the register counting pass, and the hw
generator pass to take into account the new get_params() function. For
hw generation, store the address tokens into the SHADER_OPCODE_ARG
structure, so they're available to generator functions.
Add a new function to process parameters.
On shaders 1.0, processing parameters amounts to *pToken++.
On shaders 2.0+, we have a relative addressing token to account for.
This function should be used, instead of relying on num_params everywhere.
Share shader_dump_ins_modifiers(), and make vertex shaders use it.
The saturate modifer (_sat) is valid on vs_3_0+, and it isn't being
shown in the trace.
Start to add support for DirectX 8 vertex shaders, constants and
registers are now correctly assigned and loaded allowing support for
most basic d3d8 shaders.
- use D3DCOLOR macros instead of using shift + masks
- fix a bug where diffuse.lpData checked instead of specular.lpData
- implement color fixup on ARB VShader compilation code:
-> on input parameters using swizzle
-> add is_color parameter on vshader_program_add_param
a textures is locked. This is necessary for buggy games like Warhammer
40k that don't work with the odd span sizes produce by default
nonpower 2 support.
vertex shaders 3.0.
The Parser will now display the input shader in DirectX style, and the
cross compiler now generates valid ARB_VERTEX_PROGRAM programs and
outputs the result in ARB_VERTEX_PROGRAM style.
Support for a number of extended attributes has been added, but this
may not be complete, and dereferencing from loop counters isn't
properly parsed yet.
relate to the opengl texture object will only be updated when they are
out of sync, this reduces the number of texture object state changes
during game play in Axis and allies from several hundreds to 0 or 1.
any objects referenced by the internal stateblock are released when
the stateblock is released (we don't reference count while a
stateblock is recording, so recorded stateblocks have no references to
clean up).
glDesciption member.
Removed Level and Target from LoadTexture, and reduced the dependency
on surface->device.
Fixed a couple of compiler warnings in d3d9.
the interface but it is more correct way (Microsoft even have a
resource type of volume).
- Moved usage, format, allocatedMemory and size onto the resource
class structure.
- Refactored Preload for classes that inherit BaseTexture, preload now
binds the texture instead of bind texture calling preload, bindTexture
allocated a glTexture if there isn't one.
- Added two new class static members BaseTexture_CleanUp and
Resource_CleanUp that should be called by classes that implement
BaseTexture or Resource.
Added BindTexture, GetTextureDimensions, UnBindTexture.
Proper GetContainer support for surface.
SetContainer added to surface and volume.
SetInPbufferState added to surface (until gl context management is
implemented).
Minor changes:
- BaseTexture no longer 'holds' a reference to IWineD3DDevice to
prevent circular referencing.
- Better managment of referinging for texture.
- Some TODO's for implementing a context manager.
- Better preload implementation.
- Fix compile warning in device.c Set/GetSamplerState.
- Add QueryInterface support for surface.
- Format X8R8G8B8 added to locking.
- Only prototype the interfaces which are subclassed (I overdid it
last time!).
- Implement Get/Set Texture and GetBackBuffer, plus device's
GetDisplayMode / GetDeviceCaps.
- Make some of the d3d9 skeleton code issue fixme's to highlight code
which hasn't been migrated yet.
- Correct the d3d9 headers for D3DSURFACE_DESC which caused stack
corruption in demos.
Resource classes in wined3d and use when called from d3d9.
- Reduce the header includes in all the d3d9 interface to one common
set in the private header.
- Move some of the screen mode related functions into wined3d and add
untested support for the new d3d9 options of providing the format to
some of the calls.
- Move other functions from the directx interface into the common
library and implement the calls from d3d9 as well.
- Copy across the first of the functions used to make traces more readable,
creating utils.c to store them in. Eventually the ones in d3d8 will be
removed but for now just duplicate the code.