Most callers work on a stateblock rather than a device, and the main fields
we check (vertexShader and pixelShader) are part of the stateblock as well.
Based on a patch by Stefan Dösinger. This is more flexible, and allows
the shader backend implementation to be simpler, since it doesn't have
to know about specific formats. The next patch makes use of this.
Note that minMipLookup and magLookup aren't particularly safe to use,
they're global arrays initialized from IWineD3DImpl_FillGLCaps(). The same
goes for the other global dynamic lookup tables.
A number of considerations contribute to this:
1) The shader backend knows best which shader(s) it needs. GLSL needs
both, arb only one
2) The shader backend may pass some parameters to the compilation
code(e.g. which pixel format fixup to use)
3) The structures used in (2) are different in vs and ps, so a
baseshader::Compile won't work
4) The structures in (2) are wined3d-private structures, so
having a public method in the vtable won't work(its a bad idea
anyway).
This prevents the target from changing during the first PreLoad() call
on a surface, which would be inconvenient when attaching a surface to
a FBO for example.
This gives a small performance improvement for applications that are
smart enough to set the D3DPRESENTFLAG_DISCARD_DEPTHSTENCIL flag, or
to create depth stencils with Discard set to TRUE.
Since those surfaces are stored in blocks, the 4 pixel step doesn't only apply to surfaces < 4, but
also to surfaces bigger than that, with a non-multiple-of-4 size.
Clear all attachments before deleting FBOs. It should be valid to
delete FBOs that still have attachments, but for some reason the
nvidia drivers don't like it. The resulting memory corruption can be
pretty nasty, and this workaround seems clean enough.
This is the prefered format of many codecs, and for some codecs this
is the only supported output format. As usual I try to handle all the
conversion in the GPU and keep the CPU involvement minimal to gain the
full performance of PBO transfers.
Although sharing FBOs across contexts is allowed by EXT_framebuffer_object
(issue 76), it causes issues with nVidia drivers. Considering the GL 3 spec
explicitly disallows sharing of FBOs accross contexts (Appendix D), this
patch is probably the right thing to do.
This is a long-needed cleanup aimed at removing the ddraw_primary,
ddraw_window, ddraw_width and ddraw_height members from
IWineD3DDeviceImpl, which just do not belong there. Destination
window and screen handling is supposed to be done by swapchains.
This is an ATI specific format designed for compressed normal maps,
and quite a few games check for its existence. While it is an
ATI-specific "extension" in d3d9, it is a core part of
D3D10(DXGI_FORMAT_BC5), and supported on Geforce 8 cards.
This happened to work because most cards have the same amount of
pshader and vshader constants, but for some reason this doesn't hold
true on this macbook pro here, which lead to a crash due to heap
corruption
This is cleaner than the if statements in the code. Also np2 textures
should in theory support linear filtering, but fglrx doesn't seem to
like it. This needs further investigation. So far we've never used
linear filtering on np2 textures, so there should not be a
regression. Furthermore I think shader support is more important than
filtering, since NP2 textures are mostly used for 1:1 copying to the
screen.
ATI cards prior to the radeon HD series did not have unconditional non
power of two support. So far we've used texture_rectangle for that, or
created a bigger power of two texture with padding. This had the
disadvantage that we had to correct the coordinates, which causes
extreme problems with shaders(doesn't work, pretty much).
Both the MacOS and the fglrx driver have support for
GL_ARB_texture_non_power_of_two, and run it on the hardware as long as
we stay within the texture_rectangle limitations. This allows us to
have conditional non power of two textures with normalized
coordinates. This patch adds an internal extension, and the code
creates a regular GL_TEXTURE_2D texture with NP2 size, but refuses
mipmapping, filtering and texture_rectangle incompatible
operations. This makes np2 textures work with shaders on fglrx and
macos.
Since atifs is only doing the fragment pipeline replacement right now
there is no need for the shader backend structure any longer. The ffp
private data is stored in new fragment pipeline private data(which
could potentially be set to equal the shader private data if needed).
Destroying the stateblock potentially references the shader backend.
If the stateblock has active shaders when it is released, the shader's
destructor will tell the shader backend to destroy the corresponding
resources. This was exposed by my patch that moved the glsl program
lookup table into the backend's private data.
The idea of this patchset is to split the monolithic state set into 3
parts, vertex processing, fragment processing and other states(depth,
stencil, scissor, ...). The states will be provided in templates which
can be (mostly) independently combined, and are merged into a single
state table at device creation time. This way we retain the advantages
of the single state table and having the advantage of separated
pipeline implementations which can be combined without any manually
written glue code.
Constant numbers start at 0, and the loading loop has a for(i; i <
dirtyconsts; i++). This means that the highest dirty constant isn't
loaded correctly. Rather than replacing the < with <=, which would
make it impossible to have no dirty constant, add 1 to the dirty
constant counter.
This gets rid of depth_copy_state in the device, and instead tracks
the most up to date location per-surface. This makes things a lot
easier to follow, and allows us to make a copy when switching depth
stencils in SetDepthStencilSurface().
This makes the depth copy independent of the currently attached render
targets. This is important for the next patch because it might do a
depth copy when the render targets aren't in a valid configuration
(SetDepthStencilSurface()).
The idea is to make setting depth attachments a bit more consistent
with set_render_target_fbo()/attach_surface_fbo(). I've also got an
upcoming patch in my tree that needs this.
OpenGL always offers filtering on all formats, and if the hardware
doesn't support it the driver falls back to software. Direct3D on the
other hand silently disables filtering, so that's what we should do too.
Strictly speaking this is redundant because the UnLoad before did the
job, but if we mess with the allocated memory we have to tell the
surface about that. Updating INDRAWABLE will automatically mark SYSMEM
outdated.
Since the shader backend implementations might track opengl resources in
their private data inform them about reset calls. For example, the atifs
backend keeps track of the replacement shaders, which are lost during an
opengl context recreation.
Add a new property of the shader backend which indicates whether the
shader backend is able to dirtify single constants rather than
dirtifying vshader and pshader constants as a whole. Depending on this
a different Set*ConstantF implementation is used which marks constants
dirty. The ARB shader backend uses this and marks constants clean
after uploading.