This allows us to drop the load time conversion and the clear
readback hack and replaces it with a color fixup in the fixed
function pipeline replacement.
Based on a patch by Stefan Dösinger. This is more flexible, and allows
the shader backend implementation to be simpler, since it doesn't have
to know about specific formats. The next patch makes use of this.
Note that minMipLookup and magLookup aren't particularly safe to use,
they're global arrays initialized from IWineD3DImpl_FillGLCaps(). The same
goes for the other global dynamic lookup tables.
Some stateblock parameters have to be compiled into the GL pixel
shader code, like lines for pixelformat fixups. This leads to problems
when applications switch those settings, requiring a recompilation of
the shader. This patch enables wined3d to have multiple GL shaders for
a D3D shader(pixel shaders only so far) to handle this more
efficiently.
This was suggested by Ivan quite a while ago, and we need it to better
handle conflicting texture format corrections and similar stateblock
value changes which until now required a recompilation of the entire
shader
A number of considerations contribute to this:
1) The shader backend knows best which shader(s) it needs. GLSL needs
both, arb only one
2) The shader backend may pass some parameters to the compilation
code(e.g. which pixel format fixup to use)
3) The structures used in (2) are different in vs and ps, so a
baseshader::Compile won't work
4) The structures in (2) are wined3d-private structures, so
having a public method in the vtable won't work(its a bad idea
anyway).
This creates a function for setting the texture name and one for
setting the texture target. The idea is that the texture target should
get set right after the surface is created, and won't change, while
generating a texture name can wait.
This gives a small performance improvement for applications that are
smart enough to set the D3DPRESENTFLAG_DISCARD_DEPTHSTENCIL flag, or
to create depth stencils with Discard set to TRUE.
This is the prefered format of many codecs, and for some codecs this
is the only supported output format. As usual I try to handle all the
conversion in the GPU and keep the CPU involvement minimal to gain the
full performance of PBO transfers.
GL_ARB_fragment_program and GL_ATI_fragment_shader can disable
projected textures properly, and they can also handle
D3DTTFF_PROJECTED | D3DTTFF_COUNT3 properly.
Since some of those function pointers are direct GL functions the function
prototype needs the WINE_GLAPI calling convention. This makes prevents
drawStridedSlow from crashing with USE_WIN32_OPENGL.
Although sharing FBOs across contexts is allowed by EXT_framebuffer_object
(issue 76), it causes issues with nVidia drivers. Considering the GL 3 spec
explicitly disallows sharing of FBOs accross contexts (Appendix D), this
patch is probably the right thing to do.
If a format is not supported natively by opengl, a shader may be able
to convert it. Up to now, CheckDeviceFormat had magic knowldge which
GL extensions lead to which supported format. This patch adds
functions that allow CheckDeviceFormat to ask the actual
implementation for its capabilities.
This is a long-needed cleanup aimed at removing the ddraw_primary,
ddraw_window, ddraw_width and ddraw_height members from
IWineD3DDeviceImpl, which just do not belong there. Destination
window and screen handling is supposed to be done by swapchains.
ATI cards prior to the radeon HD series did not have unconditional non
power of two support. So far we've used texture_rectangle for that, or
created a bigger power of two texture with padding. This had the
disadvantage that we had to correct the coordinates, which causes
extreme problems with shaders(doesn't work, pretty much).
Both the MacOS and the fglrx driver have support for
GL_ARB_texture_non_power_of_two, and run it on the hardware as long as
we stay within the texture_rectangle limitations. This allows us to
have conditional non power of two textures with normalized
coordinates. This patch adds an internal extension, and the code
creates a regular GL_TEXTURE_2D texture with NP2 size, but refuses
mipmapping, filtering and texture_rectangle incompatible
operations. This makes np2 textures work with shaders on fglrx and
macos.
This patch adds a new field to the state templates. If this extension
field is != 0, then the line is only applied to the final state table
if the extension is supported. Once a line is applied to the final
table, all further templates for this state from the same pipeline
part are ignored. This allows removing some extension checks from the
state handlers, which cleans them up and saves a few CPU cycles when
applying the states.
This code creates the structures and the pipeline selection, as well
as the caps filling. It does not yet move the actual code around,
since this will be a bigger task.
Since atifs is only doing the fragment pipeline replacement right now
there is no need for the shader backend structure any longer. The ffp
private data is stored in new fragment pipeline private data(which
could potentially be set to equal the shader private data if needed).
It isn't related to the shader backend any longer. The nvts_enable in
the ffp code isn't quite right as well, it should be moved away once
there is a dedicated nvts fragment pipeline replacement
Calling shader_select() from inside depth_blt() isn't necessarily
safe. shader_select() assumes CompileShader() has been called for the
current shaders, but that depends on STATE_VSHADER / STATE_PIXELSHADER
being applied. That isn't always true when depth_blt() gets called,
with the result that sometimes GLSL programs could be created with no
shader objects attached.
For now the atifs selection sticks to the old rules, thus it is bound to
the available and selected shader capabilities. We may want to change that
in the future.
The idea of this patchset is to split the monolithic state set into 3
parts, vertex processing, fragment processing and other states(depth,
stencil, scissor, ...). The states will be provided in templates which
can be (mostly) independently combined, and are merged into a single
state table at device creation time. This way we retain the advantages
of the single state table and having the advantage of separated
pipeline implementations which can be combined without any manually
written glue code.
This gets rid of depth_copy_state in the device, and instead tracks
the most up to date location per-surface. This makes things a lot
easier to follow, and allows us to make a copy when switching depth
stencils in SetDepthStencilSurface().
This makes the depth copy independent of the currently attached render
targets. This is important for the next patch because it might do a
depth copy when the render targets aren't in a valid configuration
(SetDepthStencilSurface()).
SetupForBlit sets up the GL viewport and projection matrix for
screen-cordinate access to the framebuffer. These settings were not
updated if the other gl states were already set up for blitting. Guild
Wars reads back an offscreen rendered texture from the framebuffer,
which currently sets up CTXUSAGE_BLIT, then changes the render target,
and draws to the texture, which has to be reloaded from system memory
before it can be rendered to(since GW loaded some data into it). If the
two render targets had different size this failed.
SM3.0 requires 10 4 component float varyings for passing stuff between
vertex and pixel shaders. GF7 and earlier report 8 generic varyings +
gl_Color and gl_SecondaryColor in GLSL. This patch allows us to use
gl_Color and gl_SecondaryColor to get 2 extra varyings, which some
games, like C&C3 with highest gfx settings, require.
The previous logic assumed that if NVTS or ATIFS are available they
will be used. This happens to be true for NVTS, but ATIFS is only used
if neither ARBFP nor GLSL are supported. This breaks fixed function
fragment processing on ATI r300 and newer cards
This makes it easier to make this a per texture / per adapter property.
Somewhen we should rename the remaining lookup type in the general
lookup table to wraplookup.
OpenGL always offers filtering on all formats, and if the hardware
doesn't support it the driver falls back to software. Direct3D on the
other hand silently disables filtering, so that's what we should do too.
This adds code for handling fixed function fragment processing with the
GL_ATI_fragment_shader extension. This is a sort-of programmable
interface for fragment processing at the level of shader model 1.4 in
d3d. This code is of use on r200, r250 and r280 cards(radeon 8500 to
9200) which do not support GL_ARB_fragment_program, but support pixel
shader 1.4 on Windows. This code is somewhat a counterpart to the
existing fragment processing code using GL_NV_register_combiners and
GL_NV_texture_shader.
The whole control structures in directx.c get terribly confusing with
the various codepaths for texturing and different shader
implementations. It is also hard to reflect the shader model
decisions this way too. This patch moves the shader specific parts of
the caps code into the shader backend where we can set our caps
dependent of the shader model decisions and without complex caps flag
checks.
Generating the shader ID and parts of the shader prolog and epilog was
done by the common vertexshader.c / pixelshader.c, which is ugly.
This patch doesn't get rid of all the uglyness, somewhen we'll still
have to sort out the relationship of [arb|glsl]_generate_shader and
[arb|glsl]_generate_declarations.
Add a new property of the shader backend which indicates whether the
shader backend is able to dirtify single constants rather than
dirtifying vshader and pshader constants as a whole. Depending on this
a different Set*ConstantF implementation is used which marks constants
dirty. The ARB shader backend uses this and marks constants clean
after uploading.
The GL_ARB_vertex_program extension does not define a standard value for
output texture coordinates. This makes problems when using vertex
shaders with fixed function fragment processing because fffp divides the
texture coords by its .w component. This means that gl shaders have to
write to the .w component of texture coords. Direct3D shaders however
do not.
I'm resending this patch because my reply to Henri's concern came too late.
Henri noted that I am enabling lights that do not exist. Existing tests show
that if no light is assigned to the index, LightEnable creates a light with a
set of default parameters, so the tests should be fine.
From 9ee4c61805b50886f79e87d744b52f27b7b00b4e Mon Sep 17 00:00:00 2001
From: Stefan Doesinger <stefan@codeweavers.com>
Date: Thu, 29 Nov 2007 13:22:47 +0100
Subject: [PATCH] WineD3D: Enabling too many lights is silently ignored
This patch adds tests for all d3d versions that show that Windows
pretends that enabling more lights than supported succeeds. D3D_OK is
returned, and the light is reported as enabled.
What is not tested in this patch is the rendering output of this
situation, thus the FIXME is still written.
If an attribute has type D3DDECLTYPE_D3DCOLOR, the red and blue channels
are swizzled in the shader. Since the attribute is stored in the vertex
declaration and not the vertex shader, it can change by setting a new
vertex declaration. If this happens, we have to recompile the shader
with the swizzling of that specific attribute turned on or off.
Before it was done in findContext, before selecting the new context
which is bad (it doesn't always work). The new code works and this
change also fixes some draw buffer regressions that happened during
the surface rewrite from the last couple of days.