The idea here is that we should lookup format information in struct
GlPixelFormatDesc, while StaticPixelFormatDesc and GlPixelFormatDescTemplate
will only be used to build the table.
The goal is to eventually use a pointer to the format description in most
places where we currently use WINED3DFORMAT. IWineD3DSurfaceImpl for example
has copies of several fields from the format description, but also needs to
lookup the format description itself in several places.
The color correction cannot be done behind the back of the individual
instruction handlers because it might conflict with the instruction's
color modifications and the D3D provided writemask.
Add a new wined3d-internal PreLoad function to textures and surfaces
that takes a parameter specifying wether the rgb or srgb texture
should be loaded.
This reduces the number of srgb switching reloads quite a lot. The only
situation in which a reload is needed is if the rgb copy is modified on the GL
side and the srgb copy is needed.
The fog settings do not depend on wether the shader writes to oFog or not,
instead they depend on the FOGVERTEXMODE and FOGTABLEMODE settings, and if a
vertex shader is bound at all.
It works the same way as with the fixed function, and having a vertex shader
is the same as using pretransformed vertices, just that the fog coord comes
from the shader instead of the specular color:
FOGTABLEMODE != NONE: The Z coord is used, oFog is ignored
FOGTABLEMODE == NONE, with VS: oFog is used
FOGTABLEMODE == NONE, no VS, XYZ: Z is used
FOGTABLEMODE == NONE, no VS, XYZRHW: diffuse color is used
Other than being a bit nicer than passing function pointers all over the
place, this helps dxgi/d3d10. While the swapchain itself is created in dxgi,
its surfaces are constructed in d3d10core, which makes it impractical for dxgi
to pass the appropriate function pointers.
startIdx should be the first index to draw, either from the vertex
array or the index array, depending on if the draw is indexed or
not. Having both at the same time wouldn't make sense.
Most callers work on a stateblock rather than a device, and the main fields
we check (vertexShader and pixelShader) are part of the stateblock as well.
This is a first step towards cleaning up the fog mess. The fog
parameter is added to the pixelshader compile args structure. That way
multiple pshaders are compiled for different fog settings, and the
pixel shader can remove the fog line if fog is not enabled. That way
we don't need special fog start and end settings, and this allows us
to implement EXP and EXP2 fog in the future too.
We cannot remove this because we still have to load the surface as
RGB. The shader may take care of setting the blue channel to 1.0 now,
but we still get the red and green channels loaded incorrectly if we
don't insert a blue channel before loading.
This allows us to drop the load time conversion and the clear
readback hack and replaces it with a color fixup in the fixed
function pipeline replacement.
Based on a patch by Stefan Dösinger. This is more flexible, and allows
the shader backend implementation to be simpler, since it doesn't have
to know about specific formats. The next patch makes use of this.
Note that minMipLookup and magLookup aren't particularly safe to use,
they're global arrays initialized from IWineD3DImpl_FillGLCaps(). The same
goes for the other global dynamic lookup tables.
Some stateblock parameters have to be compiled into the GL pixel
shader code, like lines for pixelformat fixups. This leads to problems
when applications switch those settings, requiring a recompilation of
the shader. This patch enables wined3d to have multiple GL shaders for
a D3D shader(pixel shaders only so far) to handle this more
efficiently.
This was suggested by Ivan quite a while ago, and we need it to better
handle conflicting texture format corrections and similar stateblock
value changes which until now required a recompilation of the entire
shader
A number of considerations contribute to this:
1) The shader backend knows best which shader(s) it needs. GLSL needs
both, arb only one
2) The shader backend may pass some parameters to the compilation
code(e.g. which pixel format fixup to use)
3) The structures used in (2) are different in vs and ps, so a
baseshader::Compile won't work
4) The structures in (2) are wined3d-private structures, so
having a public method in the vtable won't work(its a bad idea
anyway).
This creates a function for setting the texture name and one for
setting the texture target. The idea is that the texture target should
get set right after the surface is created, and won't change, while
generating a texture name can wait.
This gives a small performance improvement for applications that are
smart enough to set the D3DPRESENTFLAG_DISCARD_DEPTHSTENCIL flag, or
to create depth stencils with Discard set to TRUE.
This is the prefered format of many codecs, and for some codecs this
is the only supported output format. As usual I try to handle all the
conversion in the GPU and keep the CPU involvement minimal to gain the
full performance of PBO transfers.
GL_ARB_fragment_program and GL_ATI_fragment_shader can disable
projected textures properly, and they can also handle
D3DTTFF_PROJECTED | D3DTTFF_COUNT3 properly.
Since some of those function pointers are direct GL functions the function
prototype needs the WINE_GLAPI calling convention. This makes prevents
drawStridedSlow from crashing with USE_WIN32_OPENGL.
Although sharing FBOs across contexts is allowed by EXT_framebuffer_object
(issue 76), it causes issues with nVidia drivers. Considering the GL 3 spec
explicitly disallows sharing of FBOs accross contexts (Appendix D), this
patch is probably the right thing to do.
If a format is not supported natively by opengl, a shader may be able
to convert it. Up to now, CheckDeviceFormat had magic knowldge which
GL extensions lead to which supported format. This patch adds
functions that allow CheckDeviceFormat to ask the actual
implementation for its capabilities.
This is a long-needed cleanup aimed at removing the ddraw_primary,
ddraw_window, ddraw_width and ddraw_height members from
IWineD3DDeviceImpl, which just do not belong there. Destination
window and screen handling is supposed to be done by swapchains.
ATI cards prior to the radeon HD series did not have unconditional non
power of two support. So far we've used texture_rectangle for that, or
created a bigger power of two texture with padding. This had the
disadvantage that we had to correct the coordinates, which causes
extreme problems with shaders(doesn't work, pretty much).
Both the MacOS and the fglrx driver have support for
GL_ARB_texture_non_power_of_two, and run it on the hardware as long as
we stay within the texture_rectangle limitations. This allows us to
have conditional non power of two textures with normalized
coordinates. This patch adds an internal extension, and the code
creates a regular GL_TEXTURE_2D texture with NP2 size, but refuses
mipmapping, filtering and texture_rectangle incompatible
operations. This makes np2 textures work with shaders on fglrx and
macos.
This patch adds a new field to the state templates. If this extension
field is != 0, then the line is only applied to the final state table
if the extension is supported. Once a line is applied to the final
table, all further templates for this state from the same pipeline
part are ignored. This allows removing some extension checks from the
state handlers, which cleans them up and saves a few CPU cycles when
applying the states.
This code creates the structures and the pipeline selection, as well
as the caps filling. It does not yet move the actual code around,
since this will be a bigger task.
Since atifs is only doing the fragment pipeline replacement right now
there is no need for the shader backend structure any longer. The ffp
private data is stored in new fragment pipeline private data(which
could potentially be set to equal the shader private data if needed).
It isn't related to the shader backend any longer. The nvts_enable in
the ffp code isn't quite right as well, it should be moved away once
there is a dedicated nvts fragment pipeline replacement
Calling shader_select() from inside depth_blt() isn't necessarily
safe. shader_select() assumes CompileShader() has been called for the
current shaders, but that depends on STATE_VSHADER / STATE_PIXELSHADER
being applied. That isn't always true when depth_blt() gets called,
with the result that sometimes GLSL programs could be created with no
shader objects attached.
For now the atifs selection sticks to the old rules, thus it is bound to
the available and selected shader capabilities. We may want to change that
in the future.
The idea of this patchset is to split the monolithic state set into 3
parts, vertex processing, fragment processing and other states(depth,
stencil, scissor, ...). The states will be provided in templates which
can be (mostly) independently combined, and are merged into a single
state table at device creation time. This way we retain the advantages
of the single state table and having the advantage of separated
pipeline implementations which can be combined without any manually
written glue code.
This gets rid of depth_copy_state in the device, and instead tracks
the most up to date location per-surface. This makes things a lot
easier to follow, and allows us to make a copy when switching depth
stencils in SetDepthStencilSurface().
This makes the depth copy independent of the currently attached render
targets. This is important for the next patch because it might do a
depth copy when the render targets aren't in a valid configuration
(SetDepthStencilSurface()).
SetupForBlit sets up the GL viewport and projection matrix for
screen-cordinate access to the framebuffer. These settings were not
updated if the other gl states were already set up for blitting. Guild
Wars reads back an offscreen rendered texture from the framebuffer,
which currently sets up CTXUSAGE_BLIT, then changes the render target,
and draws to the texture, which has to be reloaded from system memory
before it can be rendered to(since GW loaded some data into it). If the
two render targets had different size this failed.
SM3.0 requires 10 4 component float varyings for passing stuff between
vertex and pixel shaders. GF7 and earlier report 8 generic varyings +
gl_Color and gl_SecondaryColor in GLSL. This patch allows us to use
gl_Color and gl_SecondaryColor to get 2 extra varyings, which some
games, like C&C3 with highest gfx settings, require.