This is the Nth attemt to make clipping work with GLSL shaders. The patch now
uses the GLSL quirk table to handle cards that need a custom varying for
gl_ClipPos, and the code is adapted to the changed state table and shader
backend system.
This extension is a subset of GL_NV_half_float that defines support
for the stream format(same constant), but doesn't define texture
formats or immediate mode entrypoints.
GL_ARB_shader_texture is supported on dx9 ATI cards(and probably dx10
ones too). For Nvidia cards I included a fallback to normal texld.
GL_EXT_gpu_shader4 supports similar texture*Grad GLSL functions, just
with an EXT prefix instead of ARB. For dx9 NV cards we'd have to use
GL_NV_fragment_program2, which supports a texldd equivalent on those
cards.
Nvidia doesn't offer it on geforce 7 and earlier cards, but some games need
it. This is surprising because the extension was made specifically for
compatibility purposes for older cards.
Some drivers apparently need private constants, or don't have an efficient
immval packing. For example, MacOS seems to need 1 float for each different
relative addressing offset. fglrx has the same issue, although it is more
efficient in general
Previously this worked on most drivers because the 16 + 4 reserved int and
bool constants kept the problem hidden. Now that we are more aggressive with
uniforms we have to keep free room for some drivers.
This allows better defining of driver desc fixups without adding extra if
lines for each card.
For starters, there's a fixup for the advertised GLSL constants in ATI cards.
fglrx advertises 512 GLSL uniforms instead of the supported 1024(means 128
instead of 256 vec4's). This bug was confirmed by ATI.
This moves the GLSL and ARB specific reserved constants out of directx.c into
the get_caps methods of the shader backends. That way the number of reserved
constants remains in the backend.
GL_LIMITS({v/p}shader_constantsF) now contains the real number of constants as
advertised by GL instead of some mixture of GL info and backend implementation
specifics. This makes it easier for backends to decide how many constants to
use.
Right now we assume that the extension is there but this isn't always
the case. The next patch in this series will add a
non-WGL_ARB_pixel_format codepath to help VirtualBox and others.
The idea here is that we should lookup format information in struct
GlPixelFormatDesc, while StaticPixelFormatDesc and GlPixelFormatDescTemplate
will only be used to build the table.
Other than being a bit nicer than passing function pointers all over the
place, this helps dxgi/d3d10. While the swapchain itself is created in dxgi,
its surfaces are constructed in d3d10core, which makes it impractical for dxgi
to pass the appropriate function pointers.
This prevents fallout from the GL_EXT_fog_coord emulation. glEnable
and glDisable calls other than those that change GL_FOG are not
hooked. The glEnableWINE and glDisableWINE functions can be used to
add other hooks too if ever needed.
This confuses applications like Steam, which hook d3d9 and opengl
functions. It sees that the application uses opengl32, but it doesn't
realize that d3d9 is wrapped to opengl. Thus it starts messing around
with wined3d's wgl context. It usually tries to draw geometry with the
context, but cannot deal with some of the obscure extensions we have
activated.
Based on a patch by Stefan Dösinger. This is more flexible, and allows
the shader backend implementation to be simpler, since it doesn't have
to know about specific formats. The next patch makes use of this.
Note that minMipLookup and magLookup aren't particularly safe to use,
they're global arrays initialized from IWineD3DImpl_FillGLCaps(). The same
goes for the other global dynamic lookup tables.
GL_ATI_envmap_bumpmap provides two things: Signed V8U8 pixel formats,
and bump mapping. The extension is only supported on fglrx, and this
driver also supports GL_ARB_fragment_program. Thus the bump mapping
code is never used on any driver out there. Furthermore, if it is
used, it tends to crash the driver
The signed pixel format is used, as it can be used by pixel shaders or
the ARBfp replacement. However, the format is broken in fglrx, and
negative values are clamped to 0.0. This results in test
failures. WineD3D has an alternative codepath using scale+bias to
enable V8U8 using a standard signed RGB which works correctly on
fglrx.
Since some of those function pointers are direct GL functions the function
prototype needs the WINE_GLAPI calling convention. This makes prevents
drawStridedSlow from crashing with USE_WIN32_OPENGL.
GL_RGBA doesn't gaurantee an internal storage depth, which can cause the test
to fail if it's stored with less than 8 bits of precision. Some nVidia
drivers would actually store with 4 bits of precision.
If a format is not supported natively by opengl, a shader may be able
to convert it. Up to now, CheckDeviceFormat had magic knowldge which
GL extensions lead to which supported format. This patch adds
functions that allow CheckDeviceFormat to ask the actual
implementation for its capabilities.
Currently the ddraw capabilities were almost static, except of D3D
support. When overlay support is added, the caps depend on certain
settings in WineD3D or capabilities available from OpenGL and Xv. So
set those caps in wined3d as well.
This is an ATI specific format designed for compressed normal maps,
and quite a few games check for its existence. While it is an
ATI-specific "extension" in d3d9, it is a core part of
D3D10(DXGI_FORMAT_BC5), and supported on Geforce 8 cards.
ATI cards prior to the radeon HD series did not have unconditional non
power of two support. So far we've used texture_rectangle for that, or
created a bigger power of two texture with padding. This had the
disadvantage that we had to correct the coordinates, which causes
extreme problems with shaders(doesn't work, pretty much).
Both the MacOS and the fglrx driver have support for
GL_ARB_texture_non_power_of_two, and run it on the hardware as long as
we stay within the texture_rectangle limitations. This allows us to
have conditional non power of two textures with normalized
coordinates. This patch adds an internal extension, and the code
creates a regular GL_TEXTURE_2D texture with NP2 size, but refuses
mipmapping, filtering and texture_rectangle incompatible
operations. This makes np2 textures work with shaders on fglrx and
macos.
This patch adds a new field to the state templates. If this extension
field is != 0, then the line is only applied to the final state table
if the extension is supported. Once a line is applied to the final
table, all further templates for this state from the same pipeline
part are ignored. This allows removing some extension checks from the
state handlers, which cleans them up and saves a few CPU cycles when
applying the states.
This code creates the structures and the pipeline selection, as well
as the caps filling. It does not yet move the actual code around,
since this will be a bigger task.
Since atifs is only doing the fragment pipeline replacement right now
there is no need for the shader backend structure any longer. The ffp
private data is stored in new fragment pipeline private data(which
could potentially be set to equal the shader private data if needed).
It isn't related to the shader backend any longer. The nvts_enable in
the ffp code isn't quite right as well, it should be moved away once
there is a dedicated nvts fragment pipeline replacement
For now the atifs selection sticks to the old rules, thus it is bound to
the available and selected shader capabilities. We may want to change that
in the future.
The idea of this patchset is to split the monolithic state set into 3
parts, vertex processing, fragment processing and other states(depth,
stencil, scissor, ...). The states will be provided in templates which
can be (mostly) independently combined, and are merged into a single
state table at device creation time. This way we retain the advantages
of the single state table and having the advantage of separated
pipeline implementations which can be combined without any manually
written glue code.
As far as I can tell we support post ps blending in combination with
MRTs fine. Tabula Rasa needs this cap in order to enable some of the
higher graphics settings.
Currently we only check if ARB_HALF_FLOAT_PIXEL is supported. This is
not enough, we need ARB_TEXTURE_FLOAT as well. This fixes some errors
when running the d3d9 visual test with Mesa swrast.
Currently depth formats are handled separately from the other formats,
but depth formats can support things like filtering as well, so we
should check those caps as well.
This makes it easier to make this a per texture / per adapter property.
Somewhen we should rename the remaining lookup type in the general
lookup table to wraplookup.
OpenGL always offers filtering on all formats, and if the hardware
doesn't support it the driver falls back to software. Direct3D on the
other hand silently disables filtering, so that's what we should do too.
This adds code for handling fixed function fragment processing with the
GL_ATI_fragment_shader extension. This is a sort-of programmable
interface for fragment processing at the level of shader model 1.4 in
d3d. This code is of use on r200, r250 and r280 cards(radeon 8500 to
9200) which do not support GL_ARB_fragment_program, but support pixel
shader 1.4 on Windows. This code is somewhat a counterpart to the
existing fragment processing code using GL_NV_register_combiners and
GL_NV_texture_shader.
The whole control structures in directx.c get terribly confusing with
the various codepaths for texturing and different shader
implementations. It is also hard to reflect the shader model
decisions this way too. This patch moves the shader specific parts of
the caps code into the shader backend where we can set our caps
dependent of the shader model decisions and without complex caps flag
checks.
It is legal to pass Usage=0 to CheckDeviceFormat and in both this case
and in the case a format isn't available UsageCaps would be 0 and a
format would be reported available.
We assume it has the same capabilities as VOLUMETEXTURE. MSDN is very
vague on this topic. Intel/Nvidia/ATI drivers seem to offer nearly the
same caps on both, so do that too.
Add a new property of the shader backend which indicates whether the
shader backend is able to dirtify single constants rather than
dirtifying vshader and pshader constants as a whole. Depending on this
a different Set*ConstantF implementation is used which marks constants
dirty. The ARB shader backend uses this and marks constants clean
after uploading.
As some Mesa developers pointed out, the GL_ARB_vertex_program grammar
does not allow an immediate value as source argument in ARL. Most
compilers accept it, but since it is not the purpose of the test
program to test for this replace it with a proper constant.
Often the Linux / MacOS graphics driver version is of no use for
finding a proper driver version to report to the D3D app. So this
patch adds some infrastructure for easy hardcoding of card specific
driver versions to report to the application. This helps applications
which make assumptions based on the driver version, like bug
workarounds.
The GL_ARB_vertex_program extension does not define a standard value for
output texture coordinates. This makes problems when using vertex
shaders with fixed function fragment processing because fffp divides the
texture coords by its .w component. This means that gl shaders have to
write to the .w component of texture coords. Direct3D shaders however
do not.
Without this vertex shader 3.0 is reported on non-Nvidia cards that
only support vertex shader 2.0. Reporting 3.0 would result in slow
software rendering as it is much more advanced than 2.0.
The removal of ENTER_GL from the fake context code, requires the
addition of ENTER_GL/LEAVE_GL to FillGLCaps which was protected by the
fake context code before.