It's probably rare for higher render targets to get locked or updated
from sysmem, but this should still be more correct. It also makes the
code simpler.
This is an ATI specific format designed for compressed normal maps,
and quite a few games check for its existence. While it is an
ATI-specific "extension" in d3d9, it is a core part of
D3D10(DXGI_FORMAT_BC5), and supported on Geforce 8 cards.
This happened to work because most cards have the same amount of
pshader and vshader constants, but for some reason this doesn't hold
true on this macbook pro here, which lead to a crash due to heap
corruption
Some drivers(the open source ones most notably) cannot satisfy all
possible D3D formats. This doesn't mean we should fall back to the
emergency fallback instantly. Instead, try to loosen the requirements
step by step.
This is cleaner than the if statements in the code. Also np2 textures
should in theory support linear filtering, but fglrx doesn't seem to
like it. This needs further investigation. So far we've never used
linear filtering on np2 textures, so there should not be a
regression. Furthermore I think shader support is more important than
filtering, since NP2 textures are mostly used for 1:1 copying to the
screen.
ATI cards prior to the radeon HD series did not have unconditional non
power of two support. So far we've used texture_rectangle for that, or
created a bigger power of two texture with padding. This had the
disadvantage that we had to correct the coordinates, which causes
extreme problems with shaders(doesn't work, pretty much).
Both the MacOS and the fglrx driver have support for
GL_ARB_texture_non_power_of_two, and run it on the hardware as long as
we stay within the texture_rectangle limitations. This allows us to
have conditional non power of two textures with normalized
coordinates. This patch adds an internal extension, and the code
creates a regular GL_TEXTURE_2D texture with NP2 size, but refuses
mipmapping, filtering and texture_rectangle incompatible
operations. This makes np2 textures work with shaders on fglrx and
macos.
The opengl extension mentioned in that code was never finished, and as
far as I know there is no way to make use of tangent data in the d3d
fixed function pipeline as well.
Note that GL_ATI_envmap_bumpmap is not the same as
GL_ATI_fragment_shader. envmap_bumpmap is used together with the
regular opengl ffp pipeline and is not used (other than for
pixelformats) if GL_ATI_fragment_shader is used.
This patch adds a new field to the state templates. If this extension
field is != 0, then the line is only applied to the final state table
if the extension is supported. Once a line is applied to the final
table, all further templates for this state from the same pipeline
part are ignored. This allows removing some extension checks from the
state handlers, which cleans them up and saves a few CPU cycles when
applying the states.
This patch enables texture filtering for textures using the A4R4G4B4
format, I can see no reason why it shouldn't be filtered (especially
considering X4R4G4B4 has it).
This creates an nvts version of this function, and removes the nvts
code from the original one. The nvts version is used by the nvts
pipeline implementation, the original one by the nvrc-only, atifs and
ffp one.
As long as we have the shader constants in misc, it is best to keep
all the code that affects shader constants, like bumpenvmat setting,
in there as well.
This code creates the structures and the pipeline selection, as well
as the caps filling. It does not yet move the actual code around,
since this will be a bigger task.
When a sampler is changed and unconditional NP2 textures are not
supported, the texture matrix may need adjustment. The sampler state
function checks for that, and calls the texture transform setting
function in that case. However, samplers are a misc state, and the
texture transform flags a vertex state. Thus split up the code and
move the matrix changes to the vertex side.
Since atifs is only doing the fragment pipeline replacement right now
there is no need for the shader backend structure any longer. The ffp
private data is stored in new fragment pipeline private data(which
could potentially be set to equal the shader private data if needed).
It isn't related to the shader backend any longer. The nvts_enable in
the ffp code isn't quite right as well, it should be moved away once
there is a dedicated nvts fragment pipeline replacement
Destroying the stateblock potentially references the shader backend.
If the stateblock has active shaders when it is released, the shader's
destructor will tell the shader backend to destroy the corresponding
resources. This was exposed by my patch that moved the glsl program
lookup table into the backend's private data.
Calling shader_select() from inside depth_blt() isn't necessarily
safe. shader_select() assumes CompileShader() has been called for the
current shaders, but that depends on STATE_VSHADER / STATE_PIXELSHADER
being applied. That isn't always true when depth_blt() gets called,
with the result that sometimes GLSL programs could be created with no
shader objects attached.
For now the atifs selection sticks to the old rules, thus it is bound to
the available and selected shader capabilities. We may want to change that
in the future.
The idea of this patchset is to split the monolithic state set into 3
parts, vertex processing, fragment processing and other states(depth,
stencil, scissor, ...). The states will be provided in templates which
can be (mostly) independently combined, and are merged into a single
state table at device creation time. This way we retain the advantages
of the single state table and having the advantage of separated
pipeline implementations which can be combined without any manually
written glue code.
The atifs fragment processing implementation doesn't borrow a pixel shader
implementation from anywhere. It was a hack during development, but never needed.
Constant numbers start at 0, and the loading loop has a for(i; i <
dirtyconsts; i++). This means that the highest dirty constant isn't
loaded correctly. Rather than replacing the < with <=, which would
make it impossible to have no dirty constant, add 1 to the dirty
constant counter.
This gets rid of depth_copy_state in the device, and instead tracks
the most up to date location per-surface. This makes things a lot
easier to follow, and allows us to make a copy when switching depth
stencils in SetDepthStencilSurface().
This makes the depth copy independent of the currently attached render
targets. This is important for the next patch because it might do a
depth copy when the render targets aren't in a valid configuration
(SetDepthStencilSurface()).
SetupForBlit sets up the GL viewport and projection matrix for
screen-cordinate access to the framebuffer. These settings were not
updated if the other gl states were already set up for blitting. Guild
Wars reads back an offscreen rendered texture from the framebuffer,
which currently sets up CTXUSAGE_BLIT, then changes the render target,
and draws to the texture, which has to be reloaded from system memory
before it can be rendered to(since GW loaded some data into it). If the
two render targets had different size this failed.
The idea is to make setting depth attachments a bit more consistent
with set_render_target_fbo()/attach_surface_fbo(). I've also got an
upcoming patch in my tree that needs this.
Just unsetting SFLAG_INTEXTURE doesn't work for FBOs because the
drawable and texture are the same there (and ModifyLocation() is the
correct way to do this anyway). Fixes another ddraw test failure with
FBO ORM.
As far as I can tell we support post ps blending in combination with
MRTs fine. Tabula Rasa needs this cap in order to enable some of the
higher graphics settings.
Currently we only check if ARB_HALF_FLOAT_PIXEL is supported. This is
not enough, we need ARB_TEXTURE_FLOAT as well. This fixes some errors
when running the d3d9 visual test with Mesa swrast.
Currently depth formats are handled separately from the other formats,
but depth formats can support things like filtering as well, so we
should check those caps as well.
SM3.0 requires 10 4 component float varyings for passing stuff between
vertex and pixel shaders. GF7 and earlier report 8 generic varyings +
gl_Color and gl_SecondaryColor in GLSL. This patch allows us to use
gl_Color and gl_SecondaryColor to get 2 extra varyings, which some
games, like C&C3 with highest gfx settings, require.
There is no reason to do that, now that the SetGLTextureDesc bug is
fixed. This avoids an infinite recursion because PreLoad calls
ActivateContext at some point.
Fixes screen not updating or getting updated inconsistently when apps blit to
front buffer or lock it when RenderTargetLockMode=readtex, as happens in e.g.
Red Alert 2 and also in p8_primary_test in ddraw tests.
glStencilFuncSeparateATI does not take a face argument, instead it
sets the front and back facing functions at once. This means the
renderstate_stencil_twosided helper function is somewhat pointless for
this extension.
The previous logic assumed that if NVTS or ATIFS are available they
will be used. This happens to be true for NVTS, but ATIFS is only used
if neither ARBFP nor GLSL are supported. This breaks fixed function
fragment processing on ATI r300 and newer cards
This makes it easier to make this a per texture / per adapter property.
Somewhen we should rename the remaining lookup type in the general
lookup table to wraplookup.
OpenGL always offers filtering on all formats, and if the hardware
doesn't support it the driver falls back to software. Direct3D on the
other hand silently disables filtering, so that's what we should do too.