swrast: When asked to map a slice of a 1D array, give back that slice.
Until now, we've been treating 1D arrays as a single slice, and each
array slice is actually just a row of the 2D texture. While swrast
still stores them this way, hardware drivers think that 1D arrays have
actual separate slices not stored as contiguous rows.
Reviewed-by: Brian Paul <brianp@vmware.com>
intel: Consolidate texture validation copy code, and reuse it correctly.
The path for ->Data was failing to be called for the FBO draw offset
fallback, and also had mismatched compressed texture support code.
This drops the intel_prepare_render() in the blit path. We aren't
copying to/from a GL_FRONT buffer, so it doesn't matter.
intel: Clean up the function chain for mapping texture images for swrast.
Too many separate functions each called from one location (in
different files). This code should all die soon when swrast starts
using MapTextureImage.
intel: Allocate s8z24 separate renderbuffers from AllocTextureImageBuffer().
Before, we were only allocating these from our TexImage, so if the
texture image was set up in any other way (non-accelerated
glGenerateMipmaps()), they'd be missing or wrong.
mesa: Convert _mesa_generate_mipmap to MapTexImage()-based access.
Now that we can zero-copy generate the mipmaps into brand new
glTexImage()-generated storage using MapTextureImage(), we no longer
need to allocate image->Data in mipmap generate. This requires
deleting the drivers' old overrides of the miptree tracking after
calling _mesa_generate_mipmap at the same time, or the drivers
promptly lose our newly-generated data.
Reviewed-by: Eric Anholt <eric@anholt.net>
i965: Reverse the operands for INT DIV prior to Gen6.
Apparently on Gen4 and 5, the denominator comes first.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
i965: Set the signed/unsigned type bit in Gen4/5 math messages.
It never mattered before since we only did floating point math.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
i965: Fix message and response length calculations for INT DIV.
Both POW and INT DIV need a message length of 2; previously, we only
checked for POW.
Also, BRW_MATH_FUNCTION_INT_DIV_QUOTIENT_AND_REMAINDER has a response
length of 2; previously, we only checked for SINCOS. We don't use this
message, but in case we ever decide to, we may as well fix it now.
While we're at it, just move these computations into
brw_set_math_message, since they're entirely based on the function.
This fixes it for both brw_math and the old backend's brw_math_16.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
i965: Fix assertions about register types for INT DIV in brw_math.
BRW_MATH_FUNCTION_REMAINDER was missing. Also, it seems worthwhile to
assert that INT DIV's arguments are signed/unsigned integers.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
ir_to_mesa: Don't assertion fail on integer modulus.
Drivers implementing GLSL 1.30 want to do integer modulus, and until we
can stop generating code via ir_to_mesa, it's easier to make it silently
generate rubbish code. Multiply will do.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Classic compiler mistake. In the example below, the OMOD optimization
was combining instructions 4 and 10, but since there was an instruction
(#8) in between them that wrote to the same registers as instruction 10,
instruction 11 was reading the wrong value.
Example of the mistake:
Before OMOD:
4: MAD temp[0].y, temp[3]._y__, const[0]._x__, const[0]._y__;
...
8: ADD temp[2].x, temp[1].x___, -temp[4].x___;
...
10: MUL temp[2].x, const[1].y___, temp[0].y___;
11: FRC temp[5].x, temp[2].x___;
After OMOD:
4: MAD temp[2].x / 8, temp[3]._y__, const[0]._x__, const[0]._y__;
...
8: ADD temp[2].x, temp[1].x___, -temp[4].x___;
...
11: FRC temp[5].x, temp[2].x___;
https://bugs.freedesktop.org/show_bug.cgi?id=41367
r300/compiler: Use consistent src swizzles for transcendent instructions
Source swizzles for transcendent instructions were being stored in the X
channel regardless of what channel the instruction was writing.
This was causing problems for some helper functions that were expecting
source swizzles to occupy channels corresponding to the instruction's
writemask. This commit makes transcendent instructions follow the same
convention as normal instructions for representing source swizzles.
Previous behavior:
LG2 temp[0].y, input[0].x___;
Current behavior:
LG2 temp[0].y, input[0]._x__;
mesa: Respect GL_RASTERIZER_DISCARD for various meta-type operations.
From the EXT_transform_feedback spec:
Primitives can be optionally discarded before rasterization by calling
Enable and Disable with RASTERIZER_DISCARD_EXT. When enabled, primitives
are discared right before the rasterization stage, but after the optional
transform feedback stage. When disabled, primitives are passed through to
the rasterization stage to be processed normally. RASTERIZER_DISCARD_EXT
applies to the DrawPixels, CopyPixels, Bitmap, Clear and Accum commands as
well.
And the GL 3.2 spec says it applies to ClearBuffer* as well.
Reviewed-by: Brian Paul <brianp@vmware.com>
Revert "vbo: Don't discount stride == 0 for testing all varyings in VBOs."
This reverts commit d631c19db4.
The commit was broken, and ended up returning false all the time
because nobody in the world binds every single possible vertex array.
On further reflection, we don't want to discount stride == 0: This
function is just used for deciding to calculate whether to compute the
bonuds on the index, and there's no sense in computing index bounds
when stride == 0.
For the separate question of "how much data do I upload for this
vertex element?", the i965 driver was fixed to upload the data.
Fixes a regression of about 2x in 3DMMES, and most importantly, makes
Hammerfight playable.
i965: Make sure to upload the data for a collection of Stride == 0 arrays.
Commit d631c19db4 avoided this problem
by forcing the driver to get the min/max index, but that commit was
broken, so just fix the driver problem (confusion between "do I need
to upload any data?" and "do I need the index bounds in order to
upload any data?").
mesa: Delay s_texcombine.c memory allocation until it's used.
Generally we're using fragment programs in all our drivers, so wasting
4MB for code that's never called is pretty lame. Reduces i965 memory
allocation for a short shader program from 21,932,128B to 17,737,816B.
As innocuous as it seemed, ebca47a basically broke the world (e.g.,
>200 piglit regressions). In vec4_visitor::emit_block_move,
src->swizzle was expected to be BRW_SWIZZLE_NOOP before setting it to
a swizzle that would replicate the existing channels of the source
type to a vec4 (e.g., .xyyy for a vec2).
The original assertion seems to have been a little bogus. In addition
to being BRW_SWIZZLE_NOOP, src->swizzle might already be a swizzle
that would replicate the existing channels of the source type to a
vec4. In other words, it might already have the value that we're
about to assign to it.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
mesa: number of combiner terms to pop depends on GL_NV_texture_env_combine4
If GL_NV_texture_env_combine4 is not supported, setting the fourth
combiner term would generate a GL error.
Of course, I noticed this right after committing the previous patch
to use a loop in the first place. <sigh>
Note that GL_EXT_texture_env_combine is always supported so the first
three combiner terms are always accepted.
mesa: use loop in pop_texture_group() to restore 4 combiner terms
There's four combiner terms (not 3) with GL_NV_texture_env_combine4.
Use a loop to make the code a little more compact.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Lots of things set and copy this field around, but nothing uses it.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
winsys/radeon: move GEM domains out of the drivers into winsys
The drivers don't need to care about the domains. All they need to set
are the bind and usage flags. This simplifies the winsys too.
This also fixes on r600g:
- fbo-depth-GL_DEPTH_COMPONENT32F-copypixels
- fbo-depth-GL_DEPTH_COMPONENT16-copypixels
- fbo-depth-GL_DEPTH_COMPONENT24-copypixels
- fbo-depth-GL_DEPTH_COMPONENT32-copypixels
- fbo-depth-GL_DEPTH24_STENCIL8-copypixels
I can't explain it.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
I have moved 'last_flush' and 'binding' from r600_bo to winsys/radeon.
The other members are now part of r600_resource.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>