i965: avoid anonymous struct in float <-> VF conversions
Anonymous structures are only supported with newer versions of
GCC. They will not work with GCC 4.2.1 used by OpenBSD or
GCC 4.4.7 shipped with RHEL6 going by a commit to fix a similiar
problem in radeonsi earlier in the year
(74388dd24b).
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
mesa: fix arithmetic error in _mesa_compute_compressed_pixelstore()
We need parenthesis around the expression which computes the number of
blocks per row.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
The i965 backends pass something out of 'screen', which is allocated
per-process, making using this as a ralloc context not thread-safe.
All callers ra_alloc_interference_graph() already ralloc_free() its
return value.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
It was totally broken:
- p_atomic_dec_zero() was returning the negation of the expected value
- p_atomic_inc_return()/p_atomic_dec_return() was
post-incrementing/decrementing, hence returning the old value instead
of the new
- p_atomic_cmpxchg() was returning the new value on success, instead of
the old
It is clear this never used in the past. I wonder if it wouldn't be better to
yank it altogether.
Reviewed-by: Matt Turner <mattst88@gmail.com>
It was much easier for me to verify things build and run as expected
with this simple test, than building and testing whole Mesa.
With scons the test can be build and run merely by doing:
scons u_atomic_test
Building the test with autotools is left as a future exercise.
Reviewed-by: Matt Turner <mattst88@gmail.com>
like how C11's stdatomic.h provides generic functions. GCC's __sync_*
builtins already take a variety of types, so that's simple.
MSVC and Sun Studio don't, but we can implement it with something that
looks a little crazy but is actually quite readable.
Thanks to Jose for some MSVC fixes!
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This doesn't reschedule much currently, just tries to fit things into the
regfile A/B write-versus-read slots (the cause of the improvements in
shader-db), and hide texture fetch latency by scheduling setup early and
results collection late (haven't performance tested it). This
infrastructure will be important for doing instruction pairing, though.
shader-db2 results:
total instructions in shared programs: 61874 -> 59583 (-3.70%)
instructions in affected programs: 50677 -> 48386 (-4.52%)
We're supposed to be checking that nothing else writes r4, which is done
by the TMU result collection signal, not the coordinate setup.
Avoids a regression when QPU instruction scheduling is introduced.
freedreno/a4xx: invalidate cache when vbo's change
Otherwise vertex shader can see stale cache data. This in particular
happens when the same vbo is updated and reused. Not sure yet if vbo's
at differing addresses but bound to same vertex buffer slot could have
issues, but seems safest to flush whenever new vertex buffers are bound.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
st/mesa: avoid exposing EXT_texture_integer for pre-GLSL 1.30
For drivers building up to GL(ES)3, only expose the actual extension if
the API will let it be used (e.g. via overrides/debug flags that enable
higher versions).
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
freedreno/a3xx: add missing integer formats and enable rendering
The mesa state tracker doesn't fall back on similar integer formats, so
they must all be provided. Remove the restriction against integer color
rendering.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
freedreno/a3xx: enable sampling from integer textures
We need to produce a u32 destination type on integer sampling
instructions, so keep that in a shader key set based on the
currently-bound textures.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
freedreno/a3xx: don't use half precision shaders for int/float32
Integer outputs end up getting mangled due to cov.f32f16, and float32
loses precision. Use full precision shaders in both of those cases.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
freedreno/a3xx: remove blend clamp enables from gmem/clears
Just pass the data through unmolested. This probably has no effect since
blending isn't actually enabled.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Looks like none of the mad variants do u16 * u16 + u32, so just add in
the extra value "by hand".
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
freedreno/a3xx: unify vertex/texture formats into a single table
The table contains all the relevant information about each format. The
helper functions now just do lookups in the table.
Note that this adds support for a lot of formats that were previously
unsupported. Additionally it adds disabled support for integer render
buffers, which will require more work to actually enable.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
freedreno/a3xx: rename vertex/texture format enums to be more consistent
Switch both of them from independently inconsistent conventions to having
UINT/SINT/UNORM/SNORM/FLOAT/FIXED suffixes.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
freedreno/a3xx: only enable blend clamp for non-float formats
This fixes arb_color_buffer_float-render GL_RGBA16F.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>