Currently when lowering mod() we add an extra instruction so if
mod(a,b) == b then 0 is returned instead of b, as mathematically
mod(a,b) is in the interval [0, b).
But Vulkan spec has relaxed this restriction, and allows the result to
be in the interval [0, b].
This commit takes this in account to remove the extra instruction
required to return 0 instead.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2922>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2922>
This lets us enable PIPE_CAP_FRAGMENT_SHADER_TEXTURE_LOD, which in turns
gives us ARB_shader_texture_lod.
Still fails one piglit test on ANV, namely
spec@arb_shader_texture_lod@execution@arb_shader_texture_lod-texgradcube,
but with 33 new passing tests, I think this is worth it.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3140>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3140>
turnip: don't set SP_FS_CTRL_REG0_VARYING if only fragcoord is used
Fixes artifacts in the subpasses demo, which has a shader using fragcoord
without any varyings. It looks like setting this bit when there are no
varyings can cause weirdness in some cases (without this change, if the
previous shader had <= 8 varyings it would work, but with 9 varyings it
would have artifacts).
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3143>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3143>
turnip: add cache invalidate to fix input attachment cases
Fixes artifacts in the subpasses demo.
Workaround texture cache with input attachments from GMEM by adding a cache
invalidate between subpasses.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3143>
loader: fix close on uninitialized file descriptor value
Using a drm syscall layer faking a kernel driver :
==581460== Conditional jump or move depends on uninitialised value(s)
==581460== by 0x48A4C2B: close (drm-hooks.cpp:185)
==581460== by 0x5A815F1: dri3_alloc_render_buffer (loader_dri3_helper.c:1469)
==581460== by 0x5A82050: dri3_get_buffer (loader_dri3_helper.c:1827)
==581460== by 0x5A82662: loader_dri3_get_buffers (loader_dri3_helper.c:2028)
==581460== by 0x6C78109: intel_update_image_buffers (brw_context.c:1870)
==581460== by 0x6C77805: intel_update_renderbuffers (brw_context.c:1499)
==581460== by 0x6C7789D: intel_prepare_render (brw_context.c:1520)
==581460== by 0x6C773D4: intelMakeCurrent (brw_context.c:1341)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 069fdd5f9f ("egl/x11: Support DRI3 v1.1")
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3152>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3152>
freedreno: Use new macros for CP_WAIT_REG_MEM and CP_WAIT_MEM_GTE
Similar to the existing usage for CP_COND_WRITE5, this makes it clear
what each of the magic parameters are for.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>
And add fields uncovered by looking at the firmware. I think this covers
all the memory, register, and scratch manipulation opcodes that exist on
A6xx, plus one additional nice find for Vulkan and describing a
previously unknown opcode and documenting CP_WAIT_REG_MEM.
Note that the bits for the CP_REG_TO_MEM count, as well as the formula
for computing the actual count for both CP_REG_TO_MEM and CP_MEM_TO_REG,
are changed because the A630 SQE firmware actually does something
different. I haven't investigated older microcodes to see whether this
extends back to A5xx and A4xx, but the only non-A6xx uses of this
field result in the same bit-pattern when using the A6xx bit range and
formula, so it should be safe to change the definition universally.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>
Fixes a hang with geekbench.
The existence of RX 580 and NAVI10 results shows that the generations
before and after this do not have the issue. (They show up on the
website). So this is likely a GFX9 only issue.
This is not something weird like LDS size since none of the shaders
seem to use LDS.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3145>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3145>
For the sake of our testing infrastructure, disable this extension
for TGL until we can sort out a hang in Vulkan CTS.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
intel/vec4: Fix lowering of multiplication by 16-bit constant
Existing code was ignoring whether the type of the immediate source
was signed or not. If the source was signed, it would ignore small
negative values but it also would wrongly accept values between
INT16_MAX and UINT16_MAX, causing the atual value to later be
reinterpreted as a negative number (under 16-bits).
Fixes tests/shaders/glsl-mul-const.shader_test in Piglit for older
platforms that don't support MUL with 32x32 types and use vec4.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
intel/fs: Fix lowering of dword multiplication by 16-bit constant
Existing code was ignoring whether the type of the immediate source
was signed or not. If the source was signed, it would ignore small
negative values but it also would wrongly accept values between
INT16_MAX and UINT16_MAX, causing the atual value to later be
reinterpreted as a negative number (under 16-bits).
Fixes tests/shaders/glsl-mul-const.shader_test in Piglit for platforms
that don't support MUL with 32x32 types, including ICL and TGL.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2186
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
pan/midgard: Hoist temporary coordinate for cubemaps
We'll reuse some of this code for shadow samplers, which are represented
by a distinct source in NIR.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
I didn't realize this was in spec, but it fixes a crash in shaderdb.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
This cuts down the number of random environmental variables we need
flying around; now PAN_MESA_DEBUG=precompile is sufficient and
MIDGARD_MESA_DEBUG=shaderdb will be implied.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
panfrost: Add PAN_MESA_DEBUG=precompile for shader-db
We would like to use run.c for shader-db runs (rather than capturing in
real-time, which is limiting).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
Patch adds BGRX sRGB visuals, required format translation information
to the __DRI_IMAGE_FOURCC_SXRGB8888 format and makes all BGRX visuals
sRGB capable just like is done with BGRA.
squashed patches from Yevhenii Kolesnikov:
dri: Add __DRI_IMAGE_FOURCC_SXRGB8888 conversion
i965: force visuals without alpha bits to use sRGB
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1501
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3077>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3077>
gallium/auxiliary: Handle count == 0 in u_vbuf_get_minmax_index_mapped
This makes u_vbuf_get_minmax_index_mapped return min = 0 / max = 0
when info->count == 0.
That should never happen anyway, but this commit makes it at least
return a sane value that callers expect, and also allows us - and
GCC - to assume count != 0 for optimization purposes.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3050>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3050>
gallium/auxiliary: Reduce conversions in u_vbuf_get_minmax_index_mapped
With this patch, GCC generates vectorized code that does the comparisons
without converting the indices to 32-bit first.
This optimization makes the aforementioned function almost twice as fast
for ARM NEON, and should speed up vectorised code on other platforms.
Without vectorisation, the function is still a percent or two faster,
but slightly larger.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3050>
Based on the GL driver:
-Compute needs different opcode (this fixes a GPU hang problem)
-REG_A6XX_SP_IBO_LO/REG_A6XX_SP_CS_IBO_LO were swapped
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>