i965: Don't copy propagate abs into Broadwell logic instructions.
It's not clear what abs on logical instructions means on Broadwell, and
it doesn't appear to do anything sensible.
Fixes 270 Piglit tests (the bitand/bitor/bitxor tests with abs).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81157
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
i965/fs: Use WE_all for gl_SampleID header register munging.
This code should execute without regard to the currently executing
channels. Asking for gl_SampleID inside control flow might break in
strange ways. It appears to break even at the top of the program in
SIMD16 mode occasionally as well.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
i965/fs: Set force_uncompressed and force_sechalf on samplepos setup.
gen8_fs_generator uses these to decide whether to set the execution size
to 8 or 16, so we incorrectly made both of these MOVs the full width in
SIMD16 shaders. (It happened to work out on Gen4-7.)
Setting them should also help inform optimization passes what's really
going on, which could help avoid bugs.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
i965: Set execution size to 8 for instructions with force_sechalf set.
Both inst->force_uncompressed and inst->force_sechalf mean that the
generated instruction should be uncompressed and have an execution size
of 8. We don't require the visitor to set both flags - setting
inst->force_sechalf by itself is supposed to be enough.
On Gen4-7, guess_execution_size() demoted instructions to 8-wide based
on the default compression state. On Gen8+, we instead set a default
execution size, which worked great...except that we forgot to check
inst->force_sechalf when deciding whether to use 8 or 16.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
nouveau: check if a fence has already been signalled
nouveau_fence_update does real work unconditionally. Avoid doing that if
the fence we're checking on has already been signalled.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
exec_list: Make various places use the new length() method.
Instead of hand-rolling it.
v2 [mattst88]: Rename get_size to length. Expand comment in ir_reader.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
exec_list: Add a function to give the length of a list.
v2 [mattst88]: Remove trailing whitespace. Rename get_size to length.
Mark as const.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
This complements the existing append function. It's implemented in a
rather simple way right now; it could be changed if performance is a
concern.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
There are no queries for GL_TEXTURE_LUMINANCE_SIZE,
GL_TEXTURE_INTENSITY_SIZE, GL_TEXTURE_LUMINANCE_TYPE, or
GL_TEXTURE_INTENSITY_TYPE in any version of OpenGL ES or desktop OpenGL
core profile.
NOTE: Without changes to piglit, this regresses
required-sized-texture-formats.
v2: Rebase on different initial change.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.2 <mesa-stable@lists.freedesktop.org>
There are no texture borders in any version of OpenGL ES or desktop
OpenGL core profile.
Fixes piglit's gl-3.2-texture-border-deprecated.
v2: Rebase on different initial change.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.2 <mesa-stable@lists.freedesktop.org>
mesa: Handle uninitialized textures like other textures in get_tex_level_parameter_image
Instead of catching the special case early, handle it by constructing a
fake gl_texture_image that will cause the values required by the OpenGL
4.0 spec to be returned.
Previously, calling
glGenTextures(1, &t);
glBindTexture(GL_TEXTURE_2D, t);
glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, 0xDEADBEEF, &value);
would not generate an error.
Anuj: Can you verify this does not regress proxy_textures_invalid_size?
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Suggested-by: Brian Paul <brianp@vmware.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Cc: Anuj Phogat <anuj.phogat@gmail.com>
i965/fs: Relax interference check in register coalescing.
A similar attempt was made in commit 5ff1e446 and was reverted in commit
a39428cf after causing a regression in an ES 3 conformance test. The
test still passes after this commit.
total instructions in shared programs: 1994827 -> 1992858 (-0.10%)
instructions in affected programs: 128247 -> 126278 (-1.54%)
GAINED: 0
LOST: 1
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
i965/fs: Perform CSE on sends-from-GRF rather than textures.
Should potentially allow a few more cases, while avoiding doing CSE on
texture operations on Gen <= 6 with the MRF.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80211
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: lu hua <huax.lu@intel.com>
glsl: Update expression types after rebalancing the tree.
If we saw a tree that looked like
vec3
/ \
vec3 float
/ \
vec3 float
/ \
vec3 float
We would see that all of the expression types were vec3, and then
rebalance to
vec3
/ \
vec3 vec3 <-- should be float
/ \ / \
vec3 float float float
This patch adds code to visit the rebalanced tree and update the
expression types from the bottom up.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80880
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
0cbefc1bea added a source argument to
EMIT/ENDPRIM, but it did not update tgsi_ureg accordingly, causing all
users of ureg_EMIT/ENDPRIM to fail at runtime with an assertion failure.
Trivial.
glapi: Use GetProcAddress instead of dlsym on Windows.
This patch fixes this MinGW build error.
glapi_gentable.c: In function '_glapi_create_table_from_handle':
glapi_gentable.c:123:9: error: implicit declaration of function 'dlsym' [-Werror=implicit-function-declaration]
*procp = dlsym(handle, symboln);
^
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Brian Paul <brianp@vmware.com>
Vectors are falling in to the ir_dereference_array() path.
Without this change, the following glsl aborts the debug driver,
or gets the wrong answer in release:
mat2x2 a = mat2( vec2( 1.0, vertex.x ), vec2( 0.0, 1.0 ) );
Also submitting piglit tests, will reference in bug.
v2: Rebase on Mesa master.
v3: Remove unneeded check for arrays, which are covered by
process_array_constructor(), recommended by Timothy Arceri.
Signed-off-by: Cody Northrop <cody@lunarg.com>
Reviewed-by: Courtney Goeltzenleuchter <courtney@lunarg.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79373
On Cygwin and MinGW, linking a shared library also generates an import library
Use a wildcard which also matches the name of the megadriver import lib,
mesa_dri_drivers.dll.a, so that is also removed after megadriver symlinks are
created
(This then matches src/gallium/targets/dri/Makefile.am, which already does
things this way)
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
i965/fs: add support for ir_*_interpolate_at_* expressions
SIMD8-only for now.
V5: - Fix style complaints
- Move prototype to be with other oddball emit functions
- Use unreachable() instead of assert() where possible
V6: - Describe what is happening with the clamping
- Add reg_width to make some expressions clearer
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
i965/fs: Skip channel expressions splitting for interpolation
The backend will have to do a message send, so we want to keep these in
one piece, just like texture ops.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
i965/fs: add generator support for pixel interpolator query
V5: - Split into separate opcodes
- Pass message data in src1 immediate
- Put noperspective bit in fs_inst rather than adding any junk to
backend_instruction
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
i965/disasm: Disassemble indirect sends more properly
- Don't try to disassemble send's src1 as a descriptor if it's not an
immediate.
- In the same case, show src1 as an operand (makes it easier to see
bogus register regions, etc -- the hardware is very fussy)
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
i965: Avoid crashing while dumping vec4 insn operands
We'd otherwise go looking into virtual_grf_sizes for things that aren't
in there at all.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
V2: - Don't assume everyone wants interpolateAtSample() lowered to
interpolateAtOffset. It turns out this isn't what we want most
of the time for i965. Lowering can be added later in an ir pass
which drivers opt into, rather than bolting it straight into the
builtin definition.
- Only expose the interpolateAt* builtins in the fragment language.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Will be used to implement interpolateAt*() from ARB_gpu_shader5
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
allow builtin functions to require parameters to be shader inputs
The new interpolateAt* builtins have strange restrictions on the
<interpolant> parameter.
- It must be a shader input, or an element of a shader input array.
- It must not include a swizzle.
V2: Don't abuse ir_var_mode_shader_in for this; make a new flag.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
radeonsi: properly implement texture opcodes that take an offset
Instead of using intr_name in lp_build_tgsi_action, this selects the names
with a switch statement in the emit function. This allows emitting
llvm.SI.sample for instructions without offsets and llvm.SI.image.sample.*.o
otherwise.
This depends on my LLVM changes.
When LLVM 3.5 is released, I'll switch all texture instructions to the new
intrinsics.