i965: Move inst_info from brw_eu_validate.c to brw_eu.c.
Drop the uses of 'enum gen' to a plain int, so that we don't have to
expose the bitfield definitions and GEN_GE/GEN_LE macros to other users
of brw_eu.h. As a result, s/.gen/.gens/ to avoid confusion with
devinfo->gen.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
i965/disasm: Wrap opcode_desc look-up in a function.
The function takes a device info struct as argument in addition to the
opcode number in order to disambiguate between multiple opcode_desc
entries for different instructions with the same opcode number.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [v1]
[v2] mattst88: Put brw_opcode_desc() in brw_eu.c instead of moving it
there in a later patch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v2]
[v3] mattst88: Return NULL if opcode >= ARRAY_SIZE(opcode_descs)
Reviewed-by: Matt Turner <mattst88@gmail.com>
This is not strictly required for the following changes because none
of the three-source opcodes we support at the moment in the compiler
back-end has been removed or redefined, but that's likely to change in
the future. In any case having hardware instructions specified as a
pair of hardware device and opcode number explicitly in all cases will
simplify the opcode look-up interface introduced in a subsequent
commit, since the opcode number alone is in general ambiguous.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
i965: Pass devinfo pointer to brw_instruction_name().
A future series will implement support for an instruction that happens
to have the same opcode number as another instruction we support
already on a disjoint set of hardware generations. In order to
disambiguate which instruction it is brw_instruction_name() will need
some way to find out which device we are generating code for.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
i965: Write a scalar TCS backend that runs in SINGLE_PATCH mode.
Unlike most shader stages, the Hull Shader hardware makes us explicitly
tell it how many threads to dispatch and manually configure the channel
mask. One perk of this is that we have a lot of flexibility - we can
run it in either SIMD4x2 or SIMD8 mode.
Treating it as SIMD8 means that shaders with 8 or fewer output vertices
(which is overwhemingly the common case) can be handled by a single
thread. This has several intriguing properties:
- Accessing input arrays with gl_InvocationID as the index is a simple
SIMD8 URB read with g1 as the header. No indirect addressing required.
- Barriers are no-ops.
- We could potentially do output shadowing to combine writes, as the
concurrency concerns are gone. (We don't do this yet, though.)
v2: Drop first_non_payload_grf change, as it was always adding 0
(caught by Jordan Justen).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
i965: Rework the TCS passthrough shader to use NIR.
I'm about to implement a scalar TCS backend, and I'd rather not
duplicate all of this code there.
One change is that we now write the tessellation levels from all
TCS threads, rather than just the first. This is pretty harmless,
and was easier. The IF/ENDIF needed for that are gone; otherwise
the generated code is basically identical.
I chose to emit load/store intrinsics directly because it was easier.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
gallium/util: change assertion to conditional in util_bitmask_destroy()
If we fail to create a context in the VMware driver we call this function
unconditionally to free a bunch of bit vectors. Instead of asserting on
a null pointer, just no-op.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
If, for example, we previously had 2 sampler states bound and now we
are binding one, we'd leave the second sampler state unchanged.
This change nulls-out the second sampler state in this situation.
We're already doing the same thing for sampler views.
This silences an occasional warning issued by the VMware driver when
the number of sampler views and sampler states disagreed.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
svga: try to flag surfaces for sampling, in addition to rendering
This silences some warnings when we try to sample from surfaces that were
created for drawing, such as when blitting from one of the framebuffer
surfaces. We were already doing the opposite situation (adding a bind
flag for rendering to surfaces declared as texture sources).
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
svga: fix copying non-zero layers of 1D array textures
Like cube maps, we need to convert the z information to a layer index.
Also rename the *_face vars to *_face_layer to make things a little more
understandable.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
nvc0: compute a percentage for metric-achieved_occupancy
metric-issue_slot_utilization and metric-branch_efficiency are already
computed as percentages.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
nvc0: store the driver query type for performance metrics
This will allow to use percentages for some metrics because the Gallium
HUD doesn't allow to display floating point numbers and 0 is printed
instead.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
nvc0: fix exposing of metric-issue_slots for SM21/SM30
This is most likely a copy-paste error when I reworked this area few
weeks ago. For SM20, metric-issue_slots is equal to inst_issued because
there is only one pipeline, so the metric is not exposed there.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reported-by: Karol Herbst <nouveau@karolherbst.de>
This prevents a crash when a NULL src is passed with a non-NULL length.
fixes: dEQP-GLES31.functional.debug.object_labels.query_length_only
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95252
Signed-off-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
glsl: subroutine types cannot be used in constructors.
This fixes two of the cases in
GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
glsl: resource is a reserved keyword in GLSL 4.20 as well
resource just appears in GLSL 4.20 without any fanfare.
Fixes GL43-CTX.CommonBugs.CommonBug_ReservedNames
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
mesa/copyimage: make sure number of samples match.
This fixes
GL43-CTS.copy_image.samples_missmatch
which otherwise asserts in the radeonsi driver.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
mesa/objectlabel: don't do memcpy if bufSize is 0 (v2)
This prevents GL43-CTS.khr_debug.labels_non_debug from
memcpying all over the stack and crashing.
v2: actually fix the test.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
GL43-CTS.texture_view.errors checks for GL_INVALID_VALUE
here but we catch these problems in the dimensionsOK check
and return the wrong error value.
This fixes:
GL43-CTS.texture_view.errors.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
gallium/radeon: remove stencil_tile_split from metadata
this is a leftover from the days when depth-stencil buffers were
allocated by the DDX
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
winsys/amdgpu: pass PIPE_CONFIG to addrlib on texture import
This hasn't been needed, but I think we should set it.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>