target/d3dadapter9: Return Windows like card names
Add support for multiple cards and fill in Win
like card name, driver name and version info.
Use fallback for unknown vendors and unknown card names.
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Nine code uses some C11 features, and this
leads to compile error on gcc <= 4.5
Another way would have been to use the
-fms-extensions CFLAG
Signed-off-by: David Heidelberg <david@ixit.cz>
Cc: "10.4 10.5 10.6" <mesa-stable@lists.freedesktop.org>
This got missed because the piglit test only tested int images to avoid a
combinatiorial explosion of format, targets, stages and sizes which
takes more than 5 minutes to test on nvidia's driver.
This patch also drops the IMAGE_FUNCTION_AVAIL_ATOMIC which is not applicable
to the image_size codepath but was not hurting in any way.
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
There is no MDOperand in llvm 3.5.
v2: Check if kernel metadata is present to avoid crash (EdB).
v3: Second attempt to avoid crash: switch off metadata query for llvm < 3.6.
Reviewed-by: Serge Martin (EdB) <edb+mesa@sigluy.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
We have to re-validate FBOs rendering to the texture like is done
with TexImage and CopyTexImage.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91673
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
We're generating rcps as part of backend lowering of the packed coordinate
in the CS, and we don't want to lower them in NIR because of the extra
newton-raphson steps in the common case. However, GLB2.7 is moving a
vertex attribute with a 1.0 W component to the position, and that makes us
produce some silly RCPs.
total instructions in shared programs: 97590 -> 97580 (-0.01%)
instructions in affected programs: 74 -> 64 (-13.51%)
I had QPU emit code to do it, but forgot to flag the register class.
total instructions in shared programs: 97974 -> 97590 (-0.39%)
instructions in affected programs: 25291 -> 24907 (-1.52%)
vc4: Pack the unorm-packing bits into a src MUL instruction when possible.
Now that we do non-SSA QIR instructions, we can take a NIR SSA src that's
only used by the unorm packing and just stuff the pack bits into it.
total instructions in shared programs: 98136 -> 97974 (-0.17%)
instructions in affected programs: 4149 -> 3987 (-3.90%)
vc4: Make the pack-to-unorm instructions be non-SSA.
This helps ensure that the register allocator doesn't force the later pack
operations to insert extra MOVs.
total instructions in shared programs: 98170 -> 98159 (-0.01%)
instructions in affected programs: 2134 -> 2123 (-0.52%)
Now that we have NIR, most of the optimization we still need to do is
peepholes on instruction selection rather than general dataflow
operations. This means we want to be able to have QIR be a lot closer to
the actual QPU instructions, just with virtual registers. Allowing
multiple instructions writing the same register opens up a lot of
possibilities.
The constant folding pass can take a long time to complete
so rather than running through the entire pass each time
a new constant is propagated (and vice versa) interleave them.
This change helps ES31-CTS.arrays_of_arrays.InteractionFunctionCalls1
go from around 2 min -> 23 sec.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
nv50/ir: pre-compute BFE arg when both bits and offset are imm
Due to a quirk in how the nv50 opt passes run, the algebraic
optimization that looks for these BFE's happens before the constant
folding pass. Rearranging these passes isn't a great idea, but this is
easy enough to fix. Allows a following cvt to eliminate the bfe in
certain situations.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
glsl: expose textureQueryLod in GLSL 4.00+ fragment shaders
See issue from the ARB_texture_query_lod spec for LOD vs Lod confusion:
(3) The core specification uses the "Lod" spelling, not "LOD". Should
this extension be modified to use "Lod"?
RESOLVED: The "Lod" spelling is the correct spelling for the core
specification and the preferred spelling for use. However, use of
"LOD" also exists, as the extension predated the core specification,
so this extension won't remove use of "LOD".
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Revert "mesa/formats: refactor by collapsing cases in switch statement by type"
This reverts commit ffe6c6ad5f.
_mesa_format_num_components() does not include the padding bits in mesa formats
containing 'X' channels. This could cause mipmap generation for certain
uncompressed formats to underestimate the number of channels in the source
image by 1.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
FLT_TO_INT goes in the vector pipes on evergreen/NI,
not the trans unit as on earlier chips.
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This struct was getting a bit crowded, following the lead of
radeonsi, mirror the idea of having sub-structures for each
shader type. Turning 'r600_shader_key' into an union saves
some trivial memory and CPU cycles for the shader keys.
[airlied: drop as_ls, and reorder so larger fields at start.]
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
r600: Rewrite r600_shader_selector_key() to use a switch stmt
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
glsl: check if return_deref in lower_subroutine_visitor::visit_leave isn't NULL
Fixes a crash in Piglit's
spec@arb_shader_subroutine@linker@no-mutual-recursion.vert for me.
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
nvc0/ir: undo more shifts still by allowing a pre-SHL to occur
This happens with unpackSnorm lowering. There's yet another
bitfield-extract behind it, but there's too much variation to be worth
cutting through.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
nvc0/ir: don't require AND when the high byte is being addressed
unpackUnorm* lowering doesn't AND the high byte/word as it's
unnecessary. Detect that situation as well.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
nvc0/ir: detect i2f/i2i which operate on specific bytes/words
Some Unigine shaders have been observed to unpack bytes out of 32-bit
integers and convert them to floats. I2F/I2I can handle this sort of
thing directly. Detect the handleable situations.
This misses 16-bit word capabilities in nv50, but I haven't seen shaders
that would actually make use of that.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
nvc0/ir: detect AND/SHR pairs and convert into EXTBF
Some shaders appear to extract bits using shift/and combos. Detect
(some) of those and convert to EXTBF instead.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
nv50/ir: support different unordered_set implementations
If build with C++11 standard, use std::unordered_set.
Otherwise if build on old Android version with stlport,
use std::tr1::unordered_set with a wrapper class.
Otherwise use std::tr1::unordered_set.
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
I pushed a half-baked version of "i965: handle nir_intrinsic_image_size" by
accident. Not having the Reviewed-by: tags on the last two commits should
have been a red flag but I somehow missed it after the QA check.
This patch should fix image-size for non-int images. I will add support to
the piglit test for all the other image types.
Sorry for the noise.
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
v2, Review from Francisco Jerez:
- avoid the camelCase for the booleans
- init the booleans using the sampler type
- force the initialization of all the components of the output register
v3:
- Rename a variable from CubeMapArray to CubeArray to re-use GLSL's name (Ilia)
- Fix some indentation and drop parenthesis (Topi)
- Fix a signed/unsigned comparaison warning
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
nir: convert the glsl intrinsic image_size to nir_intrinsic_image_size
v2, review from Francisco Jerez:
- make the destination variable as large as what the nir instrinsic
defines (4) instead of the size of the return variable of glsl. This
is still safe for the already existing code because all the intrinsics
affected returned the same amount of components as expected by glsl IR.
In the case of image_size, it is not possible to do so because the
returned number of component depends on the image type and this case
is not well handled by nir.
v3:
- Style fix
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
The code is heavily inspired from Francisco Jerez's code supporting the
image_load_store extension.
Backends willing to support this builtin should handle
__intrinsic_image_size.
v2: Based on the review of Ilia Mirkin
- Enable the extension for GLES 3.1
- Fix indentation
- Fix the return type (float to int, number of components for CubeImages)
- Add a warning related to GLES 3.1
v3: Based on the review of Francisco Jerez
- Refactor the code to share both add_image_function and _image with the other
image-related functions
v4: Based on Topi Pohjolainen's comments
- Do not add parenthesis for the return value
v5: based on Francisco Jerez's comments:
- Fix a few indent issues
- Reduce the size of a condition by testing the dimension and array properties
instead of enumerating all the formats.
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
I'll mark the OES_shader_image_atomic extension entry with this tag to
make sure that we don't expose it on earlier GLES API versions
accidentally, because according to the extension:
"OpenGL ES 3.1 and GLSL ES 3.10 are required."
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
glsl: Parse the allowed image format qualifiers in GLSL ES 3.1.
This includes the minimum required desktop/ES GLSL version in the
format qualifier table in anticipation of new GLSL versions extending
the set of supported image formats. According to section 4.4.7 of the
GLSL ES 3.1 spec:
"The format layout qualifier identifiers for image variable
declarations are:
[...]
rgba32f
rgba16f
r32f
rgba8
rgba8_snorm
[...]
rgba32i
rgba16i
rgba8i
r32i
[...]
rgba32ui
rgba16ui
rgba8ui
r32ui"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
glsl: Remove duplicate definition of gl_MaxTess*ImageUniforms built-in constants.
These seem to have been re-added at some point during the
ARB_tessellation_shader implementation work. AFAICT the second
(correct) definition of each constant would have had no effect because
the symbols were already defined.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
glsl: Accept supported image types in GLSL ES 3.1.
These are a subset of the image types supported by desktop GL,
excluding 1D, 1D array, rectangle, buffer, cube array, 2D MS and 2D
MS array texture targets.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>