brkho/mesa - mesa - Brian's Gitea

Commit Graph

Author	SHA1	Message	Date
Jason Ekstrand	65ee5cc0da	anv: Use an actual binding for gl_NumWorkgroups This commit moves our handling of gl_NumWorkgroups over to work like our handling of other special bindings in the Vulkan driver. We give it a magic descriptor set number and teach emit_binding_tables to handle it. This is better than the bias mechanism we were using because it allows us to do proper accounting through the bind map mechanism. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	6 years ago
Jason Ekstrand	5c96120b5c	intel,nir: Lower TXD with min_lod when the sampler index is not < 16 When we have a larger sampler index, we get into the "high sampler" scenario and need an instruction header. Even in SIMD8, this pushes the instruction over the sampler message size maximum of 11 registers. Instead, we have to lower TXD to TXL. Fixes: `cb98e0755f` "intel/fs: Support min_lod parameters on texture..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	6 years ago
Jason Ekstrand	ca295ddbfb	spirv: OpImageQueryLod requires a sampler No idea how this fell through the cracks besides the fact that the sampler bound at 0 almost always works and the CTS isn't amazing. In any case, this appears to have been broken for almost forever. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable@lists.freedesktop.org	6 years ago
Jason Ekstrand	5049fbddb4	anv: Count surfaces for non-YCbCr images in GetDescriptorSetLayoutSupport We were accidentally not counting those surfaces Fixes: `ddc4069122` "anv: Implement VK_KHR_maintenance3" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	6 years ago
Sagar Ghuge	58bcebd987	spirv: Allow [i/u]mulExtended to use new nir opcode Use new nir opcode nir_[i/u]mul_2x32_64 and extract lower and higher 32 bits as needed instead of emitting mul and mul_high. v2: Surround the switch case with curly braces (Jason Ekstrand) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	6 years ago
Sagar Ghuge	47ec9bdc60	nir/algebraic: Optimize low 32 bit extraction Optimize a situation where we only need lower 32 bits from 64 bit result. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	6 years ago
Sagar Ghuge	1d8994a63b	glsl: [u/i]mulExtended optimization for GLSL Optimize mulExtended to use 32x32->64 multiplication. Drivers which are not based on NIR, they can set the MUL64_TO_MUL_AND_MUL_HIGH lowering flag in order to have same old behavior. v2: Add missing condition check (Jason Ekstrand) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Suggested-by: Matt Turner <Matt Turner <mattst88@gmail.com> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	6 years ago
Sagar Ghuge	e551040c60	nir/glsl: Add another way of doing lower_imul64 for gen8+ On Gen 8 and 9, "mul" instruction supports 64 bit destination type. We can reduce our 64x64 int multiplication from 4 instructions to 3. Also instead of emitting two mul instructions, we can emit single mul instuction and extract low/high 32 bits from 64 bit result for [i/u]mulExtended v2: 1) Allow lower_mul_high64 to use new opcode (Jason Ekstrand) 2) Add lower_mul_2x32_64 flag (Matt Turner) 3) Remove associative property as bit size is different (Connor Abbott) v3: Fix indentation and variable naming convention (Jason Ekstrand) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	6 years ago
Axel Davy	1d363d440f	st/nine: Ignore multisample quality level if no ms Apparently instead of returning error when passing a quality level different than 0 for D3DMULTISAMPLE_NONE, we should pass. Fixes: https://github.com/iXit/Mesa-3D/issues/340 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Axel Davy <davyaxel0@gmail.com>	6 years ago
Axel Davy	86666f051e	st/nine: Ignore window size if error Check GetWindowInfo and ignore the computed sizes if there is an error. Fixes a regression caused by earlier commit when using old wine gallium nine patches. Should also address a crash at window destruction. Related issues: https://github.com/iXit/Mesa-3D/issues/331 https://github.com/iXit/Mesa-3D/issues/332 Cc: mesa-stable@lists.freedesktop.org Fixes: `2318ca68bb` ("st/nine: Handle window resize when a presentation buffer is used") Signed-off-by: Axel Davy <davyaxel0@gmail.com>	6 years ago
Mauro Rossi	ec0f465bc5	android: anv: fix libexpat shared dependency Fixes undefined reference building errors for XML_* functions Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: "19.0" <mesa-stable@lists.freedesktop.org>	6 years ago
Mauro Rossi	14e7e26a09	android: anv: fix generated files depedencies (v2) Fix anv_extrypoints.{c,h} and anv_extensions.{c,h} missing dependencies Rename the variable labels according to targets and python scripts Align the building rules as per Automake for simplification Fixes building errors during rebuils due to missing dependencies (v2) Fixed a missing $(VULKAN_API_XML) reference Fixes: `9a508b7` ("android: anv/extensions: fix generated sources build") Fixes: `dd088d4bec` ("anv/extensions: Generate a header file with extension tables") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Cc: "19.0" <mesa-stable@lists.freedesktop.org>	6 years ago
Brian Paul	e2369e133c	st/wgl: init a variable to silence MinGW warning MinGW release build says 'value' may be used before being initialized. Reviewed-by: Neha Bhende <bhenden@vmware.com>	6 years ago
Brian Paul	66ba12973b	svga: silence array out of bounds warning MinGW release build complains about a possible out-of-bounds array access. Test i < 4 to silence it. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	6 years ago
Brian Paul	999db9ac51	svga: init fill variable to avoid compiler warning MinGW release builds warns about use of a possbily uninitialized variable here. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	6 years ago
Brian Paul	9b07a221a4	st/mesa: whitespace fixes in st_texture.h Trivial.	6 years ago
Brian Paul	d74932dfea	st/mesa: line wrapping, whitespace fixes in st_cb_texture.c Trivial.	6 years ago
Brian Paul	fc91c2698e	st/mesa: whitespace fixes in st_sampler_view.c Replace tabs w/ spaces. 80-column wrapping. Trivial.	6 years ago
Gurchetan Singh	610758d3e5	egl/sl: also allow virtgpu to fallback to kms_swrast virtio-gpu fallbacks to software rendering when 3D features are unavailable since 6c5ab, and kms_swrast is more feature complete than swrast. v2: Add comment (Emil) Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	6 years ago
Mathias Fröhlich	904a0552aa	st/mesa: Invalidate the gallium array atom only if needed. Now that the buffer object usage history tracks if it is being used as vertex buffer object, we can restrict setting the ST_NEW_VERTEX_ARRAYS bit to dirty on glBufferData calls to buffers that are potentially used as vertex buffer object. Also put a note that the same could be done for index arrays used in indexed draws. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	6 years ago
Mathias Fröhlich	e727f8c8b8	mesa: Track buffer object use also for VAO usage. We already track the usage history for buffer objects in a lot of aspects. Add GL_ARRAY_BUFFER and GL_ELEMENT_ARRAY_BUFFER to gl_buffer_object::UsageHistory. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	6 years ago
Samuel Pitoiset	9e787904d0	rav: use 32_AR instead of 32_ABGR when alpha coverage is required This export format is faster. Seems to improve performance in Wreckfest. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	6 years ago
Alyssa Rosenzweig	72981c92ce	panfrost: List primitive restart enable bit Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	6 years ago
Alyssa Rosenzweig	2b5cda137f	panfrost/midgard: Preview for data hazards If a selected unit causes a data hazard, the whole block gets cut short. So, we preview for data hazards _while_ selecting units. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Tested-by: Tomeu Vizoso <tomeu.vizoso@collabora.com	6 years ago
Alyssa Rosenzweig	93eeba623b	panfrost/midgard: Promote smul to vmul smul comes first in the pipeline, before vmul. Until we have a full instruction scheduler, it's better to have vmul prioritized to maximize bundle size. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Tested-by: Tomeu Vizoso <tomeu.vizoso@collabora.com	6 years ago
Alyssa Rosenzweig	25bbb44dce	panfrost: Flush with offscreen rendering This special-case was needlessly added and breaks purely offscreen rendering (when there is no scanout involved) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com	6 years ago
Alyssa Rosenzweig	4f7460297b	panfrost/midgard: Don't force constant on VLUT Previously, we forced a #0 inline constant tacked on for the lut instructions to mirror the blob's behaviour, which caused some suboptimal codegen due to our constant inlining implementation. Instead, just don't force a constant at all. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Tested-by: Tomeu Vizoso <tomeu.vizoso@collabora.com	6 years ago
Alyssa Rosenzweig	c351cc4e94	panfrost: Cleanup cruft related to clears Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	6 years ago
Alyssa Rosenzweig	40ffee4448	panfrost: Decouple Gallium clear from FBD clear The operations of gallium->clear() and the hardware callbacks are fundamentally independent. This routine decouples them by routing shared information via panfrost_job, allowing the hardware half to be deferred to the fragment job generation. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	6 years ago
Alyssa Rosenzweig	59c9623d0a	panfrost: Import job data structures from v3d At the moment, Panfrost state is ad hoc, which creates issues for FBOs. This commit imports the skeleton of the v3d_job structure as panfrost_job, in preparation for refactors to organize this state. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	6 years ago
Ilia Mirkin	4eec3a2a36	glsl: fix recording of variables for XFB in TCS shaders This is purely for conformance, since it's not actually possible to do XFB on TCS output varyings. However we do have to make sure we record the names correctly, and this removes an extra level of array-ness from the names in question. Fixes KHR-GL45.tessellation_shader.single.xfb_captures_data_from_correct_stage v2: Add comment to the new program_resource_visitor::process function. (Ilia Mirkin) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108457 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	6 years ago
Jose Maria Casanova Crespo	bf1f49482d	glsl: TCS outputs can not be transform feedback candidates on GLES Avoids regression on: KHR-GLES*.core.tessellation_shader.single.xfb_captures_data_from_correct_stage that is uncovered by the following patch. "glsl: fix recording of variables for XFB in TCS shaders" v2: Rebased over glsl: fix recording of variables for XFB in TCS shaders v3: Move this patch before "glsl: fix recording of variables for XFB in TCS shaders" to avoid temporal regressions. (Illia Mirkin) Cc: 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	6 years ago
Jose Maria Casanova Crespo	cc7173b438	glsl: fix typos in comments "transfor" -> "transform" Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	6 years ago
Gert Wollny	3214f20914	mesa: Expose EXT_texture_query_lod and add support for its use shaders EXT_texture_query_lod provides the same functionality for GLES like the ARB extension with the same name for GL. v2: Set ES 3.0 as minimum GLES version as required by the extension Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	6 years ago
Greg V	7dc2f47882	util: emulate futex on FreeBSD using umtx Obtained from: FreeBSD ports Acked-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	7 years ago
Rob Clark	00f838fa73	freedreno/ir3: track register pressure in sched Not a perfect solution, and the "pressure" target is hard-coded. But it doesn't really seem to much in the common case, and avoids exploding register usage in dEQP ssbo tests. So this should serve as a stop-gap solution until I have time to re- write the scheduler. Hurts slightly in instruction count, but gains (reduces) slightly the register usage in shader-db. Fixes ~150 dEQP-GLES31.functional.ssbo.* that were failing due to RA fail. Signed-off-by: Rob Clark <robdclark@gmail.com>	6 years ago
Rob Clark	8a5f2d9444	freedreno/ir3: add Sethi–Ullman numbering pass Signed-off-by: Rob Clark <robdclark@gmail.com>	6 years ago
Rob Clark	c8e351ee3a	freedreno/ir3: include nopN in expanded instruction count Signed-off-by: Rob Clark <robdclark@gmail.com>	6 years ago
Dave Airlie	cb4e3e3ef6	st/mesa: add support for lowering fp64/int64 for nir drivers This might enough for iris and possible r600 (when it gets NIR) This appears to work for iris. v2: * change cap return so DOUBLES == 2 means sw emu v3: * Refactor using int64/doubles lowering options which were added into nir options * Remove DOUBLES == 2 added in v2 [jordan: Remove "2" value on PIPE_CAP_DOUBLES] [jordan: Use lowering options added to nir options] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	6 years ago
Jordan Justen	7de056e1a9	scons: Generate float64_glsl.h for glsl_to_nir fp64 lowering Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	6 years ago
Jordan Justen	10c5579921	intel/compiler: Move int64/doubles lowering options Instead of calculating the int64 and doubles lowering options each time a shader is preprocessed, save and use the values in nir_shader_compiler_options. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	6 years ago
Jordan Justen	31b35916dd	nir: Add int64/doubles options into nir_shader_compiler_options This will allow the options to be visible under nir_shader->options, which will allow the gallium state_tracker to use the driver preferred settings during glsl_to_nir. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	6 years ago
Ian Romanick	bae0c36751	nir/algebraic: Optimize away an fsat of a b2f The b2f can only produce 0.0 or 1.0, so the fsat does nothing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	6 years ago
Ian Romanick	d1d56f5f9a	intel/fs: Don't assert on b2f with a saturate modifier This ran afoul of Iris's use of nir_lower_clamp_color_outputs which applies fsat() before writes to vertex shader color outpus. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `7725d60938` ("intel/fs: Emit better code for b2f(inot(a)) and b2i(inot(a))")	6 years ago
Lionel Landwerlin	32ffd90002	anv: add support for INTEL_DEBUG=bat As requested by Ken ;) v2: Also decode simple batches (Caio) Fix u_vector usage issues (Lionel) v3: Make binding/instruction/state/surface available (Lionel) v4: Going through device pools for simple batches (Lionel) Centralize search BO callbacks into anv_device.c (Lionel) v5: Clear decoded batch buffer var after use (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	6 years ago
Eric Anholt	f1122f78b7	v3d: Fix build of NEON code with Mesa's cflags not targeting NEON. v3d may be built as part of a set of drivers in a system not requiring NEON, but we know V3D devices will be paired with CPUs with NEON so we should be able to use this asm. Fixes: `0c05198d6b` ("v3d: Always enable the NEON utile load/store code.")	6 years ago
Matt Turner	e0148bbcfd	intel/compiler: Add commas on final values of compaction table arrays Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	7 years ago
Ian Romanick	ecc9ffa778	nir/algebraic: Replace a-fract(a) with floor(a) I noticed this while looking at a shader that was affected by Tim's "more loop unrolling" series. In review, Tim Arceri asked: > Why the hurt on Gen6+ is this something that should be in the late > optimisations pass? As far as I can tell, it's just because our scheduler is terrible. In all the fragment shaders that I looked at (some hurt shaders were from other stages), only one of the SIMD8 or SIMD16 version would be hurt. In many of those case, the other SIMD width is improved (e.g., shaders/closed/steam/brutal-legend/3990.shader_test). Often it looks like the scheduler decides to differently schedule a SEND the occurs somewhere early in the shader. Once that happens, everything is different. I looked at one vertex shader that was hurt (from Goat Simulator). In that case, both the floor and fract are used. The optimization eliminates the add, and it should allow better scheduling. In the area of the FRC and RNDD instructions, the scheduler does the right thing. However, later in the shader a MAD and and ADD get scheduled differently, and that makes it slightly worse. In light of this, I tried adding some "is_used_once" mark-up, and that did not fix all the cycles regressions. It also did a lot more harm than good on SKL (helped 82 vs. hurt 241). All Gen6+ platforms had similar results. (Skylake shown) total instructions in shared programs: `15437001` -> `15435259` (-0.01%) instructions in affected programs: 213651 -> 211909 (-0.82%) helped: 988 HURT: 0 helped stats (abs) min: 1 max: 27 x̄: 1.76 x̃: 1 helped stats (rel) min: 0.15% max: 11.54% x̄: 1.14% x̃: 0.59% 95% mean confidence interval for instructions value: -1.89 -1.63 95% mean confidence interval for instructions %-change: -1.23% -1.05% Instructions are helped. total cycles in shared programs: `383007378` -> `382997063` (<.01%) cycles in affected programs: `1650825` -> `1640510` (-0.62%) helped: 679 HURT: 302 helped stats (abs) min: 1 max: 348 x̄: 23.39 x̃: 14 helped stats (rel) min: 0.04% max: 28.77% x̄: 1.61% x̃: 0.98% HURT stats (abs) min: 1 max: 250 x̄: 18.43 x̃: 7 HURT stats (rel) min: 0.04% max: 25.86% x̄: 1.41% x̃: 0.53% 95% mean confidence interval for cycles value: -13.05 -7.98 95% mean confidence interval for cycles %-change: -0.86% -0.50% Cycles are helped. Iron Lake and GM45 had similar results. (GM45 shown) total instructions in shared programs: `5043616` -> `5043010` (-0.01%) instructions in affected programs: 119691 -> 119085 (-0.51%) helped: 432 HURT: 0 helped stats (abs) min: 1 max: 27 x̄: 1.40 x̃: 1 helped stats (rel) min: 0.10% max: 8.11% x̄: 0.66% x̃: 0.39% 95% mean confidence interval for instructions value: -1.58 -1.23 95% mean confidence interval for instructions %-change: -0.72% -0.59% Instructions are helped. total cycles in shared programs: `128139812` -> `128135762` (<.01%) cycles in affected programs: `3829724` -> `3825674` (-0.11%) helped: 602 HURT: 0 helped stats (abs) min: 2 max: 486 x̄: 6.73 x̃: 6 helped stats (rel) min: 0.02% max: 4.85% x̄: 0.19% x̃: 0.10% 95% mean confidence interval for cycles value: -8.40 -5.05 95% mean confidence interval for cycles %-change: -0.22% -0.16% Cycles are helped. Reviewed-by: Elie Tournier <tournier.elie@gmail.com>	6 years ago
Ian Romanick	1edf67fc3f	intel/fs: Generate if instructions with inverted conditions Per-platform results were all over the place, so I have included all the results here. There is an important note at the bottom of the commit message. Skylake total instructions in shared programs: `15184683` -> `15184679` (<.01%) instructions in affected programs: 2786 -> 2782 (-0.14%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.05% max: 0.84% x̄: 0.44% x̃: 0.44% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.96% 0.07% Inconclusive result (%-change mean confidence interval includes 0). total cycles in shared programs: `370961367` -> `370961173` (<.01%) cycles in affected programs: 205867 -> 205673 (-0.09%) helped: 5 HURT: 1 helped stats (abs) min: 1 max: 149 x̄: 39.60 x̃: 16 helped stats (rel) min: 0.02% max: 1.05% x̄: 0.45% x̃: 0.55% HURT stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -93.01 28.34 95% mean confidence interval for cycles %-change: -0.82% 0.08% Inconclusive result (value mean confidence interval includes 0). Broadwell total instructions in shared programs: `15465366` -> `15465362` (<.01%) instructions in affected programs: 2799 -> 2795 (-0.14%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.04% max: 0.84% x̄: 0.44% x̃: 0.44% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.96% 0.07% Inconclusive result (%-change mean confidence interval includes 0). total cycles in shared programs: `410938419` -> `410938531` (<.01%) cycles in affected programs: 566028 -> 566140 (0.02%) helped: 18 HURT: 17 helped stats (abs) min: 1 max: 16 x̄: 3.50 x̃: 1 helped stats (rel) min: <.01% max: 1.05% x̄: 0.13% x̃: <.01% HURT stats (abs) min: 1 max: 12 x̄: 10.29 x̃: 12 HURT stats (rel) min: <.01% max: 0.16% x̄: 0.08% x̃: 0.09% 95% mean confidence interval for cycles value: 0.31 6.09 95% mean confidence interval for cycles %-change: -0.10% 0.05% Inconclusive result (%-change mean confidence interval includes 0). Haswell total instructions in shared programs: `13749760` -> `13749759` (<.01%) instructions in affected programs: 2241 -> 2240 (-0.04%) helped: 1 HURT: 0 total cycles in shared programs: `385398913` -> `385398363` (<.01%) cycles in affected programs: 554914 -> 554364 (-0.10%) helped: 31 HURT: 1 helped stats (abs) min: 1 max: 453 x̄: 18.00 x̃: 6 helped stats (rel) min: <.01% max: 0.25% x̄: 0.03% x̃: 0.05% HURT stats (abs) min: 8 max: 8 x̄: 8.00 x̃: 8 HURT stats (rel) min: 0.06% max: 0.06% x̄: 0.06% x̃: 0.06% 95% mean confidence interval for cycles value: -45.88 11.51 95% mean confidence interval for cycles %-change: -0.05% -0.02% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total cycles in shared programs: `180663626` -> `180663881` (<.01%) cycles in affected programs: 472350 -> 472605 (0.05%) helped: 15 HURT: 30 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01% HURT stats (abs) min: 8 max: 10 x̄: 9.00 x̃: 9 HURT stats (rel) min: 0.06% max: 0.14% x̄: 0.10% x̃: 0.10% 95% mean confidence interval for cycles value: 4.21 7.12 95% mean confidence interval for cycles %-change: 0.05% 0.08% Cycles are HURT. Sandy Bridge total cycles in shared programs: `154568664` -> `154569225` (<.01%) cycles in affected programs: 356486 -> 357047 (0.16%) helped: 1 HURT: 31 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.02% max: 0.02% x̄: 0.02% x̃: 0.02% HURT stats (abs) min: 4 max: 33 x̄: 18.16 x̃: 8 HURT stats (rel) min: 0.05% max: 0.23% x̄: 0.14% x̃: 0.10% 95% mean confidence interval for cycles value: 12.19 22.87 95% mean confidence interval for cycles %-change: 0.10% 0.16% Cycles are HURT. Iron Lake total instructions in shared programs: `8206589` -> `8206565` (<.01%) instructions in affected programs: 3024 -> 3000 (-0.79%) helped: 12 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.75% max: 0.83% x̄: 0.80% x̃: 0.80% 95% mean confidence interval for instructions value: -2.00 -2.00 95% mean confidence interval for instructions %-change: -0.82% -0.77% Instructions are helped. total cycles in shared programs: `187657428` -> `187656228` (<.01%) cycles in affected programs: 95748 -> 94548 (-1.25%) helped: 12 HURT: 0 helped stats (abs) min: 80 max: 120 x̄: 100.00 x̃: 100 helped stats (rel) min: 1.00% max: 1.66% x̄: 1.27% x̃: 1.21% 95% mean confidence interval for cycles value: -113.27 -86.73 95% mean confidence interval for cycles %-change: -1.43% -1.11% Cycles are helped. GM45 total instructions in shared programs: `5037569` -> `5037557` (<.01%) instructions in affected programs: 1521 -> 1509 (-0.79%) helped: 6 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.75% max: 0.83% x̄: 0.79% x̃: 0.79% 95% mean confidence interval for instructions value: -2.00 -2.00 95% mean confidence interval for instructions %-change: -0.83% -0.75% Instructions are helped. total cycles in shared programs: `128101478` -> `128100758` (<.01%) cycles in affected programs: 52746 -> 52026 (-1.37%) helped: 6 HURT: 0 helped stats (abs) min: 120 max: 120 x̄: 120.00 x̃: 120 helped stats (rel) min: 1.16% max: 1.66% x̄: 1.41% x̃: 1.41% 95% mean confidence interval for cycles value: -120.00 -120.00 95% mean confidence interval for cycles %-change: -1.70% -1.12% Cycles are helped. This change has almost no effect right now. However, removing this patch (but leaving the patch "nir/algebraic: Replace a bcsel of a b2f with a b2f(!(a \|\| b))") after adding a patch that removes !(a < b) -> (a >= b) optimizations (like https://patchwork.freedesktop.org/patch/264787/) has the following results on Skylake: Skylake total instructions in shared programs: `15071022` -> `15089710` (0.12%) instructions in affected programs: `1022219` -> `1040907` (1.83%) helped: 1 HURT: 3937 helped stats (abs) min: 41 max: 41 x̄: 41.00 x̃: 41 helped stats (rel) min: 1.01% max: 1.01% x̄: 1.01% x̃: 1.01% HURT stats (abs) min: 1 max: 256 x̄: 4.76 x̃: 4 HURT stats (rel) min: 0.05% max: 11.18% x̄: 2.59% x̃: 2.60% 95% mean confidence interval for instructions value: 4.56 4.93 95% mean confidence interval for instructions %-change: 2.54% 2.64% Instructions are HURT. total cycles in shared programs: `369777134` -> `370092923` (0.09%) cycles in affected programs: `17516573` -> `17832362` (1.80%) helped: 115 HURT: 3624 helped stats (abs) min: 1 max: 1721 x̄: 81.18 x̃: 28 helped stats (rel) min: <.01% max: 10.74% x̄: 1.24% x̃: 0.65% HURT stats (abs) min: 1 max: 12640 x̄: 89.71 x̃: 54 HURT stats (rel) min: <.01% max: 28.24% x̄: 4.72% x̃: 4.52% 95% mean confidence interval for cycles value: 75.21 93.71 95% mean confidence interval for cycles %-change: 4.43% 4.64% Cycles are HURT. total spills in shared programs: 9450 -> 9442 (-0.08%) spills in affected programs: 166 -> 158 (-4.82%) helped: 2 HURT: 0 total fills in shared programs: 21115 -> 21094 (-0.10%) fills in affected programs: 438 -> 417 (-4.79%) helped: 2 HURT: 0 LOST: 1 GAINED: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	6 years ago
Ian Romanick	d40640efe8	nir/algebraic: Replace a bcsel of a b2f sources with a b2f(!(a \|\| b)) I have not investigated the result of doing this during code generation. That should be possible, but it would be a bit more effort. All Gen6+ platforms had nearly identical results. (Skylake shown) total cycles in shared programs: `370961508` -> `370961367` (<.01%) cycles in affected programs: 5174 -> 5033 (-2.73%) helped: 2 HURT: 0 Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: `8206587` -> `8206589` (<.01%) instructions in affected programs: 1325 -> 1327 (0.15%) helped: 0 HURT: 2 total cycles in shared programs: `187657422` -> `187657428` (<.01%) cycles in affected programs: 11566 -> 11572 (0.05%) helped: 0 HURT: 2 This change has almost no effect right now. However, removing this patch (but leaving the patch "intel/fs: Generate if instructions with inverted conditions") after adding a patch that removes !(a < b) -> (a >= b) optimizations (like https://patchwork.freedesktop.org/patch/264787/) has the following results on Skylake: Skylake total instructions in shared programs: `15071804` -> `15071806` (<.01%) instructions in affected programs: 640 -> 642 (0.31%) helped: 0 HURT: 2 total cycles in shared programs: `369914348` -> `369916569` (<.01%) cycles in affected programs: 27900 -> 30121 (7.96%) helped: 4 HURT: 15 helped stats (abs) min: 2 max: 112 x̄: 30.00 x̃: 3 helped stats (rel) min: 0.28% max: 12.28% x̄: 3.34% x̃: 0.40% HURT stats (abs) min: 2 max: 758 x̄: 156.07 x̃: 81 HURT stats (rel) min: 0.20% max: 74.30% x̄: 16.29% x̃: 16.91% 95% mean confidence interval for cycles value: 12.68 221.11 95% mean confidence interval for cycles %-change: 3.09% 21.23% Cycles are HURT. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	6 years ago

1 2 3 4 5 ...

108799 Commits (faf9e40f35ff14045265556acb1f9a2c4ab33c98) All Branches Search

108799 Commits (faf9e40f35ff14045265556acb1f9a2c4ab33c98)

All Branches