brkho/mesa - mesa - Brian's Gitea

提交图

作者	SHA1	备注	提交日期
Rob Clark	cc6484f164	gitlab-ci/deqp: preserve caselists for blocks with fails Bump cts_runner to pick up the change to preserve .qpa and caselist .txt files for blocks of tests that contain fails, and preserve the caselist files. To reproduce fails that depend on order of running tests, these are useful. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	6 年前
Rob Clark	59ed90fc74	gitlab-ci/deqp: preserve full list of unexpected results The log only shows the first 50, but preserve the full list for easier browsing. (Also move return of exit code to end which makes later patches in the series easier) Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	6 年前
Rob Clark	5fa397a0d9	gitlab-ci: update deqp build so we can generate xml Update the deqp build to preserve testlog-to-xml and stylesheets, so deqp runner can extract .qpa for failed/flaked tests, and convert to xml. With this, will be able to browse output from failed tests directly from the artifacts. The main motiviation is to give better visibility into what happens with flaked tests, when it is difficult/impossible to reproduce the flake locally (ie. when it happens once out of N million tests). But this should also make it easier to debug regressions that a MR triggers, especially when it is on hw that you don't have. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	6 年前
Markus Wick	dba903ed0b	drirc: Enable glthread for dolphin/citra/yuzu. Dolphin: 75 fps -> 88 fps - Super Mario Galaxy Citra: 81 fps -> 91 fps - A Link Between Worlds Yuzu: 21 fps -> 27 fps - Super Mario Odyssey Dolphin still has many syncs because of glFenceSync and glClientWaitSync. Moving them to the dispatcher thread might yield another speedup. Yuzu uses a compatible profile by default. This benchmark used the variable MESA_GL_VERSION_OVERRIDE=4.5FC to overwrite this behavior. This profilation was done on a mobile i7-8550U CPU with i965. Signed-off-by: Markus Wick <markus@selfnet.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	6 年前
Markus Wick	f4c61d422d	mesa/glthread: Implement ARB_multi_bind. Signed-off-by: Markus Wick <markus@selfnet.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	6 年前
Rhys Perry	517728477c	aco: fix waitcnts for barriers at block ends Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `d1b9deee` ('aco: improve waitcnt insertion around loops') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	6 年前
Zebediah Figura	a3c8bc10aa	Revert "draw: revert using correct order for prim decomposition." This reverts commit `f97b731c82`. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/250 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	6 年前
Kenneth Graunke	acd36e488d	iris: Change keybox parenting For temporary lookups, just allocate out of the NULL ralloc context, so we don't have to edit the linked list of ralloc children to add it and then immediately remove it again. When uploading a new shader, allocate the keybox off the shader, so if we delete the shader the keybox also goes away. Less manual cleanup.	6 年前
Ian Romanick	ca353285cb	nir/range_analysis: Make sure the table validation only occurs once All of the tables are static const, so they only need to be validated once. As noted in the previous commit, the compiler should be able to eliminate all of this code when the assertions would pass. Even with the help of the previous commit, this does not always occur. -Og: -95.688 +/- 3.91935 (-24.9562% +/- 1.0222%) N=5 -O1: No difference proven at 95.0% confidence. N=5 -O2: -1.962 +/- 0.85001 (-0.860013% +/- 0.372589%) N=5 Reviewed-by: Eric Anholt <eric@anholt.net>	6 年前
Ian Romanick	ccefce46cb	nir/range-analysis: Add pragmas to help loop unrolling I was pretty liberal with these assertions when I wrote this code because I had assumed that GCC would unroll the loops, inline the look ups of static const arrays with now constant indices, and then elmininate all the actuall assertions. It seems none of this happens even at -O3. Adding the pragmas helps encourage loop unrolling at some optimization levels. I tested by running shader-db with NIR_VALIDATE=false on a Core i7 Haswell desktop system. -Og: No difference proven at 95.0% confidence. N=5 -O1: -48.304 +/- 1.221 (-16.3343% +/- 0.412888%) N=5 -O2: -49.94 +/- 1.23521 (-17.9634% +/- 0.444303%) N=5 v2: Add a _Pragma to an inner loop that was accidentally dropped during a rebase. Reviewed-by: Eric Anholt <eric@anholt.net>	6 年前
Danylo Piliaiev	25a00b449f	glsl: Add varyings to "zero-init of uninitialized vars" workaround Varyings are similar to already handled cases. And "glsl_zero_init" name of the workaround already looks like it should include varyings. The issue was observed in GiMark subtest from GpuTest. Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Eric Anholt <eric@anholt.net>	6 年前
Alyssa Rosenzweig	4c43b354c3	pan/midgard: Use lower_tex_without_implicit_lod Just a bit of cleanup. lower_tex can do this lowering for us, which should also eliminate some special cases (one less thing to fix if we ever need texturing in tess/geom/etc, perhaps?) Closes #2133 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	6 年前
Christian Gmeiner	47c7c4263c	etnaviv: use a more self-explanatory param name Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	6 年前
Christian Gmeiner	a949fa9d5d	etnaviv: drop not used config_out function param Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	6 年前
Samuel Pitoiset	6f7ec6ee39	gitlab-ci: reduce the number of scons build It seems overkill to me to build scons 7x for every pipeline. Scons is now build with the oldest llvm version in scons-old-llvm and with the newest llvm version in scons. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Eric Anholt <eric@anholt.net>	6 年前
Alyssa Rosenzweig	2e14fe6490	panfrost: Add lcra.c to Android.mk This was forgotten. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	6 年前
Alyssa Rosenzweig	bda2bb31b1	pan/midgard: Enable LOD lowering only on buggy chips T720 and earlier need this workaround, so check the quirk before lowering. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	6 年前
Alyssa Rosenzweig	68c2c7962a	pan/midgard: Describe quirk MIDGARD_BROKEN_LOD Corresponds to errata #10471, applies to T6xx and T720. Fixed in T760. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	6 年前
Alyssa Rosenzweig	d32d4acf68	pan/midgard: Add LOD bias/clamp lowering We fetch the info with the new intrinsic and lower with ALU ops for txl instructions, which seemingly correspond to "TEXGRD" instructions (what we call textureLod). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	6 年前
Alyssa Rosenzweig	4e07e7b232	pan/midgard: Implement load_sampler_lod_paramaters_pan We can stuff this information in as parametrized system values, like we currently do texture size and SSBO addresses. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	6 年前
Alyssa Rosenzweig	deaebc82a7	nir: Add load_sampler_lod_paramaters_pan intrinsic This loads in the <min_lod, max_lod, lod_bias> settings for a given sampler, which is necessary for lowering clamps/biases on certain Midgard chips. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	6 年前
Markus Wick	b1156ecdf2	mapi/glapi: Generate sizeof() helpers instead of fixed sizes. Generating a source code with a fixed size leads to issues with plattform dependent types. We either hard code 4 or 8 bytes there, and both are wrong on the other plattform. So this patch solves this issue by generating eg sizeof(GLsizeiptr), which is valid both on 32 and on 64 bit plattforms. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	6 年前
Ian Romanick	e51eda99df	intel/fs: Disable conditional discard optimization on Gen4 and Gen5 The CMP instruction on Gen4 and Gen5 generates one bit (the LSB) of valid data and 31 bits of junk. Results of comparisons that are used as Boolean values need to have a fixup applied to generate the proper 0/~0 values. Calling fs_visitor::nir_emit_alu with need_dest=false prevents the fixup code from being generated. This results in a sequence like: cmp.l.f0.0(16) g8<1>F g14<8,8,1>F 0x0F /* 0F / ... cmp.l.f0.0(16) g4<1>F g6<8,8,1>F 0x0F / 0F / (+f0.1) or.z.f0.1(16) null<1>UD g4<8,8,1>UD g8<8,8,1>UD instead of cmp.l.f0.0(16) g8<1>F g14<8,8,1>F 0x0F / 0F / ... cmp.l.f0.0(16) g4<1>F g6<8,8,1>F 0x0F / 0F */ or(16) g4<1>UD g4<8,8,1>UD g8<8,8,1>UD (+f0.1) and.z.f0.1(16) null<1>UD g4<8,8,1>UD 1UD I examined a couple of the shaders hurt by this change, and ALL of them would have been affected by this bug. :( Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1836 Fixes: `0ba9497e66` ("intel/fs: Improve discard_if code generation") Iron Lake total instructions in shared programs: `8122757` -> `8122957` (<.01%) instructions in affected programs: 8307 -> 8507 (2.41%) helped: 0 HURT: 100 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.84% max: 6.67% x̄: 2.81% x̃: 2.76% 95% mean confidence interval for instructions value: 2.00 2.00 95% mean confidence interval for instructions %-change: 2.58% 3.03% Instructions are HURT. total cycles in shared programs: `188510100` -> `188510376` (<.01%) cycles in affected programs: 76018 -> 76294 (0.36%) helped: 0 HURT: 55 HURT stats (abs) min: 2 max: 12 x̄: 5.02 x̃: 4 HURT stats (rel) min: 0.07% max: 3.75% x̄: 0.86% x̃: 0.56% 95% mean confidence interval for cycles value: 4.33 5.71 95% mean confidence interval for cycles %-change: 0.60% 1.12% Cycles are HURT. GM45 total instructions in shared programs: `4994403` -> `4994503` (<.01%) instructions in affected programs: 4212 -> 4312 (2.37%) helped: 0 HURT: 50 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.84% max: 6.25% x̄: 2.76% x̃: 2.72% 95% mean confidence interval for instructions value: 2.00 2.00 95% mean confidence interval for instructions %-change: 2.45% 3.07% Instructions are HURT. total cycles in shared programs: `128928750` -> `128928982` (<.01%) cycles in affected programs: 67442 -> 67674 (0.34%) helped: 0 HURT: 47 HURT stats (abs) min: 2 max: 12 x̄: 4.94 x̃: 4 HURT stats (rel) min: 0.09% max: 3.75% x̄: 0.75% x̃: 0.53% 95% mean confidence interval for cycles value: 4.19 5.68 95% mean confidence interval for cycles %-change: 0.50% 1.00% Cycles are HURT.	6 年前
Dylan Baker	bba44ef176	docs: update calendar, add news item and link release notes for 19.2.6	6 年前
Dylan Baker	3531d74e82	docs: Add SHA256 sum for 19.2.6	6 年前
Dylan Baker	f8070577a4	docs: Add release notes for 19.2.6	6 年前
Marek Olšák	0b1452ffdd	nir/serialize: do ctx = {0} instead of manual initializations Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	6 年前
Marek Olšák	ff71fae440	nir: strip as we serialize to remove the nir_shader_clone call Serializing stripped NIR is faster now. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	6 年前
Christian Gmeiner	8acaab1aa7	etnaviv: add drm-shim Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	6 年前
Eric Engestrom	609a6ae23e	vk_util: drop duplicate formats in vk_format_map[] Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Eric Anholt <eric@anholt.net>	6 年前
Jonathan Marek	773d640efa	turnip: implement UBWC This enables UBWC for everything except 3D textures. It breaks many image_to_image copies but those aren't important and it can be worked around later (image_to_image copy needs to be done in two steps, decode from the source format and then encode to the destination format). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	6 年前
Jonathan Marek	91fd83d142	freedreno/regs: update UBWC related bits Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	6 年前
Vinson Lee	6613a4a029	swr: Fix build with llvm-10.0. Fix build error after llvm-10.0 commit `1dfede3122` ("Move CodeGenFileType enum to Support/CodeGen.h"). ../src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp: In member function ‘void JitManager::DumpAsm(llvm::Function, const char)’: ../src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp:428:45: error: ‘CGFT_AssemblyFile’ is not a member of ‘llvm::TargetMachine’ *pMPasses, filestream, nullptr, TargetMachine::CGFT_AssemblyFile); ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>	6 年前
Rhys Perry	29d131d619	aco: fix copy+paste error Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	6 年前
Rhys Perry	d1b9deeea8	aco: improve waitcnt insertion around loops Do this by repeating processing of loops until no progress is made. Totals from affected shaders: SGPRS: 162576 -> 162576 (0.00 %) VGPRS: 145228 -> 145228 (0.00 %) Spilled SGPRs: 668 -> 668 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: `15778640` -> `15771336` (-0.05 %) bytes LDS: 146 -> 146 (0.00 %) blocks Max Waves: 6087 -> 6087 (0.00 %) v2: use block_kind_loop_header/block_kind_loop_exit to repeat at the end of loops instead of at each continue Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	6 年前
Rob Clark	1a8c49d76c	freedreno/perfctrs/fdperf: periodically restore counters When GPU is idle and suspends, the currently selected countables will all reset to the first one. So periodically restore the selected countables. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	6 年前
Rob Clark	5a13507164	freedreno/perfcntrs: add fdperf Port from the envytools tree, but converted to use the .c tables for describing the perfcounter groups/countables, rather than using rnndec to get this at runtime from the register xml. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	6 年前
Rob Clark	b2338a5b00	freedreno/perfcntrs/a6xx: remove RBBM counters Currently this are getting blocked by the kernel.. these counters don't seem to be the most useful ones, and to use them we'd have to somehow probe the kernel by submitting cmdstream to write the selector regs and see if that triggers a GPU fault. So let's just skip them. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	6 年前
Rob Clark	6a517b3079	freedreno/perfctrs/a2xx: move CP to be first group fdperf expects this, to find the ALWAYS_COUNT counter Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	6 年前
Rob Clark	e35c4e6ad2	freedreno/perfcntrs: add accessor to get per-gen tables Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	6 年前
Rob Clark	b21f03ae7e	freedreno/perfcntrs: move to shared location This should eventually be useful for VK_KHR_performance_query as well. And in the more near term, for fdperf. Attempt to not break android build is best-effort and untested. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	6 年前
Rob Clark	6727114cba	freedreno/perfcntrs: remove gallium dependencies Prep work to move to a shared location. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	6 年前
Rob Clark	3fb6aaf42e	freedreno/perfcntrs: small cleanup When we had one gen supporting performance counters, it made sense to have these builder macros in the .c file with the table. But time has come to de-duplicate. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	6 年前
Dave Airlie	cce07ea835	nir: fix deref offset builder Use the correct bit size Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	6 年前
Dave Airlie	7325f6ac98	vtn/opencl: add clz support This is needed for OpenCL Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	6 年前
Dave Airlie	e3b21dfcb1	nouveau: request ufind_msb64 lowering in the frontend. This passes the piglit CL builtin-ulong-clz-1.0.generated.cl test. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Karol Herbst <kherbst@redhat.com>	6 年前
Dave Airlie	d0d96053e6	nir: add 64-bit ufind_msb lowering support. (v2) This adds the option to lower 64-bit ufind_msb opcodes. v2: use split_x/y removes component loops (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	6 年前
Dave Airlie	12913bcf86	spirv/nir/opencl: handle some multiply instructions. This adds support for some missing 24-bit and hi multiply variants. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	6 年前
Dave Airlie	5375c30234	spirv: get the correct type for function returns. This needs to be derived from the address format, not always 1/32. Suggested by Jason Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	6 年前
Dave Airlie	b62a925ad1	spirv: don't store 0 to cs.ptr_size for non kernel stages. cs is a union so storing this there is wrong. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	6 年前

1 2 3 4 5 ...

117892 次代码提交 (cc6484f1641ca905074ad48b7def844540075643) 所有分支 搜索

117892 次代码提交 (cc6484f1641ca905074ad48b7def844540075643)

所有分支