Fixing this is a prereq for avoiding flagging all state at new
batch time. Eliminating that still causes problems, though (notably
glean logicOp fails on my GM965).
intel: When subdataing a busy buffer, use a temporary and blit in.
This cuts a massive number of waits in ET:QW, which uses a VBO ringbuffer.
Unfortunately it doesn't BufferData when wrapping back to 0, so we can't
be clever with tracking what's been initialized.
mesa: Improve the eliminate-move-use to work across multiple instructions.
This shaves more instructions off of the VS of my GL demo, but no
performance difference this time at n=6. This also fixes a regression
that was in this path, which is now piglit's glsl-vs-mov-after-deref.
intel: Don't check for context pointer to be NULL during extension init
Thanks to Chia-I Wu's changes to the extension function
infrastructure, we no longer have to tell the loader which extensions
the driver might enable. This means that intelInitExtensions will
never be called with a NULL context pointer. Remove all the NULL checks.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This didn't work for quad/quadstrips at all, and for all other primitive types
it only worked when they were unclipped.
Fix up the former in gs stage (could probably do without these changes and
instead set QuadsFollowProvokingVertexConvention to false), and the rest in
clip stage.
Previously, we'd load linearly from ParameterValues[0] for the constants,
though ParameterValues[1] may not equal ParameterValues[0] + 4. Additionally,
the STATE_VAL type paramters didn't get updated.
Fixes piglit vp-constant-array-huge.vpfp and ET:QW object locations.
Bug #23226.
Before, we weren't aggressive enough in checking for the start or end
of render-to-texture. In particular, if only the ctx->ReadBuffer had
texture attachments, we were treating that as a render-to-texture case.
This fixes a regression from commit 75bdbdd90b
"intel: Don't validate in a texture image used as a render target."
i965: avoid memsetting all the BRW_WM_MAX_INSN arrays for every compile.
For an app that's blowing out the state cache, like sauerbraten, the
memset of the giant arrays ended up taking 11% of the CPU even when only a
"few" of the entries got used. With this, the WM program compile drops back
down to 1% of CPU time.
Bug #24981 (bisected to BRW_WM_MAX_INSN increase).