My understanding is that Broadwell retains the same SCS mechanism
that Haswell has, so even if the underlying issue with this format
is not fixed, the w/a will be applied in SCS rather than needing
shader code.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
i965/Gen7: Allow CMS layout for multisample textures
Now that all the pieces are in place, this should provide
a nice performance boost for apps using multisample textures.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
i965/Gen7: Include bitfield in the sampler key for CMS layout
We need to emit extra shader code in this case to sample the
MCS surface first; we can't just blindly do this all the time
since IVB will sometimes try to access the MCS surface even if
disabled.
V3: Use actual MSAA layout from the texture's mt, rather
then computing what would have been used based on the format.
This is simpler and less fragile - there's at least one case where
we might want to have a texture's MSAA layout change based on what
the app does (CMS SINT falling back to UMS if the app ever attempts
to render to it with a channel disabled.)
This also obsoletes V2's 1/10 -- compute_msaa_layout can now remain
an implementation detail of the miptree code.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
i965/Gen7: Move decision to allocate MCS surface into intel_mipmap_create
This gives us correct behavior for both renderbuffers (which previously
worked) and multisample textures (which would never get an MCS surface
allocated, even if CMS layout was selected)
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Previously this was only done for render targets.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
i965/wm: Set copy of sample mask in 3DSTATE_PS correctly for Haswell
The bspec says:
"SW must program the sample mask value in this field so that it matches
with 3DSTATE_SAMPLE_MASK"
I haven't observed this to actually fix anything, but stumbled across it
while adding the rest of the support for CMS layout for multisample
textures.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Haswell needs a copy of the sample mask in 3DSTATE_PS; this makes that
convenient.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
glsl: Don't emit empty declaration warning for a struct specifier
The intention is that things like
int;
will generate a warning. However, we were also accidentally emitting
the same warning for things like
struct Foo { int x; };
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68838
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Aras Pranckevicius <aras@unity3d.com>
Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>
For some reason this was left out when the version was changed...
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
i965/fs: Extend SEL peephole to handle only matching MOVs.
Before this patch, the following code would not be optimized even though
the first two instructions were common to the then and else blocks:
(+f0) IF
MOV dst0 ...
MOV dst1 ...
MOV dst2 ...
ELSE
MOV dst0 ...
MOV dst1 ...
MOV dst3 ...
ENDIF
This commit extends the peephole to handle this case.
No shader-db changes.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
i965/fs: New peephole optimization to generate SEL.
fs_visitor::try_replace_with_sel optimizes only if statements whose
"then" and "else" bodies contain a single MOV instruction. It also
could not handle constant arguments, since they cause an extra MOV
immediate to be generated (since we haven't run constant propagation,
there are more than the single MOV).
This peephole fixes both of these and operates as a normal optimization
pass.
fs_visitor::try_replace_with_sel is still arguably necessary, since it
runs before pull constant loads are lowered.
total instructions in shared programs: 1559129 -> 1545833 (-0.85%)
instructions in affected programs: 167120 -> 153824 (-7.96%)
GAINED: 13
LOST: 6
Reviewed-by: Paul Berry <stereotype441@gmail.com>
i965/fs: Let register_coalesce_2() eliminate self-moves.
This is the last thing that register_coalesce() still handled.
total instructions in shared programs: 1561060 -> 1560908 (-0.01%)
instructions in affected programs: 15758 -> 15606 (-0.96%)
Reviewed-by: Eric Anholt <eric@anholt.net>
parent_mem_ctx was unused since db47074a, so remove the two wrappers
around create() and make create() the constructor.
Reviewed-by: Eric Anholt <eric@anholt.net>
make_list is just a one-line wrapper and was confusingly called by
NULL objects. E.g., cur_if == NULL; cur_if->make_list(mem_ctx).
Reviewed-by: Eric Anholt <eric@anholt.net>