freedreno: don't patch and re-emit same shader as much
New textures or vertex buffers don't always require patching and
re-emitting the shaders. So do a better job of figuring out when we
actually have to patch the shader.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
clang is supports most gcc options / extensions, with a some exceptions.
The biggest advantage of using clang is that compilation times are much
short.
One can tell scons to use clang when building by invoking it as
CC=clang CXX=clang++ scons libgl-xlib
We were assigning incorrect const register for immediates, and
potentially writing immediate const to the wrong location. This fixes
an incorrect-rendering bug with xonotic.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Set a few extra registers to make sure we are in proper state for
clearing. And also add some debug options to mark all state dirty in
clear and gmem operations to aid in debugging.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
There is a bit we need to set for 2D vs 3D fetch, to tell the hw whether
there are two or there valid input components.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
The previous approach of using the dst register as an intermediate
temporary doesn't work in a lot of cases. For example, if the dst
register is the same as one of the src registers.
For now, just simplify it and always allocate a new register to use as
an intermediate. In some cases this will result in more registers used
than required. I think the best solution would be to implement an
optimization pass to reduce the number of registers used, which would
also solve the problem we have now of not being able to use GPRs that
are assigned for TGSI_FILE_INPUT.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Opps, didn't notice that I had left it stubbed out.
Also, make things fail a bit more gracefully when things go wrong.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Really this should be set based on buffer format, not on color vs
depth/stencil. Probably there should be more formats that set the bit
as we add support for more render target formats.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
radeon/llvm: Fix segfault with a specifc libelf implementation
The libelf implementation that is distributed here:
http://www.mr511.de/software/english.html
requires calling elf_version() prior to calling elf_memory()
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Lighter weight then using streamout. Only evergreen
and newer asics support embedded data as src with
CP DMA.
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Should fix: https://bugs.freedesktop.org/show_bug.cgi?id=63702
v2: add a comment that this is just a workaround
v3: fix typo in comment
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
mesa: Restore 78-column wrapping of license text in C-style comments.
The previous commit introduced extra words, breaking the formatting.
This text transformation was done automatically via the following shell
command:
$ git grep 'THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY' | sed 's/:.*$//' | xargs -I {} sh -c 'vim -e -s {} < vimscript
where 'vimscript' is a file containing:
/THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY/;/\*\// !fmt -w 78 -p ' * '
:wq
Reviewed-by: Brian Paul <brianp@vmware.com>
mesa: Add "OR COPYRIGHT HOLDERS" to license text disclaiming liability.
This brings the license text in line with the MIT License as published
on the Open Source Initiative website:
http://opensource.org/licenses/mit-license.php
Generated automatically be the following shell command:
$ git grep 'THE AUTHORS BE LIABLE' | sed 's/:.*$//g' | xargs -I '{}' \
sed -i 's/THE AUTHORS/THE AUTHORS OR COPYRIGHT HOLDERS/' {}
This introduces some wrapping issues, to be fixed in the next commit.
Reviewed-by: Brian Paul <brianp@vmware.com>
gallium: Replace gl_rasterization_rules with lower_left_origin and half_pixel_center.
Squashed commit of the following:
commit 04c5fa2cbb
Author: José Fonseca <jfonseca@vmware.com>
Date: Tue Apr 23 17:37:18 2013 +0100
gallium: s/lower_left_origin/bottom_edge_rule/
commit 4dff4f64fa
Author: José Fonseca <jfonseca@vmware.com>
Date: Tue Apr 23 17:35:04 2013 +0100
gallium: Move diagram to docs.
commit 442a63012c
Author: James Benton <jbenton@vmware.com>
Date: Fri May 11 17:50:55 2012 +0100
gallium: Replace gl_rasterization_rules with lower_left_origin and half_pixel_center.
This change is necessary to achieve correct results when using OpenGL
FBOs.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
r600g: initialize CMASK and HTILE with the GPU using streamout
This fixes a crash when a resource cannot be mapped to the CPU's address space
because it's too big.
This puts a global pipe_context in r600_screen, which is guarded by a mutex,
so that we can use pipe_context when there isn't one around.
Hopefully our multi-context support is solid.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
NOTE: This is a candidate for the 9.1 branch.
softpipe: fix streamout with an emptry geometry shader
Same approach as in the llvmpipe, if the geometry shader is
null and we have stream output then attach it to the vertex
shader right before executing the draw pipeline.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
There will be a new IR for a3xx, which has a very different shader ISA
(more scalar oriented). So rename to avoid conflicts later when I start
adding a3xx support to the gallium driver.
Signed-off-by: Rob Clark <Rob Clark robdclark@freedesktop.org>
The standalone shader assembler needed some meta-data to know about
attributes/varyings/etc, to do the shader linkage. We don't need these
parts with gallium/tgsi, so just get rid of it.
Signed-off-by: Rob Clark <Rob Clark robdclark@freedesktop.org>
While initially that opcode probably was meant for something along the
lines of sm3 break_comp it has never worked that way (not even the
argument count was right) and now the opcode has quite different
semantics so just remove it. (Discovered by Jose Fonseca)
Most test pass, issue are with border color and swizzle.
Based on ircnick<maelcum> patch.
v2: Restaged commit hunk
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
v2: Remove left over code
v3: Restage properly the commit so hunk of first one are not in
second one.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Not all are supported as render targets.
The state tracker fallback of using RGBA instead of RGBX currently
fails for blending, we could work around this by clearing their alpha
to 1 and modifying the color mask to disable writing alpha.
st/mesa: optionally apply texture swizzle to border color v2
This is the only sane solution for nv50 and nvc0 (really, trust me),
but since on other hardware the border colour is tightly coupled with
texture state they'd have to undo the swizzle, so I've added a cap.
The dependency of update_sampler on the texture updates was
introduced to avoid doing the apply_depthmode to the swizzle twice.
v2: Moved swizzling helper to u_format.c, extended the CAP to
provide more accurate information.
i915g: Release old fragment shader sampler views with current pipe
We were trying to use a destroy method from a deleted context.
This fix is based on what's in the svga driver.
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
softpipe/so: use the correct variable for reporting stream out
we were using the wrong vars, reporting incorrect stream output
statistics.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
draw: implement pipeline statistics in the draw module
This is a basic implementation of the pipeline statistics in the
draw module. The interface is similar to the stream output statistics
and also requires that the callers explicitly enable it.
Included is the implementation of the interface in llvmpipe and
softpipe. Only softpipe enables the pipeline statistics capability
though because llvmpipe is lacking gathering of the fragment shading
and rasterization statistics.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
TGSI_OPCODE_IF condition had two possible interpretations:
- src.x != 0.0f
- Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was false either for
vertex and fragment shaders
- gallivm/llvmpipe
- postprocess
- vl state tracker
- vega state tracker
- most old drivers
- old internal state trackers
- many graw examples
- src.x != 0U
- Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was true for both
vertex and fragment shaders
- tgsi_exec/softpipe
- r600
- radeonsi
- nv50
And drivers that use draw module also were a mess (because Mesa would
emit float IFs, but draw module supports native integers so it would
interpret IF arg as integers...)
This sort of works if the source argument is limited to float +0.0f or
+1.0f, integer 0, but would fail if source is float -0.0f, or integer in
the float NaN range. It could also fail if source is integer 1, and
hardware flushes denormalized numbers to zero.
But with this change there are now two opcodes, IF and UIF, with clear
meaning.
Drivers that do not support native integers do not need to worry about
UIF. However, for backwards compatibility with old state trackers and
examples, it is advisable that native integer capable drivers also
support the float IF opcode.
I tried to implement this for r600 and radeonsi based on the surrounding
code. I couldn't do this for nouveau, so I just shunted IF/UIF
together, which matches the current behavior.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
v2:
- Incorporate Roland's feedback.
- Fix r600_shader.c merge conflict.
- Fix typo in radeon, spotted by Michel Dänzer.
- Incorporte Christoph Bumiller's patch to handle TGSI_OPCODE_IF(float)
properly in nv50/ir.
r600g: Workaround for a harware bug with nested loops on Cayman
There is a hardware bug on Cayman where a BREAK/CONTINUE followed by
LOOP_STARTxxx for nested loops may put the branch stack into a state
such that ALU_PUSH_BEFORE doesn't work as expected. Workaround this
by replacing the ALU_PUSH_BEFORE with a PUSH + ALU
Fixes piglit tests EXT_transform_feedback/order*
v2: Use existing loop count and improve comment
v3: [Vadim Girlin] Set jump address for PUSH instructions
NOTE: This is a candidate for the 9.1 branch
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>