brkho/mesa - mesa - Brian's Gitea

Commit Graph

Autor	SHA1	Mensaje	Fecha
Iago Toral Quiroga	5dfb085ff3	glsl: Improve precision of mod(x,y) Currently, Mesa uses the lowering pass MOD_TO_FRACT to implement mod(x,y) as y * fract(x/y). This implementation has a down side though: it introduces precision errors due to the fract() operation. Even worse, since the result of fract() is multiplied by y, the larger y gets the larger the precision error we produce, so for large enough numbers the precision loss is significant. Some examples on i965: Operation Precision error ----------------------------------------------------- mod(-1.951171875, 1.9980468750) 0.0000000447 mod(121.57, 13.29) 0.0000023842 mod(3769.12, 321.99) 0.0000762939 mod(3769.12, 1321.99) 0.0001220703 mod(-987654.125, 123456.984375) 0.0160663128 mod( 987654.125, 123456.984375) 0.0312500000 This patch replaces the current lowering pass with a different one (MOD_TO_FLOOR) that follows the recommended implementation in the GLSL man pages: mod(x,y) = x - y * floor(x/y) This implementation eliminates the precision errors at the expense of an additional add instruction on some systems. On systems that can do negate with multiply-add in a single operation this new implementation would come at no additional cost. v2 (Ian Romanick) - Do not clone operands because when they are expressions we would be duplicating them and that can lead to suboptimal code. Fixes the following 16 dEQP tests: dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.mediump_* dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.highp_* Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	hace 10 años
Eduardo Lima Mitev	c27d23f0c8	mesa: Allow querying for GL_PRIMITIVE_RESTART_FIXED_INDEX under GLES 3 GLES 3.0.0 spec introduces context state PRIMITIVE_RESTART_FIXED_INDEX (2.8.1 Transferring Array Elements, page 26) which is not currently possible to query using glGet() funcs. Fixes 4 dEQP tests: dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getboolean * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getinteger * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getinteger64 * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getfloat Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	hace 10 años
Iago Toral Quiroga	ec7dcaf578	glsl: can't have 'const' qualifier used with struct or interface block members Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_const_vertex dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_const_fragment Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	hace 10 años
Iago Toral Quiroga	5d655a43e6	glsl: interface blocks must be declared at global scope Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_in_main_vertex dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_in_main_fragment Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	hace 10 años
Iago Toral Quiroga	6dd346c232	i965: Fix negate with unsigned integers For code such as: uint tmp1 = uint(in0); uint tmp2 = -tmp1; float out0 = float(tmp2); We produce code like: mov(8) g5<1>.xF -g9<4,4,1>.xUD which does not produce correct results. This code produces the results we would expect if tmp1 and tmp2 were signed integers instead. It seems that a similar problem was detected and addressed when using negations with unsigned integers as part of condionals, but it looks like the problem has a wider impact than that. This patch fixes the problem by preventing copy-propagation of negated UD registers in all scenarios, not only in conditionals. Fixes the following 24 dEQP tests: dEQP-GLES3.functional.shaders.operator.unary_operator.minus._uint_ dEQP-GLES3.functional.shaders.operator.unary_operator.minus._uvec2_ dEQP-GLES3.functional.shaders.operator.unary_operator.minus._uvec3_ dEQP-GLES3.functional.shaders.operator.unary_operator.minus._uvec4_ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	hace 10 años
Jose Fonseca	5b941ce857	scons: Fix Windows builds with LLVM 3.5. LLVMBitReader dependency was introduced, as pointed out by Rob Conde.	hace 10 años
Ilia Mirkin	bc321db75b	st/mesa: add EXT_polygon_offset_clamp support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	hace 10 años
Ilia Mirkin	7c211a12aa	gallium: add a cap to determine whether the driver supports offset_clamp Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	hace 10 años
Ilia Mirkin	2ce29ce5af	i965/gen6+: enable EXT_polygon_offset_clamp Replace the hard-coded 0's with the context clamp value. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	hace 10 años
Ilia Mirkin	81998dda63	mesa: add support for GL_EXT_polygon_offset_clamp Nothing enables the extension yet, but the values are now available. The spec calls for it to only be exposed for GL 3.3+, which is core-only in mesa. Instead we allow any driver to enable it, including in a compat context for any GL version. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	hace 10 años
Ilia Mirkin	83321009de	glapi: add GL_EXT_polygon_offset_clamp Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	hace 10 años
Kenneth Graunke	0f06f12c11	glsl: Pick ast_conditional branch regardless of op1/2 being constant. If the ?: operator's condition is a constant value, and both branches were pure expressions, we can just make the resulting value one or the other. Previously, we only did this if op[1] and op[2] were also constant values - but there's no actual reason for that restriction. No changes in shader-db, probably because we usually optimize this later anyway. But it does make us generate less stupid code up front. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	hace 11 años
Kenneth Graunke	534f07ee85	i965: Add a better PRM citation for the IMS dimension mangling. Paul originally had to reverse engineer these formulas based on the description about how the sampler works. The description here is not the easiest to follow - especially given that it's from the Sandybridge era, when the hardware only did 4x multisampling. Jordan and I recently found another part of the documentation where they simply state that IMS dimensions must be adjusted by a set of formulas. Quoting this section provides an easy to follow explanation for the code, including 2x/4x/8x/16x. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@intel.com>	hace 11 años
Laura Ekstrand	e9b86cb5d6	swrast: Whitespace fixes. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	hace 10 años
Laura Ekstrand	e187c2f543	DD: Refactor BlitFramebuffer. In preparation for glBlitNamedFramebuffer, the DD table function BlitFramebuffer needs to accept two arbitrary framebuffer objects rather than assuming ctx->ReadBuffer and ctx->DrawBuffer. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	hace 10 años
Laura Ekstrand	ad2c64abbd	GL: Update glext.h to Khronos Revision 29537. Khronos Revision 29537 fixes ARB_direct_state_access function prototypes that had GLsizei where they should have had GLsizeiptr. The mainly affects functions related to buffer objects. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	hace 10 años
Jason Ekstrand	2cebaac479	i965: Don't use tiled_memcpy to download from RGBX or BGRX surfaces Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88841 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	hace 10 años
Neil Roberts	af8fd694d4	dir-locals.el: Don't set variables for non-programming modes This limits the style changes to modes inherited from prog-mode. The main reason to do this is to avoid setting fill-column for people using Emacs to edit commit messages because 78 characters is too many to make it wrap properly in git log. Note that makefile-mode also inherits from prog-mode so the fill column should continue to apply there. v2: Apply to all the .dir-locals.el files, not just the one in the root directory. Acked-by: Michel Dänzer <michel.daenzer@amd.com>	hace 10 años
Iago Toral Quiroga	68155e5a36	i965: Fix intel_miptree_copy_teximage for GL_TEXTURE_1D_ARRAY For GL_TEXTURE_1D_ARRAY targets we store the depth of the array in the Height field and leave Depth=1 in the underlying texture object. When we call intel_miptree_copy_teximage in the process of re-creating a miptree (possibily because the number of miplevels has changed) we didn't account for this, so we where only copying texture images for the first slice. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	hace 10 años
Eric Anholt	753c327151	vc4: Kill a bunch of color write calculation when colormask is all off. I could have done this in the bit that generates the ANDs and ORs, but it's probably generally useful. Sadly, I still need this even if I move to NIR, because I can't yet express my read of the destination color in NIR, which I would need to move my blend/logicop/colormask handling into NIR. total uniforms in shared programs: 13497 -> 13455 (-0.31%) uniforms in affected programs: 101 -> 59 (-41.58%) total instructions in shared programs: 40797 -> 40296 (-1.23%) instructions in affected programs: 1639 -> 1138 (-30.57%)	hace 10 años
Fredrik Höglund	0508032413	docs: Update ARB_direct_state_access Mark vertex array objects as started.	hace 10 años
Martin Peres	9272022353	doc: break down ARB_direct_state_access in GL3.txt A student was wondering what was going on + I started working on it too. CC: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Fredrik Höglund <fredrik@kde.org>	hace 10 años
Eric Anholt	12ebd7e20e	vc4: Dump the VPM read index in QIR disasm. Since the VPM reads have to be in order, it's useful to see their indices in the dump.	hace 10 años
Jason Ekstrand	6094619c02	i965/pixel_read: Don't try to do a tiled_memcpy from a multisampled buffer The GL spec guarantees that glGetTexImage will never get a multisampled texture, but this is not true for glReadPixels. If we get a multisampled buffer, we have to do a multisample resolve on it before we can pull the data down for the user. Since this isn't practical to handle in tiled_memcpy, we just fall back to the other paths that can handle this. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	hace 10 años
Francisco Jerez	11f5d8a5d4	i965: Enable L3 caching of buffer surfaces. And remove the mocs argument of the emit_buffer_surface_state vtbl hook. Its semantics vary greatly from one generation to another, so it kind of encourages the caller to pass 0 which is the only valid setting across generations. After this commit the hardware-specific code decides what the best cacheability settings are for buffer surfaces, just like we do for textures. This together with some additional changes coming is expected to improve performance of pull constants, buffer textures, atomic counters and image objects on Gen7 and up. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	hace 10 años
José Fonseca	11a955aef4	egl: Pass the correct X visual depth to xcb_put_image(). The dri2_x11_add_configs_for_visuals() function happily matches a 32 bits EGLconfig with a 24 bits X visual. However it was passing 32bits depth to xcb_put_image(), making X server unhappy: https://github.com/apitrace/apitrace/issues/313#issuecomment-70571911 Cc: "10.4" <mesa-stable@lists.freedesktop.org>	hace 10 años
Jason Ekstrand	5c31184cf5	intel/pixel_read: Properly flip the results for window system buffers Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88841 Reviewed-by: Chad Versace <chad.versace@intel.com>	hace 10 años
Jason Ekstrand	837a4c42a6	i965/tiled_memcpy: Support a signed linear pitch Reviewed-by: Chad Versace <chad.versace@intel.com>	hace 10 años
Jason Ekstrand	7cc3bb2318	main: Add STENCIL_INDEX formats to base_tex_format This fixes a bug on BDW when our meta-based stencil blit path assert-fails due to an invalid internal format even though we do support the ARB_stencil_texturing extension. Reviewed-by: Matt Turner <mattst88@gmail.com>	hace 10 años
Jason Ekstrand	16875bc5cd	teximage: Don't indent switch cases No functional change. Reviewed-by: Matt Turner <mattst88@gmail.com>	hace 10 años
Brian Paul	b930ef1ce8	mesa: remove some dead display list code The size of a Node is always four bytes so no need for the old code that was used when sizeof(Node)==8. Reviewed-by: Matt Turner <mattst88@gmail.com>	hace 10 años
Brian Paul	20bc72b791	mesa: remove stale comment in dlist.c code sizeof(Node) is always 4 bytes. Reviewed-by: Matt Turner <mattst88@gmail.com>	hace 10 años
Brian Paul	613974b774	mesa: s/union gl_dlist_node/Node/ in dlist.c code Just minor clean-up. Reviewed-by: Matt Turner <mattst88@gmail.com>	hace 10 años
Brian Paul	53b01938ed	mesa: fix display list 8-byte alignment issue The _mesa_dlist_alloc() function is only guaranteed to return a pointer with 4-byte alignment. On 64-bit systems which don't support unaligned loads (e.g. SPARC or MIPS) this could lead to a bus error in the VBO code. The solution is to add a new _mesa_dlist_alloc_aligned() function which will return a pointer to an 8-byte aligned address on 64-bit systems. This is accomplished by inserting a 4-byte NOP instruction in the display list when needed. The only place this actually matters is the VBO code where we need to allocate a 'struct vbo_save_vertex_list' which needs to be 8-byte aligned (just as if it were malloc'd). The gears demo and others hit this bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88662 Cc: "10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com>	hace 10 años
José Fonseca	fbc3e030e6	util/u_atomic: Provide a _InterlockedCompareExchange8 for older MSVC. Fixes build with Windows SDK 7.0.7600. Tested with u_atomic_test, both on x86 and x86_64. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	hace 10 años
José Fonseca	d7f2dfb67e	util/u_atomic: Use _Interlocked* intrinsics for non 64bits. The intrinsics are universally available, whereas older Windows SDKs (e.g. 7.0.7600) don't have the non-intrisic entrypoint. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	hace 10 años
Neil Roberts	a7eec6d620	i965/skl: Force a BINDING_TABLE_POINTER_* after push constant command According to the SKL bspec the 3DSTATE_CONSTANT_* commands only take effect on the next corresponding 3DSTATE_BINDING_TABLE_POINTER_* command. This patch just makes it set the BRW_NEW_SURFACES state when uploading the push constants to ensure the binding tables will be updated. This fixes the fbo-blending-formats Piglit test and possibly others. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	hace 10 años
Topi Pohjolainen	083fb215e1	meta: Don't write depth when decompressing tex-images Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	hace 10 años
Topi Pohjolainen	c49c750579	meta: Don't write depth when generating miptrees Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	hace 10 años
Topi Pohjolainen	941aced635	meta/blit: Compile programs with and without depth When color buffers alone are concerned the depth is not needed. No regression on BDW where meta blit is used instead of blorp. I also disabled blorp temporarily for fbo-blits on IVB and saw no regressions there either. I also compared several graphics benchmarks on BDW and saw neither regressions or improvements. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	hace 10 años
Topi Pohjolainen	97caf5fa04	meta/blit: Write depth only when asked for Implementing an idea from Ken, on i965 the shader program for 2D blits becomes significantly simpler. Before: pln(8) g6<1>F g4<0,1,0>F g2<8,8,1>F { align1 1Q compacted }; pln(8) g7<1>F g4.4<0,1,0>F g2<8,8,1>F { align1 1Q compacted }; send(8) g2<1>UW g6<8,8,1>F sampler (1, 0, 0, 1) mlen 2 rlen 4 { align1 1Q }; mov(8) g123<1>F g2<8,8,1>F { align1 1Q compacted }; mov(8) g124<1>F g3<8,8,1>F { align1 1Q compacted }; mov(8) g125<1>F g4<8,8,1>F { align1 1Q compacted }; mov(8) g126<1>F g5<8,8,1>F { align1 1Q compacted }; mov(8) g127<1>F g2<8,8,1>F { align1 1Q compacted }; nop ; sendc(8) null g123<8,8,1>F render RT write SIMD8 LastRT Surface = 0 mlen 5 rlen 0 { align1 1Q EOT }; After: pln(8) g6<1>F g4<0,1,0>F g2<8,8,1>F { align1 1Q compacted }; pln(8) g7<1>F g4.4<0,1,0>F g2<8,8,1>F { align1 1Q compacted }; send(8) g124<1>UW g6<8,8,1>F sampler (1, 0, 0, 1) mlen 2 rlen 4 { align1 1Q }; sendc(8) null g124<8,8,1>F render RT write SIMD8 LastRT Surface = 0 mlen 4 rlen 0 { align1 1Q EOT }; v2 (Matt): Removed unintended white-space change Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	hace 10 años
Topi Pohjolainen	4c157d34c0	meta/blit: Add plumbing for shaders without depth Currently all blit programs are unconditionally compiled with gl_FragDepth. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	hace 10 años
Jason Ekstrand	604ae33c8b	nir/opt_algebraic: Add some constant bcsel reductions total instructions in shared programs: `5998190` -> `5997603` (-0.01%) instructions in affected programs: 54276 -> 53689 (-1.08%) helped: 293 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	hace 10 años
Jason Ekstrand	7f19cd5a56	nir/opt_algebraic: Add some boolean simplifications total instructions in shared programs: `5998321` -> `5998287` (-0.00%) instructions in affected programs: 4520 -> 4486 (-0.75%) helped: 8 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	hace 10 años
Jason Ekstrand	70273c5cd5	nir/algebraic: Support specifying variable as constant or by type Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	hace 10 años
Jason Ekstrand	81f77e4f3a	nir/algebraic: Fail to compile of a variable is used in a replace but not the search Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	hace 10 años
Jason Ekstrand	026b5cc792	nir/search: Allow for matching variables based on types This allows you to match on an unknown value but only if it is of a given type. 90% of the uses of this are for matching only booleans, but adding the generality of arbitrary types is no more complex. nir_algebraic.py doesn't handle this yet but that's ok because the C language will ensure that the default type on all variables is void. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	hace 10 años
Jason Ekstrand	d8999bcdce	nir/search: Add support for matching unknown constants There are some algebraic transformations that we want to do but only if certain things are constants. For instance, we may want to replace a * (b + c) with (a * b) + (a * c) as long as a and either b or c is constant. While this generates more instructions, some of it will get constant folded. nir_algebraic.py doesn't handle this yet, but that's ok because the C language will make sure that false is the default for now. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	hace 10 años
Jason Ekstrand	5ab1489ae6	nir: Add an invalid type This allows us to indicate a concept of an invalid type. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	hace 10 años
Roland Scheidegger	f01e8d3ba5	gallium/docs: fix docs wrt ARL/ARR/FLR since the address reg holds integer values, ARL/ARR do an implicit float-to-int conversion, so clarify that. Thus it is also incorrect to say that FLR really does the same as ARL. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	hace 10 años

1 2 3 4 5 ...

67832 Commits (5dfb085ff325df3dbefda515f06106469babbefc) All Branches Buscar

67832 Commits (5dfb085ff325df3dbefda515f06106469babbefc)

All Branches