brkho/mesa - mesa - Brian's Gitea

Grafico dei commit

Autore	SHA1	Messaggio	Data
Kenneth Graunke	fdc5941972	mesa: Delete VERT_ATTRIB_GENERIC_NV and VERT_BIT_GENERIC_NV macros. These haven't been used since we deleted NV_vertex_program support. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	12 anni fa
Eric Anholt	0967c362bf	i965: Fix an inconsistency inb the VUE map with gl_ClipVertex on gen4/5. We are intentionally not allocating a slot for gl_ClipVertex. But by leaving the bit set in the slots_valid, the fragment shader's computation of where varyings are in urb entry coming out of the SF would be off by one. Fixes rendering in Freespace 2 SCP, and improves rendering in TF2. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62830 Tested-by: Joaquín Ignacio Aramendía <samsagax@gmail.com> NOTE: This is a candidate for the 9.1 branch. Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	12 anni fa
Eric Anholt	9dd19575d3	intel: Remove a never-taken debug print path. Alessandro Pignotti noted when I added this code in commit `0e723b135b` that it's in the else block for "if (busy)", so this debug print couldn't happen. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	12 anni fa
Brian Paul	c34bbe110d	st/mesa: add ir_lod case in GLSL->TGSI code to silence warning	12 anni fa
Ian Romanick	e0131196ca	glsl: Generated masked write instead of vector array index for UBO lowering When reading a column from a row-major matrix, we would slot the single value read into the vector using an ir_dereference_array of the vector with a constant index. This will (eventually) get optimized to a masked-write, so just generate the masked write in the first place. v2: Remove unused variable 'chan'. Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Eric Anholt <eric@anholt.net>	12 anni fa
Ian Romanick	65cc68f430	glsl: Replace open-coded dot-product with dot Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Eric Anholt <eric@anholt.net> Cc: Paul Berry <stereotype441@gmail.com>	12 anni fa
Ian Romanick	dbf94d105a	glsl: Replace constant-index vector array accesses with swizzles Search and replace: ][0] -> ].x ][1] -> ].y ][2] -> ].z ][3] -> ].w Fixes piglit tests inverse-mat[234].{vert,frag}. These tests call the inverse function with constant parameters and expect proper constant folding to happen. My suspicion is that this patch papers over some bug in constant propagation involving array accesses. Either way, all of these accesses eventually get lowered to swizzles. This cuts out the middle man (saving a trivial amount of CPU). NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Eric Anholt <eric@anholt.net> Cc: Paul Berry <stereotype441@gmail.com>	12 anni fa
Ian Romanick	c770faea0a	glsl: Add missing bool case in glsl_type::get_scalar_type Since the case was missing bec4->get_scalar_type() would return bvec4, but vec4->get_scalar_type() would return float. NOTE: This is a candidate for stable branches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	12 anni fa
Kenneth Graunke	57a502518e	i965: Fix INTEL_DEBUG=shader_time for fragment shaders with discards. "discard" instructions generate HALT instructions which jump to a final HALT near the end of the shader. Previously, fs_generator created this final jump target when it saw the first FS_OPCODE_FB_WRITE, causing it to jump right before the FB write epilogue. This is normally good. However, INTEL_DEBUG=shader_time also has an epilogue section which records the final timestamp. The frontend emits IR for this just before FS_OPCODE_FB_WRITE. Unfortunately, this led to the following ordering: 1. Shader Time Epilogue 2. Final HALT (where discards jump) 3. Framebuffer Write Epilogue This meant that discarded pixels completely skipped the shader time epilogue, causing no ending timestamp to be written. This obviously led to inaccurate results. This patch adds a new FS_OPCODE_PLACEHOLDER_HALT in the IR stream just before any epilogue sections. This is where the final HALT should be generated, and makes it easy to ensure the correct ordering: 1. Final HALT 2. Shader Time Epilogue 3. Framebuffer Write Epilogue For shaders that don't discard, this opcode compiles away to nothing. The scheduler adds barrier dependencies to make sure that it doesn't get moved above any FS_OPCODE_DISCARD_JUMP instructions. One 8-wide shader in GLBenchmark 2.7 dropped from 2291.67 Gcycles to a mere 5.13 Gcycles. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	12 anni fa
Eric Anholt	20d846ce8b	i965: Add names for all instructions to dump_instruction() in FS and VS. I'd previously added the minimum names to understand my dumps, but this makes dumps in general much easier to read. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	12 anni fa
Matt Turner	ed6186f0e8	i965: Enable ARB_texture_query_lod. v2: Support Ironlake as well. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	12 anni fa
Matt Turner	b8aa9f7d3a	i965/fs: Generate LOD sampler message from ir_lod. v2: Support Ironlake as well. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	12 anni fa
Dave Airlie	110ca8b1f3	glsl: Implement ARB_texture_query_lod v2 [mattst88]: - Rebase. - #define GL_ARB_texture_query_lod to 1. - Remove comma after ir_lod in ir.h for MSVC. - Handled ir_lod in ir_hv_accept.cpp, ir_rvalue_visitor.cpp, opt_tree_grafting.cpp. - Rename textureQueryLOD to textureQueryLod, see https://www.khronos.org/bugzilla/show_bug.cgi?id=821 - Fix ir_reader of (lod ...). v3 [mattst88]: - Rename textureQueryLod to textureQueryLOD, pending resolution of Khronos 821. - Add ir_lod case to ir_to_mesa.cpp. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	13 anni fa
Matt Turner	0e0ab8a071	i965/fs: Use measured Gen7 instruction timings on Gen6. x before + after +------------------------------------------------------------------------------+ \| x x + \| \| xx ++ x + \| \| xx ++ + xx ++ \| \|x xxx x+++++ + xxx xx++++ + x +\| \| \|_____\|____________A______A____M____M_\|_______\| \| +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 23 8083.78 8287.83 8205.55 8162.7461 68.307951 + 23 8107.56 8358.74 8224.33 8186.1765 71.506301 No difference proven at 95.0% confidence Reviewed-by: Eric Anholt <eric@anholt.net>	12 anni fa
Matt Turner	f085b21b25	i965/fs: Increase and document MAD latency on Gen7. 58% of mad(8) generated in shader-db are reading registers from the same bank. Reviewed-by: Eric Anholt <eric@anholt.net>	12 anni fa
Matt Turner	414ea2f560	i965/fs: Add LRP instruction latency. Set its latency to what happens to be the default floating-point instruction latency. One day we may want to handle latency based on register bank information. Reviewed-by: Eric Anholt <eric@anholt.net>	12 anni fa
Matt Turner	ad4507b355	i965/fs: Add Haswell cycle timings Reviewed-by: Eric Anholt <eric@anholt.net>	12 anni fa
Matt Turner	7997e59b65	i965: Note that write-after-write dependencies are blocking. Reviewed-by: Eric Anholt <eric@anholt.net>	12 anni fa
Matt Turner	f91e371fee	i965: Reword comment about the shared mathbox. Reviewed-by: Eric Anholt <eric@anholt.net>	12 anni fa
Roland Scheidegger	5f41e08cf3	gallivm: consolidate some half-to-float and r11g11b10-to-float code Similar enough that we can try to use shared code. v2: fix a stupid bug using wrong variable causing mayhem with Inf and NaNs. Reviewed-by: Jose Fonseca <jfonseca@vmware.com	12 anni fa
Chris Forbes	4412f3bc13	mesa: provide default implementation of QuerySamplesForFormat Previously at least i915 failed to provide an implementation, but exposed ARB_internalformat_query anyway, leading to crashes when QueryInternalformativ was called. Default implementation just returns 1 for everything, so is suitable for any driver which does not support multisampling. V2: - Move from intel to core mesa. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	12 anni fa
Christoph Bumiller	ee624ced36	nvc0: implement MP performance counters There's more, but this only adds (most) of the counters that are handled directly by the shader processors. The other counter domains are not handled on the multiprocessor and there are no FIFO object methods for configuring them. Instead, they have to be programmed by the kernel via PCOUNTER, and the interface for this isn't in place yet.	12 anni fa
Christoph Bumiller	480359bcf6	nvc0: enable compression when supported	12 anni fa
Christoph Bumiller	25722e3454	nvc0: use NOUVEAU_GETPARAM_GRAPH_UNITS to get MP count	12 anni fa
Christoph Bumiller	443b247878	nv50,nvc0: fix 3d blits, restore viewport after blit	12 anni fa
Christoph Bumiller	090e73fc46	nv50: fix 3D render target setup	12 anni fa
Brian Paul	b54ce3738a	llvmpipe: put .bmp extension on dumped image files	12 anni fa
Brian Paul	e90c56bc4e	llvmpipe: add 'f' suffix to 1.0 in fixed_to_float()	12 anni fa
Brian Paul	499aa3ddb4	draw: fix some build breakage when LLVM is not used Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62883 Tested-by: Vinson Lee <vlee@freedesktop.org>	12 anni fa
Marek Olšák	9ad9141917	mesa: handle STATE_CURRENT_ATTRIB_MAYBE_VP_CLAMPED for parameter printing Reviewed-by: Brian Paul <brianp@vmware.com>	12 anni fa
Kenneth Graunke	9fe47756b3	i965: Tidy shader time printing code by using printf's field widths. We can use %-6s%-6s rather than manually counting characters, resulting in much more readable code. This necessitates a small secondary change: using "total fs16" and "" now causes the "" string to be padded out to 6 characters, resulting in too much whitespace. Splitting it into "total" and "fs16" produces the same output as before. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	12 anni fa
Eric Anholt	6192e9b377	i965/vs: Include URB payload setup in shader_time. This much more accurately reflects the cost of the vertex shader, since the payload setup is often a significant fraction of the instructions in the VS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	12 anni fa
Eric Anholt	55feb19704	i965/vs: Use a send from a 2-register VGRF for shader time writes. This will let us emit it later, after we're setting up MRFs for the URB write. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	12 anni fa
Eric Anholt	130138030a	i965/vs: Teach copy propagation about sends from GRFs. This incidentally also teaches it a bit about gen6 math -- we now allow unswizzled, unmodified GRF temps as the sources for math. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	12 anni fa
Eric Anholt	c3a22d42a8	i965/vs: Prepare split_virtual_grfs() for the presence of SENDs from GRFs. v2: Fix silly bool handling, and don't add new tabs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	12 anni fa
Eric Anholt	47e795d861	i965/fs: Include everything but the final FB write in shader_time. Previously, if you just wrote a constant color to the render target, no time got noted at all. This is convenient for doing single-instruction timings, but not so much for actual program analysis. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	12 anni fa
Eric Anholt	5c5218ea61	i965/fs: Switch shader_time writes to using GRFs. This avoids conflicts between shader_time and FB writes, so we can include more of the program under our profiling. This does mean hiding more of the message setup from the optimizer, which doesn't have a way to handle multi-reg sends from GRFs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	12 anni fa
Eric Anholt	5c039543db	i965: Provide more detailed information to match shader_time to programs. Ken asked me the other day what -1 vs 0 vs 3 vs other meant in our shader names, and I realized that it was really unclear. I'd like to do even better, like noting which one is the clear shader, but that would require exposing the metaops struct to the driver. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	12 anni fa
Eric Anholt	d2ba1c24b4	i965: Track ARB program state along with GLSL state for shader_time. This will let us do much better printouts for non-GLSL programs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	12 anni fa
Marek Olšák	a19f6e880a	st/dri: fix crash with HUD and single buffering	12 anni fa
Marek Olšák	6b5dfa42c9	st/mesa: remove leftover printfs from ReadPixels Oops, I thought I had removed all debugging code.	12 anni fa
Eric Anholt	eda434921d	i965/fs: Improve performance of copy propagation dataflow using bitsets. Reduces compile time of l4d2's slowest shader by 17.8% +/- 1.3% (n=10). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	12 anni fa
Zack Rusin	d066133a76	llvmpipe/draw: Fix texture sampling in geometry shaders We weren't correctly propagating the samplers and sampler views when they were related to geometry shaders. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	12 anni fa
Zack Rusin	186a6bffdd	draw/llvm: Cleanup the store debugging code Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	12 anni fa
Zack Rusin	10964fc73d	draw: Allocate the output buffer for output primitives We were allocating the output buffer but using the input primitives. We need to allocate that buffer using the maximum number of output, not input, primitives. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	12 anni fa
Zack Rusin	f20f981553	gallivm: Implement the breakc instruction Required by more modern examples. Like BRK but with a condition. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	12 anni fa
Zack Rusin	b66ffcf2f8	gallivm: implement implicit primitive flushing TGSI semantics currently require an implicit endprim at the end of GS if an ending primitive hasn't been emitted. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	12 anni fa
Zack Rusin	e96f4e3b85	gallium/llvm: implement geometry shaders in the llvm paths This commits implements code generation of the geometry shaders in the SOA paths. All the code is there but bugs are likely present. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	12 anni fa
Zack Rusin	edcebe665d	draw/gs: Fetch more than one primitive per invocation Allows executing gs on up to 4 primitives at a time. Will also be required by the llvm code because there we definitely don't want to flush with just a single primitive. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	12 anni fa
Zack Rusin	014c4d1cd7	draw/gs: Abstract the portions of GS that are tgsi specific To be able to add llvm paths later on we need to have some common interface for them. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	12 anni fa

1 2 3 4 5 ...

55879 Commit (bdfbeb9633eb3f8cf1ad76723f6c3839e57a08a3) Tutti i branch Cerca

55879 Commit (bdfbeb9633eb3f8cf1ad76723f6c3839e57a08a3)

Tutti i branch