瀏覽代碼

ac: only load used channels when sampling buffer views

This allows to reduce the number of dwords that are loaded
with buffer_load_format_xyzw. For example, when the only used
channel is 1, the driver will emit buffer_load_format_x instead.

Shader stats for DOW3 (with some local hacky scripts for SPIRV):

143 shaders in 143 tests
Totals:
SGPRS: 5344 -> 5352 (0.15 %)
VGPRS: 3476 -> 3452 (-0.69 %)
Spilled SGPRs: 30 -> 29 (-3.33 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 269860 -> 269808 (-0.02 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 1267 -> 1272 (0.39 %)
Wait states: 0 -> 0 (0.00 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
tags/18.1-branchpoint
Samuel Pitoiset 7 年之前
父節點
當前提交
310d17fcf1
共有 1 個檔案被更改,包括 4 行新增1 行删除
  1. 4
    1
      src/amd/common/ac_nir_to_llvm.c

+ 4
- 1
src/amd/common/ac_nir_to_llvm.c 查看文件

@@ -2315,11 +2315,14 @@ static LLVMValueRef build_tex_intrinsic(struct ac_nir_context *ctx,
struct ac_image_args *args)
{
if (instr->sampler_dim == GLSL_SAMPLER_DIM_BUF) {
unsigned mask = nir_ssa_def_components_read(&instr->dest.ssa);

return ac_build_buffer_load_format(&ctx->ac,
args->resource,
args->addr,
ctx->ac.i32_0,
4, true);
util_last_bit(mask),
true);
}

args->opcode = ac_image_sample;

Loading…
取消
儲存