Quellcode durchsuchen

nir/algebraic: Strip double negatives from comparison sources

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 17224623 -> 17224337 (<.01%)
instructions in affected programs: 32648 -> 32362 (-0.88%)
helped: 148
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.93 x̃: 2
helped stats (rel) min: 0.16% max: 2.74% x̄: 1.07% x̃: 1.08%
95% mean confidence interval for instructions value: -1.97 -1.89
95% mean confidence interval for instructions %-change: -1.15% -1.00%
Instructions are helped.

total cycles in shared programs: 360828714 -> 360826090 (<.01%)
cycles in affected programs: 347416 -> 344792 (-0.76%)
helped: 148
HURT: 26
helped stats (abs) min: 1 max: 426 x̄: 26.33 x̃: 18
helped stats (rel) min: 0.03% max: 15.10% x̄: 1.78% x̃: 1.41%
HURT stats (abs)   min: 2 max: 337 x̄: 48.96 x̃: 6
HURT stats (rel)   min: 0.04% max: 18.82% x̄: 2.15% x̃: 0.27%
95% mean confidence interval for cycles value: -23.78 -6.38
95% mean confidence interval for cycles %-change: -1.59% -0.79%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
tags/19.2-branchpoint
Ian Romanick vor 6 Jahren
Ursprung
Commit
281f20e26d
1 geänderte Dateien mit 28 neuen und 0 gelöschten Zeilen
  1. 28
    0
      src/compiler/nir/nir_opt_algebraic.py

+ 28
- 0
src/compiler/nir/nir_opt_algebraic.py Datei anzeigen

@@ -199,6 +199,20 @@ optimizations = [
(('inot', ('ieq', a, b)), ('ine', a, b)),
(('inot', ('ine', a, b)), ('ieq', a, b)),

# This helps some shaders because, after some optimizations, they end up
# with patterns like (-a < -b) || (b < a). In an ideal world, this sort of
# matching would be handled by CSE.
(('flt', ('fneg', a), ('fneg', b)), ('flt', b, a)),
(('fge', ('fneg', a), ('fneg', b)), ('fge', b, a)),
(('feq', ('fneg', a), ('fneg', b)), ('feq', b, a)),
(('fne', ('fneg', a), ('fneg', b)), ('fne', b, a)),
(('flt', ('fneg', a), -1.0), ('flt', 1.0, a)),
(('flt', -1.0, ('fneg', a)), ('flt', a, 1.0)),
(('fge', ('fneg', a), -1.0), ('fge', 1.0, a)),
(('fge', -1.0, ('fneg', a)), ('fge', a, 1.0)),
(('fne', ('fneg', a), -1.0), ('fne', 1.0, a)),
(('feq', -1.0, ('fneg', a)), ('feq', a, 1.0)),

# 0.0 >= b2f(a)
# b2f(a) <= 0.0
# b2f(a) == 0.0 because b2f(a) can only be 0 or 1
@@ -1124,6 +1138,20 @@ late_optimizations = [

(('~fge', ('fmin(is_used_once)', ('fadd(is_used_once)', a, b), ('fadd', c, d)), 0.0), ('iand', ('fge', a, ('fneg', b)), ('fge', c, ('fneg', d)))),

(('flt', ('fneg', a), ('fneg', b)), ('flt', b, a)),
(('fge', ('fneg', a), ('fneg', b)), ('fge', b, a)),
(('feq', ('fneg', a), ('fneg', b)), ('feq', b, a)),
(('fne', ('fneg', a), ('fneg', b)), ('fne', b, a)),
(('flt', ('fneg', a), -1.0), ('flt', 1.0, a)),
(('flt', -1.0, ('fneg', a)), ('flt', a, 1.0)),
(('fge', ('fneg', a), -1.0), ('fge', 1.0, a)),
(('fge', -1.0, ('fneg', a)), ('fge', a, 1.0)),
(('fne', ('fneg', a), -1.0), ('fne', 1.0, a)),
(('feq', -1.0, ('fneg', a)), ('feq', a, 1.0)),

(('ior', a, a), a),
(('iand', a, a), a),

(('fdot2', a, b), ('fdot_replicated2', a, b), 'options->fdot_replicates'),
(('fdot3', a, b), ('fdot_replicated3', a, b), 'options->fdot_replicates'),
(('fdot4', a, b), ('fdot_replicated4', a, b), 'options->fdot_replicates'),

Laden…
Abbrechen
Speichern