- Use a lookup table for log2.
- Compute (float) (1 << ipart) by tweaking with the exponent directly to
avoid integer overflow and float conversion.
- Also table negative exponents to avoid float division and branching.
- Implement util_fast_exp as function of util_fast_exp2.
tgsi: SSE2 optimized exp2, log2 and pow implementations.
Special care must be taken when calling compiler generated SSE2 functions
from the runtime generated SSE2: saving the xmm registers, and notify gcc
the stack is not 16byte aligned.
It would be more efficient to keep the stack pointer 16byte aligned, but
too hairy, and not consistent in all x86 architectures.
This has been tested in linux x86 and windows x86 userspace. Not tested on
x86-64 because it is broken for other reasons (even without this change).
cell: checkpoint: support for function calls in SPU shaders
Will be used for instructions like SIN/COS/POW/TEX/etc. The PPU needs to
know the address of some functions in the SPU address space. Send that
info to the PPU/main memory rather than patch up shaders on the SPU side.
Not finished/tested yet...
mesa: fix/simplify initialization of vertex/fragment program limits
Defaults for program length, num ALU instructions, num indirections, etc.
basically indicate no limit for software rendering. Driver should override
as needed.
The VBO module may use a real VBO or a malloc'd buffer for vertex storage.
Be careful not to accidentally replace the later with the former when drawing.
Check if using a real VBO at destroy time to prevent a double-free.
intel: Fix clears to depth_stencil texture attachments.
Broken by 0adfd1021035e90995a25ec5f20b736e55075d92, showed up as an assertion
failure in a software fallback in the shadowtex demo when we failed to
recognize the texture format.