cell: Fixed usage of MAX_INSTRUCTIONS to use new MAX_PROGRAM_INSTRUCTIONS instead of old MAX_NV_XXX definitions in order to allow Cell TGSI fragment program generator to work again.
- Use a lookup table for log2.
- Compute (float) (1 << ipart) by tweaking with the exponent directly to
avoid integer overflow and float conversion.
- Also table negative exponents to avoid float division and branching.
- Implement util_fast_exp as function of util_fast_exp2.
tgsi: SSE2 optimized exp2, log2 and pow implementations.
Special care must be taken when calling compiler generated SSE2 functions
from the runtime generated SSE2: saving the xmm registers, and notify gcc
the stack is not 16byte aligned.
It would be more efficient to keep the stack pointer 16byte aligned, but
too hairy, and not consistent in all x86 architectures.
This has been tested in linux x86 and windows x86 userspace. Not tested on
x86-64 because it is broken for other reasons (even without this change).
cell: checkpoint: support for function calls in SPU shaders
Will be used for instructions like SIN/COS/POW/TEX/etc. The PPU needs to
know the address of some functions in the SPU address space. Send that
info to the PPU/main memory rather than patch up shaders on the SPU side.
Not finished/tested yet...
mesa: fix/simplify initialization of vertex/fragment program limits
Defaults for program length, num ALU instructions, num indirections, etc.
basically indicate no limit for software rendering. Driver should override
as needed.