llvmpipe: fastpath for interpolated z16 less depthtesting
Because this is interpolated (ie. early) depth, we can build in an
assumption about the quads emitted by triangle setup, ie that they
are actually linear spans. Interpolate z over those spans in z16
format to save on math & conversion.
llvmpipe: avoid flushing depth buffer cache on swapbuffers
There's no need to push out depth buffer contents on swapbuffers.
Note that this change doesn't throw away depth buffer changes, it simply
holds them in the cache over calls to swapbuffers. The hope is
that swapbuffers will be followed by a clear() which means in that case
we won't have to write the changes out.
llvmpipe: shortcircuit repeated lookups of the same tile
The lp_tile_cache is often called repeatedly to look up the same
tile. Add a cache (to the cache) of the single tile most recently
retreived and make a quick inline check to see if this matches the
subsequent request.
Add a tile_address bitfield struct to make this check easier.
llvmpipe: remove backwards dependency from tilecache to llvmpipe
The tile cache is a utility, it shouldn't know anything about the
entity which is making use of it (ie llvmpipe).
Remove llvmpipe parameter to all the tilecache function calls, and
also remove the need to keep a llvmpipe pointer in the sampler structs.
There are some inconsistencies in pipe_format, but above all, there
simply aren't enough bits in an enum to conveniently store all
information about a pixel format we need to be able to dynamically
generate pixel packing/unpacking code.
Problem was to find the correct place to run prediction. Only place that is
called for every primitive is ALLOC_VERTS so we have to do prediction there
before allocation.