Simulated palette with 1d texture lookup is way too slow...
by Gabest » Fri, 20 May 2005 22:10:39 GMT
.. is there any way to improve it?
I think there is none, but I thought it might be worth asking once.
Just for the reference, my lookup code is something like this in HLSL:
tex1D(Palette, tex2D(Texture, TexCoord).x - a_little_correction);
Doing it four times and lerping the samples to get bilinear filtering only
makes it even slower, of course.
The HLSL code compiles to 10-20 assembly instructions for point sampling and
30-40 with bilinear.
Re: Simulated palette with 1d texture lookup is way too slow...
by Wessam Bahnassi » Fri, 20 May 2005 23:33:18 GMT
I think the problem is in the dependant read itself. Four dependant reads
will really hurt performance.
Unfortunately, support for indexed textures is gone with new hardware.
Perhaps there is another way of doing this without palettes?
What are you trying to do exactly?
Wessam Bahnassi
Microsoft DirectX MVP,
Lead Programmer
In|Framez
Re: Simulated palette with 1d texture lookup is way too slow...
by Gabest » Sat, 21 May 2005 02:31:44 GMT
>I think the problem is in the dependant read itself. Four dependant reads
Emulation ;) At each frame a great number of textures have to be uploaded to
the video card every time, especially if the palette changes, but as it
turned out it actually hurts less than doing the palette lookups in the
shader.
Re: Simulated palette with 1d texture lookup is way too slow...
by mikegi » Sat, 21 May 2005 04:36:45 GMT
What is in your tex2d? Is it a scalar value? If so, you probably should
sample it four times, average the result, then do the tex1d lookup. That
would be post-classification. What you're doing now is pre-classification,
which usually produces ugly results.
Mike
and
Re: Simulated palette with 1d texture lookup is way too slow...
by Eyal Teler » Sat, 21 May 2005 17:33:22 GMT
Doesn't make sense to me that this code will come to 10 instructions,
unless "a_little_correction" is pretty complicated. What code are you
getting?
Eyal
Re: Simulated palette with 1d texture lookup is way too slow...
by Gabest » Sun, 22 May 2005 01:56:47 GMT
This is not the full shader, just the part doing the sampling. But there are
not many more instructions, just something to substitute the texture
functions of the fixed pipeline (modulation, decal, two highlight modes).
The source can be viewed here, if you are interested:
http://www.**--****.com/
Re: Simulated palette with 1d texture lookup is way too slow...
by Gabest » Sun, 22 May 2005 02:01:08 GMT
> What is in your tex2d? Is it a scalar value?
Nope, it's the index of the palette. Can't be averaged.
Re: Simulated palette with 1d texture lookup is way too slow...
by Eyal Teler » Mon, 23 May 2005 09:16:15 GMT
> This is not the full shader, just the part doing the sampling. But there are
If the extras you're talking about are "if(!fRT) c.a *= 2;" and so on,
they likely take most of the shader. A == or != comparison is quite
costly, IIRC (translates into quite a few instructions). Try using >
or >= instead. Better yet, use different shaders if you can (one which
doubles a, one which doesn't).
Eyal
Re: Simulated palette with 1d texture lookup is way too slow...
by Gabest » Tue, 24 May 2005 00:17:58 GMT
Not really, only those shaders are much slower which do a palette lookup.
I've tried to comment out as much of those extras as I could already.
That's a funny thing actually. I'm setting those consts to be either 0 or 1
so != 0 should be the same as > 0 (which compiles to one instruction less)
but in reality it just does not want to work on nvidia, while on ati
hardware it does.
The current number of shaders are already the result of such separation,
they could be broken into more, just each step doubles their number. I wish
there was something like #ifdef in hlsl, that would surely make it easier to
prepare different versions of the shaders.
Re: Simulated palette with 1d texture lookup is way too slow...
by Eyal Teler » Tue, 24 May 2005 06:35:35 GMT
> Not really, only those shaders are much slower which do a palette lookup.
It's possible, but in that case it's unlikely to do with the number of
instructions.
The compiler doesn't know which values you will use. Since the only
way to compare in PS 2.0 is with cmp, which tests ">= 0", any exact
comparisons need to use two tests.
It's helpful to understand what the code compiles to. You can then
choose the constants and code better. We're not yet in a position
where there's such an abundance of computing power that we can just
use whatever code we like. For example, if instead of 0 and 1 you
chose 1 and 2, then you could simply multiply the value with this,
instead of having to test anything -- that'd be much shorter and
faster. (Although even with 0 and 1, just add 1 and multiply :)
You can use constants in the shader definitions. Check for example the
HLSL Workshop tutorial 1 (goal1.fx). At the end of the file you'll see
the same vertex shader compiled quite a few times with different
boolean parameters, resulting in totally different code.
Eyal