Metal kernel shader -- fade implementation

Question

I haven't written many Metal kernel shaders yet; here's a fledgling "fade" shader between two RGBX-32 images, using a tween value of 0.0 to 1.0 between inBuffer1 (0.0) to inBuffer2 (1.0).

Is there something I'm missing here? Something strikes me that this may be terribly inefficient.

My first inkling is to attempt to do subtraction and multiplication using the vector data types (eg. char4) thinking that might be better, but the results of this are certainly undefined (as some components will be negative).

Also, is there some advantage to using MTLTexture versus MTLBuffer objects as I've done?

kernel void fade_Kernel(device const uchar4  *inBuffer1  [[ buffer(0) ]],
                        device const uchar4  *inBuffer2  [[ buffer(1) ]],
                        device const float   *tween      [[ buffer(2) ]],
                        device uchar4        *outBuffer  [[ buffer(3) ]],
                        uint gid [[ thread_position_in_grid ]])
{
    const float t = tween[0];
    uchar4 pixel1 = inBuffer1[gid];
    uchar4 pixel2 = inBuffer2[gid];

    // these values will be negative
    short r=(pixel2.r-pixel1.r)*t;  
    short g=(pixel2.g-pixel1.g)*t;
    short b=(pixel2.b-pixel1.b)*t;

    outBuffer[gid]=uchar4(pixel1.r+r,pixel1.g+g,pixel1.b+b,0xff);
}

Ken Thomases · Accepted Answer · 2017-07-23 21:38:41Z

2

First, you should probably declare the tween parameter as:

constant float &tween [[ buffer(2) ]],

Using the constant address space is more appropriate for a value like this that's the same for all invocations of the function (and not indexed into by grid position or the like). Also, making it a reference instead of a pointer tells the compiler that you won't be indexing other elements in the "array" that a pointer might be.

Finally, there's a mix() function that performs exactly the sort of computation that you're doing here. So, you could replace the body of the function with:

uchar4 pixel1 = inBuffer1[gid];
uchar4 pixel2 = inBuffer2[gid];

outBuffer[gid] = uchar4(uchar3(mix(float3(pixel1.rgb), float3(pixel2.rgb), tween)), 0xff);

As to whether it would be better to use textures, that depends somewhat on what you plan to do with the result after running this kernel. If you're going to be doing texture-like things with it anyway, it might be better to use textures all throughout. Indeed, it might be better to use drawing operations with blending rather than a compute kernel. After all, such blending is something GPUs have to do all the time, so that path is probably fast. You'd have to test the performance of each approach.

edited Jul 23, 2017 at 21:38

answered Jul 23, 2017 at 9:05

Ken Thomases

90.4k7 gold badges124 silver badges162 bronze badges

Thanks for this, Ken. You're very helpful, once again. Oddly "mix" doesn't appear to be part of <metal_common> under pre-Metal2 implementations. Looking the the Metal Shader Language docs, I can do "saturate" but not "mix" -> No matching function for call to 'mix'.
– zzyzy
Commented Jul 23, 2017 at 16:40
My mistake. mix() only works for floating-point types. I've edited my answer to convert back and forth. The conversion was implicit in your original code. You might want to wrap the call to mix() in a call to round(), too, although your original code just truncated like my new code does.
– Ken Thomases
Commented Jul 23, 2017 at 21:41
Thanks Ken. If alpha was to be considered, I suppose outBuffer[gid]=uchar4(mix(float4(pixel1),float4(pixel2),tween)) could also work just fine.
– zzyzy
Commented Jul 24, 2017 at 22:03

Add a comment |

Satoshi Nakajima · Accepted Answer · 2017-07-25 05:05:39Z

If you are dealing with images, it's much more efficient to use MTLTexture than MTLBuffer. It is also better to use "half" than "uchar". I've learned this directly from an Apple engineer at WWDC this year.

kernel void alpha(texture2d<half, access::read>  inTexture2  [[texture(0)]],
    texture2d<half, access::read>  inTexture1  [[texture(1)]],
    texture2d<half, access::write> outTexture [[texture(2)]],
    const device float& tween [[ buffer(3) ]],
    uint2 gid [[thread_position_in_grid]]) 
{
    // Check if the pixel is within the bounds of the output texture
    if((gid.x >= outTexture.get_width()) || (gid.y >= outTexture.get_height())) {
        // Return early if the pixel is out of bounds
        return;
    }
    half4 color1  = inTexture1.read(gid);
    half4 color2  = inTexture2.read(gid);
    outTexture.write(half4(mix(color1.rgb, color2.rgb, half(tween)), color1.a), gid);
}

I suppose the efficient use of "half" or "char4" would depend on whether the source buffer was RGBA-32 (4 bytes/pixel) or some other format, no? — zzyzy, Commented Jul 25, 2017 at 20:23

Collectives™ on Stack Overflow

Metal kernel shader -- fade implementation

2 Answers 2

Your Answer

Not the answer you're looking for? Browse other questions tagged
metal
pixel-shader
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged metalpixel-shader or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
metal
pixel-shader
or ask your own question.