Edge Detection in OpenCL

One of the OpenCL based filters I wanted to add before the next build was simple edge detection. The first step was to find a suitable algorithm that would scale nicely with OpenCL. There are many algorithms out there and I’ll probably implement more (Sobel and Canny come to mind) but this first implementation is based on Christian Graus’s Difference Edge Detection algorithm. The C/C++ code is fairly straightforward

Our Raving Buddy as the Source

and requires a double for loop with some very basic subtraction and comparison operations in the core of the loop.

The OpenCL version of this algorithm isn’t too much more complex than its native code predecessor. The bulk of the extra work lies in figuring out the adjacent pixels and I also added wrapping to handle pixels on the edges¹. We’ll start by showing the complete listing of the OpenCL kernel then break down each component with an in-depth overview.

Complete listing for edgeDetect.cl

__attribute__ ((always_inline)) int4 GetColorInt4(int srcColor) {

int4 color;

color.x = (srcColor >> 16) & 0xff;
color.y = (srcColor >> 8 ) & 0xff;
color.z = (srcColor & 0xff);
color.w = (srcColor >> 24) & 0xff;

return color;
}

__attribute__ ((always_inline)) int ColorToInt(int4 srcColor) {

return ((srcColor.w & 0xff) << 24) |
((srcColor.x & 0xff) << 16) |
((srcColor.y & 0xff) << 8 ) |
(srcColor.z & 0xff);
}

__kernel void difference(__global int *srcImage,
__global int *destImage,
int iImageStride,
int iImageHeight) {

const int gid = get_global_id(0);
const int x = gid % iImageStride;
const int y = gid / iImageStride;
int4 srcColor = GetColorInt4(srcImage[gid]);

uint4 destColor; // The destination color
uint4 tmpColor; // Used to hold the results of a computation before maximum comparison.
// It's a uint because abs() returns uint's.

int2 pixelCoords[8]; // Holds neighboring pixel coordinates
int4 pixels[8]; // The pixels around us: 0=TopLeft, 1=Top, 2=TopRight,
// 3=Left, 4=Right
// 5=BottomLeft, 6=Bottom, 7=BottomRight

// Gather up all the pixel coordinate values we'll use (we'll do wrapping next)
pixelCoords[0].xy = (int2){x - 1, y - 1};
pixelCoords[1].xy = (int2){x , y - 1};
pixelCoords[2].xy = (int2){x + 1, y - 1};
pixelCoords[3].xy = (int2){x - 1, y};
pixelCoords[4].xy = (int2){x + 1, y};
pixelCoords[5].xy = (int2){x - 1, y + 1};
pixelCoords[6].xy = (int2){x , y + 1};
pixelCoords[7].xy = (int2){x + 1, y + 1};

// Wrap the X-coordinates if the source pixel is on the left or right edge
if(x == 0) {

// Wrap the left side
pixelCoords[0].x = iImageStride - 1;
pixelCoords[3].x = iImageStride - 1;
pixelCoords[5].x = iImageStride - 1;

} else if(x == (iImageStride - 1)) {

// Wrap the right side
pixelCoords[2].x = 0;
pixelCoords[4].x = 0;
pixelCoords[7].x = 0;
}

// Wrap the Y-coordinates if the source pixel is on the top or bottom edge
if(y == 0) {

// Wrap the top
pixelCoords[0].y = iImageHeight - 1;
pixelCoords[1].y = iImageHeight - 1;
pixelCoords[2].y = iImageHeight - 1;

} else if(y == (iImageHeight - 1)) {

// Wrap the bottom
pixelCoords[5].y = 0;
pixelCoords[6].y = 0;
pixelCoords[7].y = 0;
}

// Retrieve all of the pixel values.
pixels[0] = GetColorInt4(srcImage[(pixelCoords[0].y * iImageStride) + pixelCoords[0].x]);
pixels[1] = GetColorInt4(srcImage[(pixelCoords[1].y * iImageStride) + pixelCoords[1].x]);
pixels[2] = GetColorInt4(srcImage[(pixelCoords[2].y * iImageStride) + pixelCoords[2].x]);
pixels[3] = GetColorInt4(srcImage[(pixelCoords[3].y * iImageStride) + pixelCoords[3].x]);
pixels[4] = GetColorInt4(srcImage[(pixelCoords[4].y * iImageStride) + pixelCoords[4].x]);
pixels[5] = GetColorInt4(srcImage[(pixelCoords[5].y * iImageStride) + pixelCoords[5].x]);
pixels[6] = GetColorInt4(srcImage[(pixelCoords[6].y * iImageStride) + pixelCoords[6].x]);
pixels[7] = GetColorInt4(srcImage[(pixelCoords[7].y * iImageStride) + pixelCoords[7].x]);

// Pre-set our maximum color to be Top Right - Bottom Left
destColor = abs(pixels[2] - pixels[5]);

// Bottom Right - Top Left
tmpColor = abs(pixels[7] - pixels[0]);
destColor = max(tmpColor, destColor);

// Top - Bottom
tmpColor = abs(pixels[1] - pixels[6]);
destColor = max(tmpColor, destColor);

// Right - Left
tmpColor = abs(pixels[4] - pixels[3]);
destColor = max(tmpColor, destColor);

// Stuff the pixel back in there
int4 finalColor = {destColor.x, destColor.y, destColor.z, srcColor.w};

destImage[gid] = ColorToInt(finalColor);

}//end of difference()

Where’s image2d_t and why are you using int * ?

Not every device has to support image objects. Sure, there’s some extra performance gain to be had by using them but while OpenCL is still in its infancy I’m shooting for maximum compatibility across all vendors and platforms. In lieu of using image2d_t we pass in good ole 32-bit pixels. PhotoMonkee uses 32 bit RGBA internally so that’s what gets passed down into all OpenCL kernels. The GetColorInt4() and ColorToInt() functions perform all of the neccesary bit shifting that allow us to work with each color component. We opted for using int’s to allow some overflow when performign computations. We always call clamp(val, 255) before passing the int4 type back into ColorToInt().

Where in the image…is Carmen Sandiego?

The global ID is our 1D index into the image. We need to extrapolate our X,Y coordinate so that we can find our neighbors. Using the modulus and division operators that’s no problem. While we’re at it, we’ll snag the current pixel.

const int gid = get_global_id(0);
const int x = gid % iImageStride;
const int y = gid / iImageStride;
int4 srcColor = GetColorInt4(srcImage[gid]);

Example: If we’re at index 200 of a 64×64 image: 200 % 64 = 8 = X and 200 / 64 = 3 = Y. Welcome to (8, 3), enjoy your stay!

Welcome to the neighborhood, neighbor!

Now it’s time to find our neighboring pixels. To start we setup 8 coordinates with the appropriate offsets from our current X,Y position. Example: The pixel to our top, left is X – 1, Y – 1.

pixelCoords[0].xy = (int2){x - 1, y - 1};
pixelCoords[1].xy = (int2){x , y - 1};
pixelCoords[2].xy = (int2){x + 1, y - 1};
pixelCoords[3].xy = (int2){x - 1, y};
pixelCoords[4].xy = (int2){x + 1, y};
pixelCoords[5].xy = (int2){x - 1, y + 1};
pixelCoords[6].xy = (int2){x , y + 1};
pixelCoords[7].xy = (int2){x + 1, y + 1};

Astute readers will correctly inquire what happens if we’re at 0,0: don’t worry, we’ve got you covered! If we detect our current pixel is on an edge the we “reach” (wrapping in graphics parlance) around the image and snag the pixel from the opposite side of the image. For example, if we are at 0,0 in an image that’s 64×64 then the top-left  neighbor will be 63,63.

// Wrap the X-coordinates if the source pixel is on the left or right edge
if(x == 0) {

// Wrap the left side
pixelCoords[0].x = iImageStride - 1;
pixelCoords[3].x = iImageStride - 1;
pixelCoords[5].x = iImageStride - 1;
} else if(x == (iImageStride - 1)) {

// Wrap the right side
pixelCoords[2].x = 0;
pixelCoords[4].x = 0;
pixelCoords[7].x = 0;
}

// Wrap the Y-coordinates if the source pixel is on the top or bottom edge
if(y == 0) {

// Wrap the top
pixelCoords[0].y = iImageHeight - 1;
pixelCoords[1].y = iImageHeight - 1;
pixelCoords[2].y = iImageHeight - 1;

} else if(y == (iImageHeight - 1)) {

// Wrap the bottom
pixelCoords[5].y = 0;
pixelCoords[6].y = 0;
pixelCoords[7].y = 0;
}

Now that we have the coordinates for our neighbors...GO FETCH!
// Retrieve all of the pixel values.
pixels[0] = GetColorInt4(srcImage[(pixelCoords[0].y * iImageStride) + pixelCoords[0].x]);
pixels[1] = GetColorInt4(srcImage[(pixelCoords[1].y * iImageStride) + pixelCoords[1].x]);
pixels[2] = GetColorInt4(srcImage[(pixelCoords[2].y * iImageStride) + pixelCoords[2].x]);
pixels[3] = GetColorInt4(srcImage[(pixelCoords[3].y * iImageStride) + pixelCoords[3].x]);
pixels[4] = GetColorInt4(srcImage[(pixelCoords[4].y * iImageStride) + pixelCoords[4].x]);
pixels[5] = GetColorInt4(srcImage[(pixelCoords[5].y * iImageStride) + pixelCoords[5].x]);
pixels[6] = GetColorInt4(srcImage[(pixelCoords[6].y * iImageStride) + pixelCoords[6].x]);
pixels[7] = GetColorInt4(srcImage[(pixelCoords[7].y * iImageStride) + pixelCoords[7].x]);

Ssshhh. This is where the magic happens.

Now that we’ve got the neighboring pixels it’s time to perform the actual edge detection algorithm. We’ll subtract the corresponding pixel pairs, comparing the result against the current maximum value. We’re making use of vector components by not specifying any at all. That means we’re actually performing the Red, Green, Blue and Alpha (which we don’t really care about) operations in tandem. Hopefully the OpenCL compiler is working its magic and applying some sweet SIMD action in there for us.

// Pre-set our maximum color to be Top Right - Bottom Left
destColor = abs(pixels[2] - pixels[5]);
// Bottom Right - Top Left
tmpColor = abs(pixels[7] - pixels[0]);
destColor = max(tmpColor, destColor);
// Top - Bottom
tmpColor = abs(pixels[1] - pixels[6]);
destColor = max(tmpColor, destColor);
// Right - Left
tmpColor = abs(pixels[4] - pixels[3]);
destColor = max(tmpColor, destColor);

Pack it in, boys.

The last thing we do is re-apply the original alpha value, convert the int4 back into an int and stuff it into the output buffer (the result image).

int4 finalColor = {destColor.x, destColor.y, destColor.z, srcColor.w};
destImage[gid] = ColorToInt(finalColor);

The Result

After all that hard work we’re left with a pretty good outline of the original source image. In my testing I saw a 4X improvement when comparing the native CPU implementation against OpenCL running ONLY on the CPU. The native algorithm ran in ~400ms with the OpenCL (CPU device) clocking in at 170ms. A solid showing even without the GPU on duty. OpenCL using a GPU device clocked in a 10ms. 400ms down to 10ms? OpenCL’s future is looking pretty damn promising.

Conclusion

The native C++ implementation of this took about a half hour to code up while the OpenCL version took roughly two hours (probably about an hour was spend on a logic bug on my part). Coding up image filters using OpenCL is super easy. For filters that require user interaction (i.e. tweakable parameters) the ability to modify a high-definition image in under 50 milliseconds is simply incredible. We’ve got a color substitution filter (video demo coming soon) that is nothing short of amazing when you see it live.

Thanks for reading and be sure to follow us on Twitter and hit up our Facebook Fanpage. Until next time – viva le technology!

¹The original algorithm simply skipped over the edge pixels by confining the X and Y axes to (1 to width-1) and (1 to height-1).