Dealing with no Image2d support in OpenCL

The Image2d type is not a guaranteed type because the underlying device must support it. Since I like to keep things as open as possible I opted to skip using Image2D in PhotoMonkee’s image filtering system and instead go with using int.

int?

Yeah, like 32 bits of colory goodness. Since Qt currently only supports 32 bit images there’s no need to deal with higher bit rates in the filter infrastructure. The tricky part (and maybe performance reducing part) is converting the integer into its component color parts so that we can modify it.

Let’s dig into an example to break things down. In this sample I’ll be showing you the kernel for image desaturation (i.e. converting to grayscale).

Passing image data to the kernel

As previously mentioned, PhotoMonkee uses Qt for holding image data. We use a combination of QPixmap and QImage, depending on the area. In this case, we’re using QImage for direct pixel access. The OpenCL memory buffer is populated with the contents inside of QImage::bits().

The Kernel

Because we’re working with 32bpp the kernel takes in int * as the source and destination buffer types.

__kernel void grayscale(__global int *srcImage,
                         __global int *destImage) {

In order to pick out the exact pixel we need from the buffer we simply need the GID which idexes directly to our source and destination pixel. With the GID in hand, we reach into the array and get back our int. One problem, we need RGBA components we can work with.

    int gid = get_global_id(0);
    int4 color = GetColorInt4(srcImage[gid]);

Enter: the bit shifter! 

This function, and its corresponding reversal function, will convert between the int and int4 data types.

__attribute__ ((always_inline)) int4 GetColorInt4(int srcColor) {

    int4 color;

    color.x = (srcColor >> 16) & 0xff;
    color.y = (srcColor >> 8) & 0xff;
    color.z = (srcColor & 0xff);
    color.w = (srcColor >> 24) & 0xff;

    return color;
}

__attribute__ ((always_inline)) int ColorToInt(int4 srcColor) {

    return ((srcColor.w & 0xff) << 24) |
            ((srcColor.x & 0xff) << 16) |
            ((srcColor.y & 0xff) << 8) |
            (srcColor.z & 0xff);
}

Why not convert from an int into a char4? Some image manipulation operations will blow past the upper bounds (255) or lower bounds(0) of the color range. By using an int we allow for that to occur. Note: it’s up to the kernel to make sure the value gets properly clamped before returning! 

Now that we’ve got our RGBA color components we can modify the values.

    int grayColor = 0.299f * color.x +
                    0.587f * color.y +
                    0.114f * color.z;

    grayColor = clamp((float)grayColor, 0.0f, 255.0f);

    color.x = grayColor;
    color.y = grayColor;
    color.z = grayColor;

Lastly, we convert back from int4 to int and then stash the result in the output buffer.

    destImage[gid] = ColorToInt(color);
}

I’ve done every filter in PhotoMonkee using this method for accessing image data. While I don’t have any performance comparisons to using Image2D I’m happy with the rate at which the filters execute on both GPU and CPU-only accelerated platforms. The only downside I’ve seen from using this methodology is that reading or writing to buffers outside of their range is BAD! It causes all sorts of strange things to happen the most severe of which is watching the GPU driver crash. Not cool.

In summary: I thought this was a pretty cleaver workaround to the problem spotty Image2D support. Do you have a solution that you have used in your implementations? I’d love to hear about it. Drop a comment below or feel free to shoot me an email: admin at this website.

Thanks for reading!

Creating a Ruler Widget in Qt

In the latest PhotoMonkee build I cooked up some rulers to serve as a handy reference point when editing items on the canvas. I’m pretty happy with how they turned out. The feature took a little longer than planned due to some struggles with how I wanted to handle zooming. In the end I took a shortcut (violated some OOP “rules”) to quickly bypass the issues I was hitting.

So what do the new rulers look like?

Well, like rulers! In the above picture you’ll see both a horizontal and vertical ruler extending along the edge of the window. A blue hash mark shows the current position of the cursor. I went with the “quarters” type of ruler whereby major hash marks are shown with three quarter hash marks are drawn in between each major.

Scrolling the canvas (when zoomed in far enough to allow for scrolling) will force a redraw of the ruler to update the coordinates as needed. In addition, zooming the canvas will also force an update to the rulers so that they can account for the new scaling that is required.

Future enhancements I’ve got planned for the rulers include:

  • Highlighting an area on the ruler when a selection is made via the marquee tool
  • Allowing the user to select the unit type used in the ruler
  • Allowing the user to turn the ruler on or off (it’s always on right now)
  • Creating guides by dragging them “out” of the rulers (not really a ruler feature but it’ll be involved)

The Design

As with any Qt widget, the ruler widget (PMRuler was the class name I used) will be based off of the QWidget class. Rather than break out the horizontal and vertical drawing logic into separate classes I just created an orientation enum and tossed some conditionals into the drawing routines.

Here is the header file for PMRuler

#ifndef PMRULER_H
#define PMRULER_H

#include <QWidget>
namespace Ui {class PMRuler;};

class PMRuler : public QWidget
{
    Q_OBJECT

public:
    enum Orientation {
        Horizontal,
        Vertical
    };

    PMRuler(PMRuler::Orientation orientation, QWidget *parent = 0);
    ~PMRuler();


    QSize sizeHint() const;
    QSize minimumSizeHint() const;
    void paintEvent(QPaintEvent* pEvent);
    void DrawHash(QPainter &painter, QLine &hashLine, int iHashType, int iMinorInterval);

    void ZoomUpdate();
private slots:
    void MousePosUpdate();

private:
    Ui::PMRuler *ui;
    PMRuler::Orientation m_Orientation;
    QSize                m_Size;
    QPoint               m_MousePos;   
    QTimer               *m_pTimer;
    QPixmap              m_CachePixmap;

};

#endif // PMRULER_H

What’s The Timer For?

Rather than have the ruler connect to another widget or have the parent window notify the ruler when the mouse moved I opted for polling the moue using QCursor::pos(). Right now I’ve got it set on a 50 millisecond polling interval.

Widget positioning

I found that by using the QGridLayout in the parent window allowed for easy positioning of the ruler widgets and the QScrollArea (which houses the canvas).

// Set the layout to make sure that the scroll area takes up the entire window's client area.
    QGridLayout *gridLayout = new QGridLayout();
    gridLayout->setSpacing(0);
    gridLayout->setMargin(0);

    QLabel *pLabel = new QLabel("px");
    pLabel->setBackgroundRole(QPalette::Window);
    pLabel->setStyleSheet("color: rgb(255, 255, 255);");
    pLabel->setAlignment(Qt::AlignCenter);
    pLabel->setFixedSize(24, 24);
   
    m_pHorizRuler = new PMRuler(PMRuler::Horizontal, this);
    m_pHorizRuler->show();     

    m_pVertRuler = new PMRuler(PMRuler::Vertical, this);
    m_pVertRuler->show();  

    gridLayout->addWidget(pLabel, 0, 0);
    gridLayout->addWidget(m_pHorizRuler, 0, 1);
    gridLayout->addWidget(m_pVertRuler, 1, 0);
    gridLayout->addWidget(m_ScrollArea, 1, 1); 

    this->setLayout(gridLayout);

Drawing the Ruler

First rule of drawing rulers: cache drawing the ruler as often as possible. Right now we’re drawing major hash marks, quarter hash marks and numbers on the major hash lines. If we add in eighth or sixteenth hash marks things could get a little more expensive. Even without the other hash types, drawing everything each time the mouse moves (to update the blue mouse position indicator) is a huge waste of resources. We only redraw the ruler to a QPixmap cache when the zoom factor is updated or a scroll bar is dragged.

The painting handler

void PMRuler::paintEvent(QPaintEvent *) {
    QPainter painter(this);
    painter.drawPixmap(0, 0, m_CachePixmap);

    QLine mousePosLine;
   
    // Draw the blue hash mark for the current mouse position
    if(m_Orientation == Vertical)
        mousePosLine = QLine(0, m_MousePos.y(), 24, m_MousePos.y());               
    else
        mousePosLine = QLine(m_MousePos.x(), 0, m_MousePos.x(), 24);   
   
    painter.setPen(Qt::blue);
    painter.drawLine(mousePosLine);
   
}//end of PMRuler::paintEvent()

So what is the actual algorithm to draw the ruler? Well, I break it down into two passes: Drawing from 0 to end of the ruler and 0 to the start of the ruler. Each loop simply calls a function to perform drawing where it’s appropriate. The iMinorInterval variable determines the spacing, in real screen coordinates, for each hash marks. This is currently set to 25 pixels which means every 100 pixels we get a major hash mark. The iHashType variable is simply a counter. Inside DrawHash() we do a “ihashType % 4″ check to see if we’re on a major or minor hash.

        // Draw the hashes from canvas(0,0) to the end of the ruler
    for(int x = iStart; x < iEnd; x+= iMinorInterval) {                            

        DrawHash(painter, hashLine, iHashType, iMinorInterval);    
        iHashType++;
    }

The hashLine variable (a QLine object) is the current line we’re drawing. It’s a reference so DrawHash() will actually move the line to the next location for the next iteration (woops, did I just violate another OOP rule?)

Coordinate Conversion

During the initial implementation I tried to pass in the scale factor to the PMRuler object and have it convert from actual pixel coordinate space to canvas coordinate space. After a couple of nights of frustration I just punted. The ghetto hack? Have the PMRuler object ask its parent to convert a coordinate from actual pixel space to canvas coordinate space.

You see, when a horizontal PMRuler draws its first major hash mark it is at (100,0). If the canvas is currently set to 100% zoom then we are ALSO at (100,0) on the canvas. If the canvas is currently zoomed in to 110% then (100,0) is actually (110,0) on the canvas.

To perform the conversion I converted the PMRuler‘s local coordinates to global coordinates using mapToGlobal(). Then I take that global coordinate and map it to the QGraphicsView coordinate system using mapFromGlobal(). Lastly we go from widget space to canvas space by calling mapToScene().

Gotchas

Along the way I hit a couple of issues that took some time to resolve. The first was the failure to re-implement the sizeHint() and minimumSizeHint() functions in the PMRuler class. That means when Qt was trying to figure out the appropriate size for the rulers it would end up calling the default QWidget implementation. Who knows what that said the size should be…it definitely wasn’t right!

Readability of the code was also an issue that I took some time to address. Initially I had two methods in PMRuler to perform the drawing. One method was for drawing “forward” (zero to end of ruler) and the other was “backwards” (zero to the start of ruler). I’m glad I took the time to clean things up because while writing this feature. There are other sections of PhotoMonkee that have very messy code and I’m not looking forward to reacquainting myself with them :)

Conclusion

I’m pretty happy with how the control turned out. It serves as a great basis for some future enhancements and helps PhotoMonkee look a little more like a “real” image editing program. Granted, breaking away from the traditional mold of image editing wouldn’t be a bad thing.

If there are more features you’d like me to write about please drop a comment below or contact me directly: admin at photomonkee dot you know what! Until next time….stay thirsty my friends.

 

OpenCL Hard at Work: The Color Substitution Filter

I’ve been asked by a few people “Why do you require OpenCL? Photoshop doesn’t need it and they sell a hojillion copies!”. My response is twofold:

1) I want to push the envelope and incorporate the latest technologies.
2) I don’t need to sell a hojillion copies…just enough to pay the mortgage!

Before continuing, here’s a quick video of the color substitution filter in action (pop the player out to 1080p for the best viewing).

The concept of image filtering is simple: given a source image, apply some type of algorithm to the image which yields a modified version. The filter can be as simple as inverting the colors or something a little more taxing such as a multi-pass convolution filter. Regardless of how simple or complex the filter is, it must be run on every single pixel of the image.

Given a 3000×2000 image (common resolution from a compact digital camera) that means our filter must be run 6 million times to produce the output image.

Some filters have parameters that the user can tweak to vary the intensity of the effect. Using PhotoMonkee’s Color Substitution Filter as an example, the user can spin the color wheel to change the mapping of source and destination hues. Each time the user modifies a parameter we’ve got to reapply the filter to all 6 million of our pixels.

Here’s where things get sticky.

The computer is going to be awfully busy trying to update 6 million pixels when the application asks to redraw the UI because the user moved a slider or clicked a button. Best case? The application spins the filter off to a separate thread so the UI doesn’t hang but you’re still going to end up with a very slow rendering of your image.

What about a preview?

Great idea! Instead of trying to update 6 million pixels we can scale down the image to 256×256 and just update that scaled down preview until the user clicks “accept”. At that point we can give it the full monty. That sucks! I want to see my beautiful 3000×2000 image updated on my 24″ monitor as I tweak the filter. Unfortunately any application that only makes use of the CPU is bound to this slow fate.

How does OpenCL fix this?

OpenCL leverages a concept called heterogeneous computing. In short, it can farm out work to all of your CPU cores and all of the processing units on your graphics card (GPU). In particular, those GPU processing units are really good at crunching image data. An HD resolution image (1920×1080) contains 2,073,600 pixels and is a walk in the park, even on older GPU’s.

OpenCL compatibility issues

OpenCL is still a relatively young API at only 3.5 years old. The initial driver implementations by nVidia and ATI were clunky and are only now getting stable. In addition,  Intel is relatively new to the game with their driver offering for CPU-only acceleration. Windows doesn’t yet ship with a driver which, until that happens, means OpenCL isn’t quite mainstream. Think back to the days when OpenGL didn’t come stock with your OS.

Going Forward

Even given the compatibility issues I think that OpenCL is here to stay and is laying a solid foundation for high performance applications in the future. Just this week I started coding up the image filter plugins so 3rd party developers can add filter to PhotoMonkee. Those plugins will all have the ability to leverage OpenCL for accelerated processing.

Interested in getting involved with PhotoMonkee plugin development? Send an email to “support@photomonkee.awesome” =~ s/awesome/com/; and we’ll let you know when the API becomes available. Thanks for your interest!

Edge Detection in OpenCL

One of the OpenCL based filters I wanted to add before the next build was simple edge detection. The first step was to find a suitable algorithm that would scale nicely with OpenCL. There are many algorithms out there and I’ll probably implement more (Sobel and Canny come to mind) but this first implementation is based on Christian Graus’s Difference Edge Detection algorithm. The C/C++ code is fairly straightforward

Our Raving Buddy as the Source

and requires a double for loop with some very basic subtraction and comparison operations in the core of the loop.

The OpenCL version of this algorithm isn’t too much more complex than its native code predecessor. The bulk of the extra work lies in figuring out the adjacent pixels and I also added wrapping to handle pixels on the edges¹. We’ll start by showing the complete listing of the OpenCL kernel then break down each component with an in-depth overview.

Complete listing for edgeDetect.cl

__attribute__ ((always_inline)) int4 GetColorInt4(int srcColor) {

int4 color;

color.x = (srcColor &gt;&gt; 16) &amp; 0xff;
color.y = (srcColor &gt;&gt; 8 ) &amp; 0xff;
color.z = (srcColor &amp; 0xff);
color.w = (srcColor &gt;&gt; 24) &amp; 0xff;

return color;
}

__attribute__ ((always_inline)) int ColorToInt(int4 srcColor) {

return ((srcColor.w &amp; 0xff) &lt;&lt; 24) |
((srcColor.x &amp; 0xff) &lt;&lt; 16) |
((srcColor.y &amp; 0xff) &lt;&lt; 8 ) |
(srcColor.z &amp; 0xff);
}

__kernel void difference(__global int *srcImage,
__global int *destImage,
int iImageStride,
int iImageHeight) {

const int gid = get_global_id(0);
const int x = gid % iImageStride;
const int y = gid / iImageStride;
int4 srcColor = GetColorInt4(srcImage[gid]);

uint4 destColor; // The destination color
uint4 tmpColor; // Used to hold the results of a computation before maximum comparison.
// It's a uint because abs() returns uint's.

int2 pixelCoords[8]; // Holds neighboring pixel coordinates
int4 pixels[8]; // The pixels around us: 0=TopLeft, 1=Top, 2=TopRight,
// 3=Left, 4=Right
// 5=BottomLeft, 6=Bottom, 7=BottomRight

// Gather up all the pixel coordinate values we'll use (we'll do wrapping next)
pixelCoords[0].xy = (int2){x - 1, y - 1};
pixelCoords[1].xy = (int2){x , y - 1};
pixelCoords[2].xy = (int2){x + 1, y - 1};
pixelCoords[3].xy = (int2){x - 1, y};
pixelCoords[4].xy = (int2){x + 1, y};
pixelCoords[5].xy = (int2){x - 1, y + 1};
pixelCoords[6].xy = (int2){x , y + 1};
pixelCoords[7].xy = (int2){x + 1, y + 1};

// Wrap the X-coordinates if the source pixel is on the left or right edge
if(x == 0) {

// Wrap the left side
pixelCoords[0].x = iImageStride - 1;
pixelCoords[3].x = iImageStride - 1;
pixelCoords[5].x = iImageStride - 1;

} else if(x == (iImageStride - 1)) {

// Wrap the right side
pixelCoords[2].x = 0;
pixelCoords[4].x = 0;
pixelCoords[7].x = 0;
}

// Wrap the Y-coordinates if the source pixel is on the top or bottom edge
if(y == 0) {

// Wrap the top
pixelCoords[0].y = iImageHeight - 1;
pixelCoords[1].y = iImageHeight - 1;
pixelCoords[2].y = iImageHeight - 1;

} else if(y == (iImageHeight - 1)) {

// Wrap the bottom
pixelCoords[5].y = 0;
pixelCoords[6].y = 0;
pixelCoords[7].y = 0;
}

// Retrieve all of the pixel values.
pixels[0] = GetColorInt4(srcImage[(pixelCoords[0].y * iImageStride) + pixelCoords[0].x]);
pixels[1] = GetColorInt4(srcImage[(pixelCoords[1].y * iImageStride) + pixelCoords[1].x]);
pixels[2] = GetColorInt4(srcImage[(pixelCoords[2].y * iImageStride) + pixelCoords[2].x]);
pixels[3] = GetColorInt4(srcImage[(pixelCoords[3].y * iImageStride) + pixelCoords[3].x]);
pixels[4] = GetColorInt4(srcImage[(pixelCoords[4].y * iImageStride) + pixelCoords[4].x]);
pixels[5] = GetColorInt4(srcImage[(pixelCoords[5].y * iImageStride) + pixelCoords[5].x]);
pixels[6] = GetColorInt4(srcImage[(pixelCoords[6].y * iImageStride) + pixelCoords[6].x]);
pixels[7] = GetColorInt4(srcImage[(pixelCoords[7].y * iImageStride) + pixelCoords[7].x]);

// Pre-set our maximum color to be Top Right - Bottom Left
destColor = abs(pixels[2] - pixels[5]);

// Bottom Right - Top Left
tmpColor = abs(pixels[7] - pixels[0]);
destColor = max(tmpColor, destColor);

// Top - Bottom
tmpColor = abs(pixels[1] - pixels[6]);
destColor = max(tmpColor, destColor);

// Right - Left
tmpColor = abs(pixels[4] - pixels[3]);
destColor = max(tmpColor, destColor);

// Stuff the pixel back in there
int4 finalColor = {destColor.x, destColor.y, destColor.z, srcColor.w};

destImage[gid] = ColorToInt(finalColor);

}//end of difference()

Where’s image2d_t and why are you using int * ?

Not every device has to support image objects. Sure, there’s some extra performance gain to be had by using them but while OpenCL is still in its infancy I’m shooting for maximum compatibility across all vendors and platforms. In lieu of using image2d_t we pass in good ole 32-bit pixels. PhotoMonkee uses 32 bit RGBA internally so that’s what gets passed down into all OpenCL kernels. The GetColorInt4() and ColorToInt() functions perform all of the neccesary bit shifting that allow us to work with each color component. We opted for using int’s to allow some overflow when performign computations. We always call clamp(val, 255) before passing the int4 type back into ColorToInt().

Where in the image…is Carmen Sandiego?

The global ID is our 1D index into the image. We need to extrapolate our X,Y coordinate so that we can find our neighbors. Using the modulus and division operators that’s no problem. While we’re at it, we’ll snag the current pixel.

const int gid = get_global_id(0);
const int x = gid % iImageStride;
const int y = gid / iImageStride;
int4 srcColor = GetColorInt4(srcImage[gid]);

Example: If we’re at index 200 of a 64×64 image: 200 % 64 = 8 = X and 200 / 64 = 3 = Y. Welcome to (8, 3), enjoy your stay!

Welcome to the neighborhood, neighbor!

Now it’s time to find our neighboring pixels. To start we setup 8 coordinates with the appropriate offsets from our current X,Y position. Example: The pixel to our top, left is X – 1, Y – 1.

pixelCoords[0].xy = (int2){x - 1, y - 1};
pixelCoords[1].xy = (int2){x , y - 1};
pixelCoords[2].xy = (int2){x + 1, y - 1};
pixelCoords[3].xy = (int2){x - 1, y};
pixelCoords[4].xy = (int2){x + 1, y};
pixelCoords[5].xy = (int2){x - 1, y + 1};
pixelCoords[6].xy = (int2){x , y + 1};
pixelCoords[7].xy = (int2){x + 1, y + 1};

Astute readers will correctly inquire what happens if we’re at 0,0: don’t worry, we’ve got you covered! If we detect our current pixel is on an edge the we “reach” (wrapping in graphics parlance) around the image and snag the pixel from the opposite side of the image. For example, if we are at 0,0 in an image that’s 64×64 then the top-left  neighbor will be 63,63.

// Wrap the X-coordinates if the source pixel is on the left or right edge
if(x == 0) {

// Wrap the left side
pixelCoords[0].x = iImageStride - 1;
pixelCoords[3].x = iImageStride - 1;
pixelCoords[5].x = iImageStride - 1;
} else if(x == (iImageStride - 1)) {

// Wrap the right side
pixelCoords[2].x = 0;
pixelCoords[4].x = 0;
pixelCoords[7].x = 0;
}

// Wrap the Y-coordinates if the source pixel is on the top or bottom edge
if(y == 0) {

// Wrap the top
pixelCoords[0].y = iImageHeight - 1;
pixelCoords[1].y = iImageHeight - 1;
pixelCoords[2].y = iImageHeight - 1;

} else if(y == (iImageHeight - 1)) {

// Wrap the bottom
pixelCoords[5].y = 0;
pixelCoords[6].y = 0;
pixelCoords[7].y = 0;
}

Now that we have the coordinates for our neighbors...GO FETCH!
// Retrieve all of the pixel values.
pixels[0] = GetColorInt4(srcImage[(pixelCoords[0].y * iImageStride) + pixelCoords[0].x]);
pixels[1] = GetColorInt4(srcImage[(pixelCoords[1].y * iImageStride) + pixelCoords[1].x]);
pixels[2] = GetColorInt4(srcImage[(pixelCoords[2].y * iImageStride) + pixelCoords[2].x]);
pixels[3] = GetColorInt4(srcImage[(pixelCoords[3].y * iImageStride) + pixelCoords[3].x]);
pixels[4] = GetColorInt4(srcImage[(pixelCoords[4].y * iImageStride) + pixelCoords[4].x]);
pixels[5] = GetColorInt4(srcImage[(pixelCoords[5].y * iImageStride) + pixelCoords[5].x]);
pixels[6] = GetColorInt4(srcImage[(pixelCoords[6].y * iImageStride) + pixelCoords[6].x]);
pixels[7] = GetColorInt4(srcImage[(pixelCoords[7].y * iImageStride) + pixelCoords[7].x]);

Ssshhh. This is where the magic happens.

Now that we’ve got the neighboring pixels it’s time to perform the actual edge detection algorithm. We’ll subtract the corresponding pixel pairs, comparing the result against the current maximum value. We’re making use of vector components by not specifying any at all. That means we’re actually performing the Red, Green, Blue and Alpha (which we don’t really care about) operations in tandem. Hopefully the OpenCL compiler is working its magic and applying some sweet SIMD action in there for us.

// Pre-set our maximum color to be Top Right - Bottom Left
destColor = abs(pixels[2] - pixels[5]);
// Bottom Right - Top Left
tmpColor = abs(pixels[7] - pixels[0]);
destColor = max(tmpColor, destColor);
// Top - Bottom
tmpColor = abs(pixels[1] - pixels[6]);
destColor = max(tmpColor, destColor);
// Right - Left
tmpColor = abs(pixels[4] - pixels[3]);
destColor = max(tmpColor, destColor);

Pack it in, boys.

The last thing we do is re-apply the original alpha value, convert the int4 back into an int and stuff it into the output buffer (the result image).

int4 finalColor = {destColor.x, destColor.y, destColor.z, srcColor.w};
destImage[gid] = ColorToInt(finalColor);

The Result

After all that hard work we’re left with a pretty good outline of the original source image. In my testing I saw a 4X improvement when comparing the native CPU implementation against OpenCL running ONLY on the CPU. The native algorithm ran in ~400ms with the OpenCL (CPU device) clocking in at 170ms. A solid showing even without the GPU on duty. OpenCL using a GPU device clocked in a 10ms. 400ms down to 10ms? OpenCL’s future is looking pretty damn promising.

Conclusion

The native C++ implementation of this took about a half hour to code up while the OpenCL version took roughly two hours (probably about an hour was spend on a logic bug on my part). Coding up image filters using OpenCL is super easy. For filters that require user interaction (i.e. tweakable parameters) the ability to modify a high-definition image in under 50 milliseconds is simply incredible. We’ve got a color substitution filter (video demo coming soon) that is nothing short of amazing when you see it live.

Thanks for reading and be sure to follow us on Twitter and hit up our Facebook Fanpage. Until next time – viva le technology!

¹The original algorithm simply skipped over the edge pixels by confining the X and Y axes to (1 to width-1) and (1 to height-1).

 

Enter once…return once.

Back in CS101 (or was it CS201?) I was taught that if you plan on returning a value from a function you do it one place, at the end. 12 years have elapsed since that course and I’ve become a bit ignorant of that sage piece of coding advice. My ignorance finally came back to bite me in the ass this weekend. I couldn’t figure out why PhotoMonkee was crashing while working on the newly revamped free transformation tool. Behold…the offending function:

QCursor MoveTool::GetCursor() {

    switch(m_eState) {
    case None:
        return QCursor(Qt::ArrowCursor);
    case None_Moving_Cursor:
        return QCursor(Qt::SizeAllCursor);
    case OverHandle:
        return QCursor(Qt::OpenHandCursor);
    }
}

Two problems
One – no default statement. Woops…must have spaced that one. Two – there is no return statement at the end of the function! If the current state of the move tool is MoveTool::Resize then we are going to return…whatever the hell happens to be on the heap. Yeah, that’s bad. GCC would have bitched at me for this one but for whatever reason, Visual Studio let it slide. I’m sure there’s probably a warning somewhere in my compile logs (there are quite a few I admit I need to clean up) but this seems like a disaster waiting to happen.

So, young coding adventurer, heed the words of your instructor. He or she probably had a good reason for the twenty minute diatribe about why they…wah wah waaaahh wah wah wahhh wah wah.

Let the solution come to you

I spend a full day at the office cranking away at my day job and then spend my evenings writing PhotoMonkee features. I’ll admit that I don’t cook up my best stuff during the hours when I’d rather be reading, vegging out playing video games or some other leisure activity. It’s during those ‘tired’ times when problem solving gets difficult and putting forth more effort is all for nothing. Chances are I’ll spend a solid hour fumbling around the problem and never really come up with a creative solution. Instinct and a desire to solve the problem say you should pour a cup of coffee and really buckle down. I say that’s when you pack it in. Yeah, that’s right: give up!

Chances are, if I get a good night’s rest, before I arrive at the office I’ll already have a potential solution come to mind without really “focusing” on the problem like I would sitting in front of my computer. Sometimes, trying harder just isn’t the answer.

What was the problem I was working on? Glad you asked!

PhotoMonkee uses Qt’s QGraphicsScene and QGraphicsView classes to manage drawing operations of the canvas. The handles used to transform the current selection “live” within the same scene. So if you want to click a handle and drag it outside of the canvas (i.e. stretch it a lot) you aren’t allowed because we can’t draw the handle.

My first idea (that I was pounding on last night) was to draw the resize item and its handles as custom QWidget items that were children of the current window. The major gotcha there? You can’t rotate windows. Back to square one.

The solution that came to mind this morning? Create an invisible “overlay canvas” that encompasses the entire window. Since the overlay scene covers the entire window we’ll be able to drag the handles anywhere within the confines of the window.

So, the next time you’re stuck on a problem. Try ‘backgrounding’ your brain and let a solution come to you!

Launched

Call it a “soft launch” but it’s a launch nonetheless. The website is now live, along with the store. An announcement was made on our Facebook fanpage as well as to my friends and family (feel free to spread the word if you’re reading this).

We’re selling an alpha version (build 0.12a) of PhotoMonkee for eight bucks (buyers are getting all of 1.x for $8) with the hopes of drawing in a few folks to give some solid feedback and help flesh out the last major bugs.

So, the easy part is done: writing tons of code and getting stuff to work on my development machine and our family computers. Now comes the hard part – fixing bugs and adding support for everyone else’s configurations!

It’s crazy to imagine that just about seven months ago I started learning a new GUI framework just out of boredom from my day job. It’s funny what you can do with a little time, effort and boatloads of support from your family.

Viva le e-commerce!

Skill Set

Twelve years ago when I graduated from college coding was fun. I thought my passion for coding was enough to get a business up and running and in 2003, started a game development company. A short 8 months later we closed up shop and I was left wondering what the hell just happened. I pissed away my “dot com winnings”, put my family in quite a financial pickle and was really questioning my judgement.

It wasn’t lack of passion that was the problem – it was lack of experience and guidance. Let me qualify that last statement. By “experience” I don’t mean a corporate cube farm but rather the process of taking a project from inception to completion. Sometimes you make decisions to cut corners, skip functionality, enhance specific areas and make other “on the fly” judgement calls that only come from working on real-world projects.

In my professional career I’ve authored roughly a half dozen major projects from inception and each time gained valuable experience to improve my own personal process for the next iteration. PhotoMonkee has been a very interesting experience on that front. I’ve been able to, very quickly, learn new algorithms, turn out new feature and cycle through bugs. Most recently I completely changed the development environment from Qt Creator to Visual Studio in a couple of hours. No small feat but without years of Windows software development under my belt that quick turnaround would not have been possible.

Now, I’m not advocating everyone spend 12 years in a cube farm before finding your personal pet project and striking out on your own. However if that’s what it takes for you to perfect your own development cycle then so be it. If coding new features and the “debug cycle” doesn’t come easy to you then it’s worth taking a step back to think about your skill set before venturing out on your own.

Custom Dialog Boxes in Qt

PhotoMonkee is built using the (wonderful) Qt UI framework. Something that’s been on my TODO list for quite some time has been to customize the look of all the dialog boxes in the application. Qt allows an application to completely customize the look of any window control but doing a dialog is a bit tricky. The first thing I needed to do in order to fully customize the dialog box is to drop the entire window frame that the window manager encases the dialog in. That includes the title bar. One of the main features of a dialog box is the ability for the user to move it around on the screen by clicking on the title bar and dragging it.

Enter: Subclassing. I created a new class, PMDialog that is derived from the QDialog class. In it, I create a QLabel that always exists at the top of the dialog (32 pixels tall). It contains the nice gradient you see as well as the title of the dialog box. An event filter handles the magic of capturing the mouse when the user clicks on the header and drags the dialog around. Seriously…Qt has hooks for everything!

Another sweet feature is the ability to set a mask for any control to display non-rectangular shapes. In the case of PhotoMonkee’s dialog boxes I’m cooking up a 1bit QBitmap with QPainter::drawRoundedRect() and applying the result of that to dialog. Voila…curved edges.

After we get out of alpha I’d like to come back and animate the display and closure of dialog boxes. Now that all the functionality is wrapped up into a single base class it’s an “implement once and enjoy everywhere” feature!

Viva le Qt!

Histogram? More like Fantastigram!

One of the todo items for the road to beta is an automatic image level adjuster. Put simply, it’s the easy button for histogram manipulation. Advanced graphics nerds know how to use curves to modify an image’s histogram. Everyone else just hits “auto adjust” and BOOYA…the image magically looks better (sometimes).

For the uninitiated, histograms are graphs that show how colors are distributed throughout an image. Each element in a color space has 255 possible values. Knowing that, we design our histogram to have 255 possible x-axis values. What we do is inspect each pixel’s color, and whatever value it is (from 0 – 255) we increment that point in the graph by one unit.

In the upper right hand corner you’ll see the newly minted histogram window for PhotoMonkee. The peaks on the left side of the graph show that there are a lot of red, green and blue pixels with values between 0 and 75 with not many in the middle to higher range. RGB values (that’s short for red, green and blue in the nerd circles) are cool to see on a graph but what we’re really after is a completely different color space called HSL or Hue, Saturation and Lightness.

Here’s the histogram in HSL space for the cork image on the left. On the right is the image of just the Lightness parameter.

  

What we want to do is get the lightness component to use more of its available spectrum. You can see on the very right edge of the graph there are no values being used. That means we can “stretch” the graph out to use the entire spread of values. After “stretching” the graph we then go back and map the lightness value of each pixel to its new spot on the graph. The result looks like so:

If you compare the two images you’ll note that the overall brightness of the resultant image has been increase but the color values have remained relatively unchanged. There is still plenty of tweaking to be done but this is a fantastic start.

To calculate the RGB and HSL histograms for a 1920×1080 image takes roughly 90 milliseconds. Stunning when you consider converting from RGB to HSL color space for every pixel isn’t exactly a cheap conversion. Also, my way of incrementing the ‘buckets’ is ridiculously slow and contentious since I’m using OpenCL’s atomic_inc() extension. I’ll have to revisit this later for further optimization.

Viva Le Histogram!