How to write a library from a stand-alone program


There are lots of open-source programs out there. It can be anything from example files in a SDK to fully-fledged projects on GitHub or Google Code. One often wants to include the logic of these programs into her own projects. A beginner may think that the easiest way would be to write some shell code that takes the example program as-is, and tries convoluted ways to transform inputs/outputs adapter for testing and visual inspection into machine-processable data. There is, however, a much simpler (at the end) and robust way to achieve the same goal: convert the standalone program into a library.

Let’s take an example: the face detection example program of the OpenCV library. For reference, here is a copy of the source code:

// OpenCV Sample Application: facedetect.c

// Include header files
#include "cv.h"
#include "highgui.h"

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <math.h>
#include <float.h>
#include <limits.h>
#include <time.h>
#include <ctype.h>

// Create memory for calculations
static CvMemStorage* storage = 0;

// Create a new Haar classifier
static CvHaarClassifierCascade* cascade = 0;

// Function prototype for detecting and drawing an object from an image
void detect_and_draw( IplImage* image );

// Create a string that contains the cascade name
const char* cascade_name =
    "haarcascade_frontalface_alt.xml";
/*    "haarcascade_profileface.xml";*/

// Main function, defines the entry point for the program.
int main( int argc, char** argv )
{

    // Structure for getting video from camera or avi
    CvCapture* capture = 0;

    // Images to capture the frame from video or camera or from file
    IplImage *frame, *frame_copy = 0;

    // Used for calculations
    int optlen = strlen("--cascade=");

    // Input file name for avi or image file.
    const char* input_name;

    // Check for the correct usage of the command line
    if( argc > 1 && strncmp( argv[1], "--cascade=", optlen ) == 0 )
    {
        cascade_name = argv[1] + optlen;
        input_name = argc > 2 ? argv[2] : 0;
    }
    else
    {
        fprintf( stderr,
        "Usage: facedetect --cascade=\"<cascade_path>\" [filename|camera_index]\n" );
        return -1;
        /*input_name = argc > 1 ? argv[1] : 0;*/
    }

    // Load the HaarClassifierCascade
    cascade = (CvHaarClassifierCascade*)cvLoad( cascade_name, 0, 0, 0 );

    // Check whether the cascade has loaded successfully. Else report and error and quit
    if( !cascade )
    {
        fprintf( stderr, "ERROR: Could not load classifier cascade\n" );
        return -1;
    }

    // Allocate the memory storage
    storage = cvCreateMemStorage(0);

    // Find whether to detect the object from file or from camera.
    if( !input_name || (isdigit(input_name[0]) && input_name[1] == '\0') )
        capture = cvCaptureFromCAM( !input_name ? 0 : input_name[0] - '0' );
    else
        capture = cvCaptureFromAVI( input_name );

    // Create a new named window with title: result
    cvNamedWindow( "result", 1 );

    // Find if the capture is loaded successfully or not.

    // If loaded succesfully, then:
    if( capture )
    {
        // Capture from the camera.
        for(;;)
        {
            // Capture the frame and load it in IplImage
            if( !cvGrabFrame( capture ))
                break;
            frame = cvRetrieveFrame( capture );

            // If the frame does not exist, quit the loop
            if( !frame )
                break;

            // Allocate framecopy as the same size of the frame
            if( !frame_copy )
                frame_copy = cvCreateImage( cvSize(frame->width,frame->height),
                                            IPL_DEPTH_8U, frame->nChannels );

            // Check the origin of image. If top left, copy the image frame to frame_copy.
            if( frame->origin == IPL_ORIGIN_TL )
                cvCopy( frame, frame_copy, 0 );
            // Else flip and copy the image
            else
                cvFlip( frame, frame_copy, 0 );

            // Call the function to detect and draw the face
            detect_and_draw( frame_copy );

            // Wait for a while before proceeding to the next frame
            if( cvWaitKey( 10 ) >= 0 )
                break;
        }

        // Release the images, and capture memory
        cvReleaseImage( &frame_copy );
        cvReleaseCapture( &capture );
    }

    // If the capture is not loaded succesfully, then:
    else
    {
        // Assume the image to be lena.jpg, or the input_name specified
        const char* filename = input_name ? input_name : (char*)"lena.jpg";

        // Load the image from that filename
        IplImage* image = cvLoadImage( filename, 1 );

        // If Image is loaded succesfully, then:
        if( image )
        {
            // Detect and draw the face
            detect_and_draw( image );

            // Wait for user input
            cvWaitKey(0);

            // Release the image memory
            cvReleaseImage( &image );
        }
        else
        {
            /* assume it is a text file containing the
               list of the image filenames to be processed - one per line */
            FILE* f = fopen( filename, "rt" );
            if( f )
            {
                char buf[1000+1];

                // Get the line from the file
                while( fgets( buf, 1000, f ) )
                {

                    // Remove the spaces if any, and clean up the name
                    int len = (int)strlen(buf);
                    while( len > 0 && isspace(buf[len-1]) )
                        len--;
                    buf[len] = '\0';

                    // Load the image from the filename present in the buffer
                    image = cvLoadImage( buf, 1 );

                    // If the image was loaded succesfully, then:
                    if( image )
                    {
                        // Detect and draw the face from the image
                        detect_and_draw( image );

                        // Wait for the user input, and release the memory
                        cvWaitKey(0);
                        cvReleaseImage( &image );
                    }
                }
                // Close the file
                fclose(f);
            }
        }

    }

    // Destroy the window previously created with filename: "result"
    cvDestroyWindow("result");

    // return 0 to indicate successfull execution of the program
    return 0;
}

// Function to detect and draw any faces that is present in an image
void detect_and_draw( IplImage* img )
{
    int scale = 1;

    // Create a new image based on the input image
    IplImage* temp = cvCreateImage( cvSize(img->width/scale,img->height/scale), 8, 3 );

    // Create two points to represent the face locations
    CvPoint pt1, pt2;
    int i;

    // Clear the memory storage which was used before
    cvClearMemStorage( storage );

    // Find whether the cascade is loaded, to find the faces. If yes, then:
    if( cascade )
    {

        // There can be more than one face in an image. So create a growable sequence of faces.
        // Detect the objects and store them in the sequence
        CvSeq* faces = cvHaarDetectObjects( img, cascade, storage,
                                            1.1, 2, CV_HAAR_DO_CANNY_PRUNING,
                                            cvSize(40, 40) );

        // Loop the number of faces found.
        for( i = 0; i < (faces ? faces->total : 0); i++ )
        {
           // Create a new rectangle for drawing the face
            CvRect* r = (CvRect*)cvGetSeqElem( faces, i );

            // Find the dimensions of the face,and scale it if necessary
            pt1.x = r->x*scale;
            pt2.x = (r->x+r->width)*scale;
            pt1.y = r->y*scale;
            pt2.y = (r->y+r->height)*scale;

            // Draw the rectangle in the input image
            cvRectangle( img, pt1, pt2, CV_RGB(255,0,0), 3, 8, 0 );
        }
    }

    // Show the image in the window named "result"
    cvShowImage( "result", img );

    // Release the temp image created.
    cvReleaseImage( &temp );
}

First step: identify the key components ¶

As the first step, you have to identify the three components of the program : input retrieval, data processing and result output. As you are writing a library, you want to isolate the data processing from input and output. A program has inputs when it has external dependencies. There are several types of them:

  • Parameters passed on the command-line,
  • Data passed on standard input,
  • Files read,
  • Data from the network (e.g. a web page), if the purpose of the program is not retrieving this data.
  • Transient data from other sources

These external dependencies are transformed into internal structures to be processed by the main part of the program, the data processing. Programs are also likely to have some initialization steps, or use “resources”, ie. data files that rarely change.

Once processed, these internal structures are outputted to the user, using ways similar to inputs:

  • Writing to standard output (console),
  • Writing to a file,
  • GUI display

A close look at the code above will tell you that the main(int argc, char** argv) function is mainly dealing with input retrieval and initialization. But let’s go into the details.

From the start to line 60: fields are cleared, command-line argument are parsed.

Lines 60 to 69, we initialize cascade based on cascade_name, which is a constant string. This is ressource initialization.

Lines 75 to 78, the program attempts to acquire a video, either through a file or a camera. We are well into the input retrieval phase.

Lines 87 to 124, we have an infinite loop. It does:

  1. Grab an image from the video
  2. Do some allocations
  3. Preprocess (flip the image if necessary)
  4. Process (detect_and_draw)
  5. Wait and restart

We can see that the code between lines 132 and 145 on one hand, and between lines 166 to 177 on the other hand, follow the same pattern, without the looping.

There’s only one subroutine here, so it is going to be easy. It is called detect_and_draw, and as we have seen in the previous section, it is likely the main processing step of the program. But we haven’t seen any code directly related to the program output, so this function must contain some in addition of the data processing code, and thus we must stay on the lookout.

From lines 196 to 206, we have again some initializing. The if(), line 209, should be always true given lines 62 to 69. Then, we have the meat (see below). Lastly, cvShowImage (“show” is the keyword here) and cvReleaseImage deal with the display of the final results. You can also notice that the IplImage temp is never used…

// There can be more than one face in an image. So create a growable sequence of faces.
// Detect the objects and store them in the sequence
CvSeq* faces = cvHaarDetectObjects( img, cascade, storage,
                                    1.1, 2, CV_HAAR_DO_CANNY_PRUNING,
                                    cvSize(40, 40) );

// Loop the number of faces found.
for( i = 0; i < (faces ? faces->total : 0); i++ )
{
    // Create a new rectangle for drawing the face
    CvRect* r = (CvRect*)cvGetSeqElem( faces, i );

    // Find the dimensions of the face,and scale it if necessary
    pt1.x = r->x*scale;
    pt2.x = (r->x+r->width)*scale;
    pt1.y = r->y*scale;
    pt2.y = (r->y+r->height)*scale;

    // Draw the rectangle in the input image
    cvRectangle( img, pt1, pt2, CV_RGB(255,0,0), 3, 8, 0 );
}

This piece does this: first, detect faces using the Haar cascade, then, for each detected face, compute the bounding rectangle and draws it on top of the image. You can conclude easily that everything except the last line is about detecting faces.

Second step: decide what we (really) want ¶

Now that we know all the components of the initial program, it is time to ask ourselves what we really want from our library. What are the inputs? What are the outputs?

For the inputs, you can decide you want to give your library an image filename or a video filename, a video stream, or simply one image you managed to get by other means. The base rule is, the more abstract your inputs are, the more flexible your library will be. If your library needs a filename, it can’t be used to detect faces from videos streamed from the network, whereas if you decide to only pass one image (as a IplImage) and you decide to move from detecting faces from files in a directory to faces from a video stream, only the network streaming code needs to be rewritten, outside of the library.

The same apply to the outputs. Here, for the sake of the example, let’s say we would like our library to return each detected face (not just the bounding rectangle, but the actual sub-image).

We know have all the parts, we just have to write our own code now.

Third and last step: library ¶

Once we have done this preliminary analysis, writing the library is totally straightforward. First, we write our function declaration:

int detect_faces( IplImage* input, IplImage** output )

Then, we modify the “meat” to match our needs (extracting a sub-image):

// There can be more than one face in an image. So create a growable sequence of faces.
// Detect the objects and store them in the sequence
CvSeq* faces = cvHaarDetectObjects( img, cascade, storage,
                                    1.1, 2, CV_HAAR_DO_CANNY_PRUNING,
                                    cvSize(40, 40) );

output = malloc(faces->total * sizeof(IplImage*));

// Loop the number of faces found.
for( i = 0; i < (faces ? faces->total : 0); i++ )
{
    // Create a new rectangle for drawing the face
    CvRect* r = (CvRect*)cvGetSeqElem( faces, i );

    IplImage * face = CopySubImage(input, r->x, r->y, r->width, r->height);
    output[i] = face;
}

We need to do some initialization also, this can be done in an “init_library” function you will put on top of detect_faces:

// Create memory for calculations
static CvMemStorage* storage = 0;
// Create a new Haar classifier
static CvHaarClassifierCascade* cascade = 0;

// Create a string that contains the cascade name
const char* cascade_name = "haarcascade_frontalface_alt.xml";

// Initialize the library. Call this function first.
bool init_library() {
    // Load the HaarClassifierCascade
    cascade = (CvHaarClassifierCascade*)cvLoad( cascade_name, 0, 0, 0 );

    // Check whether the cascade has loaded successfully. Else report and error and quit
    if( !cascade )
    {
        return false;
    }

    // Allocate the memory storage
    storage = cvCreateMemStorage(0);
    return true;
}

Then, add all the necessary header files: you are done!

I will leave as an exercise to the reader to piece together the final .c file.