Teigha Multithreading Low-Level API (Part 2 of 4)

Andrew Markovich
February 01, 2018

Tags: performance Example getting started

This article is part of a series about the low-level API used for multithreading. For the first article, see Part 1.

Built-in Teigha mutex objects

Teigha Kernel provides cross-platform implementation of mutex objects. For a description of mutex (MUTual EXclusion) objects, see Wikipedia.

Mutex objects are wrapped in the OdMutex class. This class contains only two methods:

void lock();
void unlock();

The OdMutex::lock method can be called at the beginning of a critical section that, for example, contains shared resources accessible from multiple threads. The OdMutex::unlock method must be called at the end of the critical section. The OdMutex::lock/OdMutex::unlock methods represent scopes for shared resource access, so they should always be called in a pair. If the OdMutex::unlock method is not called after using the OdMutex::lock method, the code section is kept in a locked state infinitely and the application will enter into an infinite loop. When the thread calls the OdMutex::unlock method it opens the gates for other threads that are waiting inside their OdMutex::lock method calls. Mutex objects are re-enterable, which means that the same tread that has a locked mutex can enter into a locked section again many times, but of course, after each subsequent entrance it must call mutex unlock, which will reduce the number of thread entrances in the critical section; only a final OdMutex::unlock method call will unlock critical code sections for other threads.

To simplify working with mutex objects, the OdMutexAutoLock class can be used. It is simply a call to the OdMutex::lock method in a constructor and to the OdMutex::unlock method in a destructor; this guarantees that the lock/unlock methods pair is called together and the developer doesn’t forget to call mutex unlock when leaving a critical section.

Example of using mutex objects

To demonstrate using mutex objects, we will extend our multithreaded image processing example from Part 1 in this series of articles. We will add a new ExampleMtStringAccumulator class, which accumulates output text strings from multiple threads into one large text string:

// Example of multithread strings accumulator invoking mutex object
class ExampleMtStringAccumulator
    OdString m_outputString;
    OdMutex m_mutex;
    ExampleMtStringAccumulator() {}
    ~ExampleMtStringAccumulator() {}
    void addString(const OdString &str)
      m_outputString += str;
    const OdString &getString() { return m_outputString; }

Since the ExampleMtStringAccumulator::addString method will be called from multiple threads, to avoid conflicts between threads during string concatenation we will secure the string concatenation operation using a mutex object. TD_AUTOLOCK macros simply invoke the OdMutexAutoLock class for multithreaded Teigha configuration, so it is used here to simplify the final code. Alternatively we can use the OdMutexAutoLock class as-is:

OdMutexAutoLock autoLock(m_mutex);

Now we construct an instance of the ExampleMtStringAccumulator class in our main example function:

// Multithread output strings accumulator
ExampleMtStringAccumulator stringsAccum;

Extend the ProcessImageCaller class to output scanline numbers that it processed through the ExampleMtStringAccumulator class:

// Thread running method implementation
class ProcessImageCaller : public OdRxObject
  OdSmartPtr m_pProcImage;
  OdUInt32 m_scanLineFrom, m_nScanLines;
  ExampleMtStringAccumulator *m_pOutput;
    ProcessImageCaller *setup(ProcessedRasterImage *pProcImage, OdUInt32 scanLineFrom, OdUInt32 nScanLines)
    { m_pProcImage = pProcImage; m_scanLineFrom = scanLineFrom; m_nScanLines = nScanLines;
      return this; }
    static DWORD WINAPI entryPoint(LPVOID pArg)
      ProcessImageCaller *pCaller = (ProcessImageCaller*)pArg;
      pCaller->m_pProcImage->process(pCaller->m_scanLineFrom, pCaller->m_nScanLines);
      pCaller->m_pOutput->addString(OdString().format(OD_T("Thread processed scanlines from %u to %u\n"),
                                    (unsigned)pCaller->m_scanLineFrom, (unsigned)(pCaller->m_scanLineFrom + pCaller->m_nScanLines - 1)));
      return EXIT_SUCCESS;
    ProcessImageCaller *run(SimpleWinThreadsPool &threadPool, ExampleMtStringAccumulator *pOutput)
      m_pOutput = pOutput;
      threadPool.runNewThread(entryPoint, this, ThreadsCounter::kNoAttributes);
      return this;

Now we can pass our ExampleMtStringAccumulator class instance into each ProcessImageCaller instance and output the accumulated string to the console after all treads are completed:

// Run threads for raster image processing
const OdUInt32 nScanlinesPerThread = pProcImage->pixelHeight() / 4;
for (OdUInt32 nThread = 0; nThread < 4; nThread++)
  OdUInt32 nScanlinesPerThisThread = nScanlinesPerThread;
  if (nThread == 3) // Height can be not divided by 2, so last thread can have one scanline less.
    nScanlinesPerThisThread = pProcImage->pixelHeight() - nScanlinesPerThread * 3;
      setup(pProcImage, nScanlinesPerThread * nThread, nScanlinesPerThisThread)->
        run(winThreadPool, &stringsAccum));

// Wait for threads completion

// Output accumulated string

Final output will look like this:

4 files loaded and rendered in 0.421743 seconds
Thread processed scanlines from 768 to 1023
Thread processed scanlines from 512 to 767
Thread processed scanlines from 0 to 255
Thread processed scanlines from 256 to 511
Final raster image processed in 0.583598 seconds

Watch for another article coming soon about working with the built-in Teigha Thread ID accessor and using the multithreading low-level API.

All posts