DcmStack vs stack of STL

All other questions regarding DCMTK

Moderator: Moderator Team

Post Reply
Message
Author
karmakul
Posts: 4
Joined: Tue, 2005-02-08, 05:48
Location: Novosibirsk, Russia

DcmStack vs stack of STL

#1 Post by karmakul »

Guten Tag!

I've recently come across slow loading and saving of DICOMDIR. Profiler claims that much time is spent on stack.pop() and stack.push(). When profiler traces down these methods it comes clear that most of the time is spent on new and delete calls.

Have you faced the problem before and how did you solve it, if ever? Did you try to use Standard Template Library's stack instead?

Danke schön
Kirill

Jörg Riesmeier
ICSMED DICOM Services
ICSMED DICOM Services
Posts: 2217
Joined: Fri, 2004-10-29, 21:38
Location: Oldenburg, Germany

#2 Post by Jörg Riesmeier »

We're not aware of any issues regarding OFStack. There was a performance problem with the DICOMDIR implementation in versions prior to DCMTK 3.5.3 (inefficient use of list classes) but this has been solved.

Anyway, you should be able to use the STL class instead of OFStack by defining HAVE_STL (or HAVE_STL_STACK). Would be interesting to hear your experiences.

karmakul
Posts: 4
Joined: Tue, 2005-02-08, 05:48
Location: Novosibirsk, Russia

#3 Post by karmakul »

Hello. In the past couple of weeks I worked on optimization of DICOMDIR loading/saving. Before the optimization a run of loading and saving one DICOMDIR took 130 seconds; after the optimization the same task took 16 seconds.

Background:
In our lab we're developing an utility that will provide us with conformance to Media Storage Service, Interchange Option, General Purpose CD-R and DVD Interchange Application Profiles: mainly STD-GEN-DVD-RAM, but also STD-GEN-CD. In our case there will be about 20K images, resulting in ~100K records. => loading/saving of such DICOMDIRs is a kind of bottleneck for us.

Experience:
1) DcmUnsignedLong::getUint32() was the main contributor to loading/saving time. It was called mostly for DcmUnsignedLongOffset objects. So, I created a short DcmUnsignedLongOffset::getUint32(), which got value right from fValue of DcmElement. It gave me 20% overall speedup.

2) The stack.pop() and stack.push() methods were the next in queue. They were called by search() and searchSubFromHere(). While loading/saving DICOMDIR I saw that code can be a bit organized to search only in elementList of some particular item, so I added DcmItem::searchPlain(), which scans only local elementList and doesn't descend to sub-items. Then I used the new method instead of search() where it was possible in methods of DcmDicomDir class

Then it came clear that neither the search()es, nor the get() themselves eated the time, but the enormous number of times they were called. DcmDicomDir::convertGivenPointer() and DcmDicomDir::resolveGivenOffsets() came up. In both of them there were nested loops. The outer loop iterated on DcmUnsignedLongOffset objects, the inner searched for the corresponding value and updated the current DcmUnsignedLongOffset.

3) In convertGivenPointer() the inner loop was:

Code: Select all

for (unsigned long i = 0; i < numOffsets; i++ )
{
  if ( offElem->getNextRecord() == itOffsets[i].item )
  {
     offElem->putUint32( itOffsets[i].fileOffset );
     break;
  }
}
In convertAllPointers() the part that came before the 5 calls to convertGivenPointer() set file offset of each record. That is, the data is already there and there is no need to iterate on array to find it. So, I replaced the abovementioned loop with

Code: Select all

dO = offset->getNextRecord();
if (dO != NULL) {
  dirRec = OFstatic_cast(DcmDirectoryRecord *, dO);
  offset->putUint32(dirRec->getFileOffset());
} else
  offset->putUint32((Uint32)0);
It worked fine.

4) In resolveGivenOffsets things were more complex. The inner loop had to iterate on array of record pointers since it had no idea of what record the current offset should point to. So, I decided to push all the DcmDirectoryRecords to hash table with their file offsets as keys. I did it in the loop of resolveAllOffsets:

Code: Select all

for (unsigned long i = 0; i < maxitems; i++ )
{
    DcmDirectoryRecord *rec;
    rec = OFstatic_cast(DcmDirectoryRecord *, localDirRecSeq.getItem( i ));
    long filePos = rec->getFileOffset();
    offsetHash[filePos] = rec;
}    
Then I replaced the inner loop of resolveGivenOffsets()

Code: Select all

for (unsigned long i = 0; i < numOffsets; i++ )
{
    l_error = offElem->getUint32(offset);
    if (offset == itOffsets[ i ].fileOffset )
    {
        offElem->setNextRecord( itOffsets[ i ].item );
        break;
    }
}
with

Code: Select all

offsetValue = offElem->getUint32();
dirRec = offsetHash[offsetValue];
offElem->setNextRecord(dirRec);
where offsetHash was the mentioned hash table.

Since both convertGivenPointer() and resolveGivenOffsets() methods access DcmUnsignedLongOffset attributes of DcmDirectoryRecords I decided to cache them. It added to DcmDirectoryRecord class the following:

Code: Select all

    DcmUnsignedLongOffset* getNextRecordOffset();
    DcmUnsignedLongOffset* getLowerRecordOffset();
    DcmUnsignedLongOffset* getMDRDRecordOffset();
These use searchPlain() to find the proper item. The first two also create such an item if it was not found. I use the new methods in DcmDicomDir.

5) In order to get rid of slow search()es in convertGivenPointer() and resolveGivenOffsets() I did the following. In convertAllPointers() and resolveAllOffsets I introduced an std::vector<DcmUnsignedLongOffsets*> to hold all the offset objects that existed. Then the vector was delivered to convertGivenPointer() and resolveGivenOffsets(); the latter iterate on the vector instead of search()ing for each offset attribute.[/b] Here is the filling of the vector:

Code: Select all

...
DcmSequenceOfItems &localDirRecSeq = getDirRecSeq( dset );
unsigned long maxitems = localDirRecSeq.card();
std::vector<DcmUnsignedLongOffset *> offsets;
DcmUnsignedLongOffset *tmp;
for (unsigned long i = 0; i < maxitems; i++ )
{
    ...
    if ( (tmp = rec->getNextRecordOffset()) != NULL )
      offsets.push_back(tmp);
    if ( (tmp = rec->getLowerRecordOffset()) != NULL )
      offsets.push_back(tmp);
    if ( (tmp = rec->getMRDRRecordOffset()) != NULL )
      offsets.push_back(tmp);
}
if ((tmp = (DcmUnsignedLongOffset*)dset.searchPlain(DCM_OffsetOfTheFirstDirectoryRecordOfTheRootDirectoryEntity)) != NULL)
  offsets.push_back(tmp);
if ((tmp = (DcmUnsignedLongOffset*)dset.searchPlain(DCM_OffsetOfTheLastDirectoryRecordOfTheRootDirectoryEntity)) != NULL)
  offsets.push_back(tmp);
...

Jörg Riesmeier
ICSMED DICOM Services
ICSMED DICOM Services
Posts: 2217
Joined: Fri, 2004-10-29, 21:38
Location: Oldenburg, Germany

#4 Post by Jörg Riesmeier »

Thank you for your detailed analysis and report. I admit that we never tried to add more than about 5,000 files to a DICOMDIR.

We would appreciate if you could send us the changes you've made as a patch for dcmtk 3.5.3 (use email address: dicom/at/offis/dot/de).

J. Riesmeier
DCMTK Developer
Posts: 2517
Joined: Tue, 2011-05-03, 14:38
Location: Oldenburg, Germany
Contact:

#5 Post by J. Riesmeier »

It took some time, but finally we've managed to enhance the performance of the DICOMDIR code significantly: Could you please check the latest DCMTK snapshot (from today)? Thanks.

Post Reply

Who is online

Users browsing this forum: Google [Bot] and 0 guests