FileFormat/Dataset read performance

All other questions regarding DCMTK

Moderator: Moderator Team

Post Reply
Message
Author
alanxz
Posts: 9
Joined: Mon, 2012-05-21, 08:37

FileFormat/Dataset read performance

#1 Post by alanxz »

What can I do to maximize read performance when reading a DICOM FileFormat or Dataset using DCMTK?

A bit more detail:
I have a situation where we're storing a bunch of DICOM files on disk which are accessed by an interactive program. This interactive program as a part of opening an image, sorts all of the files in a directory into loadable images. This currently involves reading each file (I left the MaxReadLen at 4096). Unfortunately this didn't scale beyond a few hundred files in a directory, and it looked like the bottleneck was that I was reading a lot of data off disk.

The next thing I tried was to create a cache of the header data: I removed the Pixel Data from each image, and wrote only the dataset (omitting the meta-header) using the DeflatedLittleEndianExplict transfer syntax to a sqlite3 database. I also set dcmEnableAutomaticInputDataCorrection, dcmAcceptOddAttributeLength, and dcmAutoDetectXfer to false. This does improve read performance somewhat, but not what I had hoped. Profiling it it looks like switching to the DeflatedLittleEndianExplicit transfer syntax I'm no longer I/O bound, but I'm CPU bound.

I'm looking for performance of around 3 s or less to read 5000 headers, right now I'm seeing 5000 headers in about 50 s.

So my question is: is there anything else I could do to try to improve this read performance?

I've tried this with DCMTK-3.6.0, and 3.6.1-20120515. I have not tried any of the -DHAVE_STL -DHAVE_STD_STRING build options yet (shame this aren't detected and enabled as a part of system introspection at build time).

J. Riesmeier
DCMTK Developer
Posts: 2501
Joined: Tue, 2011-05-03, 14:38
Location: Oldenburg, Germany
Contact:

Re: FileFormat/Dataset read performance

#2 Post by J. Riesmeier »

I don't know how your code looks like but did you already try to read the first n bytes the DICOM files in a block and feed this block to the dcmdata read (from buffer) methods?
I have not tried any of the -DHAVE_STL -DHAVE_STD_STRING build options yet (shame this aren't detected and enabled as a part of system introspection at build time).
This is mainly for historical reasons ... also the fact that we have "alternative" implementations for all STL/STD classes used within DCMTK. Automatic detection is on our to-do list but priority is not that high.

alanxz
Posts: 9
Joined: Mon, 2012-05-21, 08:37

Re: FileFormat/Dataset read performance

#3 Post by alanxz »

That was one of my first thoughts: reading a chunk of data from a sqlite3 database, then feeding the whole block in memory to dcmtk:

For these bits of DICOM I've removed the Pixel data so the blocks of data are under 128 KB, I usually don't bother break reading the dicom into chunks.

The code I'm using to actually turn the memory buffer into DCMTK objects:

Code: Select all

DcmInputBufferStream is;
is.setBuffer(buffer, length);
is.setEos();

DcmFileFormat dicom_format;
dicom_format.transferInit()

dcmEnableAutomaticInputDataCorrection.set(OFFalse);
dcmAcceptOddAttributeLength.set(OFFalse);
dcmAutoDetectDatasetXfer.set(OFFalse);

dicom_format.getDataset()->read(is, EXS_LittleEndianExplicit);
// Handle errors
dicom_format.transferEnd();
Other things I've tried as well
- Compressing the buffer using ZLIB
- Using EXS_DeflatedLittleEndianExplicit transfer syntax
- Compressing the buffer using LZO

Michael Onken
DCMTK Developer
Posts: 2048
Joined: Fri, 2004-11-05, 13:47
Location: Oldenburg, Germany
Contact:

Re: FileFormat/Dataset read performance

#4 Post by Michael Onken »

Hi,

for your deflated expirements in the SQLite database, may it be possible to parallize reading of different images from the database to get beyond the CPU bottleneck, i.e. let 2-4 cores do the work? Or are you already doing that?

Best,
Michael

Edit: Or implement your own faster compression/decompression,e.g. some time ago I read about Google snappy as being pretty fast and still effective.

alanxz
Posts: 9
Joined: Mon, 2012-05-21, 08:37

Re: FileFormat/Dataset read performance

#5 Post by alanxz »

I guess the first question I wanted answered: is there anything like "Only write out the Dataset in EXS_LittleEndianExplicit because its the fastest to parse"? I'm getting the drift there isn't anything like that.

And thats fine - I just wanted to be sure.

LZO is pretty quick as far as compressors/decompressors go. I wouldn't expect to get a huge perf gain (maybe 5-10%) by switching to something like snappy. I will give it a shot though.

Multithreading is something I hadn't considered, as the way I'm doing things currently doesn't really lend itself to doing that. That said where there's a will theres a way and I can think of a few creative ways to make the cache do a mulithread read-ahead.

Another option I'm testing out is to serialize/deserialize to another format that's a bit quicker to parse. (Protocol buffers).

I'll give some of these a shot and report back.

alanxz
Posts: 9
Joined: Mon, 2012-05-21, 08:37

Re: FileFormat/Dataset read performance

#6 Post by alanxz »

For future reference, here's a quick rundown of things I did to improve the read speed:

1. The thing that made the largest difference was make sure to read all of the data in one chunk. Since I was using sqlite3 database as a backing store writing a select statement that selects all of the cache entries in one go and iterating through the result set gave the best performance as opposed to issuing one select statement per cache entry.

2. Compressing the Dicom data that was written to disk had the second largest impact on performance. Initially I tried ZLib which certainly gave an improvement, lzo and snappy ended up being quite a bit quicker at the price of a slightly larger blob. Both lzo and snappy are very comparable in terms of speed and compression performance. I ended up choosing snappy as the API was dead simple to use from C++ compared to lzo.

3. Multi-threading was a mixed bag. My thought on how to multithread the problem was to have the main thread read the blobs from the sqlite3 (see #1) then farm out the decoding to separate threads by placing the blobs in a TBB work queue, each task being taking the blob and turning it into a DcmFileFormat*, which is then placed in a shared result queue. When all of the decoding was completed, the entries are put into an associative map (std::map was data structure of choice)

When tags were more numerous and smaller in size the multi-threading had a postiive an impact (Philips MR data is a good example). When the tags were fewer and larger in size, the multi-threading had less of a positive impact. (Siemens MR data is a good example).

4. The serialized format had a bit of an effect. If I wrote it out in DICOM format EXS_LittleEndianExplicit had the best performance (I'm on an x86_64, so the machine is little endian). Others I tried: EXS_DeflatedLittleEndianExplicit, EXS_LittleEndianImplicit, and then just leaving it as it came into the program as a mix of little and big endian and letting DCMTK detect the transfer syntax.

I also tried writing my own serializer/deserializer using google protobufs, and this did provide some performance improvement, though it wasn't as big as I had expected

5. Turned on -DHAVE_STD_STRING -DHAVE_STL in DCMTK did improve read performance a bit (2-3%)

6. Used a better memory allocator (running on RHEL 6.x, I was using the default glibc memory allocator). tcmalloc gave me a 2-3% improvement

7. Tune sqlite3 to be faster but less safe: by setting these options:
PRAGMA page_size = 16384
PRAGMA synchronous = OFF
PRAGMA journal_mode = MEMORY
PRAGMA temp_store = MEMORY

Things that were left on the table for future improvements:
- Only cache the tags being used to sort. This was one of my first thoughts, when I was faced with this problem, however the code I used to sort the files looked for an entire DcmFileFormat* and changing the sort code would've been difficult.
- Cache the final sort, this only works well if the directories of files aren't changing much.

Michael Onken
DCMTK Developer
Posts: 2048
Joined: Fri, 2004-11-05, 13:47
Location: Oldenburg, Germany
Contact:

Re: FileFormat/Dataset read performance

#7 Post by Michael Onken »

Great, thanks for your insights, this will help other DCMTK users and of course also ourselves at OFFIS. :wink:

Post Reply

Who is online

Users browsing this forum: Ahrefs [Bot], Google [Bot] and 1 guest