Robust way to tell whether a file is a DICOM

All other questions regarding DCMTK

Moderator: Moderator Team

Post Reply
Message
Author
dake
Posts: 5
Joined: Wed, 2005-11-02, 16:13

Robust way to tell whether a file is a DICOM

#1 Post by dake »

Hello

I implemented a simple function to check whether a given file is a valid DICOM or not. I'm recursively scanning some directories and non-DICOM files can be there and there.

My code is based on the assumption that "loadFile" from DcmFileFormat will return a bad status if the file isn't a DICOM :

Code: Select all

fileformat = new DcmFileFormat();
loadStatus = fileformat->loadFile(filename.c_str());

return m_loadStatus.bad() ? DICOMENTRY_LOADING_FAILED : DICOMENTRY_LOADING_SUCCESSFUL;
Unfortunately, trying to load some invalid (binary) files produce a crash (I haven't investigated the exact cause but it crashed in an assert, isctype.c / line 68, expression : (unsigned)(c+1) <= 256)

My question is : what is the most robust way to tell if a file is a DICOM ? For other formats, I used to read a few bytes at the beginning of the files, but this doesn't seem to be possible with DICOM.

mhavu
Posts: 15
Joined: Wed, 2005-11-23, 16:23
Location: Jyväskylä, Finland

#2 Post by mhavu »

DICOM 3.0 files begin with a 128-byte preamble and the ASCII characters 'D', 'I', 'C', 'M'. You might want to take a look at http://support.dcmtk.org/docs/dcmftest.html. If your files are not DICOM part 10 compliant, you might find some of the code at http://paine.wiau.man.ac.uk/pub/doc_vxl ... ource.html helpful.

Hope this helps,
Marko

Marco Eichelberg
OFFIS DICOM Team
OFFIS DICOM Team
Posts: 1493
Joined: Tue, 2004-11-02, 17:22
Location: Oldenburg, Germany
Contact:

#3 Post by Marco Eichelberg »

For the "official" DICOM file format, a check is easy: at byte offset 128 you will find the magic word "DICM". However, there is also an old, inofficial (pre-1995) file format that is used by some older DICOM tools. For this format, no identifier exists. DcmFileFormat::loadFile first checks for the magic word, and if absent, assumes an "old style" DICOM file and tries to load that. This may cause trouble if a real non-DICOM file is passed to loadFile, because some sequence of non-DICOM bytes will be interpreted as an element length (in bytes) and the toolkit will try to allocate sufficient memory, which may fail.

dake
Posts: 5
Joined: Wed, 2005-11-02, 16:13

#4 Post by dake »

Thanks for your answers.
Marco Eichelberg wrote:For the "official" DICOM file format, a check is easy: at byte offset 128 you will find the magic word "DICM". However, there is also an old, inofficial (pre-1995) file format that is used by some older DICOM tools. For this format, no identifier exists. DcmFileFormat::loadFile first checks for the magic word, and if absent, assumes an "old style" DICOM file and tries to load that. This may cause trouble if a real non-DICOM file is passed to loadFile, because some sequence of non-DICOM bytes will be interpreted as an element length (in bytes) and the toolkit will try to allocate sufficient memory, which may fail.
Is there a way to disable this forced attempt to read the old format ? If not, consider this as a feature request in loadFile or something :) I guess I won't have to work with this old format, my target CT is a Lightspeed VCT though I haven't had the opportunity to get a DICOM file produced by this modern device, I'm pretty confident it should stick to the 3.0 format.

Marco Eichelberg
OFFIS DICOM Team
OFFIS DICOM Team
Posts: 1493
Joined: Tue, 2004-11-02, 17:22
Location: Oldenburg, Germany
Contact:

#5 Post by Marco Eichelberg »

Actually this feature request has already been on our to-do-list for some time, but it's probably not going to make it into the next release, sorry.

Post Reply

Who is online

Users browsing this forum: Ahrefs [Bot], Bing [Bot], Google [Bot] and 1 guest