I implemented a simple function to check whether a given file is a valid DICOM or not. I'm recursively scanning some directories and non-DICOM files can be there and there.
My code is based on the assumption that "loadFile" from DcmFileFormat will return a bad status if the file isn't a DICOM :
Unfortunately, trying to load some invalid (binary) files produce a crash (I haven't investigated the exact cause but it crashed in an assert, isctype.c / line 68, expression : (unsigned)(c+1) <= 256)
My question is : what is the most robust way to tell if a file is a DICOM ? For other formats, I used to read a few bytes at the beginning of the files, but this doesn't seem to be possible with DICOM.
For the "official" DICOM file format, a check is easy: at byte offset 128 you will find the magic word "DICM". However, there is also an old, inofficial (pre-1995) file format that is used by some older DICOM tools. For this format, no identifier exists. DcmFileFormat::loadFile first checks for the magic word, and if absent, assumes an "old style" DICOM file and tries to load that. This may cause trouble if a real non-DICOM file is passed to loadFile, because some sequence of non-DICOM bytes will be interpreted as an element length (in bytes) and the toolkit will try to allocate sufficient memory, which may fail.
Marco Eichelberg wrote:For the "official" DICOM file format, a check is easy: at byte offset 128 you will find the magic word "DICM". However, there is also an old, inofficial (pre-1995) file format that is used by some older DICOM tools. For this format, no identifier exists. DcmFileFormat::loadFile first checks for the magic word, and if absent, assumes an "old style" DICOM file and tries to load that. This may cause trouble if a real non-DICOM file is passed to loadFile, because some sequence of non-DICOM bytes will be interpreted as an element length (in bytes) and the toolkit will try to allocate sufficient memory, which may fail.
Is there a way to disable this forced attempt to read the old format ? If not, consider this as a feature request in loadFile or something I guess I won't have to work with this old format, my target CT is a Lightspeed VCT though I haven't had the opportunity to get a DICOM file produced by this modern device, I'm pretty confident it should stick to the 3.0 format.