Undefined behavior (assert by isalpha) in DcmItem::foundVR

All other questions regarding DCMTK

Moderator: Moderator Team

Post Reply
Message
Author
Niels Dekker
Posts: 40
Joined: Thu, 2005-05-26, 17:11
Location: Leiden (work), Amsterdam (home)
Contact:

Undefined behavior (assert by isalpha) in DcmItem::foundVR

#1 Post by Niels Dekker »

At the LKEB we often have assert failures when trying to read a file by using DcmFileFormat::loadFile(). The assert failures are caused by DcmItem::foundVR(), dcmtk-3.5.4\dcmdata\libsrc\dcitem.cxx (line 152):

Code: Select all

 
   if (isalpha(c1) && isalpha(c2))
The call stack is as follows:

Code: Select all

DcmItem::foundVR()
DcmItem::checkTransferSyntax()
DcmMetaInfo::checkAndReadPreamble()
DcmMetaInfo::read()
DcmFileFormat::read()
DcmFileFormat::loadFile()
The assert failure occurs when the argument passed to isalpha (c1 or c2) is out of range. Originally I thought that the assert failure was a Visual C++ compiler bug, but it's not. Quoting the C Standard:
The header <ctype.h> declares several functions useful for testing and mapping characters. In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF. If the argument has any other value, the behavior is undefined.
So when the character read by DcmFileFormat::loadFile() is less than -1, we typically have undefined behaviour in our medical application!!!

Can you please get this bug out of DcmItem::foundVR()? The problem is easily solved, by casting the argument to unsigned char, before calling isalpha:

Code: Select all

   if (isalpha(OFstatic_cast(unsigned char, c1)) && isalpha(OFstatic_cast(unsigned char, c2)))
See also:
Newsgroup: microsoft.public.vc.language
Subject: [CRT] How to avoid isalpha(c) assert failure?
http://www.microsoft.com/communities/ne ... EB+isalpha

PS BTW, is this the best place, when reporting a bug like this?
Niels Dekker
Scientific programmer at LKEB, Leiden University Medical Center, The Netherlands

Michael Onken
DCMTK Developer
Posts: 2054
Joined: Fri, 2004-11-05, 13:47
Location: Oldenburg, Germany
Contact:

#2 Post by Michael Onken »

Hi Niels,

first of all, thank you for your report. You can use this forum for bug reporting or the email address dicom-bugs(at)offis(dot)de.

The assertion should fire, if something is passed to isalpha() which indeed is smaller than -1. I would say it makes more sense to change the API of foundVR() from accepting char* to Uint8* instead. This would implicate corresponding changes in checkTransferSyntax() to read the characters from the input stream into an Uint8 array instead of into (signed) chars. This solution seems to be more reliable for me than the cast you suggested in your posting. I internally tried this patch to Uint8 and for me it seems to work. However it would be very interesting if you could provide some (anonymized) example files forcing the current DCMTK to abort with an assertion. Maybe you could send them to dicom(at)offis(dot)de.

It is not clear to me, what kind of characters were read from the DICOM files that could end up in calling isalpha() with a negative value < -1. This could be the case if reading characters with ASCII code > 128 because then the signed char currently implemented in foundVR() would have an overflow to the negative range.

A patch to DCMTK in this regard would make sense in any case (if the file is corrupt or not) in my opinion. Bailing out with an assertion is not a nice way to say good bye :) Maybe the rest of the DICOM team will join this thread, if I missed some points ;-)

Regards,
Michael

Niels Dekker
Posts: 40
Joined: Thu, 2005-05-26, 17:11
Location: Leiden (work), Amsterdam (home)
Contact:

#3 Post by Niels Dekker »

Thanks for your reply, Michael!

Michael Onken (OFFIS DICOM Team) wrote:
I would say it makes more sense to change the API of foundVR() from accepting char* to Uint8* instead. This would implicate corresponding changes in checkTransferSyntax() to read the characters from the input stream into an Uint8 array instead of into (signed) chars. This solution seems to be more reliable for me than the cast you suggested in your posting.
Although I think my cast (char to unsigned char) is quite reliable, your solution definitely sounds like the better one! :-)
However it would be very interesting if you could provide some (anonymized) example files forcing the current DCMTK to abort with an assertion.
Well... it wouldn't be that interesting. Because actually we often try to read a file that isn't a DICOM file at all!!! Still I think when doing so, DcmFileFormat::loadFile() should just return an error code. It should not have assert failures or "undefined behaviour" (crashes).
A patch to DCMTK in this regard would make sense in any case (if the file is corrupt or not) in my opinion. Bailing out with an assertion is not a nice way to say good bye Maybe the rest of the DICOM team will join this thread, if I missed some points
Thank you :-)

FYI, we're currently developing a general purpose multi-fileformat image reader framework at LKEB. It will be somewhat like ITK's image IO framework, but based on various other libraries as well. When trying to read an unknown file, our "image reader factory" tries various fileformat specific readers, including the one from DCMTK, by just starting to read until it fails. So for our purpose, such a failure should be dealt with in a friendly way, just by trying another fileformat specific reader.

Jörg Riesmeier
ICSMED DICOM Services
ICSMED DICOM Services
Posts: 2217
Joined: Fri, 2004-10-29, 21:38
Location: Oldenburg, Germany

#4 Post by Jörg Riesmeier »

Because actually we often try to read a file that isn't a DICOM file at all!!! Still I think when doing so, DcmFileFormat::loadFile() should just return an error code.Because actually we often try to read a file that isn't a DICOM file at all!!! Still I think when doing so, DcmFileFormat::loadFile() should just return an error code.
In this case, you should probably use loadFile() with read mode ERM_fileOnly.
It should not have assert failures or "undefined behaviour" (crashes).
I guess that assertions only occur in debug mode, right?

Niels Dekker
Posts: 40
Joined: Thu, 2005-05-26, 17:11
Location: Leiden (work), Amsterdam (home)
Contact:

#5 Post by Niels Dekker »

Jörg Riesmeier wrote:
you should probably use loadFile() with read mode ERM_fileOnly.
Thanks. I just tried ERM_fileOnly, but unfortunately I still got the assert failures.
I guess that assertions only occur in debug mode, right?
Right. But in release mode, DcmItem::foundVR might still crash, when the parameter it passes to isalpha is out of range. Quoting MSDN:
When used with a release CRT library, isalpha will use the parameter as an index into an array, with undefined results if the parameter is not EOF or in the range of 0 through 0xFF.
From http://msdn2.microsoft.com/en-us/librar ... s.90).aspx

So please fix the bug! :-)

Michael Onken
DCMTK Developer
Posts: 2054
Joined: Fri, 2004-11-05, 13:47
Location: Oldenburg, Germany
Contact:

#6 Post by Michael Onken »

Hi Niels,
Niels Dekker wrote: So please fix the bug! :-)
fixed in CVS :wink:

Regards,
Michael

Niels Dekker
Posts: 40
Joined: Thu, 2005-05-26, 17:11
Location: Leiden (work), Amsterdam (home)
Contact:

#7 Post by Niels Dekker »

Michael Onken wrote:
fixed in CVS
Cool! Is the DCMTK CVS publicly accessible? Otherwise, can you please send me the fixed version, so that we can use it here at the LKEB in Leiden?

There's a mail address of mine at www.xs4all.nl/~nd/dekkerware
Niels Dekker
Scientific programmer at LKEB, Leiden University Medical Center, The Netherlands

Michael Onken
DCMTK Developer
Posts: 2054
Joined: Fri, 2004-11-05, 13:47
Location: Oldenburg, Germany
Contact:

#8 Post by Michael Onken »

Hello Niels,

I've send you an email how to download a recent CVS snapshot.

Regards,
Michael

Niels Dekker
Posts: 40
Joined: Thu, 2005-05-26, 17:11
Location: Leiden (work), Amsterdam (home)
Contact:

#9 Post by Niels Dekker »

Michael Onken wrote:
I've send you an email how to download a recent CVS snapshot.
Thank you very much, Michael!

-- Niels
Niels Dekker
Scientific programmer at LKEB, Leiden University Medical Center, The Netherlands

Post Reply

Who is online

Users browsing this forum: Ahrefs [Bot] and 1 guest