Hi Per,
good question. I looked into the code. Actually, if the code encounters unknown VRs, it checks whether the VR implicates a 4 byte length field which is the (expected) behaviour for unknown VR as you pointed out.
However, there is indeed a little heuristic which was introduced because a large german company (...) has built systems in the past which entered "??" as explicit VR followed by a 2 byte length field. We are able to read those encodings.
Thus, if the parser finds an unknown VR, it checks for "??", but as an extended version of that, also checks whether one of the two VR characters is outside the range of normal letters (A-Z) which could give a hint that something is fishy here. In this case, internally the VR is mapped to the special VR EVR_UNKNOWN2B (aiming mostly at ??) which is associated with a 2 bytes length field. This is the case for the discussed wrongly encoded VRs in this thread. This is the code from dcmvr.cc for this
Code: Select all
register char c1 = *vrName;
register char c2 = (c1)?(*(vrName+1)):('\0');
if ((c1=='?')&&(c2=='?')) vr = EVR_UNKNOWN2B;
if (!found && ((c1<'A')||(c1>'Z')||(c2<'A')||(c2>'Z'))) vr = EVR_UNKNOWN2B;
If the letters turn out to be in A-Z, a 4 bytes length field is assumed.
Best regards,
Michael
P.S: However, did not try in the debugger