Broken character in DICOMDIR using dcmmkdir

All other questions regarding DCMTK

Moderator: Moderator Team

Post Reply
Message
Author
Bruno.Milutin
Posts: 2
Joined: Thu, 2022-02-24, 13:22

Broken character in DICOMDIR using dcmmkdir

#1 Post by Bruno.Milutin »

Hi all,

I found a strange problem when creating DICOMDIR files using dcmmkdir. The DICOM files are containing German Umlauts eg. in the patients name. Encoding of the DICOM files is ISO_IR 100, and also the DICOMDIR will have the ISO_IR 100 encoding.

In most cases there is no problem and the DICOMDIR has the same patient name and characters as the source DICOM files.
In some cases the characters in the DICOMDIR are reproducible wrong. (characters from different charset).

How to reproduce:

1. Directory structure is important
-> Create this directory structure: C:\UMLAUTE\SUBDIR\DCM
- When calling dcmmkdir with source path C:\UMLAUT\SUBDIR the characters are NOT broken
- When calling dcmmkdir with source path C:\UMLAUT the characters are broken
- Regarding the path and seperators I tried forward slash, backward slash and quoting "", all types are working for dcmmkdir but it has no impact on the encoding or the broken umlauts

2. Two files are required
-> Place two DICOM files in the folder: C:\UMLAUTE\SUBDIR\DCM
- I tried either one of the files alone and the DICOMDIR was OK, when adding two or more files at the same time the encoding error occurs

3. Content of the DICOM files
-> Patientname should of cource contain at least one Umlaut character eg: "Müller"
- There are two more DICOM tags in the source files that a relevant:
- 0008/0081 should be >=39 chars (I can't reproduce when lowering the content to 38 chars)
- 0008/0080 should be >=20 chars

4. Run dcmmkdir
-> dcmmkdir.exe -Pdv -nb -Nxc +m +r +I +id C:\UMLAUTE +D DICOMDIR
- In this case the Umlaut-characters in the DICOMDIR are broken
- When changing the directory (eg. to C:\UMLAUTE\SUBDIR ) the Umlauts are correct again


Some infos about the broken characters:

The original DICOM file contains eg. Müller (ISO_IR 100). ü == 0xfc
If the DICOMDIR was created correctly it will have the same encoding and also 0xfc for ü
If the DICOMDIR is broken the character 0xfc has changed to 0xd1 0x8c, encoding in DICOMDIR is still ISO_IR 100, no error or warning in the console
If you dump the DICOMDIR the characters - even the broken ones - are displayed right. So you need to check the character in text or hex editor.


To make it easier to reproduce I can provide the DICOM files, working and broken DICOMDIR files.


Screenshot:
https://drive.google.com/file/d/1MLRiUh ... sp=sharing

FILE1: https://drive.google.com/file/d/1Bu5C15 ... sp=sharing
FILE2: https://drive.google.com/file/d/1QwQpY8 ... sp=sharing

DICOMDIR_BROKEN: https://drive.google.com/file/d/1NlpOwt ... sp=sharing
DICOMDIR_OK: https://drive.google.com/file/d/1fs0NxB ... sp=sharing


Thanks for all hints
Bruno

J. Riesmeier
DCMTK Developer
Posts: 2501
Joined: Tue, 2011-05-03, 14:38
Location: Oldenburg, Germany
Contact:

Re: Broken character in DICOMDIR using dcmmkdir

#2 Post by J. Riesmeier »

The original DICOM file contains eg. Müller (ISO_IR 100). ü == 0xfc
If the DICOMDIR was created correctly it will have the same encoding and also 0xfc for ü
If the DICOMDIR is broken the character 0xfc has changed to 0xd1 0x8c, encoding in DICOMDIR is still ISO_IR 100, no error or warning in the console
If you dump the DICOMDIR the characters - even the broken ones - are displayed right. So you need to check the character in text or hex editor.
Thank you for your report. Unfortunately, I cannot confirm your above description: the "ü" is encoded with the same byte in both DICOMDIR files:

Code: Select all

> hexdump -C DICOMDIR_BROKEN 
[...]
000001e0  20 31 30 30 10 00 10 00  50 4e 06 00 4d fc 6c 6c  | 100....PN..M.ll|
[..]

Code: Select all

> hexdump -C DICOMDIR_OK 
[...]
000001e0  30 30 10 00 10 00 50 4e  06 00 4d fc 6c 6c 65 72  |00....PN..M.ller|
[...]

Bruno.Milutin
Posts: 2
Joined: Thu, 2022-02-24, 13:22

Re: Broken character in DICOMDIR using dcmmkdir

#3 Post by Bruno.Milutin »

Thanks for the fast respond. I'm very surprised and checked it with another hex dumper. You're right.
The problem was the wrong display with Notepad++ / Hex View Plugin. I don't think I will use this hex plugin again ;-)

Thanks again and best regards
Bruno

J. Riesmeier
DCMTK Developer
Posts: 2501
Joined: Tue, 2011-05-03, 14:38
Location: Oldenburg, Germany
Contact:

Re: Broken character in DICOMDIR using dcmmkdir

#4 Post by J. Riesmeier »

You're welcome. I would have been really surprised if DCMTK's DICOMDIR generation code would convert any characters.

Post Reply

Who is online

Users browsing this forum: Ahrefs [Bot], Google [Bot] and 1 guest