Hi all,
I found a strange problem when creating DICOMDIR files using dcmmkdir. The DICOM files are containing German Umlauts eg. in the patients name. Encoding of the DICOM files is ISO_IR 100, and also the DICOMDIR will have the ISO_IR 100 encoding.
In most cases there is no problem and the DICOMDIR has the same patient name and characters as the source DICOM files.
In some cases the characters in the DICOMDIR are reproducible wrong. (characters from different charset).
How to reproduce:
1. Directory structure is important
-> Create this directory structure: C:\UMLAUTE\SUBDIR\DCM
- When calling dcmmkdir with source path C:\UMLAUT\SUBDIR the characters are NOT broken
- When calling dcmmkdir with source path C:\UMLAUT the characters are broken
- Regarding the path and seperators I tried forward slash, backward slash and quoting "", all types are working for dcmmkdir but it has no impact on the encoding or the broken umlauts
2. Two files are required
-> Place two DICOM files in the folder: C:\UMLAUTE\SUBDIR\DCM
- I tried either one of the files alone and the DICOMDIR was OK, when adding two or more files at the same time the encoding error occurs
3. Content of the DICOM files
-> Patientname should of cource contain at least one Umlaut character eg: "Müller"
- There are two more DICOM tags in the source files that a relevant:
- 0008/0081 should be >=39 chars (I can't reproduce when lowering the content to 38 chars)
- 0008/0080 should be >=20 chars
4. Run dcmmkdir
-> dcmmkdir.exe -Pdv -nb -Nxc +m +r +I +id C:\UMLAUTE +D DICOMDIR
- In this case the Umlaut-characters in the DICOMDIR are broken
- When changing the directory (eg. to C:\UMLAUTE\SUBDIR ) the Umlauts are correct again
Some infos about the broken characters:
The original DICOM file contains eg. Müller (ISO_IR 100). ü == 0xfc
If the DICOMDIR was created correctly it will have the same encoding and also 0xfc for ü
If the DICOMDIR is broken the character 0xfc has changed to 0xd1 0x8c, encoding in DICOMDIR is still ISO_IR 100, no error or warning in the console
If you dump the DICOMDIR the characters - even the broken ones - are displayed right. So you need to check the character in text or hex editor.
To make it easier to reproduce I can provide the DICOM files, working and broken DICOMDIR files.
Screenshot:
https://drive.google.com/file/d/1MLRiUh ... sp=sharing
FILE1: https://drive.google.com/file/d/1Bu5C15 ... sp=sharing
FILE2: https://drive.google.com/file/d/1QwQpY8 ... sp=sharing
DICOMDIR_BROKEN: https://drive.google.com/file/d/1NlpOwt ... sp=sharing
DICOMDIR_OK: https://drive.google.com/file/d/1fs0NxB ... sp=sharing
Thanks for all hints
Bruno
Broken character in DICOMDIR using dcmmkdir
Moderator: Moderator Team
-
- Posts: 2
- Joined: Thu, 2022-02-24, 13:22
-
- DCMTK Developer
- Posts: 2506
- Joined: Tue, 2011-05-03, 14:38
- Location: Oldenburg, Germany
- Contact:
Re: Broken character in DICOMDIR using dcmmkdir
Thank you for your report. Unfortunately, I cannot confirm your above description: the "ü" is encoded with the same byte in both DICOMDIR files:The original DICOM file contains eg. Müller (ISO_IR 100). ü == 0xfc
If the DICOMDIR was created correctly it will have the same encoding and also 0xfc for ü
If the DICOMDIR is broken the character 0xfc has changed to 0xd1 0x8c, encoding in DICOMDIR is still ISO_IR 100, no error or warning in the console
If you dump the DICOMDIR the characters - even the broken ones - are displayed right. So you need to check the character in text or hex editor.
Code: Select all
> hexdump -C DICOMDIR_BROKEN
[...]
000001e0 20 31 30 30 10 00 10 00 50 4e 06 00 4d fc 6c 6c | 100....PN..M.ll|
[..]
Code: Select all
> hexdump -C DICOMDIR_OK
[...]
000001e0 30 30 10 00 10 00 50 4e 06 00 4d fc 6c 6c 65 72 |00....PN..M.ller|
[...]
-
- Posts: 2
- Joined: Thu, 2022-02-24, 13:22
Re: Broken character in DICOMDIR using dcmmkdir
Thanks for the fast respond. I'm very surprised and checked it with another hex dumper. You're right.
The problem was the wrong display with Notepad++ / Hex View Plugin. I don't think I will use this hex plugin again
Thanks again and best regards
Bruno
The problem was the wrong display with Notepad++ / Hex View Plugin. I don't think I will use this hex plugin again
Thanks again and best regards
Bruno
-
- DCMTK Developer
- Posts: 2506
- Joined: Tue, 2011-05-03, 14:38
- Location: Oldenburg, Germany
- Contact:
Re: Broken character in DICOMDIR using dcmmkdir
You're welcome. I would have been really surprised if DCMTK's DICOMDIR generation code would convert any characters.
Who is online
Users browsing this forum: No registered users and 1 guest