FINDSCU encoding of result files doesnt match
Moderator: Moderator Team
FINDSCU encoding of result files doesnt match
Hi,
when using findscu of dcmtk v3.6.5 the result xml files got the header 'encoding="UTF-8"', but the file itself is beeing created with ANSI encoding.
Any ideas on that one?
Kind regards
when using findscu of dcmtk v3.6.5 the result xml files got the header 'encoding="UTF-8"', but the file itself is beeing created with ANSI encoding.
Any ideas on that one?
Kind regards
-
- DCMTK Developer
- Posts: 2506
- Joined: Tue, 2011-05-03, 14:38
- Location: Oldenburg, Germany
- Contact:
Re: FINDSCU encoding of result files doesnt match
What is the value of Specific Character Set (0008,0005)? What do you mean by "the file itself is beeing created with ANSI encoding"?
Re: FINDSCU encoding of result files doesnt match
Opening the result file with notepad shows "ANSI" in the statusbar.
Opening the file with other editors able to switch encodings (e.g. notepad++) messes up characters like äöü when using UTF-8 encoding (so the file content is clearly not UTF-8 encoded). Switching to ANSI displays special characters correctly.
Opening the file with .NET System.Xml.XmlDocument throws the error that the specified header encoding does not match actual the file encoding.
The input.xml (which is converted with xml2dcm before sending through findscu) uses "iso-8859-1" in the Xml header and writes the corresponding "ISO_IR 100" to 0008,0005.
The findscu result looks like this. UTF-8 set as header encoding, but UTF-8 chars not showing up correctly (see patient name).
findscu call is:
Opening the file with other editors able to switch encodings (e.g. notepad++) messes up characters like äöü when using UTF-8 encoding (so the file content is clearly not UTF-8 encoded). Switching to ANSI displays special characters correctly.
Opening the file with .NET System.Xml.XmlDocument throws the error that the specified header encoding does not match actual the file encoding.
The input.xml (which is converted with xml2dcm before sending through findscu) uses "iso-8859-1" in the Xml header and writes the corresponding "ISO_IR 100" to 0008,0005.
Code: Select all
<?xml version="1.0" encoding="iso-8859-1"?>
<file-format>
<meta-header xfer="1.2.840.10008.1.2.1" name="Little Endian Explicit">
<element tag="0002,0010" vr="UI" len="19" name="TransferSyntaxUID">1.2.840.10008.1.2.1</element>
<element tag="0002,0012" vr="UI" len="26" name="ImplementationClassUID">...</element>
<element tag="0002,0013" vr="SH" len="11" name="ImplementationVersionName">...</element>
</meta-header>
<data-set xfer="1.2.840.10008.1.2.1" name="Little Endian Explicit">
<element tag="0008,0005" vr="CS" len="10" name="SpecificCharacterSet">ISO_IR 100</element>
<sequence tag="0040,0100" vr="SQ" name="ScheduledProcedureStepSequence">
<item>
<element tag="0040,0001" vr="AE" len="7">PDC-SCU</element>
<element tag="0008,0060" vr="CS" len="3">ECG</element>
<element tag="0040,0010" vr="SH" len="0"></element>
</item>
</sequence>
<element tag="0010,0010" vr="PN" len="0"></element>
<element tag="0010,0020" vr="LO" len="9">000000003</element>
<element tag="0010,0030" vr="DA" len="0"></element>
<element tag="0010,0040" vr="CS" len="0"></element>
<element tag="0008,0050" vr="SH" len="0"></element>
</data-set>
</file-format>
Code: Select all
<?xml version="1.0" encoding="UTF-8"?>
<responses type="C-FIND">
<data-set xfer="1.2.840.10008.1.2" name="Little Endian Implicit">
<element tag="0008,0050" vr="SH" vm="1" len="6" name="AccessionNumber">acc no</element>
<element tag="0010,0010" vr="PN" vm="1" len="18" name="PatientName">Sch䦥r</element>
<element tag="0010,0020" vr="LO" vm="1" len="10" name="PatientID">000000003</element>
<element tag="0010,0030" vr="DA" vm="1" len="8" name="PatientBirthDate">19000101</element>
<element tag="0010,0040" vr="CS" vm="1" len="2" name="PatientSex">M</element>
<sequence tag="0040,0100" vr="SQ" card="1" name="ScheduledProcedureStepSequence">
<item card="2">
<element tag="0040,0002" vr="DA" vm="1" len="8" name="ScheduledProcedureStepStartDate">20200910</element>
<element tag="0040,0003" vr="TM" vm="1" len="6" name="ScheduledProcedureStepStartTime">101140</element>
</item>
</sequence>
</data-set>
</responses>
Code: Select all
findscu.exe -Xx -to 30 -ta 30 -td 30 -Xs Output.xml -od <target path> <server> -aet PDC-SCU -aec DVTK_MW_SCP Input.dcm -ll debug
Last edited by CRIno on Thu, 2020-09-10, 10:35, edited 1 time in total.
-
- DCMTK Developer
- Posts: 2506
- Joined: Tue, 2011-05-03, 14:38
- Location: Oldenburg, Germany
- Contact:
Re: FINDSCU encoding of result files doesnt match
To me it seems that the SCP is sending a C-FIND Response dataset with non-ASCII characters without specifying the character set that was used in (0008,0005) Specific Character Set. This is not valid according to the DICOM standard. That means, the findscu assumes ASCII encoding and, therefore, does not perform any character set conversion. The "UTF-8" in the XML header is a result of the -Xs (--extract-xml-single) option you've used.
Re: FINDSCU encoding of result files doesnt match
The SCP is DVTk RIS Emulator 5.0.0
So the SCP should include the 0008,0005 in the response (which seems like it is not) and then findscu would set the xml header for the output xml correctly?
So the SCP should include the 0008,0005 in the response (which seems like it is not) and then findscu would set the xml header for the output xml correctly?
-
- DCMTK Developer
- Posts: 2506
- Joined: Tue, 2011-05-03, 14:38
- Location: Oldenburg, Germany
- Contact:
Re: FINDSCU encoding of result files doesnt match
If the SCP would correctly specify the character set used, e.g. (0008,0005) Specific Character Set = "ISO_IR 100", then findscu with option -Xs (and compiled with support for character set conversion) would convert the Latin-1 (ISO 8859-1) encoding to UTF-8. The encoding in the XML header is always "UTF-8" when using option -Xs (see manpage of this tool for details).
Re: FINDSCU encoding of result files doesnt match
Ok, Thank you very much!
Who is online
Users browsing this forum: No registered users and 1 guest