FINDSCU encoding of result files doesnt match

All other questions regarding DCMTK

Moderator: Moderator Team

Post Reply
Message
Author
CRIno
Posts: 4
Joined: Tue, 2020-05-12, 14:35
Location: Germany

FINDSCU encoding of result files doesnt match

#1 Post by CRIno »

Hi,

when using findscu of dcmtk v3.6.5 the result xml files got the header 'encoding="UTF-8"', but the file itself is beeing created with ANSI encoding.

Any ideas on that one?

Kind regards

J. Riesmeier
DCMTK Developer
Posts: 2501
Joined: Tue, 2011-05-03, 14:38
Location: Oldenburg, Germany
Contact:

Re: FINDSCU encoding of result files doesnt match

#2 Post by J. Riesmeier »

What is the value of Specific Character Set (0008,0005)? What do you mean by "the file itself is beeing created with ANSI encoding"?

CRIno
Posts: 4
Joined: Tue, 2020-05-12, 14:35
Location: Germany

Re: FINDSCU encoding of result files doesnt match

#3 Post by CRIno »

Opening the result file with notepad shows "ANSI" in the statusbar.
Opening the file with other editors able to switch encodings (e.g. notepad++) messes up characters like äöü when using UTF-8 encoding (so the file content is clearly not UTF-8 encoded). Switching to ANSI displays special characters correctly.
Opening the file with .NET System.Xml.XmlDocument throws the error that the specified header encoding does not match actual the file encoding.

The input.xml (which is converted with xml2dcm before sending through findscu) uses "iso-8859-1" in the Xml header and writes the corresponding "ISO_IR 100" to 0008,0005.

Code: Select all

<?xml version="1.0" encoding="iso-8859-1"?>
<file-format>
  <meta-header xfer="1.2.840.10008.1.2.1" name="Little Endian Explicit">
    <element tag="0002,0010" vr="UI" len="19" name="TransferSyntaxUID">1.2.840.10008.1.2.1</element>
    <element tag="0002,0012" vr="UI" len="26" name="ImplementationClassUID">...</element>
    <element tag="0002,0013" vr="SH" len="11" name="ImplementationVersionName">...</element>
  </meta-header>
  <data-set xfer="1.2.840.10008.1.2.1" name="Little Endian Explicit">
    <element tag="0008,0005" vr="CS" len="10" name="SpecificCharacterSet">ISO_IR 100</element>
    <sequence tag="0040,0100" vr="SQ" name="ScheduledProcedureStepSequence">
      <item>
        <element tag="0040,0001" vr="AE" len="7">PDC-SCU</element>
        <element tag="0008,0060" vr="CS" len="3">ECG</element>
        <element tag="0040,0010" vr="SH" len="0"></element>
      </item>
    </sequence>
    <element tag="0010,0010" vr="PN" len="0"></element>
    <element tag="0010,0020" vr="LO" len="9">000000003</element>
    <element tag="0010,0030" vr="DA" len="0"></element>
    <element tag="0010,0040" vr="CS" len="0"></element>
    <element tag="0008,0050" vr="SH" len="0"></element>
  </data-set>
</file-format>
The findscu result looks like this. UTF-8 set as header encoding, but UTF-8 chars not showing up correctly (see patient name).

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<responses type="C-FIND">
<data-set xfer="1.2.840.10008.1.2" name="Little Endian Implicit">
<element tag="0008,0050" vr="SH" vm="1" len="6" name="AccessionNumber">acc no</element>
<element tag="0010,0010" vr="PN" vm="1" len="18" name="PatientName">Sch䦥r</element>
<element tag="0010,0020" vr="LO" vm="1" len="10" name="PatientID">000000003</element>
<element tag="0010,0030" vr="DA" vm="1" len="8" name="PatientBirthDate">19000101</element>
<element tag="0010,0040" vr="CS" vm="1" len="2" name="PatientSex">M</element>
<sequence tag="0040,0100" vr="SQ" card="1" name="ScheduledProcedureStepSequence">
<item card="2">
<element tag="0040,0002" vr="DA" vm="1" len="8" name="ScheduledProcedureStepStartDate">20200910</element>
<element tag="0040,0003" vr="TM" vm="1" len="6" name="ScheduledProcedureStepStartTime">101140</element>
</item>
</sequence>
</data-set>
</responses>
findscu call is:

Code: Select all

findscu.exe -Xx -to 30 -ta 30 -td 30 -Xs Output.xml -od <target path> <server> -aet PDC-SCU -aec DVTK_MW_SCP Input.dcm -ll debug
Last edited by CRIno on Thu, 2020-09-10, 10:35, edited 1 time in total.

J. Riesmeier
DCMTK Developer
Posts: 2501
Joined: Tue, 2011-05-03, 14:38
Location: Oldenburg, Germany
Contact:

Re: FINDSCU encoding of result files doesnt match

#4 Post by J. Riesmeier »

To me it seems that the SCP is sending a C-FIND Response dataset with non-ASCII characters without specifying the character set that was used in (0008,0005) Specific Character Set. This is not valid according to the DICOM standard. That means, the findscu assumes ASCII encoding and, therefore, does not perform any character set conversion. The "UTF-8" in the XML header is a result of the -Xs (--extract-xml-single) option you've used.

CRIno
Posts: 4
Joined: Tue, 2020-05-12, 14:35
Location: Germany

Re: FINDSCU encoding of result files doesnt match

#5 Post by CRIno »

The SCP is DVTk RIS Emulator 5.0.0

So the SCP should include the 0008,0005 in the response (which seems like it is not) and then findscu would set the xml header for the output xml correctly?

J. Riesmeier
DCMTK Developer
Posts: 2501
Joined: Tue, 2011-05-03, 14:38
Location: Oldenburg, Germany
Contact:

Re: FINDSCU encoding of result files doesnt match

#6 Post by J. Riesmeier »

If the SCP would correctly specify the character set used, e.g. (0008,0005) Specific Character Set = "ISO_IR 100", then findscu with option -Xs (and compiled with support for character set conversion) would convert the Latin-1 (ISO 8859-1) encoding to UTF-8. The encoding in the XML header is always "UTF-8" when using option -Xs (see manpage of this tool for details).

CRIno
Posts: 4
Joined: Tue, 2020-05-12, 14:35
Location: Germany

Re: FINDSCU encoding of result files doesnt match

#7 Post by CRIno »

Ok, Thank you very much! :)

Post Reply

Who is online

Users browsing this forum: Ahrefs [Bot] and 1 guest