Unicode conversion

All other questions regarding DCMTK

Moderator: Moderator Team

Message
Author
Jan Schlamelcher
OFFIS DICOM Team
OFFIS DICOM Team
Posts: 318
Joined: Mon, 2014-03-03, 09:51
Location: Oldenburg, Germany

Re: Unicode conversion

#16 Post by Jan Schlamelcher » Wed, 2018-09-05, 10:52

Soumya Basheer wrote:Which will be better to use ICU or ICONV. Which is supporting more languages?
The ICU is in general a bigger library compared to ICONV, with support for many more things e.g. heuristically detecting the encoding of some text. The current DCMTK (3.6.3) however only uses a subset of its possibilities (the implementation is still incomplete), which leaves you with less than what you will get when using the ICONV library. In short: today, ICONV is better while ICU will probably be the better bet for the future.

Soumya Basheer
Posts: 17
Joined: Wed, 2018-07-11, 13:21

Re: Unicode conversion

#17 Post by Soumya Basheer » Wed, 2018-09-05, 11:10

Could you please provide me which all character sets that i can give to iconv library.
While I gave ISO 2022 IR 87 it is giving error as
"Cannot select source character set: SpecificCharacterSet (0008,0005) value 'ISO 2022 IR 87' not supported"
I need to convert the kanji script in japanese..Could you please help me for it/

I see it is defined in http://dicom.nema.org/medical/dicom/cur ... D.6.2.html

J. Riesmeier
DCMTK Developer
Posts: 2295
Joined: Tue, 2011-05-03, 14:38
Location: Oldenburg, Germany
Contact:

Re: Unicode conversion

#18 Post by J. Riesmeier » Wed, 2018-09-05, 16:51

Short answer: according to the DICOM standard, 'ISO 2022 IR 87' is only allowed for character sets with code extensions, i.e. if multiple values are given in the Specific Character Set (0008,0005) attribute. For Japanese characters without code extensions (ISO 2022), you have to use 'ISO_IR 13' as the source character set (but, of course, only if this is applicable to your input data).

Soumya Basheer
Posts: 17
Joined: Wed, 2018-07-11, 13:21

Re: Unicode conversion

#19 Post by Soumya Basheer » Thu, 2018-09-06, 06:41

I need to give ISO 2022 IR 87 (Kanji- extended). whether it will support the dcmtk.I am getting error as not supported.

In ICU I could convert using ISO_IR 13. but in iconv I am getting error as illegal byte sequence. Any issues in my understandings?

J. Riesmeier
DCMTK Developer
Posts: 2295
Joined: Tue, 2011-05-03, 14:38
Location: Oldenburg, Germany
Contact:

Re: Unicode conversion

#20 Post by J. Riesmeier » Thu, 2018-09-06, 08:06

As I wrote in my previous posting, "ISO 2022 IR 87" is only used for character sets with code extensions, which requires to provide multiple values in the Specific Character Set (0008,0005) attribute. For example, "ISO 2022 IR 13\ISO 2022 IR 87" would be a valid value.

Soumya Basheer
Posts: 17
Joined: Wed, 2018-07-11, 13:21

Re: Unicode conversion

#21 Post by Soumya Basheer » Thu, 2018-09-06, 10:21

I have loaded a file and converted to japanese using icu
if (fileformat.loadFile("E:\\mod\\A00268282.dcm").good())
{
DcmDataset *dataset1 = fileformat.getDataset();
OFCondition status1 = dataset1->convertToUTF8();
dataset1->findAndGetOFString(DCM_PatientName, reqpatientName);
}
the patientname is now in OFString and can not see in japanese. If there is any method in dcmtk to view it in japanese?

While I tried it using iconv I am getting error as(input is ISO_IR 13)
"Cannot convert character encoding: Illegal byte sequence". but it is converting, I could get the output.any idea about it?

Post Reply

Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 1 guest