About 0008,0005 Specific Character Set

All other questions regarding DCMTK

Moderator: Moderator Team

Post Reply
Posts: 13
Joined: Tue, 2004-11-16, 09:53

About 0008,0005 Specific Character Set

#1 Post by hamlet » Tue, 2004-11-16, 10:30

I have download some dicom image with different specific char set.

In my code:

Code: Select all

if (dO->search(tag, stack, ESM_fromHere, OFTrue) == EC_Normal)
    char *szTmp = NULL;
    dcmTmp = (DcmElement*) stack.pop();
    szTmp = (char *)szTmp.c_str();

How do I process szTmp that could support these images?

When I do nothing with GB18030 (Chinese) and OS is Chinese, it show patient's name correctly.

When I use MultiByteToWideChar with ISO_IR 192(UTF-8) and OS is Chinese, it show patient's name correctly.

Code: Select all

WCHAR* szUnicode;
int i= MultiByteToWideChar( CP_UTF8 , 0 ,(char*) szTmp ,-1 ,NULL,0);  
szUnicode = new WCHAR[i];
MultiByteToWideChar ( CP_UTF8 , 0 ,( char * ) szTmp, -1, szUnicode , i);
(PS 3.5-2004 Annex J has these two sample)

Do I do nothing with ISO_IR 100, ISO_IR 101, ISO_IR 109, ISO_IR 110, ISO_IR 148, ISO_IR 144, ISO_IR 127, ISO_IR 126, ISO_IR 138 in appropriate OS that show patient's name correctly?

And, How to process Japanese charset?

(sorry, I don't understand clearly at PS 3.5-2004 section 6, and I don't understand what is multivalued mean in 0008,0005)

Jörg Riesmeier
Posts: 2217
Joined: Fri, 2004-10-29, 21:38
Location: Oldenburg, Germany

Re: About 0008,0005 Specific Character Set

#2 Post by Jörg Riesmeier » Tue, 2004-11-16, 17:52

First of all, I would recommend a much easier way of accessing the value of particular data elements (in this case as a C string):

Code: Select all

dO->findAndGetString(tag, szTemp);
Anyway, this question is probably not directly related to the DCMTK since up to now the toolkit does not contain support for specific character sets - at least not at this level (module "dcmdata"). Your question seems to be related to the MS Windows API.

Multi-valued in the standard text means that the attribute Specific Character Set (0008,0005) may contain multiple values. This allows to switch between different character sets within one DICOM dataset, even within one element value.
Details are described in part 5 of the DICOM standard as you've already noticed. Further questions regarding this topic are probably better posted to the newgroups comp.protocols.dicom.

J. Riesmeier
DCMTK Developer
Posts: 2297
Joined: Tue, 2011-05-03, 14:38
Location: Oldenburg, Germany

#3 Post by J. Riesmeier » Thu, 2011-11-03, 11:01

A late follow-up (hopefully, not too late): This week, we've completed a first version of an enhanced character set support for the DCMTK. This also includes Chinese and Japanese character sets. If compiled with "libiconv", all affected strings in a DICOM dataset using any of the DICOM character sets can now be converted to UTF-8 (see "dcmconv --convert-to-utf8"). As already mentioned, this is only a first step of enhanced character set support ...

If you are interested in details, check our public git repository!

Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 1 guest