About 0008,0005 Specific Character Set

Message

hamlet · #1 Post by **hamlet** » Tue, 2004-11-16, 10:30

Hi,
I have download some dicom image with different specific char set.
http://dclunie.com/images/charsettests.20030219.tar.gz

In my code:

Code: Select all

if (dO->search(tag, stack, ESM_fromHere, OFTrue) == EC_Normal)
{
    char *szTmp = NULL;
    dcmTmp = (DcmElement*) stack.pop();
    dcmTmp->getOFStringArray(szTmp);
    szTmp = (char *)szTmp.c_str();

}

How do I process szTmp that could support these images?

PS:
When I do nothing with GB18030 (Chinese) and OS is Chinese, it show patient's name correctly.

When I use MultiByteToWideChar with ISO_IR 192(UTF-8) and OS is Chinese, it show patient's name correctly.

Code: Select all

WCHAR* szUnicode;
int i= MultiByteToWideChar( CP_UTF8 , 0 ,(char*) szTmp ,-1 ,NULL,0);  
szUnicode = new WCHAR[i];
MultiByteToWideChar ( CP_UTF8 , 0 ,( char * ) szTmp, -1, szUnicode , i);

(PS 3.5-2004 Annex J has these two sample)

Do I do nothing with ISO_IR 100, ISO_IR 101, ISO_IR 109, ISO_IR 110, ISO_IR 148, ISO_IR 144, ISO_IR 127, ISO_IR 126, ISO_IR 138 in appropriate OS that show patient's name correctly?

And, How to process Japanese charset?

(sorry, I don't understand clearly at PS 3.5-2004 section 6, and I don't understand what is multivalued mean in 0008,0005)

#2 Post by **Jörg Riesmeier** » Tue, 2004-11-16, 17:52

First of all, I would recommend a much easier way of accessing the value of particular data elements (in this case as a C string):

Code: Select all

dO->findAndGetString(tag, szTemp);

Anyway, this question is probably not directly related to the DCMTK since up to now the toolkit does not contain support for specific character sets - at least not at this level (module "dcmdata"). Your question seems to be related to the MS Windows API.

Multi-valued in the standard text means that the attribute Specific Character Set (0008,0005) may contain multiple values. This allows to switch between different character sets within one DICOM dataset, even within one element value.
Details are described in part 5 of the DICOM standard as you've already noticed. Further questions regarding this topic are probably better posted to the newgroups comp.protocols.dicom.

J. Riesmeier · #3 Post by **J. Riesmeier** » Thu, 2011-11-03, 11:01

A late follow-up (hopefully, not too late): This week, we've completed a first version of an enhanced character set support for the DCMTK. This also includes Chinese and Japanese character sets. If compiled with "libiconv", all affected strings in a DICOM dataset using any of the DICOM character sets can now be converted to UTF-8 (see "dcmconv --convert-to-utf8"). As already mentioned, this is only a first step of enhanced character set support ...

If you are interested in details, check our public git repository!

DICOM @ OFFIS

About 0008,0005 Specific Character Set

About 0008,0005 Specific Character Set

Re: About 0008,0005 Specific Character Set

Who is online