CJKVconv.pl is a Perl 5.003 utility by Ken Lunde to convert between scads of codesets, including Unicode in raw and UTF forms. The command-line is: cjkvconv -iX -oY out_file Here are the options: SWITCHES: -h = This help message -i = Input language -o = Output language -s = Ignore EUC-JP code set 3 (JIS X 0212-1990) for EUC-JP output -v = Attempt to substitute mappable variants for unmappable characters -x = "Unmappable" character handling customization The following characters (or combinations of characters) can be used after the "-i" and "-o" options (no intervening space): c = Traditional Chinese (EUC-TW) e = Japanese (EUC-JP) g = Simplified Chinese (GBK) j = Japanese (Shift-JIS) k = Korean (EUC-KR) s = Simplified Chinese (EUC-CN) t = Traditional Chinese (Big Five) u = Generic CJKV (big-endian UTF-16) ul = Generic CJKV (little-endian UTF-16) u8 = Generic CJKV (UTF-8) The following characters (or combinations of characters) can be used after the "-x" option (no intervening space) to specify "unmappable" character handling: t = Output four-digit hexadecimal Unicode (big-endian UTF-16) tag <....> c = Output hexadecimal code of input encoding ch = Output hexadecimal code of input encoding, with hyphens between multiple codes e = Output nothing gX = Output a user-specified character string 'X' NOTE! The default "undefined" code point and "unmappable" character is two question marks ("??"). Round-trip conversion is possible when using the "-xt" option.