Mini Japanese Dictionary for PocketPC.

For Windows CE 3.0 and Pocket PC 2002.

MJDictC.cab or MJDictA.cab is a dictionary search program using Edict, KanjiDic and Enamedic. A cut down and sorted version of Kanjidic is included, the other dictionaries need to be downloaded from the Monash web site.

This is has not been changed since the version of Jan 2001, its just been recompiled for the ARM cpu so that it works on PocketPC 2002. I also omited the dictionaries, since there are newer versions on the Monash site. The PocketPC that I have is a UK English colour version of Compaq iPAQ with 32 Mbytes memory. I have been using it for a while and it seems to work OK, plus it's lighter than carrying 2 or 3 dictionaries, but there are probably some bugs or strange features that I don't know about. The latest dictionary files can be found on "The Monash Nihongo ftp Archive" http://ftp.monash.edu.au/pub/nihongo/.

The new features were:

Additional dictionary supported.
Japanese verb and adjective search, (almost works).
Dictionaries in EUC, UTF-16LE or UTF-8 format. (Unicode files must have the BOM)
Dictionaries or index files can be on compact flash memory (or Microdrive ?)
Copy and paste now work normally.
Kanji file sorted by radical to make SKIP lookup easier.

Cab file

The file MJdictA.cab is for the ARM cpu, MJDictC.cab is in portable CEF format for Windows CE 3.0. This file should be installed after the font file and input SIP. It contains the dictionary search program, and a cut down and sorted version of KANJIDIC called KANJIDIC.EUC.

The program has a text entry box at the top, a results area below, with the menu and 5 buttons on the toolbar. Enter English or Japanese text in the top box. The left button searches EDICT the second button searches EDICT for modified verbs. The 3rd button searches KANJIDIC and the last button searches ENAMEDIC. The search ignores the differences between Hiragana and Katakana, and between lowercase and capitals.

For the Kanji search they can be entered either by drawing the character, by cut and paste from somewhere else, or by entering a SKIP code e.g. 1-2-3 (I often use Jack Halpern's Kanji Learner's Dictionary). If the first character in the input box is kanji, then the search looks up all the kanji in the input string. If the first character is not kanji, then the search tries to match the whole string, again the difference between Hiragana and Katakana is ignored.

The last button searches the output text.

If some of the output text is selected, then it is used as the input for the search, without having to cut and paste it to the input.

The copy and paste functions are now on the menu, and work normally.

The input box also has a drop down history of past searches, which is cleared after exiting from the program.

The Dictionaries... entry on the Menu allows the location and name of the dictionaries to be changed. The small buttom after the current file name marked with ... is a browse button that allows the file to be found and selected. Directories are marked with a + sign and can be expanded by tapping with the pen on the + or by double tapping on the directory name.

The required dictionary and index files are opened and closed for each search. Adding or removing a compact flash card with the dictionary or index file should have no unexpected side effects. Removing the index file should just slow down the searches.

One known bug is that on at least German CE machines the "Program Files" directory has another name so the program cannot find the dictionary and index files. Its necessary to use the dictionaries dialog box to find the files or to create the directory and to move the dictionary and index files there for the program to work.

Other hardware

I have only tried these programs on one machine, Microsoft recommends trying CEF on at least 2 different machines, so there may be some problems.

Dictionaries

The program now accepts dictionaries in EUC code or Unicode, UTF-8 or UTF-16 little-endian. For Unicode files the first few bytes of the file must be the standard Unicode BOM (Byte Order Mark), anything without these bytes is assumed to be in EUC code. UTF-8 BOM is EF BB BF, UTF-16 little-endian BOM is FF FE. For information on Unicode see "http://www.unicode.org".

EUC code uses 1 byte for English characters and 2 bytes for Japanese, this gives the smallest file size. UTF-8 is a little larger if the files has only a few Japanese characters. It uses 1 byte for English characters and 3 bytes for Japanese characters. UTF-16 uses 2 bytes per character, so it is significantly larger than EUC but can be smaller than UTF-8 if there are a lot of Japanese characters in the file.

The best source of dictionaries, information and utilities is probably The Monash University Nihongo ftp Archive at "http://ftp.monash.edu.au/pub/nihongo/".

The format of the dictionaries and optional index files is the same as for EDICT. The applications like ejoin (for joining dictionaries) and jdxgen95 (for making the index file) can also be used to create new or customized dictionaries in EUC code. There are also programs to convert to and from other codes such as UTF-8, JIS and Shift-JIS.

Mike Johnson

E-mail: mikejohnson @ dsl . pipex . com

16 Feb 2002