PURPOSE

This programme takes as its input James Breen's Kanjidic, and outputs a copy of
this with the fields stripped. The fields are ordered, and may be alligned into
columns by padding them with spaces. Further it can attempt to output a romaji
transliteration of the hiragana and katakana readings.

LICENCE

KDCol is issued under the terms of the GNU licence.

AUTHOR

KDCol has been written by Leo V. Tilson. lvtilson@hotmail.com

LANGUAGE

This programme has been written in Pascal and compiled using Borland Turbo Pascal 3.02
This compiler is rather old now, but it is still my favourite Pascal compiler. One great
advantage is that the whole compiler (including editor), the source code, the 
compiled code, and the whole of kanjidic can be fitted on one single floppy disc, which
allows one to work at any DOS or Windows machine  machine without having to install any
new software. 

ISSUES

Kanjidic is issued in UNIX text format. This results in lines being terminated by 
Carriage Returns (CR) or Line Feeds (LF), I can never remember which, while KDCol
running under DOS expects lines to terminate with the CR LF pair. Kanjidics 
end-of-line should be changed before it can be used under DOS. Several utilities
exist to do this, but the method I used was simply to open kanjidic using Windows
Wordpad, and 'save as' in a txt format.

DETAILS

Kanjidic consists of approx 6500 lines of text each of which contains several kinds
of fields. There may be more than one entry of each field type per line. Very often
a lot of the information contained in each line might be spurious to requirements
and could be removed. Furthermore it would be convenient if the fields were all in 
the same order in each line to facilitate sorting programmes. KDCol (Kanji Dic Columnator)
takes as its input kanjidic and outputs a file containing just the required fields
consistently ordered into columns. The tag is separated from each field entry and 
written separately. To allign the columns they may be padded using spaces. The 
columns are separated from each other by means of a user settable character.

The fields are output in the following order:

Kanji character.
Readings.
   Katakana On readings. 
   Hiragana kun readings.
   Romaji ON readings.
   Romaji kun readings .
   English readings.
   Korean readings.
   Pinyin readings.
Descriptors.
   Bushu number.
   Classical Radical Number.
   SKIP Codes.
   Four Corner Code.
   Stroke counts.
   Fr Joseph De Roo codes.
Computer Codes.
   JIS code.
   Unicode.
Dictionary Indices.
   James Breen Kanjidic Line No .
   Morohashi Daikanwajiten.
   Gakken Kanji Dictionary.
   Jack Halperns Modern Dictionary.
   Jack Halpern Kanji Learners Dictionary.
   James Heisig: Rembering the Kanji.
   Kenneth G.Henshell Guide to Japanese.
   Andrew Nelson's Modern Dictionary.
   Andrew Nelson's New Dictionary.
   Spahn & Hadamitsky Kanji & Kana.
   Spahn & Hadamitsky Kanji Dictionary.
   P.G.O'Neill Japanese Names.
   P.G. O'Neill's Essential Kanji.
Jouyou grade.
Frequency Rank.
Reference Codes.
Mis-Classification code.

Where more than one entry for each field type occurs the user is given the 
opportunity to select the number of entries to be output. While the width
of most of the fields is fairly constant, the, particularly English, readings
can vary considerably in length. In the case of readings then the 
user is given the opportunity to set the maximum field width. This may 
cause the reading to be truncated. In the famous last words of a general
whose name I cannot remember "Nonsense man! They couldn't hit an elephant
at this dist..."

The Romaji versions of the ON and kun yomi are produced using a rather simple
algorithm, and cannot be entirely relied upon, but may prove useful.

An example of the output using default column widths follows:

  ;i           ;  ;GAI         ;{;sign of the ;Y;hai4        ;B;8   ;S;6   ;A;   71;IN;          ;F;2072      ;
  ;            ;  ;IKI         ;{;range       ;Y;yu4         ;B;32  ;S;11  ;A;   72;IN;970       ;F;609       ;
  ;soda.tsu    ;  ;IKU         ;{;bring up    ;Y;yu4         ;B;8   ;S;8   ;A;   73;IN;246       ;F;232       ;
T1;aya         ;  ;IKU         ;{;cultural pro;Y;yu4         ;B;163 ;S;9   ;A;   74;IN;          ;F;2030      ;
  ;iso         ;  ;KI          ;{;seashore    ;Y;ji1         ;B;112 ;S;17  ;A;   75;IN;          ;F;1997      ;
  ;hito-       ;  ;ICHI        ;{;one         ;Y;yi1         ;B;1   ;S;1   ;A;   76;IN;2         ;F;2         ;
  ;hitotsu     ;  ;ICHI        ;{;I           ;Y;yi1         ;B;32  ;S;7   ;A;   77;IN;1730      ;F;1820      ;
  ;kobo.reru   ;  ;ITSU        ;{;overflow    ;Y;yi4         ;B;85  ;S;13  ;A;   78;IN;          ;F;          ;
  ;so.reru     ;  ;ITSU        ;{;deviate     ;Y;yi4         ;B;162 ;S;11  ;A;   79;IN;734       ;F;1564      ;
  ;ine         ;  ;TOU         ;{;rice plant  ;Y;dao4        ;B;115 ;S;14  ;A;   80;IN;1220      ;F;921       ;
^ ^  ^               ^          ^              ^                 ^     ^         ^      ^            ^
Ś Ś  Ś               Ś          Ś              Ś                 Ś     Ś         Ś      Ś            Ś
Ś Ś  Ś               Ś          Ś              Ś                 Ś     Ś         Ś      Ś            Rank frequency
Ś Ś  Ś               Ś          Ś              Ś                 Ś     Ś         Ś      Kanji&Kana index
Ś Ś  Ś               Ś          Ś              Ś                 Ś     Ś         Kanjidic line number
Ś Ś  Ś               Ś          Ś              Ś                 Ś     Stroke count
Ś Ś  Ś               Ś          Ś              Ś                 Bushu number
Ś Ś  Ś               Ś          Ś              Pinyin field identifier
Ś Ś  Ś               Ś          English field idendifier
Ś Ś  Ś               Romaji ON reading
Ś Ś  Romaji kun reading
Ś Field Separator Character
Field Identifier Column

Note that if more than one instance of a field type is output, the instances are placed one 
after another with only one field identifier included at the beginning of the group.