MSX Technical Guidebook
from ASCAT, a source of this article is here in Japanese.This document talk about how an MSX handles kanjis, of which I assume you have few knowledge with.
In fact, simply implementing this onto your MSX or emulation codes do not work. You should learn about
MSX-JE, software interface which converts inputted kanas into kanji. But better than nothing.JIS 1st and 2nd standards are Japanese kanji table stated by national law, 2nd being more less used than 1st kanjis. Sanyo 2+ (WAVYs) do not support 1st standard kanjis. All other 2+s and turboRs own both 1st and 2nd standard kanjis inside them.
Making a kanji system without knowledge with them is next to impossible. However, at this point, there are no Japanese coders who can explain such system to you through languages other than Japanese. Therefore, only way I can help you is through translating as much as I can.
Chapter 4 Kanji ROM
Here we explain mostly about how to access kanji ROM.
2. 4. 1 Input/output of Kanji ROM
If you want to access kanji ROM on MSX, directly access D8H to DBH of I/O.
To read a kanji font data, you should use 12-bit data called kanji codes. These codes are different with JIS codes and kuten codes, so you should find out the kanji code of the one you want to display.
#You can convert a kanji into JIS code by executing CALL JIS in kanji mode.
CALL JIS (A$,kanjistring$)
then A$ will be a string of 4-digit hexadecimal. Kanjistring$ can be any long, but what you get is always a JIS of the first character.
How to Calculate a Kanji Code
Deduct 32 from the upper byte of a kanji's JIS code and call the value as ku. Deduct 32 from the lower byte of a kanji's JIS code and call the value as ten. (these are so-called kuten codes).
In case of 1st class kanjis, if ku is 15 or less, a kanji code is ku*96+ten. If ku is 16 or greater, a kanji code is ku*96+ten-512.
In case of 2nd class kanjis, a kanji code is ((ku-48)*96+ten).
Then, access I/O port using the kanji code you calculated. This differs depending on 1st and 2nd class.
To Read a 1st Class Kanji
Output lower 6 bit of a kanji code to D8H of I/O port, and upper 6 bit to D9H. And read the value 32 times uninterrupted from D9H to get the font for the kanji.
To Read a 2nd Class Kanji
Output lower 6 bit of a kanji code to DAH of I/O port, and upper 6 bit to DBH. And read the value 32 times uninterrupted from DBH to get the font for the kanji.
Font data are arranged this way. First, 8 bytes for upper-left, then 8 bytes for upper-right. Then 8 bytes for lower-left, and 8 bytes for lower-right.
Order of Data
1st to 8th A
9th to 16th B
17th to 24th C
25th to 32th D
Font Dots
1 9
to A B to
8 16
17 25
to C D to
24 32
Usually, direct input/output to I/O port is forbidden. But there is no other way in case of kanji ROM access. BASIC command PUT KANJI and BIOS called KNJPRT, which were added for MSX2 and later only supports 1st Class kanjis (MSX2+ and later supports 2nd class too) and also useless when you want to directly tamper with kanji fonts, say, when you want to compress 16*16 dot fonts into 12*16 dots, or when you want to directly print out the kanji font as bit images. Also, they do not support screen 2 or 4. And, there is no BIOS to access kanji ROM available in MSX1. Therefore, we access I/O port directly in case of kanji ROM access.
Beside, as many of you might know, if you access blank part of kanji ROM, you do not get a blank space but a font of kanij you didn't want to at all. To get a 2-byte space, make sure to use that of JIS code 2121H. About hankaku letters (narrow katakanas, alphabets and puctuations), use only the left part (A and C in the diagram up) of fonts between JIS code 2021H to 207EH, and 2921H to 295FH. These hankaku fonts are not supported by kanji drivers and PUT KANJI command.
2. 4. 2 Existence of Kanji ROM
If you try kanji input/output without kanji ROM, bean curds (16*16 dots white square) appear on the screen and you will feel really sorry. Therefore, calibrating the existence of kanji ROM becomes necessary.
To Check Whether 1st Class Kanji ROM Exist
First 8 bytes of the font with JIS code 2140H (ku-1, ten-32) of the kanji ROM is always in order of 00H, 40H, 20H, 10H, 08H, 04H, 02H, 01H. This means that, if you read the out font for this code, and if all first 8 bytes correspond to these, there must be 1st Class kanji ROM present.
To Check Whether 2nd Class Kanji ROM Exist
Read out the font data of the one with JIS code 737EH (ku-83, ten-94), sum up the values of first 8 bytes and divide it by 256, and if mod is 149 (95H), there must be 2nd Class kanji ROM present.
When making a software, make sure it checks the existence of kanji ROM, and to make it available on machines without kanji ROMs, prepare 1-byte messages. A software which uses tons of rare kanjis on 2nd-Class equipped machines and display simple alternative kanjis on machines without is possible, but I have never seen such.
2. 4. 3 SHIFT JIS Code
In MSX, kanji files use SHIFT JIS code (Microsoft kanji code). This is because since in MS-DOS kanji files are standardized by Microsoft kanji code, so on MSX which use same format, same codes are used.
In shift JIS code, there are no kanji start and end codes, and you can use both hankaku and 2 byte kanjis together.
Here, we present an algorithm to convert from JIS code into SHIFT JIS.
Assume the upper byte of a JIS code as JIH, and lower byte as JIL. And assume upper byte of SHIFT JIS code as SJH, and lower byte as SJL.
To Convert from JIS Code into SHIFT JIS Code
1. If JIH is 5EH or below, INT((JIH-1)/2+71H) is SJH. Otherwise, INT((JIH-1)/2+B1H) is SJH..
2. If JIH is even number, (JIL+7EH) is SJL. Otherwise, temporary suppose (JIL+1FH) as SJL. If this value is 7FH or more, add 1 to it. If it is smaller than 7FH, leave it as is.
To convert from SHIFT JIS into JIS
1. If SJH is smaller than A0H, subtract 71H from SJH and name it as H. If it is more than A0H, subtract B1H from SJH and name it as H instead. Whichever the case, substitute H*2+1 into H again.
2. If SJL is greater than 7FH, then SJL-1 is L. Otherwise, SJL itself is. Then, if L is smaller than 9EH, L-1FH becomes L. If L is 9EH or more, L-7DH becomes L, and H+1 becomes H.
3. Now value of H is upper byte of JIS code, and L is the lower byte of it.