Chinese Translation |
|

|
| |
|
Chinese Character Encoding">
| |
|
Computers don't speak any languages. They only know numbers. In order for computers to work with human languages such as Chinese and English, special mappings between numbers and letters or characters are made into standards that various computers and programs understand. These agreed ways of using Chinese are called characters sets or code sets. GB (short for "Guojia Biaozhun" or "National Standard) was established by Mainland China in 1981 to represent simplified Chinese characters. It is a one, two or four byte encoding and has defined about 6763 Chinese characters (excluding all symbols). Countries such as China, Singapore and Malaysia are using this encoding standard. Big 5 whose name refers to the five companies that collaborated in its development, was established in 1984 and is the character encoding standard most commonly used for traditional Chinese characters. Countries such as Taiwan, Hong Kong and Malaysia are using this encoding standard. It is a one or two byte encoding.
GB is usually displayed using simplified characters and Big5 is usually displayed using traditional characters. There is however no mandated connection between the encoding system and the font used to display the characters, though font and encoding are always tied together for practical reasons.
The conversion between traditional and simplified Chinese is usually problematic. The traditional to simplified (many-to-one) conversion is simple, but the opposite conversion often results in a data loss. The simplified to traditional (one-to-many) conversion often requires usage context or common phrases to resolve conflicts. The vocabulary in the two lexical corpuses does not have one-to-one correspondence and one simplified character may correspond with multiple traditional ones. Besides, the standard of choosing characters is not the same for the two writing systems. Some rare characters can be recognised by only one of the encoding systems. No software can perform such a transformation. Thus, the conversion has to be corrected manually.
|
|
|