Background
TitleFactory supports and several scripts. This section describes supported scripts and the degree that one can expect from the product.
The user interface for TitleFactory supports ASCII or non-ASCII text as of TitleFactory Release 2.0.2 BLDA. ACSII is an internal format that allows a single character to be represented in a single byte. In Fact ASCII only allows 128 characters. Unfortunately, a single byte does not allow for all of the various scripts to be represented. There are much more than 128 characters in all of the world's scripts.
TitleFactory supports non_ASCII scripts if the text is placed into UTF-8 format. Every script that we have tried from ASCII, to Greek, to Russian, to Chinese, to Czech can be placed into a UTF8 format, in NotePad or Word or many other text editors. In fact TitleFactory is being used all over the world and is being used to create image text files supporting a variety of scripts for DVD production.
If you have a script that you believe cannot be placed into UTF-8 format, please contact us, and we will see if can provide an alternative conversion method for your particular script.
How to Process Non ASCII Text in TitleFactory
To process text other than ASCII text or extended ASCII text (Latin-1, or ISO-8859-1), the input text file should be encoded in UniCode UTF-8 format. Many common word or text processing packages such as Microsoft Word, NotePad, etc. can save text in UTF-8 format.
UniCode is a text format that allows for multi-byte character sets of varying length. UTF-8 is a subset of the UniCode specification, and allows each character of text to be internally represented with 2-6 bytes. Note that a encoding of UniCode, however, is not the same as an encoding of UTF-8 for most software packages.
To create images with TitleFactory, with text other that Western European (which includes North/South American) scripts, one or more parameters must be set properly.
Note that some fonts have re-used the Latin 1 code sets to represent a non-Latin 1 script. An example of this is the Hindi script known as DV-TTNandan. In this case the script is actually represented in the area normally reserved for Latin 1 scripts. In this case, Even though TitleFactory may work, even if the Encoding setting on the input settings window is et to ASCII, it would still be better to convert the file to UTF-8 and specify UTF-8 on the Input Settings Window.
Also, when utilizing a non-western script and the Textmode is Un-Wrapped or Wrapped, make sure that the 'Ending Punctuation' is set to the characters that the script uses to end a sentence. This is necessary if the Textmode is Un-Wrapped or Wrapped, that is when, TitleFactory needs to parse the text.
Special Notes on Script Support
ImageMagick is a set of low level routines that are used to annotate text on images. While ImageMagick supports UTF-8, it does not perform any special script encoding. Script encoding, therefore is performed with TitleFactory for certain scripts. At present, special encoding is only performed on Right to Left scripts.
What do we know works? Only those scripts that have been tested.
Successfully tested scripts include:
Western European, also known as Latin Script (for languages such as English, French, Spanish),
Slavic (for languages such as Norwegian, Swedish),
Eastern European (for languages such as Czech, Romanian),
Cryllic (for languages such as Russian, Greek, Bulgarian),
Arabic. Note: There are two encoding schemes supported for Arabic (I.e., Arabic and Arabic2). Arabic requires that the font supports the Arabic Presentation Forms A and B and provides better results that Arabic2 encoding. Arabic 2 encoding only requires the use of Arabic Presentation Form B.
Chinese
Hindu
Untested scripts include:
Several Right to Left scripts (The only encoding done with these is to transform the characters such that they appear in the correct order).
Southeast Asian scripts such as Korean, and Thai
Indic
Niger-Congo
and all the rest of the scripts that are in the yellowish hue below.
Note also that since the user interface does not support non-ASCII character, the use of TitleFactory for non-Latin scripts is extremely limited.
| Language Tree | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Tower of Babel | Everyone spoke with the same language | |||||||||
| Family | Branch | Language | Speakers | Region | Script | Computer | ||||
| 1 | Indo-European | Germanic | English | 430 | Global | Latin | utf-8 or ASCII | |||
| German | 120 | Germany | Latin | utf-8 or ASCII | ||||||
| Yiddish | - | Hebrew | - | |||||||
| Dutch | 22 | Netherlands | Latin | utf-8 | ||||||
| Afrikaans | South Africa | Latin | utf-8 | |||||||
| Swedish | 9 | North Europe | Latin | utf-8 | ||||||
| Danish | 5.4 | Latin | utf-8 | |||||||
| Norwegian | 4.5 | Latin | utf-8 | |||||||
| Icelandic | 0.3 | Latin | utf-8 | |||||||
| Latin (Romance) |
Spanish | 310 | Global | Latin | utf-8 | |||||
| Portuguese | 175 | Brazil | Latin | utf-8 | ||||||
| French | 115 | Global | Latin | utf-8 | ||||||
| Italian | 63 | Italy | Latin | utf-8 | ||||||
| Romanian | 22 | Romania | Latin | utf-8 | ||||||
| Hellenic | Greek | 11 | Greece | Cryllic | utf-8 | |||||
| Slavic | Russian | 280 | Soviet Union | Cryllic | utf-8 | |||||
| Bulgarian | 7.6 | Bulgaria | Cryllic | utf-8 | ||||||
| Polish | 39 | Poland | Latin | utf-8 | ||||||
| Ukrainian | 48 | Ukraine | Cryllic | utf-8 | ||||||
| Czech | 10 | Czech Republic | Cryllic | utf-8 | ||||||
| Indic | Hindi | 320 | India | Davanagari | - | |||||
| Bengali | 185 | Bangladesh | Davanagari | - | ||||||
| Urdu | 88 | Pakistan | Nastaliq | - | ||||||
| Punjabi | 75 | Pakistan | Gurumukha | - | ||||||
| Konkani | - | India | Latin | - | ||||||
| Other | Nepali, Assamese, Oriya, kashmiri, Sindhi, Gujerati, Sinhalese, Maldavian, Romany | |||||||||
| 2 | Altic | Japonic | Japanese | 125 | Japan | Japanese | utf-8 | |||
| Korean | Korean | 68 | Korea | Hangul | utf-8 | |||||
| Mongolian | Mongolian, Buryat, Kalmyk | |||||||||
| Tungusic | Evenki, Lamut, Manchu, Nanai, Sibo | |||||||||
| Turkic | Turkish | 83 | Turkey | Latin | - | |||||
| Other | Azeri, Turkmen, Kazakh, Kirghiz, tatar, Bashkir, Uzbek, Uigur, Chuvash, Balkar, Nogai, Salar | |||||||||
| 3 | Sino-Tibetan | Sinitic | Mandarin | 900 | China | Chinese | utf-8 | |||
| Other | Wu, Gan, Min, Hakka, Xiang, Cantonese, Yue | |||||||||
| Tibeto-Burman | Burmese | 42 | Burma | Indian | - | |||||
| Other | Tibetan, Yi, Lisu, Moso, Lahu, Karen, Kachin, Chin, Bodo, Garo, Meithei, Lushei, Newari, Murmi, Jonkha, Mizo, Lepcha, Manipuri | |||||||||
| Tai | Thai | 62 | Thailand | South Indian | iso-8859-11, tis-620 | |||||
| Other | Lao, Chuang, Puyi, Tung, Nung, Shan, kam-Sui, Zhuang, Li, Be | |||||||||
| Southern | Miao, Yao, She | |||||||||
| 4 | Afro-Asiatic | Semitic | Arabic | 185 | Middle East | Arabic | utf-8 | |||
| Hebrew | - | Israel | Hebrew | utf-8 | ||||||
| Maltese | 0.4 | Malta | Latin | utf-8 | ||||||
| Aramaic | - | Middle East | Latin | utf-8 | ||||||
| Other | Amharic, Tigrinya, Tigre, Aramaic, Gurage, Harari, Geez | |||||||||
| Berber | Shluh, Tamazight, Riffian, Kabyle, Shawia, Tuareg | |||||||||
| Cushitic | Somali, Galla, Sidamo, Beja, Afar, Saho | |||||||||
| Chadic | Hausa | |||||||||
| 5 | Austro-Asiatic | Viet-Muong | Vietnamese | 81 | Vietnam | Latin | utf-8 | |||
| Muong | - | |||||||||
| Mon-khmer | Khmer, Mon, Palaung, Wa, Bahnar, Sedang, Khasi, Nicobarese, So, Nancowry, Sengoi, Temiar | |||||||||
| Munda | Santali, Mundari, Ho, Savara, Korku | |||||||||
| 6 | Uralic | Finnic | Finnish, Estonian, Mordvin, Udmurt, Mari, Votyak, Komi, Sami | |||||||
| Ugric | Hungarian, Ostyak, Vogul | |||||||||
| Samoyed | Nenets, Selkup, Nganasan, Enets, Kamas | |||||||||
| Yukaghir | Yukaghir | - | Eastern Siberia | Pictogram | - | |||||
| 7 | Malayo-Polynesian Austronesian |
Formosan | Amis, Atayal, Paiwan, Tsou | |||||||
| Western | Indonesia | 140 | Indonesia | Latin | utf-8 | |||||
| Malay | Indonesia | Latin | - | |||||||
| Tagalog | 85 | Philippines | Latin | utf-8 | ||||||
| Other | Javanese, Sundanese, Madurese, Visayan, Malagasy, Achinese, Batak, Buginese, Balinese, Ilocano, Bikol, Igorot, Maranao, Pampangan, Pangasinan, Jarai, Rhade, Cham | |||||||||
| Micronesian | Marshallese, Gilbertese, Chamorro, Ponapean, Yapese, Palau, Trukese, Nauruan | |||||||||
| Melanesian | Fijian, Motu, Yabim | |||||||||
| Polynesian | Maori, Uvea, Samoan, Tongan, Niuean, Rarotongan, Tahitian, Tuamotu, Marquesan, Hawaiian, Rapa, Nui | |||||||||
| 8 | Caucasian | Kartvelian | Georgian, Laz, Svan, Chan, Mingrelian | |||||||
| Abkhaz-Adyghean | Abaza, Abkhaz, Adyghe, Kabardian, Circassian | |||||||||
| Nakh | Chechen, Ingush, Tsova-Tush | |||||||||
| Daghestanian | Tsez, Hunzib, Beshta, Avar, Andi, Chamali, Lak, Dargwa, Lezgian, Tabasaran, Tsakhur | |||||||||
| 9 | Dravidian | Southern | Tamil | 66 | South India | Tamil | - | |||
| Other | Telugu, Kannada, Malayalam, Tulu | |||||||||
| Central | Brahui, Gondhi, Kurukh, Kui | |||||||||
| 10 | Niger-Congo | English and French appear to be the official languages of most of these nations. | ||||||||
| Mande | Mende, Malinke, Bambara, Dyula, Soninke, Susu, Kpelle, Vai, Loma | |||||||||
| West Altantic | Fulani, Wolof, Serer, Dyola, Temne, Kissi, Gola, Balante | |||||||||
| Voltaic | Mossi, Gurma, Dagomba, Kabre, Senufo, Bariba | |||||||||
| Kwa | Yoruba, Ibo, Ewe, Twi, Fanti, Ga, Adangme, Fon, Edo, Urhobo, Idoma, Nupe, Agni, Baule, Kru, Grebo, Bassa | |||||||||
| Bantu | Luba, Kongo, Lingala, Mongo, Ruanda, Rundi, Kikuyu, Kamba, Sukuma, Nyamwezi, Hehe, Chagga, Makonde, Yao, Ganda, Nkole, Chiga, Gisu, Toro, Nyoro, Nyanja, Tumbuka, Bemba, Tonga, Lozi, Lwena, Lunda, Shona, Fang, Bulu, Yaundé, Duala, Bubi, Mbundu, Chokwe, Ambo, Herero, Makua, Thonga, Sotho, Tswana, Pedi, Swazi, Zulu, Matebele, Xhosa, Venda | |||||||||
| Swahili | ||||||||||
| Efik | Efik, Ibibio, Tiv | |||||||||
| Adamwan | Mbum | |||||||||
| Eastern | Zande, Sango, Gbaya, Banda | |||||||||
| Ijo | Ijo | |||||||||
| 11 | Other | There are over 100 language families | ||||||||
from http://www.teachinghearts.org/
|
Copyright © 2002-2009 . All rights reserved. |