Creating international applications

PC Repair Tools

Advanced Registry Cleaner PC Diagnosis and Repair

Get Instant Access

This chapter discusses guidelines for writing applications that you plan to distribute to an international market. By planning ahead, you can reduce the amount of time and code necessary to make your application function in its foreign market as well as in its domestic market.

Internationalization and localization

To create an application that you can distribute to foreign markets, there are two major steps that need to be performed:

  • Internationalization
  • Localization

If your edition includes the Translation Tools, you can use the them to manage localization. For more information, see the online Help for the Translation Tools (ETM.hlp).

Internationalization

Internationalization is the process of enabling your program to work in multiple locales. A locale is the user's environment, which includes the cultural conventions of the target country as well as the language. Windows supports many locales, each of which is described by a language and country pair.

Internationalizing applications

Localization

Localization is the process of translating an application so that it functions in a specific locale. In addition to translating the user interface, localization may include functionality customization. For example, a financial application may be modified for the tax laws in different countries.

Internationalizing applications

You need to complete the following steps to create internationalized applications:

  • Enable your code to handle strings from international character sets.
  • Design your user interface to accommodate the changes that result from localization.
  • Isolate all resources that need to be localized.

Enabling application code

You must make sure that the code in your application can handle the strings it will encounter in the various target locales.

-Character sets

The Western editions (including English, French, and German) of Windows use the ANSI Latin-1 (1252) character set. However, other editions of Windows use different character sets. For example, the Japanese version of Windows uses the Shift-JIS character set (code page 932), which represents Japanese characters as multibyte character codes.

There are generally three types of characters sets:

  • Single-byte
  • Multibyte
  • Wide characters

Windows and Linux both support single-byte and multibyte character sets as well as Unicode. With a single-byte character set, each byte in a string represents one character. The ANSI character set used by many western operating systems is a single-byte character set.

In a multibyte character set, some characters are represented by one byte and others by more than one byte. The first byte of a multibyte character is called the lead byte. In general, the lower 128 characters of a multibyte character set map to the 7-bit ASCII characters, and any byte whose ordinal value is greater than 127 is the lead byte of a multibyte character. Only single-byte characters can contain the null value (#0). Multibyte character sets—especially double-byte character sets (DBCS)—are widely used for Asian languages.

OEM and ANSI character sets

It is sometimes necessary to convert between the Windows character set (ANSI) and the character set specified by the code page of the user's machine (called the OEM character set).

Multibyte character sets

The ideographic character sets used in Asia cannot use the simple 1:1 mapping between characters in the language and the one byte (8-bit) char type. These languages have too many characters to be represented using the single-byte char. Instead, a multibyte string can contain one or more bytes per character. AnsiStrings can contain a mix of single-byte and multibyte characters.

The lead byte of every multibyte character code is taken from a reserved range that depends on the specific character set. The second and subsequent bytes can sometimes be the same as the character code for a separate one-byte character, or it can fall in the range reserved for the first byte of multibyte characters. Thus, the only way to tell whether a particular byte in a string represents a single character or is part of a multibyte character is to read the string, starting at the beginning, parsing it into two or more byte characters when a lead byte from the reserved range is encountered.

When writing code for Asian locales, you must be sure to handle all string manipulation using functions that are enabled to parse strings into multibyte characters. See "MBCS utilities" in the online Help for a list of the RTL functions that are enabled to work with multibyte characters.

Delphi provides you with many of these runtime library functions, as listed in the following table:

Table 17.1 Runtime library functions

AdjustLineBreaks

AnsiStrLower

ExtractFileDir

AnsiCompareFileName

AnsiStrPos

ExtractFileExt

AnsiExtractQuotedStr

AnsiStrRScan

ExtractFileName

AnsiLastChar

AnsiStrScan

ExtractFilePath

AnsiLowerCase

AnsiStrUpper

ExtractRelativePath

AnsiLowerCaseFileName

AnsiUpperCase

FileSearch

AnsiPos

AnsiUpperCaseFileName

IsDelimiter

AnsiQuotedStr

ByteToCharIndex

IsPathDelimiter

AnsiStrComp

ByteToCharLen

LastDelimiter

AnsiStrIComp

ByteType

StrByteType

AnsiStrLastChar

ChangeFileExt

StringReplace

AnsiStrLComp

CharToByteIndex

WrapText

AnsiStrLIComp

CharToByteLen

Remember that the length of the strings in bytes does not necessarily correspond to the length of the string in characters. Be careful not to truncate strings by cutting a multibyte character in half. Do not pass characters as a parameter to a function or procedure, since the size of a character can't be known up front. Instead, always pass a pointer to a character or a string.

Wide characters

Another approach to working with ideographic character sets is to convert all characters to a wide character encoding scheme such as Unicode. Unicode characters and strings are also called wide characters and wide character strings. In the Unicode character set, each character is represented by two bytes. Thus a Unicode string is a sequence not of individual bytes but of two-byte words.

The first 256 Unicode characters map to the ANSI character set. The Windows operating system supports Unicode (UCS-2). The Linux operating system supports UCS-4, a superset of UCS-2. Delphi supports UCS-2 on both platforms. Because wide characters are two bytes instead of one, the character set can represent many more different characters.

Using a wide character encoding scheme has the advantage that you can make many of the usual assumptions about strings that do not work for MBCS systems. There is a direct relationship between the number of bytes in the string and the number of characters in the string. You do not need to worry about cutting characters in half or mistaking the second half of a character for the start of a different character.

The biggest disadvantage of working with wide characters is that Windows supports a few wide character API function calls. Because of this, the VCL components represent all string values as single byte or MBCS strings. Translating between the wide character system and the MBCS system every time you set a string property or read its value would require additional code and slow your application down. However, you may want to translate into wide characters for some special string processing algorithms that need to take advantage of the 1:1 mapping between characters and WideChars.

Including bi-directional functionality in applications

Some languages do not follow the left to right reading order commonly found in western languages, but rather read words right to left and numbers left to right. These languages are termed bi-directional (BiDi) because of this separation. The most common bi-directional languages are Arabic and Hebrew, although other Middle East languages are also bi-directional.

TApplication has two properties, BiDiKeyboard and NonBiDiKeyboard, that allow you to specify the keyboard layout. In addition, the VCL supports bi-directional localization through the BiDiMode and ParentBiDiMode properties.

Note Bi-directional properties are not available for cross-platform applications.

BiDiMode property

The BiDiMode property controls the reading order for the text, the placement of the vertical scrollbar, and whether the alignment is changed. Controls that have a text property, such as Name, display the BiDiMode property on the Object Inspector.

The BiDiMode property is a new enumerated type, TBiDiMode, with four states: bdLeftToRight, bdRightToLeft, bdRightToLeftNoAlign, and bdRightToLeftReadingOnly.

Note THintWindow picks up the BiDiMode of the control that activated the hint.

bdLeftToRight bdLeftToRight draws text using left to right reading order. The alignment and scroll bars are not changed. For instance, when entering right to left text, such as Arabic or Hebrew, the cursor goes into push mode and the text is entered right to left. Latin text, such as English or French, is entered left to right. bdLeftToRight is the default value.

Figure 17.1 TListBox set to bdLeftToRight bdRightToLeft bdRightToLeft draws text using right to left reading order, the alignment is changed and the scroll bar is moved. Text is entered as normal for right-to-left languages such as Arabic or Hebrew. When the keyboard is changed to a Latin language, the cursor goes into push mode and the text is entered left to right.

Figure 17.2 TListBox set to bdRightToLeft bdRightToLeftNoAlign bdRightToLeftNoAlign draws text using right to left reading order, the alignment is not changed, and the scroll bar is moved.

Figure 17.3 TListBox set to bdRightToLeftNoAlign bdRightToLeftReadingOnly bdRightToLeftReadingOnly draws text using right to left reading order, and the alignment and scroll bars are not changed.

Figure 17.4 TListBox set to bdRightToLeftReadingOnly

ParentBiDiMode property

ParentBiDiMode is a Boolean property. When True (the default) the control looks to its parent to determine what BiDiMode to use. If the control is a TForm object, the form uses the BiDiMode setting from Application. If all the ParentBiDiMode properties are True, when you change Application's BiDiMode property, all forms and controls in the project are updated with the new setting.

FlipChildren method

The FlipChildren method allows you to flip the position of a container control's children. Container controls are controls that can accept other controls, such as TForm, TPanel, and TGroupBox. FlipChildren has a single boolean parameter, AllLevels. When False, only the immediate children of the container control are flipped. When True, all the levels of children in the container control are flipped.

Delphi flips the controls by changing the Left property and the alignment of the control. If a control's left side is five pixels from the left edge of its parent control, after it is flipped the edit control's right side is five pixels from the right edge of the parent control. If the edit control is left aligned, calling FlipChildren will make the control right aligned.

To flip a control at design-time select Edit! Flip Children and select either All or Selected, depending on whether you want to flip all the controls, or just the children of the selected control. You can also flip a control by selecting the control on the form, right-clicking, and selecting Flip Children from the context menu.

Note Selecting an edit control and issuing a Flip Children ! Selected command does nothing. This is because edit controls are not containers.

Additional methods

There are several other methods useful for developing applications for bi-directional users.

Table 17.2 VCL methods that support Bi Di Method Description

OkToChangeFieldAlignment Used with database controls. Checks to see if the alignment of a control can be changed.

DBUseRightToLeftAlignment A wrapper for database controls for checking alignment.

ChangeBiDiModeAlignment Changes the alignment parameter passed to it. No check is done for BiDiMode setting, it just converts left alignment into right alignment and vice versa, leaving center-aligned controls alone.

IsRightToLeft Returns True if any of the right to left options are selected. If it returns False the control is in left to right mode.

UseRightToLeftReading Returns True if the control is using right to left reading.

UseRightToLeftAlignment Returns True if the control is using right to left alignment. It can be overridden for customization.

UseRightToLeftScrollBar Returns True if the control is using a left scroll bar.

DrawTextBiDiModeFlags Returns the correct draw text flags for the BiDiMode of the control.

DrawTextBiDiModeFlagsReadingOnly Returns the correct draw text flags for the BiDiMode of the control, limiting the flag to its reading order.

AddBiDiModeExStyle Adds the appropriate ExStyle flags to the control that is being created.

Locale-specific features

You can add extra features to your application for specific locales. In particular, for Asian language environments, you may want your application to control the input method editor (IME) that is used to convert the keystrokes typed by the user into character strings.

Controls offer support in programming the IME. Most windowed controls that work directly with text input have an ImeName property that allows you to specify a particular IME that should be used when the control has input focus. They also provide an ImeMode property that specifies how the IME should convert keyboard input. TWinControl introduces several protected methods that you can use to control the IME from classes you define. In addition, the global Screen variable provides information about the IMEs available on the user's system.

The global Screen variable also provides information about the keyboard mapping installed on the user's system. You can use this to obtain locale-specific information about the environment in which your application is running.

The IME is available in VCL applications only.

Designing the user interface

When creating an application for several foreign markets, it is important to design your user interface so that it can accommodate the changes that occur during translation.

Text

All text that appears in the user interface must be translated. English text is almost always shorter than its translations. Design the elements of your user interface that display text so that there is room for the text strings to grow. Create dialogs, menus, status bars, and other user interface elements that display text so that they can easily display longer strings. Avoid abbreviations—they do not exist in languages that use ideographic characters.

Short strings tend to grow in translation more than long phrases. Table 17.3 provides a rough estimate of how much expansion you should plan for given the length of your English strings:

Table 17.3 Estimating string lengths

Length of English string (in characters) Expected increase

Graphic images

Ideally, you will want to use images that do not require translation. Most obviously, this means that graphic images should not include text, which will always require translation. If you must include text in your images, it is a good idea to use a label object with a transparent background over an image rather than including the text as part of the image.

There are other considerations when creating graphic images. Try to avoid images that are specific to a particular culture. For example, mailboxes in different countries look very different from each other. Religious symbols are not appropriate if your application is intended for countries that have different dominant religions. Even color can have different symbolic connotations in different cultures.

Formats and sort order

The date, time, number, and currency formats used in your application should be localized for the target locale. If you use only the Windows formats, there is no need to translate formats, as these are taken from the user's Windows Registry. However, if you specify any of your own format strings, be sure to declare them as resourced constants so that they can be localized.

The order in which strings are sorted also varies from country to country. Many European languages include diacritical marks that are sorted differently, depending on the locale. In addition, in some countries, two-character combinations are treated as a single character in the sort order. For example, in Spanish, the combination ch is sorted like a single unique letter between c and d. Sometimes a single character is sorted as if it were two separate characters, such as the German eszett.

Keyboard mappings

Be careful with key-combinations shortcut assignments. Not all the characters available on the US keyboard are easily reproduced on all international keyboards. Where possible, use number keys and function keys for shortcuts, as these are available on virtually all keyboards.

Isolating resources

The most obvious task in localizing an application is translating the strings that appear in the user interface. To create an application that can be translated without altering code everywhere, the strings in the user interface should be isolated into a single module. Delphi automatically creates a .dfm (.xfm in CLX applications) file that contains the resources for your menus, dialogs, and bitmaps.

In addition to these obvious user interface elements, you will need to isolate any strings, such as error messages, that you present to the user. String resources are not included in the form file. You can isolate them by declaring constants for them using the resourcestring keyword. For more information about resource string constants, see the Delphi Language Guide. It is best to include all resource strings in a single, separate unit.

Creating resource DLLs

Isolating resources simplifies the translation process. The next level of resource separation is the creation of a resource DLL. A resource DLL contains all the resources and only the resources for a program. Resource DLLs allow you to create a program that supports many translations simply by swapping the resource DLL.

Use the Resource DLL wizard to create a resource DLL for your program. The Resource DLL wizard requires an open, saved, compiled project. It will create an RC file that contains the string tables from used RC files and resourcestring strings of the project, and generate a project for a resource only DLL that contains the relevant forms and the created RES file. The RES file is compiled from the new RC file.

You should create a resource DLL for each translation you want to support. Each resource DLL should have a file name extension specific to the target locale. The first two characters indicate the target language, and the third character indicates the country of the locale. If you use the Resource DLL wizard, this is handled for you. Otherwise, use the following code to obtain the locale code for the target translation:

unit locales;

interface uses

Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms, Dialogs, StdCtrls;

type

TForml = class(TForm) Buttonl: TButton; LocaleList: TListBox; procedure Button1Click(Sender: TObject); private

  • Private declarations } public
  • Public declarations } end;

Forml: TForml; implementation

function GetLocaleData(ID: LCID; Flag: DWORD): string; var

BufSize: Integer; begin

BufSize := GetLocaleInfo(ID, Flag, nil, 0); SetLength(Result, BufSize);

GetLocaleinfo(ID, Flag, PChar(Result), BufSize); SetLength(Result, BufSize - 1); end;

{ Called for each supported locale. }

function LocalesCallback(Name: PChar): Bool; stdcall;

LCID: Integer; begin

Form1.LocaleList.Items.Add(GetLocaleData(LCID, LOCALE_SLANGUAGE)); Result := Bool(1); end;

procedure TForm1.Button1Click(Sender: TObject); var

I: Integer; begin with Languages do begin for I := 0 to Count - 1 do begin

Using resource DLLs

The executable, DLLs, and packages (bpls) that make up your application contain all the necessary resources. However, to replace those resources by localized versions, you need only ship your application with localized resource DLLs that have the same name as your executable, DLL, or package files.

When your application starts up, it checks the locale of the local system. If it finds any resource DLLs with the same name as the EXE, DLL, or BPL files it is using, it checks the extension on those DLLs. If the extension of the resource module matches the language and country of the system locale, your application will use the resources in that resource module instead of the resources in the executable, DLL, or package. If there is not a resource module that matches both the language and the country, your application will try to locate a resource module that matches just the language. If there is no resource module that matches the language, your application will use the resources compiled with the executable, DLL, or package.

If you want your application to use a different resource module than the one that matches the locale of the local system, you can set a locale override entry in the Windows registry. Under the HKEY_CURRENT_USER\Software\Borland\Locales key, add your application's path and file name as a string value and set the data value to the extension of your resource DLLs. At startup, the application will look for resource DLLs with this extension before trying the system locale. Setting this registry entry allows you to test localized versions of your application without changing the locale on your system.

For example, the following procedure can be used in an install or setup program to set the registry key value that indicates the locale to use when loading applications:

procedure SetLocalOverrides(FileName: string, LocaleOverride: string); var

Reg: TRegistry; begin

Reg := TRegistry.Create; try if Reg.OpenKey('Software\Borland\Locales', True) then Reg.WriteString(LocalOverride, FileName); finally

Within your application, use the global FindResourceHInstance function to obtain the handle of the current resource module. For example:

LoadStr(FindResourceHInstance(HInstance), IDS_AmountDueName, szQuery, SizeOf(szQuery));

You can ship a single application that adapts itself automatically to the locale of the system it is running on, simply by providing the appropriate resource DLLs.

Dynamic switching of resource DLLs

In addition to locating a resource DLL at application startup, it is possible to switch resource DLLs dynamically at runtime. To add this functionality to your own applications, you need to include the ReInit unit in your uses statement. (ReInit is located in the Richedit sample in the Demos directory.) To switch languages, you should call LoadResourceModule, passing the LCID for the new language, and then call ReinitializeForms.

For example, the following code switches the interface language to French:

const

FRENCH = (SUBLANG_FRENCH shl 10) or LANG_FRENCH; if LoadNewResourceModule(FRENCH) <> 0 then ReinitializeForms;

The advantage of this technique is that the current instance of the application and all of its forms are used. It is not necessary to update the registry settings and restart the application or re-acquire resources required by the application, such as logging in to database servers.

When you switch resource DLLs the properties specified in the new DLL overwrite the properties in the running instances of the forms.

Note Any changes made to the form properties at runtime will be lost. Once the new DLL is loaded, default values are not reset. Avoid code that assumes that the form objects are reinitialized to the their startup state, apart from differences due to localization.

Localizing applications

Localizing applications

Once your application is internationalized, you can create localized versions for the different foreign markets in which you want to distribute it.

Localizing resources

Ideally, your resources have been isolated into a resource DLL that contains form files (.dfm in VCL applications or .xfm in CLX applications) and a resource file. You can open your forms in the IDE and translate the relevant properties.

Note In a resource DLL project, you cannot add or delete components. It is possible, however, to change properties in ways that could cause runtime errors, so be careful to modify only those properties that require translation. To avoid mistakes, you can configure the Object Inspector to display only Localizable properties; to do so, right-click in the Object Inspector and use the View menu to filter out unwanted property categories.

You can open the RC file and translate relevant strings. Use the StringTable editor by opening the RC file from the Project Manager.

Was this article helpful?

0 0

Responses

  • nora
    How to enter multi byte characters in string Table editor in delphi?
    9 years ago

Post a comment