The Unicode character set known as UCS-2 (Universal Character Set containing 2 bytes) is a fixed width encoding scheme that uses 16 bits per character. Characters are represented and manipulated in EWL as wide characters of type wchar_t and can be manipulated with the wide character functions defined in the library.
To reduce the size of a file that contains wide character text, the library offers functions to read and write wide characters using multibyte encoding. Instead of storing wide characters in a file as a sequence of wide characters, multibyte encoding takes advantage of each wide character's bit patterns to store it in one or more sequential bytes.
There are two types of multibyte encoding, modal and non-modal. With modal encoding, a conversion state is associated with a multibyte string. This state is akin to the shift state of a keyboard. The library uses the mbstate_t type to record a shift state. With nonmodal encoding, no such state is involved and the first character of a multibyte sequence contains information about the number of characters in the sequence. The actual encoding scheme is defined in the LC_CTYPE component of the current locale.
In EWL, two encoding schemes are available, a direct encoding where only a single byte is used and the non-modal UTF-8 (UCS Transformation Format 8) encoding scheme is used where each Unicode character is represented by one to three 8-bit characters. For Unicode characters in the range 0x00 to 0x7F the encoding is direct and only a single byte is used.