Table of Contents
Starting with the 1.5.50 release of OpenAFS for Windows, each of the AFS Client Service, the AFS Explorer Shell Extension, and the command-line tools are Unicode enabled. No longer is OpenAFS restricted to accessing file system objects whose names can be represented in the locale specific OEM code page. This has significant benefits for end users. Most importantly it permits non-Western languages to now be used for file system object names in AFS from Microsoft Windows operating systems. Now that Unicode names are supported, Roaming User Profiles and Folder Redirection will no longer fail when a user attempts to store an object with a name that cannot be represented in the OEM code page.
Unicode names are stored in AFS using UTF-8 encoding. UTF-8 is supported as a locale on MacOS X, Linux, Solaris, and most other operating systems. This permits non-Western object names to be exchanged between Microsoft Windows and other operating systems. The OpenAFS for Windows client also implements Unicode Normalization as part of the name lookup algorithm. This is necessary because Unicode does not provide a unique representation for each input string. The use of normalization permits a file system object name created on MacOS X to be matched with the same string entered on Microsoft Windows even though the operating system's choice of representation may be different.
It is important to note that AFS file servers are character-set agnostic. All file system object names are stored as octet strings without any character set tagging. If a file system object is created using OEM Code Page 858 and then interpreted as UTF-8 it is likely that the object name will appear to be gibberish. OpenAFS for Windows goes to great lengths to ensure that the object name is converted to a form that will permit the user to rename the object using Unicode. Accessing UTF-8 names on UNIX systems that have the locale set to one of the ISO Latin character sets will result in the UTF-8 strings appearing to be gibberish.
UNIX AFS can not perform Unicode Normalization for string comparisons. Although it is possible to store and read Unicode object names, it is possible that a user may not be able to open an object by typing the name of the object at the keyboard. GUI point and click operations should permit any object to be accessed.
MacOS X uses UTF-8 Normalization Form D (NFD) whereas Microsoft Windows and most other applications use UTF-8 Normalization Form C (NFC). The difference is that in NFD Unicode character sequences containing diacritical marks are decomposed whereas in NFC the Unicode character sequences use combined characters whenever possible. Whereas Microsoft Windows can display and manipulate files stored using NFD, MacOS X Finder does have trouble with filenames that are NFC encoded. All file names stored by the OpenAFS Windows client use NFC.