what's the meaning of platformID and encodingID in the cmap of OpenType?


The spec( http://www.microsoft.com/typography/otspec/cmap.htm ) only have platformID 3(windows) and platformID 1(Mac) and the encodingID in the windows platform. How about other platforms and other encodings beside windows?

I use ttfdump to dump one of the adobe fonts and it give me:

'cmap' Table - Character to Glyph Index Mapping Table
         'cmap' version: 0
         number of encodings: 5
         number of subtables: 3

Encoding   0.    PlatformID:  0
                 EcodingID:   3
                 SubTable: 0, Offset: 0x00000a0e

Encoding   1.    PlatformID:  0
                 EcodingID:   4
                 SubTable: 1, Offset: 0x0000cc26

Encoding   2.    PlatformID:  1
                 EcodingID:  25
                 SubTable: 2, Offset: 0x0000002c

Encoding   3.    PlatformID:  3
                 EcodingID:   1
                 SubTable: 0, Offset: 0x00000a0e

Encoding   4.    PlatformID:  3
                 EcodingID:  10
                 SubTable: 1, Offset: 0x0000cc26

So it does have other platforms and encodings but where to find the definitions?

asked on Stack Overflow Aug 13, 2014 by Grissiom

2 Answers


There is a following table in Microsoft documentation:

 Platform ID  | Platform name | Platform-specific encoding IDs  | Language IDs
 0            | Unicode       | Various                         | Various 
 1            | Macintosh     | Script manager code             | Various
 2            | ISO [deprec]  | ISO encoding [deprecated]       | None 
 3            | Windows       | Windows encoding                | Various 
 4            | Custom        | Custom                          | None

About non-standard Encoding IDs:

A new encoding ID for the Unicode platform will be assigned if a new version of Unicode moves characters, in order to properly specify character code semantics.(Because of Unicode stability policies, such a need is not anticipated.) The distinction between Unicode platform-specific encoding IDs 1 and 2 is for historical reasons only; the Unicode Standard is in fact identical in repertoire and encoding to ISO 10646. For all practical purposes in current fonts, the distinctions provided by encoding IDs 0, 1 and 2 are not important, thus these encoding IDs are deprecated.

A new encoding ID for the Unicode platform is also sometimes assigned when new cmap subtable formats are added to the specification, so as to allow for compatibility with existing parsers. For example, when cmap subtable formats 10 and 12 were added to the specification, encoding ID 4 was added as well, and when cmap subtable format 13 was added to the specification, encoding ID 6 was added. The cmap subtable formats listed in the table above are the only ones that may be used for the corresponding encoding ID.

answered on Stack Overflow Aug 13, 2014 by Kao • edited Aug 13, 2014 by Kao

You missed the text that says that more information is in the name table documentation, over at http://www.microsoft.com/typography/otspec/name.htm, which contains the (long) lists of which encodings and languages each platform supports.

User contributions licensed under CC BY-SA 3.0