From 4814030bf696bf552c0ee8cb7ea00c62f46ff50e Mon Sep 17 00:00:00 2001 From: Werner Lemberg Date: Wed, 31 Aug 2005 07:13:27 +0000 Subject: [PATCH] * src/gxvalid/README: Revised. --- ChangeLog | 4 + src/gxvalid/README | 725 +++++++++++++++++++++++++-------------------- 2 files changed, 409 insertions(+), 320 deletions(-) diff --git a/ChangeLog b/ChangeLog index f3d4c91a5..54ef5eb3a 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,7 @@ +2005-08-30 Werner Lemberg + + * src/gxvalid/README: Revised. + 2005-08-29 Werner Lemberg * include/freetype/freetype.h, include/freetype/ftchapters.h: Add diff --git a/src/gxvalid/README b/src/gxvalid/README index ac5ad637c..a370cfc49 100644 --- a/src/gxvalid/README +++ b/src/gxvalid/README @@ -1,60 +1,91 @@ - gxvalid: TrueType GX validator - ============================== +gxvalid: TrueType GX validator +============================== - 1. What is this - --------------- - "gxvalid" is a module to validate TrueType GX tables: a collection of - additional tables in TrueType font which is used by "QuickDraw GX - Text", Apple Advanced Typography (AAT). In addition, gxvalid can - validates "kern" table which had been extended for AAT. Like otvalid, - gxvalid uses Freetype2's validator framework(ftvalid). + +1. What is this +--------------- + + `gxvalid' is a module to validate TrueType GX tables: a collection of + additional tables in TrueType font which are used by `QuickDraw GX + Text', Apple Advanced Typography (AAT). In addition, gxvalid can + validates `kern' tables which have been extended for AAT. Like the + otvalid module, gxvalid uses Freetype 2's validator framework + (ftvalid). You can link gxvalid with your program; before running your own layout - engine, gxvalid validates a font file. As the result, you can reduce - error-checking code from the layout engine. You can use gxvalid as a - stand-alone font validator; ftvalid command included in ft2demo calls - gxvalid internally. Stand-alone font validator may be useful for font - developers. + engine, gxvalid validates a font file. As the result, you can remove + error-checking code from the layout engine. It is also possible to + use gxvalid as a stand-alone font validator; the `ftvalid' test + program included in the ft2demo bundle calls gxvalid internally. + A stand-alone font validator may be useful for font developers. + + This documents documents the following issues. - This documents contains following informations: - supported TrueType GX tables - - validation limitation in principle + - fundamental validation limitations - permissive error handling of broken GX tables - - "kern" table issue. + - `kern' table issue. - 2. Supported tables - ------------------- - Following GX tables are currently supported. - bsln feat just kern(*) lcar mort morx opbd prop trak +2. Supported tables +------------------- - Following GX tables are currently unsupported. - cvar fdsc fmtx fvar gvar Zapf + The following GX tables are currently supported. - Following GX tables won't be supported. - acnt(**) hsty(***) + bsln + feat + just + kern(*) + lcar + mort + morx + opbd + prop + trak - Undocumented tables in TrueType fonts designed for Apple platform. - CVTM TPNM addg umif + The following GX tables are currently unsupported. - *) "kern" validator includes both of classic kern (format supported - by both of Microsoft and Apple platforms) and new kern (a format - supported by Apple platform only). + cvar + fdsc + fmtx + fvar + gvar + Zapf - **) "acnt" tables is not supported by currently available Apple font + The following GX tables won't be supported. + + acnt(**) + hsty(***) + + The following undocumented tables in TrueType fonts designed for Apple + platform aren't handled either. + + addg + CVTM + TPNM + umif + + + *) The `kern' validator handles both the classic and the new kern + formats; the former is supported on both Microsoft and Apple + platforms, while the latter is supported on Apple platforms. + + **) `acnt' tables are not supported by currently available Apple font tools. - ***) There is one more Apple extension "hsty" but it is for Newton-OS, - not GX (Newton-OS is a platform by Apple, but it can use sfnt- - housed bitmap fonts only. Therefore, it should be excluded from - "Apple platform" in the context of TrueType. gxvalid ignores it - as Apple font tools do so. + ***) There is one more Apple extension, `hsty', but it is for + Newton-OS, not GX (Newton-OS is a platform by Apple, but it can + use sfnt- housed bitmap fonts only). Therefore, it should be + excluded from `Apple platform' in the context of TrueType. + gxvalid ignores it as Apple font tools do so. + + + We have checked 183 fonts bundled with MacOS 9.1, MacOS 9.2, MacOS + 10.0, MacOS X 10.1, MSIE for MacOS, and AppleWorks 6.0. In addition, + we have checked 67 Dynalab fonts (designed for MacOS) and 189 Ricoh + fonts (designed for Windows and MacOS dual platforms). The number of + fonts including TrueType GX tables are as follows. - We have checked 183 fonts bundled to MacOS 9.1, MacOS 9.2, MacOS 10.0, - MacOS X 10.1, MSIE for MacOS and AppleWorks 6.0. In addition, we have - checked 67 Dynalab fonts (designed for MacOS) and 189 Ricoh fonts - (designed for Windows and MacOS dual platforms). The number of fonts - including TrueType GX tables are listed in following: bsln: 76 feat: 191 just: 84 @@ -65,371 +96,425 @@ opbd: 4 prop: 114 trak: 16 - Dynalab and Ricoh fonts didn't have GX tables except of feat and mort. - 3. Validation limitations in principle - -------------------------------------- - TrueType GX provides layout information to font-rasterize/text-layout - libraries. gxvalid can check whether layout information is stored as - TrueType GX format specified by Apple. But gxvalid cannot check how - QuickDraw GX/AAT renderer uses the stored information. + Dynalab and Ricoh fonts don't have GX tables except of `feat' and + `mort'. - 3-1. Validation of State Machine activity - ----------------------------------------- - QuickDraw GX/AAT has "State Machine" to provide "stateful" layout - features, and TrueType GX stores the state transition diagram of - "State Machine" in "StateTable" data structure. While State Machine - receives a series of glyph ID, State Machine starts from "start of - text" state, walks around various states and generates various - layout informations to renderer, and finally reaches to "end of - text". + +3. Fundamental validation limitations +------------------------------------- + + TrueType GX provides layout information to libraries for font + rasterizers and text layout. gxvalid can check whether the layout + data in a font is conformant to the TrueType GX format specified by + Apple. But gxvalid cannot check a how QuickDraw GX/AAT renderer uses + the stored information. + + 3-1. Validation of State Machine activity + ----------------------------------------- + + QuickDraw GX/AAT uses a `State Machine' to provide `stateful' layout + features, and TrueType GX stores the state transition diagram of + this `State Machine' in a `StateTable' data structure. While the + State Machine receives a series of glyph IDs, the State Machine + starts with `start of text' state, walks around various states and + generates various layout informations to the renderer, and finally + reaches the `end of text' state. gxvalid can check essential errors like: - - possibility of state transition to undefined states - - existence of glyph ID that State Machine doesn't know how to - handle it - - State Machine cannot compute the layout information from given - diagram - these errors can be checked within finite steps, and without State - Machine itself, because these are errors of "expression" of state + + - possibility of state transitions to undefined states + - existence of glyph IDs that the State Machine doesn't know how + to handle + - the State Machine cannot compute the layout information from + given diagram + + These errors can be checked within finite steps, and without the + State Machine itself, because these are `expression' errors of state transition diagram. - There's no limitation about how long State Machine walks around, so - validation of the algorithm in the state transition diagram requires - infinite steps, even if we have State Machine in gxvalid. Therefore, - following "errors" cannot be checked. - - existence of states which State Machine never transits to. - - possibility that State Machine never reaches to "end of text". - - possibility of stack underflow/overflow in State Machine - (in ligature and contextual glyph substitution, State Machine - can store 16 glyphs onto its stack) + There is no limitation about how long the State Machine walks + around, so validation of the algorithm in the state transition + diagram requires infinite steps, even if we had a State Machine in + gxvalid. Therefore, the following errors and problems cannot be + checked. - In addition, gxvalid doesn't check "temporal glyph ID" used in the - chained State Machines (in "mort" and "morx" tables). When a layout - feature is implemented by single State Machine, glyph ID converted - by State Machine is passed to the glyph renderer, thus it should not - point to undefined glyph ID. But if a layout feature is implemented - by chained State Machines, the component State Machine (if it is not - final one) is permitted to generate undefined glyph ID for temporal - use, because it is handled by next component State Machine, instead - of the glyph renderer. To validate such temporal glyph ID, gxvalid - must stack all undefined glyph IDs which is possible in the output - of previous State Machine and search them in "ClassTable" of current - State Machine. It is too complexed work to list all possible glyph - IDs from StateTable, especially from ligature substitution table. + - existence of states which the State Machine never transits to + - the possibility that the State Machine never reaches `end of + text' + - the possibility of stack underflow/overflow in the State Machine + (in ligature and contextual glyph substitutions, the State + Machine can store 16 glyphs onto its stack) - 3-2. Validation of relationship among multiple layout features - -------------------------------------------------------------- - gxvalid does not validate the relationship among multiple layout + In addition, gxvalid doesn't check `temporary glyph IDs' used in the + chained State Machines (in `mort' and `morx' tables). If a layout + feature is implemented by a single State Machine, a glyph ID + converted by the State Machine is passed to the glyph renderer, thus + it should not point to an undefined glyph ID. But if a layout + feature is implemented by chained State Machines, a component State + Machine (if it is not the final one) is permitted to generate + undefined glyph IDs for temporary use, because it is handled by next + component State Machine and not by the glyph renderer. To validate + such temporary glyph IDs, gxvalid must stack all undefined glyph IDs + which can occur in the output of the previous State Machine and + search them in the `ClassTable' structure of the current State + Machine. It is too complex to list all possible glyph IDs from the + StateTable, especially from a ligature substitution table. + + 3-2. Validation of relationship between multiple layout features + ---------------------------------------------------------------- + + gxvalid does not validate the relationship between multiple layout features at all. - If multiple layout features are defined in TrueType GX tables, the - interactivity, overriding, and conflict among layout features are - defined in the font too. For example, there are several predefined - spacing control features: + If multiple layout features are defined in TrueType GX tables, + possible interactions, overrides, and conflicts between layout + features are implicitly given in the font too. For example, there + are several predefined spacing control features: + - Text Spacing (Proportional/Monospace/Half-width/Normal) - Number Spacing (Monospaced-numbers/Proportional-numbers) - Kana Spacing (Full-width/Proportional) - Ideographic Spacing (Full-width/Proportional) - CJK Roman Spacing (Half-width/Proportional/Default-roman /Full-width-roman/Proportional) - If all layout features are independently managed, we can set an - inconsistent typographic rule, as like "Text Spacing=Monospace" and - "Ideographic Spacing=Proportional", at the same time. - The combination of each layout feature is managed by 32bit integer - (1 bit for 1 selector setting), so we can define relationship among - features up to 32 settings, theoretically. But if setting of - a feature affects setting of another features, typographic priority - of each layout feature is required to validate the relationship. + If all layout features are independently managed, we can activate + inconsistent typographic rules like `Text Spacing=Monospace' and + `Ideographic Spacing=Proportional' at the same time. + + The combinations of layout features is managed by a 32bit integer + (one bit each for selector setting), so we can define relationships + between up to 32 features, theoretically. But if one feature + setting affects another feature setting, we need typographic + priority rules to validate the relationship. Unfortunately, the TrueType GX format specification does not give such information even for predefined features. - 4. Permissive error handling of broken GX tables - ------------------------------------------------ - When Apple's font rendering system finds an inconsistency, violation - of specification or unspecified value in TrueType GX tables, they do - not always return error. In most case, they silently ignore such wrong - values or whole of table. In fact, MacOS is shipped with fonts - including broken GX/AAT tables, but no harmful effects due to - officially broken fonts are observed by end-users. - gxvalid is designed to continue its validation as long as possible. - When gxvalid find wrong value, gxvalid warns it at least, and take a - fallback procedure if possible. The fallback procedure depends on the - debug level. +4. Permissive error handling of broken GX tables +------------------------------------------------ + + When Apple's font rendering system finds an inconsistency, like a + specification violation or an unspecified value in a TrueType GX + table, it does not always return error. In most cases, the rendering + engine silently ignores such wrong values or even whole tables. In + fact, MacOS is shipped with fonts including broken GX/AAT tables, but + no harmful effects due to `officially broken' fonts are observed by + end-users. + + gxvalid is designed to continue the validation process as long as + possible. When gxvalid find wrong values, gxvalid warns it at least, + and takes a fallback procedure if possible. The fallback procedure + depends on the debug level. + + We used the following three tools to investigate Apple's error handling. - We used following 3 tools to refer Apple's error handling. - FontValidator (for MacOS 8.5 - 9.2) resource fork font - ftxvalidator (for MacOS X 10.1 -) dfont or naked-sfnt - ftxdumperfuser (for MacOS X 10.1 -) dfont or naked-sfnt - However, all tests are on PowerPC based Macintosh, we have not tested - on m68k-based Macintosh at all, at present. - We checked 183 fonts bundled to MacOS 9.1, MacOS 9.2, MacOS 10.0, - MacOS X 10.1, MSIE for MacOS and AppleWorks 6.0. These fonts are - distributed officially, but many broken GX/AAT tables are found by - Apple's font tools. In following, we list typical violation against GX - specification, in Apple official fonts. At least, gxvalid warns them, - and fallback method to continue + However, all tests were done on a PowerPC based Macintosh; at present, + we have not checked those tools on a m68k-based Macintosh. - 4-1. broken BinSrchHeader ( 19/183) - ----------------------------------- - BinSrchHeader is a header of data array, for m68k platform to access - memory effectively. Although independent parameters for real use are - only 2 (unitSize and nUnits), BinSrchHeader has 3 additional - parameters which can be calculated from unitSize and nUnits, for - fast setup. Apple font tools ignore them silently, so gxvalid warns - inconsistency and always continues validation. The additional - parameters are ignored regardless of the consistency. + In total, we checked 183 fonts bundled to MacOS 9.1, MacOS 9.2, MacOS + 10.0, MacOS X 10.1, MSIE for MacOS, and AppleWorks 6.0. These fonts + are distributed officially, but many broken GX/AAT tables were found + by Apple's font tools. In the following, we list typical violation of + the GX specification, in fonts officially distributed with those Apple + systems. - 19 fonts include inconsistent with calculated values - all breaks are in BinSrchHeader of "kern" table. + 4-1. broken BinSrchHeader (19/183) + ---------------------------------- - 4-2. too-short LookupTable ( 5/183) - ----------------------------------- - LookupTable format 0 is simple array to get a value from given GID, - the index of array is GID. Therefore, the length of array is - expected to be same with max GID defined in "maxp" table, but there - is some fonts whose LookupTable format 0 is too short to cover all - GID. FontValidator ignores this error silently, ftxvalidator and - ftxdumperfuser warns and continues. Similar shortage is found in - format 3 subtable of "kern". - gxvalid warns always and abort at FT_VALIDATE_PARANOID. + `BinSrchHeader' is a header of a data array for m68k platforms to + access memory efficiently. Although there are only two independent + parameters for real (`unitSize' and `nUnits'), BinSrchHeader has + three additional parameters which can be calculated from `unitSize' + and `nUnits', for fast setup. Apple font tools ignore them + silently, so gxvalid warns if it finds and inconsistency, and always + continues validation. The additional parameters are ignored + regardless of the consistency. - 5 fonts include too-short kern format 0 subtables. - 1 font includes too-short kern format 3 subtable. + 19 fonts include such inconsistencies; all breaks are in the + BinSrchHeader structure of the `kern' table. + 4-2. too-short LookupTable (5/183) + ---------------------------------- - 4-3. broken LookupTable format 2 ( 1/183) - ----------------------------------------- - LookupTable format 2, 4 covers GID space by collection of segments - which specified by firstGlyph and lastGlyph. Some fonts stores - firstGlyph and lastGlyph in reverse order, so segment specification - is broken. Apple font tools ignores this error silently, broken - segment is ignored as if it did not exist. gxvalid warns and - normalize the segment at FT_VALIDATE_DEFAULT, or ignore the segment - at FT_VALIDATE_TIGHT, or abort at FT_VALIDATE_PARANOID. + LookupTable format 0 is a simple array to get a value from a given + GID (glyph ID); the index of this array is a GID too. Therefore, + the length of the array is expected to be same as the maximum GID + value defined in the `maxp' table, but there are some fonts whose + LookupTable format 0 is too short to cover all GIDs. FontValidator + ignores this error silently, ftxvalidator and ftxdumperfuser both + warn and continue. Similar problems are found in format 3 subtables + of `kern'. gxvalid warns always and abort if the validation level + is set to FT_VALIDATE_PARANOID. - 1 font includes broken LookupTable format 2, in "just" table. + 5 fonts include too-short kern format 0 subtables. + 1 font includes too-short kern format 3 subtable. - *) It seems that all fonts manufactured by ITC for AppleWorks have - this error. + 4-3. broken LookupTable format 2 (1/183) + ---------------------------------------- - - 4-4. bad bracketing in glyph property ( 14/183) - ----------------------------------------------- - GX/AAT defines bracketing property of the glyphs by "prop" table, to - control layout functionalities for string closed in brackets and out - of brackets. Some fonts give inappropriate bracket properties to - glyphs. Apple font tools warn this error. gxvalid warns always and + LookupTable format 2, subformat 4 covers the GID space by a + collection of segments which are specified by `firstGlyph' and + `lastGlyph'. Some fonts store `firstGlyph' and `lastGlyph' in + reverse order, so the segment specification is broken. Apple font + tools ignore this error silently; a broken segment is ignored as if + it did not exist. gxvalid warns and normalize the segment at + FT_VALIDATE_DEFAULT, or ignore the segment at FT_VALIDATE_TIGHT, or abort at FT_VALIDATE_PARANOID. - 14 fonts include wrong bracket properties. + 1 font includes broken LookupTable format 2, in the `just' table. + + *) It seems that all fonts manufactured by ITC for AppleWorks have + this error. + + 4-4. bad bracketing in glyph property (14/183) + ---------------------------------------------- + + GX/AAT defines a `bracketing' property of the glyphs in the `prop' + table, to control layout features of strings enclosed inside and + outside of brackets. Some fonts give inappropriate bracket + properties to glyphs. Apple font tools warn about this error; + gxvalid warns too and aborts at FT_VALIDATE_PARANOID. + + 14 fonts include wrong bracket properties. - 4-5. invalid feature number (117/183) - ------------------------------------- - GX/AAT extension can include 255 different features for layout, but - popular layout features are predefined - (see http://developer.apple.com/fonts/Registry/index.html). - Some fonts include feature number which is incompatible with - predefined feature registry. + 4-5. invalid feature number (117/183) + ------------------------------------- - In our survey, there are 140 fonts including "feat" table. - a) 67 fonts uses feature number which should not be used. - b) 117 fonts set wrong feature range (nSetting). - this infraction is found in mort/morx. + The GX/AAT extension can include 255 different layout features, but + popular layout features are predefined (see + http://developer.apple.com/fonts/Registry/index.html). Some fonts + include feature numbers which are incompatible with the predefined + feature registry. - Apple font tools gives no warning, although they cannot recognize - what the feature is. At FT_VALIDATE_DEFAULT, gxvalid warns but + In our survey, there are 140 fonts including `feat' table. + + a) 67 fonts use a feature number which should not be used. + b) 117 fonts set the wrong feature range (nSetting). This is mostly + found in the `mort' and `morx' tables. + + Apple font tools give no warning, although they cannot recognize + what the feature is. At FT_VALIDATE_DEFAULT, gxvalid warns but continues in both cases (a, b). At FT_VALIDATE_TIGHT, gxvalid warns and aborts for (a), but continues for (b). At FT_VALIDATE_PARANOID, gxvalid warns and aborts in both cases (a, b). - 4-6. invalid prop version ( 10/183) - ----------------------------------- - As most TrueType GX tables, prop table must start with 32bit - version: 0x00010000, 0x00020000 or 0x00030000. But some fonts store - nonsense binary data in it. When Apple font tools find them, they - abort the processing at once, and following data are unhandled. - gxvalid does same always. + 4-6. invalid prop version (10/183) + ---------------------------------- - 10 fonts include broken prop version. + As most TrueType GX tables, the `prop' table must start with a 32bit + version identifier: 0x00010000, 0x00020000 or 0x00030000. But some + fonts store nonsense binary data instead. When Apple font tools + find them, they abort the processing immediately, and the data which + follows is unhandled. gxvalid does the same. - All of these fonts are classic TrueType for Japanese script, - manufactured by Apple. + 10 fonts include broken `prop' version. - 4-7. unknown resource name ( 2/183) - ------------------------------------ - NOTE: THIS IS NOT TRUETYPE GX ERROR - When TrueType font is stored in resource fork or dfont format, - the data must be tagged as "sfnt" in resource fork index, to invoke - TrueType font handler for the data. But the TrueType font data in - "Keyboard.dfont" is tagged as "kbd", and that in "LastResort.dfont" - is tagged as "lst". Apple font tools can detect the data is of - TrueType and successfully validate them. Possibly this because they - are known to be dfont. Current implementation of resource fork - driver of FreeType cannot do that, thus gxvalid cannot validate them. + All of these fonts are classic TrueType fonts for the Japanese + script, manufactured by Apple. - 2 fonts use unknown tag for TrueType font resource. + 4-7. unknown resource name (2/183) + ------------------------------------ - 5. "kern" table issue - --------------------- - In common terminology of TrueType, "kern" is classified to basic and - platform-independent table. But there are Apple extensions of kern, - and there is an extension which requires GX state machine for - contextual kerning. Therefore, gxvalid includes validator for kern. - Unfortunately, there is no exact algorithm to check Apple's extension, - so gxvalid includes pragmatic detector of data format and validator - for all possible data formats, including data format for Microsoft. - By calling classic_kern_validate() instead of gxv_validate(), you can - specify available "kern" format explicitly. However, current FreeType2 - uses Microsoft "kern" format only, others are ignored. + NOTE: THIS IS NOT A TRUETYPE GX ERROR. - 5-1. History - ------------ - Original 16bit version of "kern" had been designed by Apple in pre- - GX era, and it was also approved by Microsoft. Afterwards, Apple has - designed new 32bit version "kern". Apple has noted as the difference - between 16bit and 32bit version is only the size of variables in - "kern" header. In following, we call the original 16bit version as - "classic", and 32bit version as "new". + If a TrueType font is stored in the resource fork or in dfont + format, the data must be tagged as `sfnt' in the resource fork index + to invoke TrueType font handler for the data. But the TrueType font + data in `Keyboard.dfont' is tagged as `kbd', and that in + `LastResort.dfont' is tagged as `lst'. Apple font tools can detect + that the data is in TrueType format and successfully validate them. + Maybe this is possible because they are known to be dfont. The + current implementation of the resource fork driver of FreeType + cannot do that, thus gxvalid cannot validate them. - 5-2. Versions and dialects which should be discriminated - -------------------------------------------------------- - The "kern" table consists of the table header and several subtables. - The version "classic" or "new" is explicitly written in the table - header, but there are undocumented difference of font parser between - Microsoft and Apple. It is called as "dialect" in following. - There are 3 cases which should be discriminated: new Apple-dialect, - classic Apple-dialect, and classic Microsoft-dialect. Analysis and - auto detection algorithm of gxvalid is described in following. + 2 fonts use an unknown tag for the TrueType font resource. - 5-2-1. Version detection: classic and new kern - ---------------------------------------------- - According to Apple TrueType specification, the clarified - difference between classic and new version are only 2: - - "kern" table header starts with the version number. +5. `kern' table issues +---------------------- + + In common terminology of TrueType, `kern' is classified as a basic and + platform-independent table. But there are Apple extensions of `kern', + and there is an extension which requires a GX state machine for + contextual kerning. Therefore, gxvalid includes a special validator + for `kern' tables. Unfortunately, there is no exact algorithm to + check Apple's extension, so gxvalid includes a heuristic algorithm to + find the proper validation routines for all possible data formats, + including the data format for Microsoft. By calling + classic_kern_validate() instead of gxv_validate(), you can specify the + `kern' format explicitly. However, current FreeType2 uses Microsoft + `kern' format only, others are ignored (and should be handled in a + library one level higher than FreeType). + + 5-1. History + ------------ + + The original 16bit version of `kern' was designed by Apple in the + pre-GX era, and it was also approved by Microsoft. Afterwards, + Apple designed a new 32bit version of the `kern' table. According + to the documentation, the difference between the 16bit and 32bit + version is only the size of variables in the `kern' header. In the + following, we call the original 16bit version as `classic', and + 32bit version as `new'. + + 5-2. Versions and dialects which should be differentiated + --------------------------------------------------------- + + The `kern' table consists of a table header and several subtables. + The version number which identifies a `classic' or a `new' version + is explicitly written in the table header, but there are + undocumented differences between Microsoft's and Apple's formats. + It is called a `dialect' in the following. There are three cases + which should be handled: the new Apple-dialect, the classic + Apple-dialect, and the classic Microsoft-dialect. An analysis of + the formats and the auto detection algorithm of gxvalid is described + in the following. + + 5-2-1. Version detection: classic and new kern + ---------------------------------------------- + + According to Apple TrueType specification, there are only two + differences between the classic and the new: + + - The `kern' table header starts with the version number. The classic version starts with 0x0000 (16bit), the new version starts with 0x00010000 (32bit). - - In the "kern" table header, the number of subtables follows to + + - In the `kern' table header, the number of subtables follows the version number. - In the classic version, it is stored in 16bit variable. - In the new version, it is stored in 32bit variable. + In the classic version, it is stored as a 16bit value. + In the new version, it is stored as a 32bit value. From Apple font tool's output (DumpKERN is also tested in addition - to 3 Apple font tools in above), there is another undocumented - difference. In new version, the subtable header includes a 16bit - variable named "tupleIndex" which does not exist in the classic - version. + to the three Apple font tools in above), there is another + undocumented difference. In the new version, the subtable header + includes a 16bit variable named `tupleIndex' which does not exist + in the classic version. - New version can store all subtable formats (0, 1, 2 and 3), but - Apple TrueType specification does not mention about subtable - formats available in classic version. + The new version can store all subtable formats (0, 1, 2, and 3), + but the Apple TrueType specification does not mention the subtable + formats available in the classic version. + 5-2-2. Avaibale subtable formats in classic version + --------------------------------------------------- - 5-2-2. Avaibale subtable format in classic version - -------------------------------------------------- - Although Apple TrueType specification recommends to use classic - version in the case if the font is designed for both of Apple and - Microsoft platforms, it does not note about the available subtable - formats in classic version. + Although the Apple TrueType specification recommends to use the + classic version in the case if the font is designed for both the + Apple and Microsoft platforms, it does not document the available + subtable formats in the classic version. - According to Microsoft TrueType specification, the subtable format - assured for Windows & OS/2 support is only subtable format 0. Also - Microsoft TrueType specification describes the subtable format 2, - but does not mention about which platforms support it. About - subtable format 1, 3 and later are noted as reserved for future - use. Therefore, the classic version can store subtable formats 0 - and 2, at least. ttfdump.exe, a font tool provided by Microsoft - ignores the subtable format written in the subtable header, and - parse as if all subtables are in format 0. + According to the Microsoft TrueType specification, the subtable + format assured for Windows and OS/2 support is only subtable + format 0. The Microsoft TrueType specification also describes + subtable format 2, but does not mention which platforms support + it. Aubtable formats 1, 3, and higher are documented as reserved + for future use. Therefore, the classic version can store subtable + formats 0 and 2, at least. `ttfdump.exe', a font tool provided by + Microsoft, ignores the subtable format written in the subtable + header, and parses the table as if all subtables are in format 0. - kern subtable format 1 uses StateTable, so it cannot be utilized - without GX State Machine. Therefore, it is reasonable to assume - format 1 (and 3) is introduced after Apple have introduced GX and - moved to new 32bit version. + `kern' subtable format 1 uses a StateTable, so it cannot be + utilized without a GX State Machine. Therefore, it is reasonable + to assume that format 1 (and 3) were introduced after Apple had + introduced GX and moved to the new 32bit version. - 5-2-3. Apple and Microsoft dialects - ----------------------------------- - The kern subtable has 16bit "coverage" to describe kerning - attributions, but bit-interpretations by Apple and Microsoft are - reverse ordered: - e.g. Apple-dialect writes subtable format from 0x000F bit range, - Microsoft-dialect writes subtable format from 0x0F00 bit range). + 5-2-3. Apple and Microsoft dialects + ----------------------------------- - In addition, from the outputs of DumpKERN and FontValidator, - Apple's bit-interpretations of coverage in classic and new version - are incompatible. In summary, there are 3 dialects: classic Apple- - dialect, classic Microsoft-dialect, and new Apple-dialect. - The classic Microsoft-dialect and new Apple-dialect are documented - by each vendors' TrueType font specification, but the document for - classic Apple-dialect had been lost. + The `kern' subtable has a 16bit `coverage' field to describe + kerning attributes, but bit interpretations by Apple and Microsoft + are different: For example, Apple uses bits 0-7 to identify the + subtable, while Microsoft uses bits 8-15. + + In addition, due to the output of DumpKERN and FontValidator, + Apple's bit interpretations of coverage in classic and new version + are incompatible also. In summary, there are three dialects: + classic Apple dialect, classic Microsoft dialect, and new Apple + dialect. The classic Microsoft dialect and the new Apple dialect + are documented by each vendors' TrueType font specification, but + the documentation for classic Apple dialect is not available. - For example, in new Apple-dialect, the bit 0x8000 is documented as - "set to 1 when the kerning is vertical". On the other hand, in - classic Microsoft-dialect, the bit 0x0001 is documented as "set to - 1 when the kerning is horizontal". From the outputs of DumpKERN - and FontValidator, classic Apple-dialect recognizes the bit 0x8000 - as "set to 1 when the kerning is horizontal". From the results of - similar experiments, classic Apple-dialect is ein ndian-reverse of - classic Microsoft-dialect. + For example, in the new Apple dialect, bit 15 is documented as + `set to 1 if the kerning is vertical'. On the other hand, in + classic Microsoft dialect, bit 1 is documented as `set to 1 if the + kerning is horizontal'. From the outputs of DumpKERN and + FontValidator, classic Apple dialect recognizes 15 as `set to 1 + when the kerning is horizontal'. From the results of similar + experiments, classic Apple dialect seems to be the Endian reverse + of the classic Microsoft dialect. - It must be noted: no font tool can sense classic Apple-dialect or - classic-Microsoft dialect automatically. + As a conclusion it must be noted that no font tool can identify + classic Apple dialect or classic Microsoft dialect automatically. + + 5-2-4. gxvalid auto dialect detection algorithm + ----------------------------------------------- + + The first 16 bits of the `kern' table are enough to identify the + version: + + - if the first 16 bits are 0x0000, the `kern' table is in + classic Apple dialect or classic Microsoft dialect + - if the first 16 bits are 0x0001, and next 16 bits are 0x0000, + the kern table is in new Apple dialect. + + If the `kern' table is a classic one, the 16bit `coverage' field + is checked next. Firstly, the coverage bits are decoded for the + classic Apple dialect using the following bit masks (this is based + on DumpKERN output): - 5-2-4. gxvalid auto dialect detection algorithm - ----------------------------------------------- - The first 16bit of kern table is enough to sense the version: - - if first 16bit is 0x0000, - kern table is in classic Apple-dialect - or classic Microsoft-dialect, - - if first 16bit is 0x0001, and next 16bit is 0x0000, - kern table is in new Apple-dialect. - If kern table is classic version, 16bit coverage is checked for in - next. For first, the coverage is decoded by classic Apple-dialect - as following (it is based on DumpKERN output): 0x8000: 1=horizontal, 0=vertical 0x4000: not used 0x2000: 1=cross-stream, 0=normal 0x1FF0: reserved 0x000F: subtable format - If any of reserved bits are set or subtable format is - interpreted as 1 or 3, we take it as "impossible in classic - Apple-dialect", and retry by classic Microsoft-dialect. + + If any of reserved bits are set or the subtable bits is + interpreted as format 1 or 3, we take it as `impossible in classic + Apple dialect' and retry, using the classic Microsoft dialect. + The most popular coverage in new Apple-dialect: 0x8000, The most popular coverage in classic Apple-dialect: 0x0000, The most popular coverage in classic Microsoft dialect: 0x0001. - 5-3. Tested fonts - ----------------- - We checked 59 fonts bundled to MacOS which includes kern, and - 38 fonts bundled to Windows which includes kern. - - fonts bundled to MacOS - * new Apple-dialect + 5-3. Tested fonts + ----------------- + + We checked 59 fonts bundled with MacOS and 38 fonts bundled with + Windows, where all font include a `kern' table. + + - fonts bundled with MacOS + * new Apple dialect format 0: 18 format 2: 1 format 3: 1 - * classic Apple-dialect + * classic Apple dialect format 0: 14 - * classic Microsoft-dialect + * classic Microsoft dialect format 0: 15 - - fonts bundled to Windows - * classic Microsoft-dialect + + - fonts bundled with Windows + * classic Microsoft dialect format 0: 38 + It looks strange that classic Microsoft-dialect fonts are bundled to MacOS: they come from MSIE for MacOS, except of MarkerFelt.dfont. ACKNOWLEDGEMENT --------------- - Some part of gxvalid is derived from both gxlayout module and otvalid - module. Development of gxlayout was support of Information-technology - Promotion Agency(IPA), Japan. - The detailed analysis of undefined glyph ID utilization in mort, morx - is provided by George Williams. + Some parts of gxvalid are derived from both the `gxlayout' module and + the `otvalid' module. Development of gxlayout was supported by the + Information-technology Promotion Agency(IPA), Japan. + + The detailed analysis of undefined glyph ID utilization in `mort' and + `morx' tables is provided by George Williams. ------------------------------------------------------------------------ @@ -437,10 +522,10 @@ Copyright 2004, 2005 by suzuki toshiya, Masatake YAMATO, Red hat K.K., David Turner, Robert Wilhelm, and Werner Lemberg. -This file is part of the FreeType project, and may only be used, -modified, and distributed under the terms of the FreeType project -license, LICENSE.TXT. By continuing to use, modify, or distribute this -file you indicate that you have read the license and understand and +This file is part of the FreeType project, and may only be used, +modified, and distributed under the terms of the FreeType project +license, LICENSE.TXT. By continuing to use, modify, or distribute this +file you indicate that you have read the license and understand and accept it fully.