Writing Font Feature Code in the FEZ Language

Now we’ve seen what the FEZ language can do for us, let’s look in depth at how to write our rules in it.

Verbs and arguments

FEZ statements consist of a verb followed by arguments and terminated with a semicolon (;). Each verb defines their own set of arguments, and this means that there is considerable flexibility in how the arguments appear. For example, the Anchors statement takes a glyph name, followed by an open curly brace ({), a series of anchor name / anchor position pairs, and a close curly brace (}) - but no other verb has this pattern. (Plugin writers are encouraged to keep the argument syntax simple and intuitive.)

In general, amount and type of whitespace is not significant so long as arguments are separated unambiguously, and comments may be inserted between sentences in the usual form (# ignored to end of line).

Here is a simple FEZ file:

DefineClass @comma = [uni060C uni061B];
Feature ss08 {
  Substitute @comma -> @comma.alt;
};

DefineClass, Feature and Substitute are all verbs, and all have their own different argument patterns: DefineClass takes a class name, an equals and then a glyph selector (which we’ll meet soon); Feature takes a feature name, a curly brace, some statements and a closing curly brace; `Substitute takes a number of glyph selectors, an arrow, and a number of glyph selectors, and so on.

The meaning of the above code should be fairly obvious to users of Adobe feature syntax, but some features (particularly the use of “dot suffixing” to create a synthetic class) will be unfamiliar. Here’s an AFDKO translation:

@comma = [uni060C uni061B];

feature ss08 {
  sub @comma -> [uni060C.alt uni061B.alt];
} ss08;

We’ll begin by discussing how to address a set of glyphs within the FEZ language before moving on to the verbs that are available.

Glyph Selectors

A glyph selector is a way of specifying a set of glyphs in the FEZ language. There are various forms of glyph selector:

  • A single glyph name: a

  • An class name: @lc

  • An inline class: [a e i o u] (Inline classes may also contain class names, but may not contain ranges.)

  • A regular expression: /\.sc/. All glyph names in the font which match the expression will be selected. (This is another reason why FEZ needs the font beforehand.)

  • A Unicode codepoint: U+1234 (This will be mapped to a glyph in the font. If no glyph exists, an error is raised.)

  • A range of Unicode points: U+30=>U+39, described below.

The final form of glyph selector, the Unicode range selector (=>) matches all glyphs between two Unicode codepoints. For example, digits in a font usually have the standard glyph names zero, one, two, three and so on. It’s a common task to refer to all the digits, but at the same time, it’s a pain to have to list them out by hand, and there isn’t a way to do it with a regular expression glyph selector. But they do have contiguous Unicode codepoints - zero is codepoint U+0030 and nine has codepoint U+0039 - so it’s easy enough select them with a Unicode range selector:

DefineClass @digits = U+0030=>U+0039;

Using Unicode codepoints instead of glyph names helps to make your rules “portable” between different fonts targeting the same scripts.

Any of these glyph selector forms may be followed by zero or more suffixing operations or desuffixing operations. A suffixing operation begins with a period, and adds the period plus its argument to the end of any glyph names matched by the selector. In the example above, @comma.alt refers to the glyphs in @comma with the suffix .alt added; the glyphs in @comma are uni060C and uni061B, so the glyphs in @comma.alt are uni060C.alt and uni061B.alt.

A desuffixing operation, on the other hand, begins with a tilde, and removes the dot-suffix from the name of all glyphs matched by the selector. So if you happened to have a glyph class @comma_alternates consisting of uni060C.alt uni061B.alt, then @comma_alternates~alt would take the .alt off each glyph name, and refer to glyphs uni060C uni06B1.

Desuffixing operations combine very nicely with regular expression glyph selectors. Above, we saw the example /\.sc$/ which finds all glyphs which end with the suffix .sc. Let’s say this resolves to [a.sc b.sc c.sc]. If you then desuffix that glyph selector - /\.sc$/~sc - you get the bare, non-small-caps forms of those glyphs: [a b c]. In other words, the glyphs in the font which hayve a corresponding small-caps form. Now you don’t have to keep track of which glyphs you have small-caps forms of; you can select the list of small caps glyphs, and turn that back into the unsuffixed form:

Substitute /\.sc$/~sc -> /\.sc$/;
# Equivalent to "Substitute [a b c ...] -> [a.sc b.sc c.sc ...];"

Standard verbs

In FEZ, all verbs are provided by Python plugins. There are no “built-in” verbs. However, the following plugins are automatically loaded and their verbs are always available.

Class Definitions

To define a named glyph class in the FEZ language, use the DefineClass verb. This takes three arguments: the first is a class name, which must start with the @ character; the second is the symbol =; the third is a glyph selector as described above:

# Create a class which consists of the members of @upper, with the .alt
# suffix added to each glyph.
DefineClass @upper_alts = @upper.alt;

# Create a class of all glyphs matching the regex /^[a-z]$/ - i.e.
# single character lowercase names "a", "b", "c" ... "z"
DefineClass @lower = /^[a-z]$/;

# Create a class of the named uppercase glyphs, plus the contents of
# @lower.
DefineClass @upper_and_lower = [A B C D E F G @lower];

In addition, glyph classes can be combined within the DefineClass statement using the intersection (|), union (&) and subtraction (-) operators.

The | operator combines two classes together:

# Equivalent to [@lower_marks @upper_marks]
DefineClass @all_marks = @lower_marks | @upper_marks;

Whereas the & operator returns only glyphs which are common to both classes:

DefineClass @uppercase_vowels = @uppercase & @vowels;

The - returns the glyphs in the first class which are not in the second class:

DefineClass @ABCD = A | B | C | D;

# Everything in @ABCD apart from D (i.e. A, B, C)
DefineClass @ABC = @ABCD - D;

Finally, within the context of a class definition, glyphs can also be selected based on certain predicates, which test the glyphs for various properties:

# All glyphs which start with the letters BE and whose advance width is
# less than 200 units.
DefineClass @short_behs = /^BE/ & width < 200;

There are a number of metric predicates:

  • width (advance width)

  • lsb (left side bearing)

  • rsb (right side bearing)

  • xMin (minimum X coordinate)

  • xMax (maximum X coordinate)

  • yMin (minimum Y coordinate)

  • yMax (maximum Y coordinate)

  • rise (difference in Y coordinate between cursive entry and exit)

  • fullwidth (xMax-xMin)

These predicates are followed by a comparison operator (>=, <=, =, <, or >) and then an integer. So:

DefineClass @overhands = rsb < 0;

Alternatively, instead of an integer, you may supply a metric name and the name of a single glyph in brackets. For example, the following definition selects all members of the glyph class @alpha whose advance width is less than the advance width of the space glyph:

DefineClass @shorter_than_space = @alpha & width < width(space);

As well as testing for glyph metrics, the following other predicates are available:

  • hasglyph(regex string)

This is true for all glyphs where, if you take the glyph’s name, and replace the regular expression with the given string, you get the name of another glyph in the font. For example:

DefineClass @small_capable = hasglyph(/$/ .sc);

So we look at the glyph A, for example, and test “If I replace the end of this glyph’s name with .sc - i.e. A.sc - do I get the name of a valid glyph?” If so, then A is small-cap-able and goes into our class. Next we look at B, and so on.

DefineClass @localizable_digits = @digits & hasglyph(/-arab/ “-farsi”);

I have one-arab in my @digits class, but when I replace -arab with -farsi yielding one-farsi, I don’t see that in my font; so one-arab is not a localizable digit. But when I replace the -arab in four-arab to get four-farsi, I do see that in my font, so four-arab is a localizable digit.

  • hasanchor(anchorname)

This predicate is true if the glyph has the given anchor in the font source. (You will need to use either the LoadAnchors or Anchor verb before using this predicate!) Example:

DefineClass @topmarks = hasanchor(_top);
  • category(categoryname)

This predicate is true if the glyph has the given category in the font source. The category is expected to be base, mark or ligature.

DefineClass @stackable_marks = category(mark) & (hasanchor(_bottom) | hasanchor(_top));

This defines @stackable_marks to be all mark glyphs with either a _bottom or _top anchor.

Experience has shown that with smart enough class definitions, you can get away with pretty dumb rules.

Binned Definitions

Sometimes it is useful to split up a large glyph class into a number of smaller classes according to some metric, in order to treat them differently. For example, when performing an i-matra substitution in Devanagari, you would generally want to split your base glyphs by width, and apply the appropriate matra for each set of glyphs. FEZ calls the operation of organising glyphs into groups of similar metrics “binning”.

The ClassDefinition plugin also provides the DefineClassBinned verb, which generated a set of related glyph classes. The arguments of DefineClassBinned are identical to that of DefineClass, except that after the class name you must specify an open square bracket, one of the metrics listed above to be used to bin the glyphs, a comma, the number of bins to create, and a close bracket, like so:

DefineClassBinned @bases[width,5] = @bases;

This will create five classes, called @bases_width1 .. @bases_width5, grouped in increasing order of advance width.

Note that the size of the bins is not guaranteed to be equal, but bins are “smart”: glyphs are clustered according to the similarity of their metric. For example, if the advance widths are 99, 100, 110, 120, 500, and 510 and two bins are created, one bin will contain four glyphs (those with widths 99-120) and the other bin will contain two glyphs (those with widths 500-510).

(This is just an example for the purpose of explaining binning. We’ll show a better way to handle the i-matra question later.)

Glyph Class Debugging

The combination of the above rules allows for extreme flexibility in creating glyph classes, to the extent that it may become difficult to understand the final composition of glyph classes! To alleviate this, the verb ShowClass will take any glyph selector and display its contents on standard error.

Variables

FEZ allows you to give a name to any number or string in your feature file. These names, called variables, begin with the dollar sign, and you to more easily understand a rule by acting as a form of documentation. They also you allows to place all the “magic numbers” together in your file so that they can be more easily tweaked later.

So instead of saying:

DefineClass @tall_bases = @bases & yMax > 750;

# ...

Position @tall_bases (@top_marks <yPlacement=+150>);

you can say:

Set $tall_base_height = 750;
Set $tall_base_mark_adjustment = 150;

DefineClass @tall_bases = @bases & yMax > $tall_base_height;

# ...

Position @tall_bases (@top_marks <yPlacement=$tall_base_mark_adjustment>);

You can also store a single glyph name in a variable and use it as a glyph selector:

Set $virama = "dvVirama";

Substitute $virama @consonants -> @consonants.conjunct;

This is most helpful when combined with the For loop.

Features

To group a set of rules into a feature, use the Feature verb. This takes a name and a block containing rules:

Feature rlig {
    ...
};

Note that in FEZ syntax you must not repeat the feature name at the end of the block, as is required in AFDKO syntax.

For some features, like the Stylistic Sets (ss01-ss20), you can specify a FeatureName. If supported by the software it’s displayed to the user:

Feature ss01 {
    FeatureName "Half-width hiragana";
    ....
};

LoadPlugin

The LoadPlugin verb is the means by which you can load additional plugins, both those you write yourself, and the Optional plugins which we will meet later.

For one of the plugins which comes with FEZ, you can just name it:

LoadPlugin LigatureFinder;
# Now the LigatureFinder verb is available!

For a plugin you write yourself, you will need to place it into a Python package, (this basically means “creating a subdirectory and adding an empty file in it called __init__.py as well as your plugin file”) and give the full module name. For example, if you’re developing fonts for Telugu, you might create a package called “TeluguTools” which has all your plugin files in. Suppose one of those plugins is Conjuncts. You would place a file TeluguTools/Conjuncts.py somewhere in your Python load path, and then say:

LoadPlugin TeluguTools.Conjuncts;

A plugin may make available one or more verbs, so you need to read the plugin’s documentation to know which verbs are available.

Routine

To group a set of rules into a routine, use the Routine verb. This takes a name and a block containing rules:

Routine {
    ...
};

Note that in FEZ syntax you must not repeat the routine name at the end of the block, as is required in AFDKO syntax. Instead, any routine flags are added to the end of the block, and may be any combination of RightToLeft; IgnoreBases (AFDKO users, note the changed name); IgnoreLigatures; IgnoreMarks or UseMarkFilteringSet followed by a glyph selector.

For example, in AFDKO:

lookup test {
    lookupflag IgnoreBaseGlyphs UseMarkFilteringSet @thing;
    # ... rules ...
} test;

becomes:

Routine test {
    # ... rules ...
} IgnoreBases UseMarkFilteringSet @thing;

As in AFDKO, a Routine may appear within a Feature block or outside one, in which case it defines a named routine to be accessed later. In simple cases, you do not need to wrap rules in a routine inside of a feature block; however, to combine rules with different flags, you must place the rules within a routine.

FEZ routines do not always correspond directly to OpenType lookups, although in many cases they will. FEZ routines are more flexible, and may contain a mixture of rule types and may even contain rules targetting different languages:

Routine test {
    Substitute [four-arab five-arab] -> [four-urdu five-urdu] <<arab/URD>>;
    Substitute [four-arab five-arab] -> [four-farsi five-farsi] <<arab/FAR>>;
};

FEZ will resolve these routines into one or more OpenType lookups and alter the lookup references inside features accordingly when compiling to AFDKO syntax.

FEZ routines themselves may apply to certain script/language combinations, using the language syntax:

Routine test {

Substitute [four-arab five-arab] -> [four-urdu five-urdu];

} <<arab/URD>>;

As with AFDKO, this syntax can only be used when inside a feature block.

Substitute

Substitution rules are created using the Substitute verb. There are two forms of this verb:

  • a simple substitution, which simply has a number of glyph selectors on each side of an arrow (->)

  • a contextual substitution, which wraps the main glyphs to be substituted in parentheses, and optionally surrounds them with prefix and/or suffix glyphs.

Examples:

Substitute f i -> f_i;

Substitute [CH_YEu1 BEu1] ( NUNu1 ) -> NUNf2;

Within the right hand side of a Substitute operation, you may use backreferences as glyph selectors to refer to glyph selectors in equivalent positions on the left hand side. For example, the following rule:

Substitute [a e i o u] comma -> $1;

is equivalent to:

Substitute [a e i o u] comma -> [a e i o u];

The ReverseSubstitute verb is equivalent but creates reverse chaining substitution rules.

Substitution rules, as with any “basic” (substitute/position/attach/chain) rule, can be optionally followed by a list of script/language pairs in double angle brackets:

Substitute @letter (semicolon) -> space semicolon <<latn/FRA latn/DEU>>;

Position

Positioning rules are created using the Position verb. There are two forms of this verb:

  • a simple positioning, which simply has one or more glyph selectors each optionally followed by a value record.

  • a contextual positioning, which wraps the main glyphs and value records in brackets, and optionally surrounds them with prefix and/or suffix glyphs.

A value record can be specified either as a bare integer, in which case it represents an X advance adjustment, or a tuple of four integers surrounded by angle brackets, representing X position, Y position, X advance and Y advance, or as a dictionary-like structure surrounded by angle brackets, taking the form:

'<' ( ("xAdvance"| "xPlacement" | "yAdvance" | "yPlacement") '=' integer)+ '>'

Here are examples of each form of the positioning verb:

# Above nuktas followed by GAF or KAF glyphs should drop down
# and to the right
Position @above_nuktas <30 -70 0 0> /^[KG]AF/;

# Initial forms will get more space if they have consecutive dotted glyphs
# and appear after a word-final glyph.
Position @endofword ( @inits 200 ) @below_dots @medis @below_dots;
# Equivalent to AFDKO:
#   pos @endofword @inits' 200 @below_dots' @medis' @below_dots';

# Move marks back and up.
Position @marks <xPlacement=-50 yPlacement=10>;
# Equivalent to AFDKO:
#   pos @marks <-50 10 0 0>;

Positioning rules, as with any “basic” (substitute/position/attach/chain) rule, can be optionally followed by a list of script/language pairs in double angle brackets.

Chain

Chaining rules are created using the Chain verb. Lookups are differentiated from glyph selectors by prepending a ^.

Examples:

Chain glyph1 ^lookup1 glyph2 ^lookup2;
Chain pre ( glyph1 ^lookup1,^lookup2 glyph2 glyph3 ^lookup3 ) post;

These correspond to the AFDKO syntax:

sub glyph1' lookup lookup1 glyph2' lookup lookup2;
pos pre glyph1' lookup lookup1 lookup lookup2 glyph2' glyph3' lookup lookup3 post;

Whether the rules are AFDKO sub or pos rules is resolved by examining the rules within the referenced lookups.

Chaining rules, as with any “basic” (substitute/position/attach/chain) rule, can be optionally followed by a list of script/language pairs in double angle brackets.

Anchor Management

The Anchors plugin provides the Anchors, LoadAnchors, Attach and PropagateAnchors verbs.

Anchors takes a glyph name followed by anchor names and positions, like so:

Anchors A top <679 1600> bottom <691 0>;

Note that there are no semicolons between anchors. The same thing happens for mark glyphs:

Anchors acutecomb _top <-570 1290>;

If you don’t want to define these anchors manually but instead are dealing with a source font file which contains anchor declarations, you can load the anchors automatically from the font by using the LoadAnchors; verb.

Once all your anchors are defined, the Attach verb can be used to attach marks to bases:

Feature mark { Attach &top &_top bases; };

The Attach verb takes three parameters: a base anchor name, a mark anchor name, and a class filter, which is either marks, bases, cursive or a glyph selector.

The verb acts by collecting all the glyphs which have anchors defined, and filtering them according to their class definition in the GDEF table. In this case, we have asked for bases, so glyph A will be selected. Then it looks for anchor definitions containing the mark anchor name (here _top), which will select acutecomb, and writes an attachment rule to tie them together. As shown in the example, this is the most efficient way of expressing a mark-to-base feature.

This is equivalent to the AFDKO syntax:

markClass acutecomb <-570 1290> @topmarks;
pos base A <679 1600> @topmarks;

Writing a mark-to-mark feature is similar; you just need to define a corresponding anchor on the mark, and use the marks class filter instead of the bases filter:

Anchors acutecomb _top <-570 1290> top <-570 1650>;
Feature mkmk { Attach &top &_top marks; };

Writing a cursive attachment figure can be done by defining entry and exit anchors, and using an Attach statement like the following:

Feature curs {
    Routine { Attach &entry &exit cursive; } IgnoreMarks;
};

Conditional

Rules can be applied conditionally using the If statement. These will make more sense when you can define variables.

Examples:

If $dosub {
    Substitute a -> b;
}

Including Other Files

FEZ files may contain other files; to load another file, use the Include verb:

Include anchors.fez;