New issue
Advanced search Search tips
Starred by 42 users

Issue metadata

Status: Fixed
Owner: ----
Closed: May 2014
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 3
Type: Feature



Sign in to add a comment

Do ligature substitution on web content

Reported by cryptooc...@gmail.com, Sep 18 2009

Issue description

Chrome Version       : 4.0.212.0 (26435)
OS + version : Linux 
CPU architecture (32-bit / 64-bit): 64-bit
window manager : Gnome

Behavior in Firefox 3.x, 
What is the expected result?:

Firefox automatically substitutes certain character combinations
by their ligatures. For example "fl" is automatically rendered
as the fl ligature of the font (fi->fi-ligature, ffl, ffi, ...).

What happens instead?

No ligatures are rendered with Chrome.

Please provide any additional information below. Attach a screenshot
and backtrace if possible.

I have attached two screenshots from the Atlantic website.
Notice how all ligatures are nicely rendered in Firefox;
the letter-spacing is also much better in general.
 
no_ligatures_chrome.png
587 KB View Download
ligatures_firefox.png
613 KB View Download

Comment 1 by mar...@chromium.org, Sep 18 2009

Labels: -Area-Misc Area-BrowserUI
Please attach smaller and clipped screen shots.
<octoploid> Search line 6 of the article for >>battlefield<<...
That was clearly not obvious to me.

Ok, new attempt. Watch out for »fi« of »battlefield« in the middle
of each picture.
ligatures_firefox_.png
5.9 KB View Download
no_ligatures_chrome_.png
5.7 KB View Download

Comment 3 by evan@chromium.org, Oct 9 2009

Labels: -Area-BrowserUI -Size-Medium -Type-Bug Area-WebKit Pri-3 Type-Feature
Status: Available
Summary: Do ligature substitution on web content
Not sure why sky was set as the owner -- this will be tricky to fix.
Since Behnam Esfahbod is currently redesigning HarfBuzz and because
chromium is using it directly to render fonts, it would be wise to
simply wait until he's ready (sometime by the end of this year).
Using this new HarfBuzz release should fix this issue.

Comment 5 by karen@chromium.org, Oct 19 2009

Labels: Mstone-X
 

Comment 6 by ver...@gmail.com, Nov 2 2009

See also  bug #26487 .

Comment 7 by ver...@gmail.com, Nov 2 2009

Note that  bug #26487  is NOT a feature/enhancement request, it is really a Unicode 
conformance bug of Chrome (independant of HarfBuzz), that does not occur in IE8 or 
Firefox 3.5.

This bug page here is a feature request for implementing support for ligatures, and is 
separated. As long as ligature support is not there, there is still the need to correct 
 bug #26487 , and make sure that it will not be exhibited when ligatures support will be 
later integrated in HarfBuzz. 

Comment 8 by ver...@gmail.com, Nov 6 2009

Note that Unicode includes a few preencoded ligatures for compatibility with previous 
standards. These compatibility characters may have their own mappings within fonts.
How can you cope with them?

For example consider U+FB01 (LATIN SMALL LIGATURE FI) which has compatibility 
decomposition as <ligat>U+0066,U+0069.

The font maps for example:
 - U+0066 to glyph id 0x0044, and
 - U+0069 to glyph id 0x0047
 - U+FB01 to glyph id 0xF8FE,

The Unicode properties implies that the font predefines (implicitly) this 
substitution rule for glyphs:

SUBST: <0x0044,0x0047> to <0xF8F9

This rule just needs to be appended at end of the substitution rules defined in the 
font within its optional ligature feature. The font may still define its own 
preferred substitution for the pair <0x0044,0x0047>, which will take precedence.

The rule that takes precedence will work only when the ligatures are enabled, but if 
not, the font feature will be ignored, and the existing mapping of the compatibility 
character U+FB01 to font's glyph id 0xF8FE will still work (and U+FB01 will still be 
displayed with the assigned glyph). This will preserve the compatibility with Unicode 
compatibility characters for ligatures, while still enabling fonts, that forget to 
define the substitution rule impled by Unicode properties, to still display the "fi" 
ligature appropriately when the input text contains the two letters "f" and "i" and 
when ligatures rendering is enabled.

Fonts can add their own sets of ligatures that have no distinct encoding within 
Unicode. These ligatures are under the control of the style defined in the ligature 
rendering mode, and of the ZWJ and ZWNJ format controls (when they are encoded in the 
text), and of the "visible controls" mode when it is also enabled (to disable the 
interpretation of ZWJ and ZWNJ as controls, but enable them to work as visible 
characters and be displayed with the glyphs assigned to them in the font)

Comment 9 by ver...@gmail.com, Nov 6 2009

Other sources of known ligatures can also be found within Adobe documentation, related 
to standard glyph names. When a font exhibit character names, they can be parsed to see 
if they match the documented syntax for implying a ligature. These names are also 
implicitly taken as a valid source of ligatures that will be enabled when the ligature 
rendering mode is enabled. These ligatures have lower precedence than ligatures impleid 
from the <ligat> font feature, but higher precedence than the few ligatures mappings 
implied by Unicode compatibility decomposition mappings (found in the UCD). 

Comment 10 by ver...@gmail.com, Nov 6 2009

The few compatibility ligatures encoded in Unicode for the Latin script include:
 /ffl/ 'ffl'
 /ffi/ 'ffi'
 /fl/ 'fl'
 /fi/ 'fi'
 /ff/ 'ff'
 /ij/ 'ij'
 /IJ/ 'IJ'
 /st/ 'st'
The following ligatures are probably not really ligatures, enabling them can cause 
problems in various languages :
 /ae/ 'æ'
 /A[Ee]/ 'Æ'
 /oe/ 'œ'
 /O[Ee]/ 'Œ'
 /ue/ 'ᵫ'
There are similar ligatures (Dz, Dž, tſ, ...) Those used in IPA only may be disabled 
(unless they are mapped by default with glyphs in fonts). Enabling them may not be 
recommended (including /[oOaA][eE]/ in French, even French considers each ligature in 
[œŒæÆ] as a pair of letters, for collation purpose only, they are still considered 
orthographically, phonetically and morphologically as distinct from the associated 
letters composing them, as if these ligatures contained not just the two letters, but 
also an invisible implied diacritic which cannot be encoded separately, except 
possibly with ZWJ).

Those used in actual languages (like Central European languages that define for them 
three variants with lower case, uppercase and mixed titlecase) may be enabled by 
default (with the exception of /ſ[sz]/ to 'ß' in texts tagged as German with 
xml:lang="de" where they are considered distinct letters). For German, one can also 
disable the automatic creation of 'ß' ligatures by encoding ZWNJ between /ſ/ and 
/[sz]/.

Sidenote: Official German orthography prohibits ligatures across composition 
boundaries. For example »Auflage« should be set without the fl ligature.

But I guess it will take years before we'll see this feature implemented in 
browsers, if ever. :-)

Comment 12 by ver...@gmail.com, Nov 7 2009

I agree, but what is suggested is to implement several levels of ligatures, as those 
already implemented (with custom CSS properties) in Adobe Flex 4, which provides an 
excellent implementation of what typographs want.

See 
http://help.adobe.com/en_US/Flex/4.0/langref/spark/components/supportClasses/Slider.h
tml

But fonts and renderers can still correctly honor the prohibition of automatic 
ligatures for specific languages like:
 - 'fl' in German between two morphemes : note that Unicode suggests using <f,ZWNJ,l> 
for this case, if the language cannot be identified by the browser or is not 
explicitly tagged.
 - 'oe' in French (there are oppositions like "œuf" where the ligature is required 
and "coexister" where the ligature is prohibited)

Even without such advanced ligatures support (such as the one already proposed by 
Adobe), the browser should still honor the ZWJ and ZWNJ correctly (without exhibiting 
the  bug #26487 ) and in a way conforming to Unicode (not showing by default glyphs for 
format controls).

Comment 13 by ver...@gmail.com, Nov 7 2009

Note that the fact that Adobe implemented them in its propriétary Flex system (or 
custom AIR controls for browsers) demonstrates that the solution is technically 
viable. On the opposite, the custom Flex CSS properties are not necessarily the best 
suited for working with HTML (Flex is not HTML), or other W3G standards (including 
notably SVG, and other XML-based style languages), and they may need to be 
refined/enhanced/studied more appropriately to preserve the compatibility and 
iteroperability.
The W3G lags long behind, and there's not a lot made in this perspective, for more 
advanced typography on the web (and there are still a lot of works and questionable 
options for the future CSS3 specification). So advanced typography will probably not 
come before CSS4, unless there's agreement between the authors of major browsers, to 
define some extensions earlier.

What is missing is a vendor-neutral specification (and agreements signed by the 
participants to release them for open use, without requiring royalties or specific 
licences) from all W3G participants (including notably Adobe, Microsoft, the Mozilla 
Foundation, Google and Apple...).

Anyway, the advanced support for ligatures in OpenType (and partly also in Unicode) 
is already released in an open way, this paves the way for such implementations and 
fast adoption.

Note that CSS specifications are now advancing mostly by the demonstrations made by 
each vendor (using custom CSS properties, whose names are prefixed by their vendor 
id, like "-moz-", "-webkit-", ...) before larger adoption of a standard CSS property 
name and behavior without those prefixes.

Comment 14 by evan@chromium.org, Jan 13 2010

BTW, I've been looking into font stuff recently.

Since existing Pango/Harfbuzz can already render ligatures, waiting for the new 
harfbuzz doesn't change anything.  (In fact, Behdad added ligature support to the new 
harfbuzz only in December, and only because I was sitting next to him at the time and 
nagged him about it.  :P)

The underlying problem is that WebKit has a "fast" font path (where it assumes one 
Unicode codepoint => one glyph) and then a "complex" font path (used for Arabic, 
etc.).  If you put normal text through the complex font path you already get 
ligatures out of WebKit.  The reason it's not on by default is that it is much 
slower.  So what really needs to be figured out is the performance situation.

Comment 15 by ver...@gmail.com, Jan 15 2010

The "fast" font path is definitely bogous, and does not match what users and 
typographs expect. It also breaks the Unicode character encoding model. Fonts do not 
contain characters but glyphs. Fonts may contain character mapping tables, but they 
do not cover all what is needed: a "character" is first mapped to a "glyph id", then 
series of glyph ids defined ni thatfont are substituted and positioned using OpenType 
"features" (for example sets of ligatures, some of them can be specialized per 
language), then the final transformed series of glyph ids get rendered into glyphs.
Before that, when just working at the "character" level, this means the "Unicode 
abstract character" (on which all standard Unicode algorithms, notably 
normalizations, case mappings, the BiDi algorithm, the shapping of contextual 
initial/medial/final/isolated forms commonly used in Arabic, the reordering from 
logical to visual order needed for Indian scripts and for some Hebrew and Arabic non-
Unicode encodings mapped to Unicode, ...) cannot use the fast path. The same is true 
for Latin ligatures, and compatibility mappings.

So actually, the code needs first to apply the Unicode-based alogorithms, then try to 
find character mappings within the font mapping tables and the relevant OpenType 
features. Note that "symbol" fonts are not mapped using Unicode (they have an 
implicit character-to-glyph mapping table which has to be selected automatically by 
an implementation of this table within the renderer itself).

The situation can be even more tricky, because some fonts can also make internal 
references to other fonts for characters that they don't support: you have to search 
throguh a list of fonts. Some fonts are supported by the native environment as 
"virtual fonts", i.e a logical name associated to a collection of related fonts and a 
set of selection rules (for example, code point ranges, and/or language/script 
subselections).

In all cases, the format control characters of Unicode MUST be handled and recognized 
specially by renderers: they should not display them at all and not attmpt to map 
them to glyph ids within the selected fonts above, without using a specialized code 
path (for handling the default "invisible controls" mode, or the special "visible 
controls" mode when rendering into an HTML input form in a visible control, a feature 
that should not even be enabled by default, but that could be enabled by some custom 
CSS text style property).

As long as you won't have the necessary support (alternate rendering code path, and 
custom CSS controls usable in Javascript through DOM, or directly in HTML pages ans 
CSS stylesheets) for supporting BOTH the "invisible controls" mode AND the "visible 
controls" mode, the browser SHOULD only implement the "invisible controls" mode by 
default.

And so, all the optional typographic ligatures will NOT be rendered by default, 
because, even before implementing the visible controls mode, the browser will have to 
support the ligature format controls hints (joiners and disjoiners, already encoded 
in Unicode as format cnotrols, and that themselves should also remain invisible by 
default).

You absolutely cannot predic which characters are candidates for ligatures as this 
all depends on the actual fonts that will be used, ans so the "fast" code path is 
simply bogous. The "complex" path is needed and should be the default (it will even 
speed up the rendering of scripts that actually cannot work without it, and it will 
have a very minor impact of "simple" alphabets like Latin/Greek/Cyrillic: the 
assumption that the Latin script is simple is compeltely wrong, when in fact it is 
certainly the MOST complex script in Unicode; it is even MORE complex than Arabic or 
Indian Brahmic scripts, and MORE complex than Han sinograms even if it has much less 
characters because it has MUCH MORE character properties for corectly handling it: 
the number of characters in a script does not play a significant role in terms of 
actual complexity or performance).

So any progress in a better support of the Latin script will certainly help all other 
"complex" scripts. Just drop the "simple" code path (one character => one glyph) as 
it ONLY works with some legacy symbol encodings or with simple ASCII encodings. In 
the effective Unicode world (now standard in HTML and in all new protocols), such 
"simple" code path simply does not work reliably.

I read somewhere that Webkit already supports this (dynamic substitution of ligatures), but is disabled by default in the likes of Safari (and I suppose Chrome). Is this true? If so, is it possible to set a launch option to enable it? (I don't really care about performance hits...I've got an i7 and lots of RAM for that.)

I'm on Windows, FWIW.

Comment 17 by calebegg@gmail.com, Jun 29 2010

Labels: -OS-Linux OS-All
Not sure about Windows, but assuming from #16 it's an issue there too. It's definitely at least Mac and Linux though, not just Linux.
Hello?

Any news on this?
Is this dead?
The text-rendering css rule (http://www.aestheticallyloyal.com/public/optimize-legibility/) works for me in chromium 12

Comment 22 by ise...@gmail.com, Jul 5 2011

Nathan, kerning are ligature substitution are two completely different things. You're only seeing kerning.
I have just written a test page that also illustrates this bug (also, screenshot):
http://unifraktur.sourceforge.net/testcases/enable_opentype_features/

(For some OS X issues seen on that test page, I have opened  Issue 114235 .)

The ligature I have tested that is defined through the OpenType feature 'liga' is turned off by default on Chromium 16.0.912.77, Ubuntu 11.10 (thus violating the UI suggestion of the OpenType Layout tag registry). But at least, it can be enabled with ‘-webkit-font-feature-settings: "liga";’.

If ‘text-rendering: optimizeLegibility;’ is set, I see a strange thing happening: At first page load, there is no 'liga' ligature. But when I scroll down and up again, or when I navigate foreward and back again, then the ligature appears ... That same thing happens with the 'ccmp' smarts I have tested, with the difference that it does not depend on the ‘text-rendering’ property.

The ligature I have tested that is defined through the OpenType feature tag 'rlig' (required ligatures!) cannot be turned on in the sample word I have chosen, ‘ſit‍zen’. I have seen it turned on in other words, though. This behaviour is mysterious to me.

Comment 24 by tkent@chromium.org, Feb 15 2012

Labels: WebKit-Fonts

Comment 25 by js...@chromium.org, May 13 2012

Cc: behdad@chromium.org js...@chromium.org bashi@chromium.org
@bashi : do you know why 'optimizeLegibility' leads to strange behaviors as described in comment #23. 

Comment 26 by ver...@gmail.com, May 13 2012

Your test case on the "rlig" feature is wrong. There's no such "required" ligature in the Latin script, even for the "tz" letter pair, EVEN if the font defines this feature, only because this is NOT inherent to the Latin script itself.

This means that no OpenType will look for the "rlig" feature for Latin letter pairs.

To get the ligatures in the Latin script, you would need to use a joiner control, which Unicode documents only as a SUGGESTED hint for ligation: These are then just "discretionary" ligatures that an OpenType font will NOT implement using the "rlig" feature, but with the "dlig" feature, provided that it has been enabled (discretionary ligatures are disabled by default).

The "rlig" OpenType feature is mostly made for Arabic and a few other scripts that have required ligatures independant of font styles (e.g. for the Lam+Alef pair). That's why this feature should be enabled by default in all text rendering engines, and enabled as well in Chrome (I think it is, because Lam+Alef pairs are correctly rendered with a ligature; but the actual list of letter pairs in all scripts that the Opentype renderer needs to lookup in the font should be scrutinized to see what may be missing for some scripts, but I don't think that the Latin script will ever need such an "rlig" feature in Latin fonts, and that OpenType engines will ever need to look for it in any fonts, even if this feature is enabled in the renderer).

Note that OpenType engines are different in their behavior from AAT/Graphite and SIL engines, because the later engines will perform all font lookups that have been enabled. This is not the philosophy of OpenType : the OpenType renderer most often do not need to look in fonts to know that it does NOT need to perform any substituion and positioning.

So this means that any Latin pair defined in an OpenType font is clearly invalid (as it would be dependant of font styles) and it's normal that it is ignored when this style is only discretionary.

Comment 27 by bashi@chromium.org, May 14 2012

Comment #23, are you seeing the strange behaviors of optimizeLegibility on Ubuntu? I couldn't reproduce the behavior, but I'll investigate it.
I disagree with the assertion that rlig mustn't be used on Latin script. The OpenType Layout tag registry says no such thing. It says: "This feature covers those ligatures, which the script determines as required to be used in normal conditions. This feature is important for some scripts to insure correct glyph formation."

Here is the situation: Some variants of the Latin script (at least DE-Latf, that is, German language in fraktur style of the Latin script) require two different kinds of ligatures. In these variants, the normal means of text highlighting is not italics, but increased letterspacing. Now this is where two kinds of ligatures must be distinguished: The normal ligatures (like fl fi etc.) will be broken up when letterspacing is increased. For instance, a word like "find" with increased letterspacing will approximately look like  " f i n d " .  This is different, however, with the four ligatures ch ck ſt tz: When letterspacing is increased, these ligatures stay. For instance, a word like "blitzen" with increased letterspacing will approximately look like  " b l i tz e n ".

How do you propose implementing that? 'dlig' clearly does not apply, because this is a required behaviour, whereas dlig is turned off by default (same as 'hlig') and is intended "for special effect, at the user's preference". Recurring to ZWJ is highly undesirable because it means that for a correct usage of DE-Latf, a simple font change won't do but that you need additional characters (also, it doesn't work in Google Chrome). Asking for a new dedicated OpenType feature would postpone a solution for the problem and would bloat the already bloated OpenType Layout. The best suggestion I have seen so far is using 'rlig' (also, it works fine in Firefox). Please tell me if there are better implementations.

When you suggest that the 'rlig' feature be only switched on for the scripts it applies to, you get a logical problem because the OpenType Layout does not provide a list of script it applies to. While it explicitly mentions Arabic and Syriac, it concedes that the feature "(m)ay apply to some other scripts". Who is to know what scripts may have required ligatures? I think the only way to go is to enable it for all scripts.
#23, I can still see the "strange behaviour" of liga and ccmp ligatures only appearing after scrolling down (so the supposed ligature is no longer visible) and up again, on the following versions:

18.0.1025.151 (Developer Build 130497 Linux) Ubuntu 12.04
18.0.1025.168 (Developer Build 134367 Linux) Built on Ubuntu 11.10, running on LinuxMint 12

on the following page:

http://unifraktur.sourceforge.net/testcases/enable_opentype_features/optimizeLegibility.html

I don't see that strange thing on OS X or Windows, and I don't see it any more when navigating foreward and back again, and I don't see it at all font sizes (on Linux Mint, I wouldn't see it at default font size, but it would turn up when increasing font size).

Comment 30 by ver...@gmail.com, May 14 2012

Re #28. "This feature is important for some scripts to insure correct glyph formation." A "tz" ligature may be important for some typographic styles and only for some languages. But the "rlig" tag description does not speak about languages or typographic styles. It is all about **scripts**.
The "tz" ligature (or absence of ligature) is not essential to the Latin script itself. That's why it has to remain discretionary.
If you want to apply it consistently for a given language (say German) this is not the correct Opentype feature to use. Because the OpenType renderer will ignore it completely (the renderer does not know with which typographic style you are working, it makes its work only based on the properties of the encoded Unicode characters, that are NOT dependant of the language).

And possibly it will tune the rendering by using the language tagging information present in the out-of-band markup or document metadata, but only as a way to enable features that are only enabled by a specific language code (those fonts using language-speciic tunings are still rare, for example thay can tune the expected shape for the accent diacritics in Czech, or the prefered comma-like shape of the cedilla in Romanian, but Unicode often offers now alternaatives using separately encoded characters so that these language-specific tunings are only needed to handle legacy texts).

Really, reconsider correctly the Unicode 6 Standard at the page you cite : the "rlig" feature will only apply to the examples for which the *absence* of any joiner will create by default a ligature and there's no alternative shown (no "or" word in the second column of the table, and where the absence of that default ligature normally *requires* the explicit use a joiner (ZWJ, or ZWNJ, or <ZWJ,ZWNJ,ZWJ> considered equivalent to ZWNJ in Arabic only for compatibility reasons).

The "fi" ligature for example does not match this case. The same applies to the "tz" ligature yuo are testing for the "rlig" Opentype feature. This is the wrong test. "rlig" would fit only for cases like LAM+ALEF, exactly like what is shown in TUS where the ligature is used by default and distinctive from the absence of ligature or from another simpler ligature/joining encoded differently using joiners explicitly.

To support the Fraktur style of the Latin script (which is unified in Unicode with the standard Latin script) and only distinguised in ISO 15924 with the "Latf" code instead of "Latn" you need a separate font and the additional ligatures will NOT be part of the "rlig" feature because OpenType does not know that you are working with the Fraktur variant of the Latin script and only knows the unified Unicode code points.

This means that you need to use joiners explicitly in Fraktur texts. If you use joiners, you don't need "rlig": the behavior of joiner controls is enabled by default in OpenType renderers that absolutely don't need a "rlig" feature for them and not even any "dlig" feature (which is disabled by default).

Comment 31 by ver...@gmail.com, May 14 2012

Notes:

(1) The current incorrect support of joiner controls in Chrome is a separate bug, signaled since long.
    For now it has only been partly solved, at least on Windows, but still less completely on Linux where there's a very different shaping engine used (instead of the Windows OpenType API).

(2) The equivalence of ZWNJ and <ZWJ,ZWNJ,ZWJ> applies to Arabic only because of the mandatory support for the contextual shaping of letters (it uses the standard characters properties defined in JoiningProperties.txt in the UCD, which also applies to Syriac and *may* apply to a few more scripts in the future).

(3) read and link the "may" word you found in TUS 6.0 page 558 exactly because of the "JoiningProperties" that do NOT apply to Latin, but applies to Arabic and Syriac, the same scripts cited as well in the documentation of the OpenType "rlig" feature. This "may" clearly does NOT apply to the Latin script !

Sorry, I have been hasty. The test page http://unifraktur.sf.net/testcases/enable_opentype_features/ does not illustrate the use of 'rlig' on <t, z>, but on <t, ZWJ, z>. Here, the question is which OpenType Layout feature should be used in order to match Unicode's implementation note about fonts providing ZWJ sequences for any extant ligature (p. 528 of The Unicode Standard 6.0 [p. 558 of the PDF], or p. 551 of 6.1). I chose 'rlig' mainly because I was already using it for that other complicated thing I just explained, but also because I didn't want to use plain 'liga', so switching off 'liga' ligatures would not affect the stronger ZWJ ligatures.

Now you have made me doubt of the way the OpenType Layout works. My understanding was that on one hand -- as you well explained -- the rendering software/system will detect the script/language that is used, and based on that knowledge, will only render certain features of the OpenType Layout. But I thought that this was not enough for a feature to have an effect, but that additionally, on the font side, the feature must be encoded in specific rules (such as ligature substitutions). So if my font has a <Th> glyph, I still need to manually encode a <T, ZWJ, h> ligature in an appropriate feature (and, if I wish, also a <T, h> ligature). But now you are saying that "the behavior of joiner controls is enabled by default in OpenType renderers". How does that default work if there are no specific ZWJ ligature definitions (in a specific OpnType Layout feature)?

Speaking again of the Fraktur <tz> glyph, I agree that a rigid interpretation of what a script is will favour your point of view that the <tz> is not required by all of the Latin script and consequently not by the Fraktur variant. However, I believe that the other point of view is still possible that different variants of a script may have different requirements and so Latf may require ligatures while Latn doesn't. And you haven't addressed the logical problem that 'rlig' *may* occur in "some other scripts" and there is no way of knowing which ones (imagine for instance that more stenography systems will be added that require ligatures). So the safest thing, as I understand it, is enabling 'rlig' for all scripts. What harm can it do? It seems to be what the Firefox coders have already implemented.

Comment 33 by ver...@gmail.com, May 15 2012

OK now you start understanding the problem : <t, z> itself will not generate effectively any ligature from any OpenType feature.

The only good question to ask is in which substitution feature you can design such ligatures when using ZWJ.

For this you have to return to the Unicode defition of ZWJ, which clearly states that this is is NOT a required ligature (except in a few Asian scripts for which this is documented, and sometimes enforced by listing some combinations that are assigned distinctive names in an additional UCD file), but an **hint** that renderers **may** use to know where and when a ligature is suggested.

For this reason, the ligatures built upen ZWJ will remain discretionary in the Latin script. OpenType renderers are supposed to honor the "dlig" : they honor them only **after** other more essential features (which have been enabled or that are enabled by default per the essential properties of the script) have been honored.

If you design a Fraktur font however, this font should still put these ligatures in a feature that does not need to be enabled expliclity, so you will put these ligatures in the "liga" feature which is normally enabled by default (but also honored only after more essential features).

In summary, ZWJ is not suitable for use in the "rlig" feature for the Latin script.

For better understanding about how features are structured and in which order they are honored, you need to read the document in the OpenType specifications that speaks about the Latin script specifically. Reading only the description of a single isolated feature is not enough, as it does not say completely how and when this feature will be looked for by OpenType renderers.

Note that the subset of OpenType features and their relative ordering when executing them in the renderer, is specific to each script (there are some differences of order sometimes depending on the script; additionally, some OpenType renderers may add their own features not universally recognized by others : you have to look in the documentation of each specific renderer about how to use them for more advanced typography).

Some fonts may ignore those requirements : you may have found a font that defines a "rlig" feature to try defining ligatures based on ZWJ, in my opinion these fonts are incorrect and may only work with some non conforming renderers. Don't expect those fonts to work with the Windows OpenType API (implemented in Uniscribe) that is used in Chrome for Windows.

Your interpretations of Unicode and OpenType do not convince me. You are saying that the ZWJ is a mere _hint_. However, Unicode consistently says _request_ (6.1, p. 549). I believe this is a significant difference.

The ZWJ mappings that should be added for the ligature of a font (according to Unicode's implementation notes, 6.1, p. 551) would make little sense if they were discretionary. Indeed, typing the ZWJ already is a discretionary enabling of ligatures, so there should be no second level of discretionary enabling of ligatures. Therefore, choosing 'dlig' for those ZWJ mapping seems a poor choice.

I have a very practical concern with regard to the fraktur script: As I have explained, fraktur requires a distinction between two different kinds of ligatures. I still have not seen any solution other than using 'rlig' vs. 'liga'. Your suggestion of marking the stronger ligatures with a ZWJ will not have any effect at all if the ZWJ ligatures are defined in the same 'liga' feature as the weeker ligatures. Are you saying that there is no way of correctly rendering the fraktur requirements? That does not help.

You do not address the problem that there is no way of knowing to what scripts the 'rlig' feature does apply. I fail to understand your point of view that 'rlig' should not be used in the Latin script. I have not found "the document in the OpenType specifications that speaks about the Latin script specifically" that you mentioned. I think Firefox' consistent rendering of 'rlig' is not a bug, but a valuable feature and a correct interpretation of the standards.

Comment 35 by ver...@gmail.com, May 15 2012

The default ligaures needed for representing the Fraktur style should be assigned in a feature using the OpenType script code "latf" (instead of "latn"), unless the font is namely designed only for the Fraktur style, in which case the default script code can be used.

These default ligatures will be using "liga" the feature (not "rlig"). Then weak ligatures that may be optionally enabled on explicit requests (such as with a CSS style in HTML) will be in "dlig" (there are several kinds of discreationary ligatures more specific to some "frequently" used styles, such as the variable width of digits, or digits with descenders, or swash styles).

All the discretionary ligatures are resolved and execured after all other mandatory features.

For a more detailed description and specification about how to use the features in specific scripts, look at this page :

https://www.microsoft.com/typography/SpecificationsOverview.mspx

Notably this subpage for the features used in the "standard" Latin, Cyrillic and Greek scripts :

https://www.microsoft.com/typography/OpenType%20Dev/standard/intro.mspx

which immediately links to this important page :

https://www.microsoft.com/typography/otfntdev/standot/features.aspx

You'll immediately realize that the Fraktur style will need to use the "liga" feature (or "clig" only for contextual ligatures)

Then you can add different features for the discretionary ligatures (those that you designate as being "weak").

The "rlig" is clearly not defined for the Latin script but reqlly meant to be interepreted according to the Unicode standard that speaks about the REQUIRED ligatures absolutely needed to correctly support the core of a script, such as the LAM+ALEF ligature. That's why it speaks about Arabic ans Syric.

The 'may' found in the OpenType specifications has to be interpreted in terms of the REQUIRED joining behavior which DO exist in those two scripts, but may be needed for scripts that will be encoded later in the Unicode standard (joining types are not optional character properties in Unicode, they are mandatory and fully part of the standard; they are not just informative).

It is relevant to speak about the LAM+ALEF ligature with the "rlig" feature because this is an exception to the normal shaping of the ligature that the two letters would adopt to create their joining ligature: the two glyphs really become a single one, very different from the simple joining form of the two letters (this alternate simpler ligature is encoded differently, using joiner controls, and an explicit disjoiner has to be encoded if you want to block the joining forms from being generated).

Comment 36 by ver...@gmail.com, May 15 2012

Don't forget to note the difference between this page for the standard features used in the Latin script:

https://www.microsoft.com/typography/otfntdev/standot/features.aspx

with these pages for the Arabic and Syriac scripts respectively (and where you find the effective usage of the "rlig" feature):

https://www.microsoft.com/typography/otfntdev/arabicot/features.aspx
https://www.microsoft.com/typography/otfntdev/syriacot/features.aspx

Also, note that not all scripts in Unicode have their recommendation for OpenType feature development. The main reason is that they are still not agreed upon and are still experimental/not standardized betwee nthe various implementations of shaping and rendering engines.

The scripts listed here are those that have been standardized as well in the equivalent ISO standard.

But the Fraktur style of Latin is fully standardized. You don't need different features for now to support it correctly.

You may want more advanced typographic features, they will be optional and supported only in specific layout/shaping/rendering engines, notably by Adobe which documents many other specific oprional features to support advanced typographic features all of them optional.

Note: noeither Chrome, nor Firefox, do implement the shaping engine themselves, they use an API or a helper library whch may be supported only on some platforms. Chrome chose to use the Microsoft Uniscribe API on Windows, the way it exists now. Mozilla chose to integrate in Firefox an helper API not using Uniscribe, which may not follow the standard and could support other things which are not standardized or still experimental.

Note also that even Microsoft defines and uses some additional features still not documented in the OpenType specoifications. Developers of layout/shaping/rendering engines however are discussing their implementation and trying to define a standard that will later be approved by ISO, and republished by the OpenType group.

Font authors have to work and contact the authors of those engines if they want their features to be supported across platforms. Otherwise they will just create fonts that will not work correctly across platforms and applications.

So I suggest you start by the core standard 1.6 specifications (published by Microsoft and ISO separately, and reproduced by Apple and Adobe), then go with the documentation provided by Apple and Adobe for their specific features supported in their own platforms or applications.

Google can certainly work with the members of the OpenType team or with national bodies to have some features approved as part of the standard.

But the full register of existing features is not part of the standard, it is informative and just a registry to avoid conflicts between implementations. Registering a new feature there does not make it standard. That's why the way they are described is very summary and insufficient : you have to find the real authors of these registered features to know exactly how they are supposed to work.

Comment 37 by ver...@gmail.com, May 15 2012

You wrote "The ZWJ mappings that should be added for the ligature of a font (according to Unicode's implementation notes, 6.1, p. 551) would make little sense if they were discretionary. Indeed, typing the ZWJ already is a discretionary enabling of ligatures, so there should be no second level of discretionary enabling of ligatures. Therefore, choosing 'dlig' for those ZWJ mapping seems a poor choice."

That's true but your misinterpreting by assumption what I wrote. Yes the encoding of joiners in texts is discretionary. Still it is an hint that ZWJ *should* (not must) create a ligature. This means that *if* a font defines such a ligature for ZWJ, it should be within a feature that is enabled by default.

But then, it is perfectly possible to render the text with a font that cannot ligate the letters due to its typographic style, so that this ligature is still not mandatory. All depends on the font design, and also on the capability of the renderer to create such ligatures (OpenType is not a mandatory rendering technology, it is still legit to use bitmap fonts that have absolutely no encoded support for ligatures, or fonts that do not have any glyph for the ligature).

The situation is different for ligatures that are normally required to support a script correctly : a font without the necessary glyphs for the ligatures will be considered really poor. But still it will be possible to use it, even if the renderer is unable to rener it with its supported fonts. E.g. a console/terminal using fixed width bitmap fonts will just do the best to represent the individual letters.

Being an "hint" (ou say "request", this is equivalent, the request is not necessarily satisfied), does not mean that you need to put that ligature support in an OpenType feature designed to be discretionary : discretionary ligatures in OpenType features are ligatures that must be explicitly enabled but not enabled by default (e.g. swash styles; or letters that are impossible to ligate given the font typographic style,in something that is readable and recognizable, and distinctable from other similar-looking letters)

So yes, you'd define the ligatures created with ZWJ in the "liga" feature, or in the "clig" feature if they are limited to some previous context which is not itself substituted or repositioned.

Note also that I use the term "ligature" broadly : this can be realized either with glyph substitutions, or with glyph positioning, or both (when you enable a feature, you enable BOTH the GSUB and the GPOS, which ever is present in a font). A ligature of letters does not necessarily requires separate glyphs: reducing the advance width so that the glyphs collide and join may be enough.

"dlig" for example is used to change the default rendering of <f, i> as separate letters; you can create a GSUB: (f,i) -> fi rule in the "dlig" feature which is disabled by default. you can also create the GSUB:(f,ZWJ,i) -> fi rule in the "liga" feature which is disabled by default. And to support the expected rendering of <f,ZWNJ,i> without the ligature, you just need to map <ZWNJ> to an empty glyph without using it in any GSUB rule.

All those decisions are left to the typography designer. The renderer itself does not know or will not inspect the glyphs mapped in the font to find what is the expected result, it just needs to see if something is mapped or not, in the default dont mapping, or in the features that it conditionnaly enable or disable by default, or acording to explicit requests from an external stylesheet. It will then just use the features that it knows are required to honor in a script, and finally it *may* choose to honor some supplementary features requested by an external stylesheet.

When a renderer knows that a script requires the support of a "rlig" feature, it will inspect the font to see if there's one matching this requirement. If the ligature is not mapped, it may then try to look for that feature in another font to honor (for example) the expected susbtitution of (LAM,ALEF) glyphs of Arabic into something else either with substitution or by positionining or both). The result of this substition is also not required to be a single glyph. But the renderer will expect to detect the presence of the "rlig" feature: that's all he needs to correctly use a font. But if the sglyphs ALEF and LAM are not afound in the same font, their ligature will never match any feature, and there will be no other choice than just display the separate glyphs found in te separate fonts.

Thanks for the link to "Developing OpenType Fonts for Standard Scripts", especially to the page on "Features":

https://www.microsoft.com/typography/otfntdev/standot/features.aspx

I had ignored the documentation on "Script-specific Development" since I only had been looking at the "OpenType Specification" (v.1.6). Your link has helped a lot.

I am glad to read that the page on "Features" allows for two different ligature features, as is required for correctly supporting Fraktur typography:

1. There is 'liga' (also 'clig') for normal ligatures such as <fi>, <fl> etc. These are the ones I have called "weak" because they are supposed to be broken up when letter-spacing is increased, a behaviour that is not yet implemented in Chromium. They exactly correspond to ligatures of <fi>, <fl> etc. in other styles of the Latin script. Therefore, they are supposed to be encoded in 'liga'.

2. And then, there is also 'ccmp' one of the uses of which is "to compose a number of glyphs into one glyph". That fairly matches what the Fraktur ligatures <ch>, <ck>, <ſt> and <tz> are: They behave like single glyphs and are never broken up, no matter how much the letter-spacing is increased (that's why I have called them "strong"). There is nothing comparable in other styles of the Latin script unless you consider glyphs such as <œ> that are normally considered to be, and encoded as, a single character.

I know that using 'ccmp' for the special Fraktur ligatures means stretching the definition of 'ccmp' since these ligatures are not explicitly mentioned. However, the Fraktur script being outdated and seldom used, it is rarely considered by standards' authors, so you're lucky if by coincidence the standards more or less match its arequirements, as is the case with 'rlig' and – much more accurately as I have learnt just here – with 'ccmp'. Indeed, by using 'ccmp', the Fraktur ligatures are displayed correctly in Firefox (as with 'rlig').

I cannot judge whether Firefox' "helper API" does not follow the standards. I only see that with regard to displaying ligatures Firefox is, at the moment, by far superior to any other browser: It is consistent among different OS, it has less bugs, and the features that are supposed to be actived by default – according to the the OpenType Layout – really are actived by default.
Chipping in here without having thoroughly read all the comments.  The issue of how to translate ZWJ/ZWNJ into automatically enabling/disabling OpenType ligature features has been on my table to address in harfbuzz-ng.  I will consult the OpenType list and address it eventually...

Comment 40 Deleted

Cc: schenney@chromium.org
On May 16, 2012 behdad@chromium.org wrote:
> The issue of how to translate ZWJ/ZWNJ into automatically enabling/disabling OpenType ligature features has been on my table to address in harfbuzz-ng.  I will consult the OpenType list and address it eventually...

I'm trying to plan a release that requires ligature substitution in Chrome on Windows XP. If this is low priority for core developers, could you give some pointers so someone from the community could try to submit a patch?

Thanks for the great work on Chrome.
Doesn't setting text-rendering: optimizeLegibility in CSS work for you?

Comment 44 by mart...@gmail.com, Feb 19 2013

Doesn't seem to work for me
@behdad@chromium.org: I just went to http://kudakurage.com/ligature_symbols/ in Chrome on Windows XP, opened the developer tools, and added "text-rendering: optimizeLegibility" to the body element, but none of the font ligatures were substituted. Here is a screenshot:

http://i.imm.io/WKkG.png

Firefox is shown in the background correctly displaying the ligatures.
Anyone who knows the Windows code around?
Project Member

Comment 47 by bugdroid1@chromium.org, Mar 10 2013

Labels: -Area-WebKit -WebKit-Fonts Cr-Content Cr-Content-Fonts
Project Member

Comment 48 by bugdroid1@chromium.org, Apr 6 2013

Labels: -Cr-Content Cr-Blink
Project Member

Comment 49 by bugdroid1@chromium.org, Apr 6 2013

Labels: -Cr-Content-Fonts Cr-Blink-Fonts
Project Member

Comment 50 by bugdroid1@chromium.org, Feb 8 2014

The following revision refers to this bug:
    http://src.chromium.org/viewvc/blink?view=rev&rev=166751

------------------------------------------------------------------------
r166751 | efidler@blackberry.com | 2014-02-08T01:16:50.343629Z

Changed paths:
   M http://src.chromium.org/viewvc/blink/trunk/LayoutTests/fast/text/font-variant-ligatures.html?r1=166751&r2=166750&pathrev=166751
   M http://src.chromium.org/viewvc/blink/trunk/LayoutTests/TestExpectations?r1=166751&r2=166750&pathrev=166751
   M http://src.chromium.org/viewvc/blink/trunk/Source/platform/fonts/harfbuzz/HarfBuzzShaper.cpp?r1=166751&r2=166750&pathrev=166751
   M http://src.chromium.org/viewvc/blink/trunk/LayoutTests/fast/text/font-variant-ligatures-expected.html?r1=166751&r2=166750&pathrev=166751

Make -webkit-font-variant-ligatures actually work.

Right now, the complex text path always ligates, even with
-webkit-font-variant-ligatures:no-common-ligatures. Also, discretionary
and historical ligatures don't work.

This is technically tested by fast/text/font-variant-ligatures.html, but
the default font often doesn't have the GSUB ligature features.
I've changed the test to prefer some fonts that are widely available and
actually have GSUB liga for "fi".

BUG= 22240 ,  114235 

Review URL: https://codereview.chromium.org/77413003
------------------------------------------------------------------------
This bug is a bit all-over-the-place in terms of what the actual issue is, but ligation does work in general.

The fast text path doesn't ligate in Blink right now (that's  bug 322102 ), but "text-rendering: optimizeLegibility" should be good.

I'd say this bug should probably be closed, but I'm not 100% sure.

Comment 52 by 4mr.m...@gmail.com, Feb 28 2014

So I visited what is supposedly a showcase page for this feature:
http://awesomescreenshot.com/0ef2fckoce

Is it me or it isn't actually working?

Chrome 34.0.1847.14 Linux Lubuntu 13.10 amd64
@4mr.minj

That site (http://elliotjaystocks.com/blog/the-fine-flourish-of-the-ligature) doesn't have proper ligatures on any browser for me.

Comment 54 Deleted

yes, the site claims "Typekit’s version of Skolar doesn’t contain ligatures, but Fontdeck’s does, which is why I’m serving the body type for this site through Fontdeck," but the code to import the Fontdeck css is commented out.

I also verified that the Skolar WOFFs actually downloaded have no ligatures in them.
Project Member

Comment 56 by bugdroid1@chromium.org, Mar 11 2014

The following revision refers to this bug:
    http://src.chromium.org/viewvc/blink?view=rev&rev=168944

------------------------------------------------------------------------
r168944 | efidler@blackberry.com | 2014-03-11T20:00:19.971146Z

Changed paths:
   M http://src.chromium.org/viewvc/blink/trunk/Source/platform/fonts/FontDescription.cpp?r1=168944&r2=168943&pathrev=168944
   M http://src.chromium.org/viewvc/blink/trunk/LayoutTests/fast/text/font-variant-ligatures.html?r1=168944&r2=168943&pathrev=168944
   M http://src.chromium.org/viewvc/blink/trunk/Source/platform/fonts/FontDescription.h?r1=168944&r2=168943&pathrev=168944

Make any sort of ligatures, not just common ligatures, use the complex text path.

We can remove the incorrect hack in the layout test that used
common-ligatures, which forces the complex path.

BUG= 22240 

Review URL: https://codereview.chromium.org/193033003
------------------------------------------------------------------------

Comment 57 by n...@wlonk.com, Apr 7 2014

In response to #52, #53, #55: the site's author has fixed that blog series ("Tomorrow’s web type today"). Ligatures are working again.
Status: Fixed
cool. Let's close this then. Any specific ligature issues should get separate new bugs.

Sign in to add a comment