Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4222930

Hebrew punctuation often not placed correctly and sometimes overwrites things

XMLWordPrintable

    • 2d
    • generic
    • generic

      In Cricket, Hebrew punctuation marks are displayed in unbalanced positions
      and sometimes coincide with the letters they modify. The dot in the shuruk is
      on the wrong side of the vav.

      There are so many individual problems with the punctuation display that I
      have written this report in general terms.

      It appears that the current placement of Hebrew punctuation marks is
      implemented by an algorithm that is not font-specific and does not take into
      account special case letters. The marks appear to be placed according to the
      following heuristic:

      The borders of a glyph are determined using the font metric, without respect
      to the visual center or other visual characteristics of the glyph. Then,
      a) accent dots are placed at the vertical and horizontal center of the glyph
      b) marks under the baseline are horizontally centered

      This heuristic causes the puncutation marks to appear unbalanced or to
      coincide with the letters they modify.

      In addition, the punctuation marks are displayed coincident with readers
      marks (taame hamikra) if both are used together.

      The solution to this problem requires the use of a shaping engine similar to
      the shaping engine used to display Arabic text. In addition, the fonts must
      provide glyphs for the accented versions of the letters that can take accents.

      The following explains the requirements for correct display of punctuation
      marks in Hebrew text:

      1. marks must not coincide with or touch the accented letters
      2. marks must be positioned according to graphic expectations

      If the first requirement is not met and marks coincide with letters or with
      other marks, the text might not be readable as intended. If either requirement
      is not met, the text looks sloppy and unpresentable.

      There is no way to meet these placement requirements without respect to the
      font and without respect to special case letters that are the same for
      all fonts. That is, there is no algorithm for correct placement of punctuation
      marks that is correct for all fonts.

      There are several cases that must be considered.

      1. Accented letters (dagesh)

      In each font, the position of the accent dot (dagesh) in an accented peh (peh
      dagushah) depends on the degree to which the peh is closed. If it is highly
      closed, then the dot must centerd inside the closed area. If the peh is
      relatively open, most fonts place the dot in line with the line that would
      close the peh if the line were extended. The letter tet has a shape reminicent
      of peh but rotated clockwise by 90 degrees. Obviously the accent dot in the
      tet cannot always have the same vertical position relative to the baseline of
      the font as the dot in the peh. A font with a highly close peh will place the
      dot in the accented peh higher than the midline and the dot in the accented
      tet below the midline. The accent in the vav however, rarely varies from the
      midline for any font. However, it must be placed to the left of the vav, which
      might or might not require extending the space of the vav.

      2. Special case letters

      For all fonts, the horizontal position of a holam-haser dot after the letter
      lamed must take into account the vertical rise of the lamed on its left side
      and not coincide with the lamed. The holam-haser after a lamed is therefore
      usually placed over the right corner of the letter following the lamed. For
      all other letters the holam-haser can be placed over the upper left corner of
      the letter.

      The shva after an unaccented chaf in word-final position is usually placed
      above the baseline.

      Below-baseline punctuation marks for an eyin that extends below the baseline
      are displayed to the right of the extension.

      3. Punctuation below the baseline

      The rest of the punctuation marks are usually displayed below the baseline.
      These marks must be horizontally centered under the visual center of letters
      they modify, except when readers marks (taame hamikra) are also present, in
      which case the punctuation and the readers marks must be horizontally
      concatenated and the resulting string must be horizontally centered under the
      visual center of the modified letter.

            dougfelt Doug Felt
            vrosenmasunw Victor Rosenman (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: