Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String width via fontmetrics calculation is very slow if there are arabic or persian letters in text

I have a problem. My application interface works much slower if i use eastern languages there. Especially i felt it in components such as JList, JCombobox, JTable.

How i found the performance of FontMetrics.stringWidth method is very slow (500+ times) if in the text at least one letter is arabic or persian. How i know it is commonly used method in various swing components.

Is there a way to boost this method performance?

Here is the example class which demonstrates the problem:

import java.awt.Font;
import java.awt.FontMetrics;
import java.awt.Graphics;
import java.awt.image.BufferedImage;

public class FontMetricsSpeedTest
{

 public static void main( String args[] ) {
  String persian="صصصصصصصصصصصصصصصصصصصصص";
  String english="abcde()agjklj;lkjelwk";
  FontMetrics fm=createFontMetrics(new Font("dialog",Font.PLAIN,12));
  int size=50000;
  long start=System.currentTimeMillis();
  for(int i=0;i<size;i++)
  {
   fm.stringWidth(persian);
  }
  System.out.println("Calculation time for persian: "+(System.currentTimeMillis()-start)+" ms");
  start=System.currentTimeMillis();
  for(int i=0;i<size;i++)
  {
   fm.stringWidth(english);
  }
  System.out.println("Calculation time for english: "+(System.currentTimeMillis()-start)+" ms");
 }
 private static FontMetrics createFontMetrics(Font font)
 {
  BufferedImage bi = new BufferedImage(1, 1, BufferedImage.TYPE_INT_ARGB_PRE);
  Graphics g = bi.getGraphics();
  FontMetrics fm = g.getFontMetrics(font);
  g.dispose();
  bi = null;
  return fm;
 }
}

For me it gives next output:

Calculation time for persian: 5482 ms

Calculation time for english: 11 ms

like image 476
Eugene Popovich Avatar asked Dec 02 '10 09:12

Eugene Popovich


2 Answers

I've dug for a little and found next:

From the source of the FontDesignMetrics we can see the main actions sequence

public int stringWidth(String str) {
float width = 0;
if (font.hasLayoutAttributes()) {
    /* TextLayout throws IAE for null, so throw NPE explicitly */
    if (str == null) {
        throw new NullPointerException("str is null");
    }
    if (str.length() == 0) {
        return 0;
    }
    width = new TextLayout(str, font, frc).getAdvance();
} else {
    int length = str.length();
    for (int i = 0; i < length; i++) {
        char ch = str.charAt(i);
        if (ch < 0x100) {
            width += getLatinCharWidth(ch);
        } else if (FontManager.isNonSimpleChar(ch)) {
            width = new TextLayout(str, font, frc).getAdvance();
            break;
        } else {
            width += handleCharWidth(ch);
        }
    }
}
return (int) (0.5 + width);

}

For latin characters method getLatinCharWidth(ch) is used. It caches all the characters widths. But for persian and arabic characters TextLayout is used instead of. The main purpose is because eastern characters may have varios shape and width depend on context. It is possible to add method which will cache characters width but it will not give exact values such as it will ignore nuances of different characters widths. Also it will ignore various ligatures.

I've tested TextLayout separately and it is slow for both languages english and persian. So the real cause of the slow performance is the slow work of the sun.font.TextLayout class. It is used to determine string width in case characters in the string are not simple. Unfortunately i don't know how to boost TextLayout performance for a now.

If somebody is interested here the article about various font and text layout nuances is http://download.oracle.com/javase/1.4.2/docs/guide/2d/spec/j2d-fonts.html

like image 198
Eugene Popovich Avatar answered Nov 06 '22 01:11

Eugene Popovich


I performed some tests with other languages using your code. First you are right: calculations of Persian string took a lot of time.

I played with font type and size and did not see significant differences. But the result definitely depend on the script you are using. Here are the results I got on my machine.

Calculation time for Persian: 2877 ms
Calculation time for English: 8 ms
Calculation time for Russian: 47 ms
Calculation time for Hebrew:  16815 ms

As you can see Russian is 6 times slower than English. I believe that it is because the internal representation of strings is unicode. In UTF-8 English characters occupy one byte, all others 2 bytes.

I am not sure it can satisfy you :) but Hebrew test is 4 times slower than Persian. Both are slow, so I guess that right-to-left calculations kill it.

It seems that we have nothing to do with this.

like image 44
AlexR Avatar answered Nov 05 '22 23:11

AlexR