Loading PNGs into OpenGL performance issues - Java & JOGL much slower than C# & Tao.OpenGL

Question

I am noticing a large performance difference between Java & JOGL and C# & Tao.OpenGL when both loading PNGs from storage into memory, and when loading that BufferedImage (java) or Bitmap (C# - both are PNGs on hard drive) 'into' OpenGL.

This difference is quite large, so I assumed I was doing something wrong, however after quite a lot of searching and trying different loading techniques I've been unable to reduce this difference.

With Java I get an image loaded in 248ms and loaded into OpenGL in 728ms The same on C# takes 54ms to load the image, and 34ms to load/create texture.

The image in question above is a PNG containing transparency, sized 7200x255, used for a 2D animated sprite. I realise the size is really quite ridiculous and am considering cutting up the sprite, however the large difference is still there (and confusing).

On the Java side the code looks like this:

BufferedImage image = ImageIO.read(new File(fileName));
texture = TextureIO.newTexture(image, false);
texture.setTexParameteri(GL.GL_TEXTURE_MIN_FILTER, GL.GL_LINEAR);
texture.setTexParameteri(GL.GL_TEXTURE_MAG_FILTER, GL.GL_LINEAR);

The C# code uses:

Bitmap t = new Bitmap(fileName);

t.RotateFlip(RotateFlipType.RotateNoneFlipY);
Rectangle r = new Rectangle(0, 0, t.Width, t.Height);

BitmapData bd = t.LockBits(r, ImageLockMode.ReadOnly, PixelFormat.Format32bppArgb);

Gl.glBindTexture(Gl.GL_TEXTURE_2D, tID);
Gl.glTexImage2D(Gl.GL_TEXTURE_2D, 0, Gl.GL_RGBA, t.Width, t.Height, 0, Gl.GL_BGRA, Gl.GL_UNSIGNED_BYTE, bd.Scan0);
Gl.glTexParameteri(Gl.GL_TEXTURE_2D, Gl.GL_TEXTURE_MIN_FILTER, Gl.GL_LINEAR);
Gl.glTexParameteri(Gl.GL_TEXTURE_2D, Gl.GL_TEXTURE_MAG_FILTER, Gl.GL_LINEAR);

t.UnlockBits(bd);
t.Dispose();

After quite a lot of testing I can only come to the conclusion that Java/JOGL is just slower here - PNG reading might not be as quick, or that I'm still doing something wrong.

Thanks.

Edit2:

I have found that creating a new BufferedImage with format TYPE_INT_ARGB_PRE decreases OpenGL texture load time by almost half - this includes having to create the new BufferedImage, getting the Graphics2D from it and then rendering the previously loaded image to it.

Edit3: Benchmark results for 5 variations. I wrote a small benchmarking tool, the following results come from loading a set of 33 pngs, most are very wide, 5 times.

testStart: ImageIO.read(file) -> TextureIO.newTexture(image)  
result: avg = 10250ms, total = 51251  
testStart: ImageIO.read(bis) -> TextureIO.newTexture(image)  
result: avg = 10029ms, total = 50147  
testStart: ImageIO.read(file) -> TextureIO.newTexture(argbImage)  
result: avg = 5343ms, total = 26717  
testStart: ImageIO.read(bis) -> TextureIO.newTexture(argbImage)  
result: avg = 5534ms, total = 27673  
testStart: TextureIO.newTexture(file)  
result: avg = 10395ms, total = 51979

ImageIO.read(bis) refers to the technique described in James Branigan's answer below. argbImage refers to the technique described in my previous edit:

img = ImageIO.read(file);
argbImg = new BufferedImage(img.getWidth(), img.getHeight(), TYPE_INT_ARGB_PRE);
g = argbImg.createGraphics();
g.drawImage(img, 0, 0, null);
texture = TextureIO.newTexture(argbImg, false);

Any more methods of loading (either images from file, or images to OpenGL) would be appreciated, I will update these benchmarks.

user337677 · Accepted Answer

Short Answer The JOGL texture classes do quite a bit more than necessary, and I guess that's why they are slow. I run into the same problem a few days ago, and now fixed it by loading the texture with the low-level API (glGenTextures, glBindTexture, glTexParameterf, and glTexImage2D). The loading time decreased from about 1 second to "no noticeable delay", but I haven't done any systematic profiling.

Long Answer If you look into the documentation and source code of the JOGL TextureIO, TextureData and Texture classes, you notice that they do quite a bit more than just uploading the texture onto the GPU:

Handling of different image formats
Alpha premultiplication

I'm not sure which one of these is taking more time. But in many cases you know what kind of image data you have available, and don't need to do any premultiplication.

The alpha premultiplication feature is anyway completely misplaced in this class (from a software architecture perspective), and I didn't find any way to disable it. Even though the documentation claims that this is the "mathematically correct way" (I'm actually not convinced about that), there are plenty of cases in which you don't want to use alpha premultiplication, or have done it beforehand (e.g. for performance reasons).

After all, loading a texture with the low-level API is quite simple unless you need it to handle different image formats. Here is some scala code which works nicely for all my RGBA texture images:

val textureIDList = new Array[Int](1)
gl.glGenTextures(1, textureIDList, 0)
gl.glBindTexture(GL.GL_TEXTURE_2D, textureIDList(0))
gl.glTexParameterf(GL.GL_TEXTURE_2D, GL.GL_TEXTURE_MIN_FILTER, GL.GL_LINEAR)
gl.glTexParameterf(GL.GL_TEXTURE_2D, GL.GL_TEXTURE_MAG_FILTER, GL.GL_LINEAR)
val dataBuffer = image.getRaster.getDataBuffer   // image is a java.awt.image.BufferedImage (loaded from a PNG file)
val buffer: Buffer = dataBuffer match {
  case b: DataBufferByte => ByteBuffer.wrap(b.getData)
  case _ => null
}
gl.glTexImage2D(GL.GL_TEXTURE_2D, 0, GL.GL_RGBA, image.getWidth, image.getHeight, 0, GL.GL_RGBA, GL.GL_UNSIGNED_BYTE, buffer)

...

gl.glDeleteTextures(1, textureIDList, 0)

James Branigan · Answer

I'm not sure that it will completely close the performance gap, but you should be able to use the ImageIO.read method that takes a InputStream and pass in a BufferedInputStream wrapping a FileInputStream. This should greatly reduce the number of native file I/O calls that the JVM has to perform. It would look like this:

File file = new File(fileName);
FileInputStream fis = new FileInputStream(file);
BufferedInputStream bis = new BufferedInputStream(fis, 8192); //8K reads
BufferedImage image = ImageIO.read(bis);

whatnick · Answer

Have you looked into JAI (Java Advanced Imaging) by any chance, it implements native acceleration for tasks such as png compressing/decompression. The Java implementation of PNG decompression may be the issue here. Which version of jvm are you using ?

I work with applications which load and render thousands of textures, for this I use a pure Java implementation of DDS format - available with NASA WorldWind. DDS Textures load into GL faster since it is understood by the graphics card.

I appreciate your benchmarking and would like to use your experiments to test out DDS load times. Also tweak the memory available to JAI and JVM to allow loading of more segments and decompression.

Loading PNGs into OpenGL performance issues - Java & JOGL much slower than C# & Tao.OpenGL

Tags:

java

performance

png

opengl

jogl

Ed Cresswell

3 Answers

user337677

James Branigan

whatnick

Recent Activity

Donate For Us

Loading PNGs into OpenGL performance issues - Java & JOGL much slower than C# & Tao.OpenGL

Tags:

java

performance

png

opengl

jogl

Ed Cresswell

3 Answers

user337677

James Branigan

whatnick

Related questions

Recent Activity

Donate For Us