Tuesday 27 March 2012

Monkey Tip: Relative Target Performance

I've previously posted about how Android (at least on my phone) is a platform that presents performance challenges when doing cross-platform development with Monkey*. This is true, but I wanted to post something more general about cross-platform performance. So, here it is.

 *This isn't specific to Monkey, of course, but the comfortable abstraction that Monkey provides means that it's more tempting to believe that all the platforms are the same than if you were writing your own cross-platform libraries.

Box2D


First, let's look at an example of an Android performance gap. Here are the timings for the Box2D domino pyramid performance test. They show the results for a run of 300 updates, ignoring all rendering and other per frame costs.

HTML5(Chrome 17)
total: 3334, low: 8, high: 18, avg: 11.07641196013289

XNA(Windows)
total: 764, low: 0, high: 16, avg: 2.538206

Android(ZTE Blade running 2.3.7)
total: 43988, low: 109, high: 295, avg: 146.13954

That's right. No joke. The phone is over 10x as slow as Javascript and nearly 60x slower than an XNA build running on my modest laptop (2GHz T4200, Intel GM45 chipset, if you feel the need to know). You really need to be careful about making any judgements about how fast your app will be on a phone if you're taking advantage of the rapid turnaround on another target.

However, that Box2D test is all about crunching numbers, constructing objects and other CPU/VM-heavy activities. For many games the rendering performance impact is going to outweigh the computational costs and so it's useful to have an idea of how things relate there.

Rendering Comparison


Here's an example for my current project. I was interested in being able to "fog out" objects to simulate distance (or, hey, actual fog). The simplest mechanism I could think of for doing this was to render a rectangle with a suitable alpha over the top of something that I wanted to be dimmed.

Before I went with that as a technique I wanted to understand what the costs might be cross the platforms, so I whipped up a test scenario and got out my measuring stick/timing class. The test was simply to render a 400x400 rectangle with a 0.1 alpha 1000 times, repeat that enough to get a decent average and, as far as possible, to separate the actual costs of the rectangle rendering.

Here are the average results for the 1000 rectangles:

Flash Player 11 (in Chrome): 1360ms
HTML5 (IE9): 425ms
HTML5 (Firefox 12): 240ms
HTML5 (Chrome 17): 215ms

XNA (Windows): 250ms
GLFW (Windows): 260ms

Android (ZTE Blade): 794ms

On the web side that result for Flash was a bit of a shocker. I'm not sure if it represents the true poor performance of Flash in this case or it's a problem with Monkey's translation. The FF and Chrome JS/Canvas engines are reasonably close together, but IE seems to be off on its own somewhere more relaxed.

The XNA and GLFW figures give a view of a "proper" desktop target. Oddly, they're marginally slower than the fastest canvas targets. I honestly don't know why that would be the case. I guess it's possible that the canvas rendering is as fast as the windowed XNA and OGL paths and maybe v-synching explains the rest, but it doesn't really add up.

As could be expected, my little Android phone is some way behind the beefier platforms but it's only 3-4x slower, which is far, far closer than in the Box2D update comparison. With these numbers I can be reasonably confident that I'm not digging a hole for myself by using a few fog rectangles around the place.

The other thing I did was to quickly test some variations of numbers and size of rectangle to see if these times could be generalised to fill-rates. I couldn't raise the enthusiasm to draw up some graphs but perhaps you'll take my word for it that they pretty much are. Render half as many rectangles? Half the time. Render twice as many at quarter of the area? Half the time. So I can, as a rule of thumb, figure that I can fill a screen's worth of my 800x480 phone with alpha rectangles in under 2ms. That's a handy thing to be able to bear in mind.

As We're Here, What About Bitmaps?


While I had the code all set up I thought I might as well check to see what the relative cost of rendering a bitmap with embedded alpha was (like a smoke texture). So here's the same test but drawing a 400x400 semi-transparent image:

Flash Player 11 (in Chrome): 475ms
HTML5 (IE9): 2970ms
HTML5 (Firefox 12): 625ms
HTML5 (Chrome 17): 550ms

XNA (Windows): 273ms
GLFW (Windows): 275ms

Android (ZTE Blade): 1590ms

The fact that Flash is considerably faster at rendering this test makes me more suspicious that the rectangle rendering is doing something odd. Worth investigating some time.

Both FF and Chrome are about 2.5 times slower at rendering bitmaps, but look at IE! That's really slow. So slow in fact that it declared the page unresponsive a few times. The world of JS engines and Canvas implementations continues to offer excitement, mystery and intriguing inconsistency to the unwary dev (and the wary ones, in fact).

The XNA and GLFW numbers barely change from the rectangle rendering. The Android time has doubled, but it's in line with the hit on HTML5, so we can still have a rough feel for what we can get away with if we're mostly building to HTML5 on Chrome or FF.

The Conclusion Bit


What I take away from this is that it pays to test the impact of a specific technique across the platforms you care about before you tie yourself to it. You can't make assumptions that relative performance between targets in one area will hold in another. In fact you can't even trust that the same target will do so in the case of HTML5. Measure twice, cut once and all that.

2 comments:

  1. One point: HTML, FLASH, GLWF are running on the same machine (CPU+GPU+RAM+OS) I suppose. While Android is only one device.
    To me it's not very safe to say that 'Android is slower than'...
    Is slower on THAT device. You should need to put more results on OTHER Android device (same or different hardware specs) to have a complete picture of the situation.
    So we/you could understand where or what is the origin of the differences...

    Cheers

    ReplyDelete
  2. I give the specs of the PC and the model of the phone so that people have some idea of the hardware/os versions involved. I wasn't aiming or claiming to demonstrate that Android is slower as an OS on equivalent hardware. The intention of the post was to show how the variation between targets and the need to be cautious when developing on one for release on another.

    At the time this was posted it was commonly claimed in the Monkey community that HTML5 was the "lowest performing" target so I was keen to show how wrong this view could be.

    ReplyDelete