When milliseconds matter.
by Jon Frisby · 05/01/2006 (12:33 am) · 5 comments
At my day job, I am the Software Architect in charge of Business Intelligence. Basically what this means is I gather millions of pieces of information ("facts"), and a count them up. It is very important that this process happen as quickly as possible, because these systems are used to make decisions about where to spend our money and where to send traffic that we've purchased. The difference between timely data and late data is the difference between making a profit and losing money.
In this scenario milliseconds matter, because they add up quickly. An operation that takes 20ms may happen 1,000,000 times in a day. But even so, I never look at the efficiency of an operation because I'm working in terms of SQL. I simply don't have room to make an operation faster, so instead I focus on doing less work. Instead of doing an operation 1,000,000 times a day, I try to find ways to make it happen 100,000 times per day. Careful construction of queries, and indexes, and so forth. I have never once in the past 5 years had to look at an operation in terms *milliseconds* -- always in terms of minutes or hours.
In building my game, I have not paid particular attention to performance because that's something I can worry about later. As Donald Knuth put it: "Premature optimization is the root of all evil." This strategy worked fine. Until one of the composers working on my game got ahold of it.
As astonishing, and incomprehensible as it may seem, an operation that is performed twice in the space of a couple minutes was causing a horrible user experience because it took too long. Specifically, it was taking 50-65 milliseconds (On a 1.33Ghz G4 PowerBook).
The composer offered to work around the problem by making the "danger" track (explanation in next paragraph) "less rhythmically sensitive", but I have two problems with that: 1) Every workaround imposes limitations on what the artist can do. This leads to work that is not necessarily as good as the artist could otherwise do. 2) Where does that leave the less experience composers working on the project who don't do this professionally and don't know how to work around such limitations? So I had to do SOMETHING.
Here's the problem: I need to start two tracks playing simultaneously, with one essentially muted out. The second track would gradually fade in and out as the board fills up/empties, and is meant to add tension. What's the big deal about 50ms? For a song in 4/4 time at 120BPM that's almost a 32nd note. It would be fine if the difference in start times were fixed (just shift the main track back by a 1/32nd note) but in all likelihood it would be radically different from one machine to the next. I made several attempts at improving things: I moved the two successive calls to alxPlay into C++ (reduce the amount of work done by the CPU by removing the overhead of script language interpretation) using my wrapper function for alxPlay that looked up the AudioProfile object by name. This made a slight improvement. I rewrote it directly, doing the lookups and creating the transform first, then calling alxPlay immediately in succession. This made a very very slight improvement. Finally, I sat down and actually looked at alxPlay to see what it was doing that was taking so long, and I came up with alxCreateSource. This function does a HUGE amount of work. An absolutely IMMENSE amount of work (relatively speaking).
All of the overhead -- to the last millisecond -- is in this function. So, my last-ditch effort to fix the problem is to do the alxCreateSource calls first, then when both are finished, call alxPlay with AUDIOHANDLEs instead of AudioProfiles. This doesn't reduce the amount of work done, it just shifts it to it's all up-front, not in-between the events I care about. It's a cheat, but it should work.
Of course, this leaves me with two problems:
1) My music now lags 1/16th note (2 1/32nd notes) behind the "sweeper" object. If this is noticable, I may have to do something about it, but for now I am ignoring it.
2) My sound effects, which are supposed to be played on-beat, will lag 1/32nd note behind the music. Actually, possibly more because it may be cumulative with #1. I will very likely have no choice but to do something about this.
The solution to both problems will likely involve asynchronously preparing and caching AUDIOHANDLES for all the sound effects for the next skin. Not entirely certain how feasible that will be, so for now I'm hoping against hope that it isn't a problem and I can ignore it.
Of course all of this is nothing compared to the problems I'm facing with Theora video.
Problem #1: Making it loop *smoothly* is proving to be problematic, to say the least. Tom tells me that it is likely a fundamental characteristic of the current Theora decoder code, making the problem intractable. (I cannot afford to pay someone to rewrite the Theora decoder!) If I cannot solve this problem, I will likely have to drop one of the skins we've been working on, and any future skins that also would have depended on it. I'd also be out hundreds of dollars (a non-trivial percentage of my budget!) for the art asset that has been produced already as a video, and Tom's time.
Problem #2: Even lowering the resolution to 320x240, it takes a HUGE amount of CPU time to play Theora video. Framerates on a MacBook Pro go from 200-300 FPS down to ~50 FPS when the Theora video is playing. Working around this may mean having an "enhanced" version of the game for higher-end systems, or a "lite" version for lower end systems. On my 1.33Ghz G4 PowerBook, it's about 15-20 FPS when the video is playing. Much less if I increase the resolution to something where you can actually see the fine details of the video.
On the plus side: "Jane" has had her "one small step for man" moment with the Level Builder! Soon she should be crankin' away on skins. Also, the afore-mentioned composer gave me a rough-cut of his first track and it's really really really good. AND, a second composer on the project gave me a rough cut of her first track, along with the danger track and it's also really good! Now if I could just whip *my* track into shape (or convince someone else to come up with a track for this particular skin) I'd be *almost* to where I need to be for a demo!
Oh yes: Looks like I'm going to be lucky to fit a 3-skin demo into a 15MB download, even after: 1) Properly packaging stuff so symlinks aren't followed, causing frameworks to triple in size, 2) Carefully compressing the audio as much as I can, and 3) Stripping darn near everything I'm not using out of the engine. It'll likely be closer to 12MB for the Windows version.
-JF
In this scenario milliseconds matter, because they add up quickly. An operation that takes 20ms may happen 1,000,000 times in a day. But even so, I never look at the efficiency of an operation because I'm working in terms of SQL. I simply don't have room to make an operation faster, so instead I focus on doing less work. Instead of doing an operation 1,000,000 times a day, I try to find ways to make it happen 100,000 times per day. Careful construction of queries, and indexes, and so forth. I have never once in the past 5 years had to look at an operation in terms *milliseconds* -- always in terms of minutes or hours.
In building my game, I have not paid particular attention to performance because that's something I can worry about later. As Donald Knuth put it: "Premature optimization is the root of all evil." This strategy worked fine. Until one of the composers working on my game got ahold of it.
As astonishing, and incomprehensible as it may seem, an operation that is performed twice in the space of a couple minutes was causing a horrible user experience because it took too long. Specifically, it was taking 50-65 milliseconds (On a 1.33Ghz G4 PowerBook).
The composer offered to work around the problem by making the "danger" track (explanation in next paragraph) "less rhythmically sensitive", but I have two problems with that: 1) Every workaround imposes limitations on what the artist can do. This leads to work that is not necessarily as good as the artist could otherwise do. 2) Where does that leave the less experience composers working on the project who don't do this professionally and don't know how to work around such limitations? So I had to do SOMETHING.
Here's the problem: I need to start two tracks playing simultaneously, with one essentially muted out. The second track would gradually fade in and out as the board fills up/empties, and is meant to add tension. What's the big deal about 50ms? For a song in 4/4 time at 120BPM that's almost a 32nd note. It would be fine if the difference in start times were fixed (just shift the main track back by a 1/32nd note) but in all likelihood it would be radically different from one machine to the next. I made several attempts at improving things: I moved the two successive calls to alxPlay into C++ (reduce the amount of work done by the CPU by removing the overhead of script language interpretation) using my wrapper function for alxPlay that looked up the AudioProfile object by name. This made a slight improvement. I rewrote it directly, doing the lookups and creating the transform first, then calling alxPlay immediately in succession. This made a very very slight improvement. Finally, I sat down and actually looked at alxPlay to see what it was doing that was taking so long, and I came up with alxCreateSource. This function does a HUGE amount of work. An absolutely IMMENSE amount of work (relatively speaking).
All of the overhead -- to the last millisecond -- is in this function. So, my last-ditch effort to fix the problem is to do the alxCreateSource calls first, then when both are finished, call alxPlay with AUDIOHANDLEs instead of AudioProfiles. This doesn't reduce the amount of work done, it just shifts it to it's all up-front, not in-between the events I care about. It's a cheat, but it should work.
Of course, this leaves me with two problems:
1) My music now lags 1/16th note (2 1/32nd notes) behind the "sweeper" object. If this is noticable, I may have to do something about it, but for now I am ignoring it.
2) My sound effects, which are supposed to be played on-beat, will lag 1/32nd note behind the music. Actually, possibly more because it may be cumulative with #1. I will very likely have no choice but to do something about this.
The solution to both problems will likely involve asynchronously preparing and caching AUDIOHANDLES for all the sound effects for the next skin. Not entirely certain how feasible that will be, so for now I'm hoping against hope that it isn't a problem and I can ignore it.
Of course all of this is nothing compared to the problems I'm facing with Theora video.
Problem #1: Making it loop *smoothly* is proving to be problematic, to say the least. Tom tells me that it is likely a fundamental characteristic of the current Theora decoder code, making the problem intractable. (I cannot afford to pay someone to rewrite the Theora decoder!) If I cannot solve this problem, I will likely have to drop one of the skins we've been working on, and any future skins that also would have depended on it. I'd also be out hundreds of dollars (a non-trivial percentage of my budget!) for the art asset that has been produced already as a video, and Tom's time.
Problem #2: Even lowering the resolution to 320x240, it takes a HUGE amount of CPU time to play Theora video. Framerates on a MacBook Pro go from 200-300 FPS down to ~50 FPS when the Theora video is playing. Working around this may mean having an "enhanced" version of the game for higher-end systems, or a "lite" version for lower end systems. On my 1.33Ghz G4 PowerBook, it's about 15-20 FPS when the video is playing. Much less if I increase the resolution to something where you can actually see the fine details of the video.
On the plus side: "Jane" has had her "one small step for man" moment with the Level Builder! Soon she should be crankin' away on skins. Also, the afore-mentioned composer gave me a rough-cut of his first track and it's really really really good. AND, a second composer on the project gave me a rough cut of her first track, along with the danger track and it's also really good! Now if I could just whip *my* track into shape (or convince someone else to come up with a track for this particular skin) I'd be *almost* to where I need to be for a demo!
Oh yes: Looks like I'm going to be lucky to fit a 3-skin demo into a 15MB download, even after: 1) Properly packaging stuff so symlinks aren't followed, causing frameworks to triple in size, 2) Carefully compressing the audio as much as I can, and 3) Stripping darn near everything I'm not using out of the engine. It'll likely be closer to 12MB for the Windows version.
-JF
#2
Ian
05/01/2006 (4:17 am)
We do exactly this for Determinance: play two tracks simultaneously, fading one in and out depending on what's going on on-screen. It's a while since I wrote the code, but I just made a different alxPlay function which only starts playing once it's been called twice, loading everything up but only starting when everything's ready.Ian
#3
I read somewhere that you can sense hearing a latency of 12ms and above. Thats quite a target to have to aim for considering that you may enounter extra latency through different types of sound card drivers. I know the older drivers for my creative live sound card had an unacceptable latency for realtime sounds (such as playing via a midi keyboard), I had to use third party drivers that bypassed a lot of the buffering in order to get the latency to an unaudible level.
As James mentions you can create/assign sources long before you use them. I'm pretty sure this is how the audio.cc code works with the mSources array been pre-populated with a MAX_SOMETHING number of sources.
From what I've heard about your project its along a similar lines to lumines? I've never played that game but heard a lot of people singing its praise, so I'll definatly be checking out your demo when you release it :)
05/01/2006 (5:15 am)
The other weekend I spent some time working with embedded devices, before this I've never worked with tight timing constraints either, but that weekend was an eye opener. 12.5ns per cycle with most instructions taking 1-3 cycles to execute and the need to generate a new pixel every 187.5ns leaves 15 cycles to decide whether a pixel needs plotting or not. Working within such a hard time constraint wass certainly a new experience for me.I read somewhere that you can sense hearing a latency of 12ms and above. Thats quite a target to have to aim for considering that you may enounter extra latency through different types of sound card drivers. I know the older drivers for my creative live sound card had an unacceptable latency for realtime sounds (such as playing via a midi keyboard), I had to use third party drivers that bypassed a lot of the buffering in order to get the latency to an unaudible level.
As James mentions you can create/assign sources long before you use them. I'm pretty sure this is how the audio.cc code works with the mSources array been pre-populated with a MAX_SOMETHING number of sources.
From what I've heard about your project its along a similar lines to lumines? I've never played that game but heard a lot of people singing its praise, so I'll definatly be checking out your demo when you release it :)
#4
@Gary: I'm hoping to have a demo out Real Soon Now. I need to get rhythmic effects from one composer, music + rhythmic effects from another, make better menus, and get a Windows build going first though...
-JF
05/02/2006 (5:42 am)
@James: Thanks for the tip! I see myself using that in the VERY near future!@Gary: I'm hoping to have a demo out Real Soon Now. I need to get rhythmic effects from one composer, music + rhythmic effects from another, make better menus, and get a Windows build going first though...
-JF
#5
// First initialize and play the sound sources
$sound1 = alxPlay(alxCreateSource(East, "0 -2 1.7"));
$sound2 = alxPlay(alxCreateSource(North, "2 0 1.7"));
$sound3 = alxPlay(alxCreateSource(West, "0 2 1.7"));
$sound4 = alxPlay(alxCreateSource(South, "-2 0 1.7"));
// Pause them all - unfortunately this causes an audible click since the sounds have just started playing :(
alxPauseAll();
// cue the tapes to the beginning
alxSeekToStreamPosition($sound1, 0);
alxSeekToStreamPosition($sound2, 0);
alxSeekToStreamPosition($sound3, 0);
alxSeekToStreamPosition($sound4, 0);
// start them all with a single script command
alxUnpauseAll();
I know this works with about <2ms precision. I am using Audigy II ZS with the latest OpenAL DLLs.
Robert Geiman's audio seek/pause mod can be found here:
http://www.garagegames.com/index.php?sec=mg&mod=resource&page=view&qid=9385
09/14/2006 (4:47 am)
I need to sychronize multiple sound sources for a different reason. I want to play four-channel directional audio in Torque. Thanks to Robert Geiman's audio source modification to pause/seek audio, I am able to do the following trick:// First initialize and play the sound sources
$sound1 = alxPlay(alxCreateSource(East, "0 -2 1.7"));
$sound2 = alxPlay(alxCreateSource(North, "2 0 1.7"));
$sound3 = alxPlay(alxCreateSource(West, "0 2 1.7"));
$sound4 = alxPlay(alxCreateSource(South, "-2 0 1.7"));
// Pause them all - unfortunately this causes an audible click since the sounds have just started playing :(
alxPauseAll();
// cue the tapes to the beginning
alxSeekToStreamPosition($sound1, 0);
alxSeekToStreamPosition($sound2, 0);
alxSeekToStreamPosition($sound3, 0);
alxSeekToStreamPosition($sound4, 0);
// start them all with a single script command
alxUnpauseAll();
I know this works with about <2ms precision. I am using Audigy II ZS with the latest OpenAL DLLs.
Robert Geiman's audio seek/pause mod can be found here:
http://www.garagegames.com/index.php?sec=mg&mod=resource&page=view&qid=9385

Associate James Urquhart
I would suggest making a function that plays both of your sources at the same time, using alSourcePlayv.
e.g:
p.s. don't forget to read the OpenAL 1.1 Spec.