Reducing level load time
by Ian Omroth Hardingham · in Torque Game Engine · 11/30/2004 (1:34 pm) · 79 replies
Hey everyone.
To get a better understanding of what is loaded when in TGE, I'm trying to make a level load as quickly as possible. I've removed all the obvious datablocks, to the player shapes, sound etc, but the "Loading Objects" phase still takes a little time. Is this simply the time it takes to load a level, or is any kind of directory search for textures etc going on?
Any help much appreciated.
Ian
To get a better understanding of what is loaded when in TGE, I'm trying to make a level load as quickly as possible. I've removed all the obvious datablocks, to the player shapes, sound etc, but the "Loading Objects" phase still takes a little time. Is this simply the time it takes to load a level, or is any kind of directory search for textures etc going on?
Any help much appreciated.
Ian
About the author
Designer and lead programmer on Frozen Synapse, Frozen Endzone, and Determinance. Co-owner of Mode 7 Games.
#62
unfortunately we can't share the implementation, but Tim McClarren of Doppelganger came up with an amazing technique to speed up mission loading. The technique is to cache the datablock and mission information on the client. Missions which used to take over a minute to load now take like twenty seconds or less. The high-level idea is to cache the bits which come over the wire for datablocks and scope-always objects, because they're always the same. But as always, the devil is in the details: to establish a checksum of the cache, anything sent in the cache must be 100% static, which means all your scope-always object must be 100% static. This was a problem with several objects that changed state over the course of a server session, and with many object which had uninitialized fields which were therefore initially random. It took a long time to track all of those down. In addition to that, there were many gotchas around the mission loading sequence and effects on packet window saturation. We had to get in to some gnarly stuff. Then of course we pre-distribute the cache file with the client installer, so even the first-ever connection is speedy. It's pretty complex and not something i could undertake, but i thought i'd throw it out there as food for thought. This mechanism is currently in use in vSide.
Tim has also come up with another, orthogonal, technique for further speeding mission load time, which is pre-loading assets for the mission before the user actually connects to the server, based on stochastic prediction of what the mission is likely to include. This obviously uses a bit more memory than typically required, and also is only suitable when there's a small number of missions the user is likely to connect to. However, we're now anticipating that missions which used to take over a minute to load will come at under ten seconds. Pretty cool. But all due to having a networking and general coding guru like Tim on your team.
Food for thought.
09/01/2007 (8:40 pm)
Just to throw in a wild note -unfortunately we can't share the implementation, but Tim McClarren of Doppelganger came up with an amazing technique to speed up mission loading. The technique is to cache the datablock and mission information on the client. Missions which used to take over a minute to load now take like twenty seconds or less. The high-level idea is to cache the bits which come over the wire for datablocks and scope-always objects, because they're always the same. But as always, the devil is in the details: to establish a checksum of the cache, anything sent in the cache must be 100% static, which means all your scope-always object must be 100% static. This was a problem with several objects that changed state over the course of a server session, and with many object which had uninitialized fields which were therefore initially random. It took a long time to track all of those down. In addition to that, there were many gotchas around the mission loading sequence and effects on packet window saturation. We had to get in to some gnarly stuff. Then of course we pre-distribute the cache file with the client installer, so even the first-ever connection is speedy. It's pretty complex and not something i could undertake, but i thought i'd throw it out there as food for thought. This mechanism is currently in use in vSide.
Tim has also come up with another, orthogonal, technique for further speeding mission load time, which is pre-loading assets for the mission before the user actually connects to the server, based on stochastic prediction of what the mission is likely to include. This obviously uses a bit more memory than typically required, and also is only suitable when there's a small number of missions the user is likely to connect to. However, we're now anticipating that missions which used to take over a minute to load will come at under ten seconds. Pretty cool. But all due to having a networking and general coding guru like Tim on your team.
Food for thought.
#63
Like Ed Zavada here:
www.garagegames.com/index.php?sec=mg&mod=resource&page=view&qid=12879
i have exact the same error.
Some missions loads, some not at all, with a "Client .... packet error: Invalid packet.."-error.
Has someone figured out how to get it work?
09/04/2007 (1:18 am)
Unfortunately the code from Daniel Eden don't work correctly.Like Ed Zavada here:
www.garagegames.com/index.php?sec=mg&mod=resource&page=view&qid=12879
i have exact the same error.
Some missions loads, some not at all, with a "Client .... packet error: Invalid packet.."-error.
Has someone figured out how to get it work?
#64
09/04/2007 (3:40 pm)
The resource was put together for Torque 1.5. Using it for 1.5.2 or whatever is going to have issues.
#65
Edit: Ok, with this...
Do you sure this part should be in the netGhost.cc?
09/05/2007 (12:32 am)
Have tested it now with TGE 1.4.0 and a clean TGE 1.5.0 + this Ressource. Ed Zavada uses 1.5.2 and has exact the same issue, so it must be something else, what has nothing with TGE-Versions.Client 3408 packet error: Invalid packet.. ... CDROP: 3408 local
Edit: Ok, with this...
Quote:...it's works now for both, TGE 1.4 and 1.5 ;)
This mostly works for me with TGE 1.5.2 (and several other mods applied). However I had to leave out the first chang in sim/netGhost.cc, around line 958 in NetConnection::loadNextGhostAlwaysObject, where we were supposed to add isLocalConnection().
Do you sure this part should be in the netGhost.cc?
#66
Iam still try to find out how to speed up the client server loading time and also save some traffic.
If I understand right caching the ghost allways objects "phase 2" is the more difficult part because they may changed during runtime. And caching the "Phase 1 download datablocks & targets" should be easier ?
Theoretic it could be like that ?:
* after server init it call a function "save datablocks to binary file" which simulate the datablock transmission to client used to build a md5 key for example
* when a client connect it sends the "md5 key" of the current cache file and load this if it match the server version else it get the current data and write this to cache file
Would it work that way or is it completly wrong ? First i thougth this phase 1 would be managed at ConsoleMethod transmitDataBlocks but it looks like there is only a little part of packets transmitted.
09/06/2007 (8:23 am)
Daniel's resource works fine for me on TGE 1.5 without any changes.Iam still try to find out how to speed up the client server loading time and also save some traffic.
If I understand right caching the ghost allways objects "phase 2" is the more difficult part because they may changed during runtime. And caching the "Phase 1 download datablocks & targets" should be easier ?
Theoretic it could be like that ?:
* after server init it call a function "save datablocks to binary file" which simulate the datablock transmission to client used to build a md5 key for example
* when a client connect it sends the "md5 key" of the current cache file and load this if it match the server version else it get the current data and write this to cache file
Would it work that way or is it completly wrong ? First i thougth this phase 1 would be managed at ConsoleMethod transmitDataBlocks but it looks like there is only a little part of packets transmitted.
#67
It's programmer focused (list just about every single c++ and script function/method call executed during the "connecting to a server" process.
09/06/2007 (8:32 am)
I wrote a TDN article a while back that may be useful for exploring this area of code: TDN: Torque Game Engine Connection Sequence OverviewIt's programmer focused (list just about every single c++ and script function/method call executed during the "connecting to a server" process.
#68
You have it almost exactly right, Thomas (except MD5 is probably a little overkill and using Torque's built-in CRC function on the saved cache file is simpler).
I found it simpler to just create a stream and save every Datablock and every GhostAlwaysEvent to it (using it's own "packUpdate()" function).
You will face two problems:
1) Anything unitialized that is packed into the stream by the server to be sent to the client will differ from run to run... you'll want to fix all those because otherwise your client cache is always invalid. There are quite a few in stock Torque, Finding them involved using a binary file comparison tool to compare caches from run to run and then getting offsets to the differing bits and stepping back through cache creation until you've gotten to the offset that differs.
2) Make sure that objects that are ghosted down via GhostAlways are treated as static objects in your mission, and not changed during runtime (I think you'll discover that it might already be the case for you).
I also write a marker and a ghost count into the stream between the Datablocks and the GhostAlways objects to use for verification that the stream is valid.
09/06/2007 (12:04 pm)
Quote:
If I understand right caching the ghost allways objects "phase 2" is the more difficult part because they may changed during runtime. And caching the "Phase 1 download datablocks & targets" should be easier ?
Theoretic it could be like that ?:
* after server init it call a function "save datablocks to binary file" which simulate the datablock transmission to client used to build a md5 key for example
* when a client connect it sends the "md5 key" of the current cache file and load this if it match the server version else it get the current data and write this to cache file
Would it work that way or is it completly wrong ? First i thougth this phase 1 would be managed at ConsoleMethod transmitDataBlocks but it looks like there is only a little part of packets transmitted.
You have it almost exactly right, Thomas (except MD5 is probably a little overkill and using Torque's built-in CRC function on the saved cache file is simpler).
I found it simpler to just create a stream and save every Datablock and every GhostAlwaysEvent to it (using it's own "packUpdate()" function).
You will face two problems:
1) Anything unitialized that is packed into the stream by the server to be sent to the client will differ from run to run... you'll want to fix all those because otherwise your client cache is always invalid. There are quite a few in stock Torque, Finding them involved using a binary file comparison tool to compare caches from run to run and then getting offsets to the differing bits and stepping back through cache creation until you've gotten to the offset that differs.
2) Make sure that objects that are ghosted down via GhostAlways are treated as static objects in your mission, and not changed during runtime (I think you'll discover that it might already be the case for you).
I also write a marker and a ghost count into the stream between the Datablocks and the GhostAlways objects to use for verification that the stream is valid.
#69
09/08/2007 (4:44 am)
Edit: Nvm i just found out what i did wrong ;)
#70
Comments would be appreciated ;)
Datablock caching flow draft:
I.) In game.cs after loading all server stuff call the console function CreateServerDBCache.
This will write all objects from the Sim::getDataBlockGroup() to
a server cache file. And build a CRC for client validation.
II.) At clientCmdMissionStartPhase1 the client check if the cache file exits:
1) If it not exists it set the variable (registered in engine) $RecordDBCache = true
and send commandToServer('MissionStartPhase1Ack', %seq);"
Datablocks will transfered as usual and recorded to the cache file.
2) If it exists it build an crc of the cache file and
send commandToServer('VerifyDBCache', %crcDBcache);
Server verify the crc and send an commandtoclient(DBCacheOK,true/false)
a.) if false client send commandToServer('MissionStartPhase1Ack', %seq) and set $RecordDBCache = true like in II.1
b.) if true client loads cache local and after that send commandToServer('MissionStartPhase2Ack', 1.5);
09/09/2007 (8:43 am)
I made a little flow draft how i will start to implement the datablock caching. Comments would be appreciated ;)
Datablock caching flow draft:
I.) In game.cs after loading all server stuff call the console function CreateServerDBCache.
This will write all objects from the Sim::getDataBlockGroup() to
a server cache file. And build a CRC for client validation.
II.) At clientCmdMissionStartPhase1 the client check if the cache file exits:
1) If it not exists it set the variable (registered in engine) $RecordDBCache = true
and send commandToServer('MissionStartPhase1Ack', %seq);"
Datablocks will transfered as usual and recorded to the cache file.
2) If it exists it build an crc of the cache file and
send commandToServer('VerifyDBCache', %crcDBcache);
Server verify the crc and send an commandtoclient(DBCacheOK,true/false)
a.) if false client send commandToServer('MissionStartPhase1Ack', %seq) and set $RecordDBCache = true like in II.1
b.) if true client loads cache local and after that send commandToServer('MissionStartPhase2Ack', 1.5);
#71
09/10/2007 (1:55 pm)
Thomas, this looks very interesting, and I'd love to try this out at any point you feel like you've got something that can be tested. I'd also be happy to help implement. Feel free to contact me ezavada at nospam-tenderfootgames.com if you want to discuss how we might coordinate.
#72
09/11/2007 (7:38 am)
Ed atm i stuck a little bit so i've nothing which is testable. I was able to write a datablock cache file on a dedicated server - copied this to client which the client did load without errors. But it was not very faster than the normal read over network. But i stuck at recording the stuff when the cache is invalid. The only thing which match is the count of datablocks i write on the top at the cache file ;) I'am also not sure i write the server cache correctly so it may take a while until i get it.
#73
For some reason fxFoliageReplicator dont work correct with this changes :-/
Here is a snip from a log-file without "speed up":
and here with "speed up":
As you can see, after "speed up"-implementation fxFoliageReplicator dont generate any foliages, with no console errors. But a few foliage replicators have worked...
Actually i try to figure it out, maybe someone has any idea why "speed up" broke fxFoliageReplicator functionality?
09/19/2007 (2:02 am)
Another issue with this ressource.For some reason fxFoliageReplicator dont work correct with this changes :-/
Here is a snip from a log-file without "speed up":
*** fxFoliageReplicator sumary [b]asked for: 25 items[/b] [b]generated: 25 items due to restrictions[/b] Lev: 4 PotNodes: 341 Used: 208 Objs: 257 Time: 0.0000s. fxFoliageReplicator - Approx. 0.01Mb allocated. *** end fxFoliageReplicator sumary *** fxFoliageReplicator - Culling has been disabled! fxFoliageReplicator - rel_radiusX: 0.500000, rel_radiusY:0.500000 fxFoliageReplicator - area density: 0.003183 , area:7853.981445 *** fxFoliageReplicator sumary [b]asked for: 25 items[/b] [b]generated: 25 items due to restrictions[/b] Lev: 4 PotNodes: 341 Used: 110 Objs: 92 Time: 0.0000s. fxFoliageReplicator - Approx. 0.01Mb allocated. *** end fxFoliageReplicator sumary *** fxFoliageReplicator - Culling has been disabled! fxFoliageReplicator - rel_radiusX: 0.600000, rel_radiusY:0.600000 fxFoliageReplicator - area density: 0.002653 , area:11309.733398 *** fxFoliageReplicator sumary [b]asked for: 30 items[/b] [b]generated: 30 items due to restrictions[/b] Lev: 4 PotNodes: 341 Used: 191 Objs: 195 Time: 0.0000s. fxFoliageReplicator - Approx. 0.01Mb allocated. *** end fxFoliageReplicator sumary *** fxFoliageReplicator - Culling has been disabled! fxFoliageReplicator - rel_radiusX: 0.400000, rel_radiusY:0.400000 fxFoliageReplicator - area density: 0.004974 , area:5026.548340 *** fxFoliageReplicator sumary [b]asked for: 25 items[/b] [b]generated: 25 items due to restrictions[/b] Lev: 4 PotNodes: 341 Used: 221 Objs: 335 Time: 0.0160s. fxFoliageReplicator - Approx. 0.01Mb allocated. *** end fxFoliageReplicator sumary ***
and here with "speed up":
*** fxFoliageReplicator sumary [b]asked for: 25 items[/b] [b]generated: 0 items due to restrictions[/b] Lev: 4 PotNodes: 341 Used: 0 Objs: 0 Time: 0.0780s. fxFoliageReplicator - Approx. 0.00Mb allocated. *** end fxFoliageReplicator sumary *** fxFoliageReplicator - Culling has been disabled! fxFoliageReplicator - rel_radiusX: 0.500000, rel_radiusY:0.500000 fxFoliageReplicator - area density: 0.003183 , area:7853.981445 *** fxFoliageReplicator sumary [b]asked for: 25 items[/b] [b]generated: 0 items due to restrictions[/b] Lev: 4 PotNodes: 341 Used: 0 Objs: 0 Time: 0.0310s. fxFoliageReplicator - Approx. 0.00Mb allocated. *** end fxFoliageReplicator sumary *** fxFoliageReplicator - Culling has been disabled! fxFoliageReplicator - rel_radiusX: 0.600000, rel_radiusY:0.600000 fxFoliageReplicator - area density: 0.002653 , area:11309.733398 *** fxFoliageReplicator sumary [b]asked for: 30 items[/b] [b]generated: 0 items due to restrictions[/b] Lev: 4 PotNodes: 341 Used: 0 Objs: 0 Time: 0.0780s. fxFoliageReplicator - Approx. 0.00Mb allocated. *** end fxFoliageReplicator sumary *** fxFoliageReplicator - Culling has been disabled! fxFoliageReplicator - rel_radiusX: 0.400000, rel_radiusY:0.400000 fxFoliageReplicator - area density: 0.004974 , area:5026.548340 *** fxFoliageReplicator sumary [b]asked for: 25 items[/b] [b]generated: 0 items due to restrictions[/b] Lev: 4 PotNodes: 341 Used: 0 Objs: 0 Time: 0.0320s. fxFoliageReplicator - Approx. 0.00Mb allocated. *** end fxFoliageReplicator sumary ***
As you can see, after "speed up"-implementation fxFoliageReplicator dont generate any foliages, with no console errors. But a few foliage replicators have worked...
Actually i try to figure it out, maybe someone has any idea why "speed up" broke fxFoliageReplicator functionality?
#74
... and I left out the first change in sim/netGhost.cc, around line 958 in NetConnection::loadNextGhostAlwaysObject, where we were supposed to add isLocalConnection().
Because mFilesWereDownloaded is set to the value of hadNewFiles (false), tsStatic will succeed in calling register object even if some textures were missing (see tsStatic::onAdd() for details).
My original change of just leaving out the inLocalConnection() call on line 958 had the same effect, but it sent a download request to the server and waited for the server to figure out that it didn't have the file before continuing, which was overall a lot less efficient.
03/17/2008 (9:03 pm)
I looked at this again, and realized what was wrong that was causing Invalid Packet errors. The code as Daniel had it assumes that any failed file load is fatal and generates an Invalid Packet error. The normal torque behavior is to allow failed file loads if the server doesn't have the file either. To get that behavior with your patch, I added the following to NetConnection::loadNextGhostAlwaysObject in netGhost.cc:while(mGhostAlwaysSaveList.size())
{
[b]if (isLocalConnection()) hadNewFiles = false;[/b]
// only check for new files if this is the first load, or if new
// files were downloaded from the server.... and I left out the first change in sim/netGhost.cc, around line 958 in NetConnection::loadNextGhostAlwaysObject, where we were supposed to add isLocalConnection().
Because mFilesWereDownloaded is set to the value of hadNewFiles (false), tsStatic will succeed in calling register object even if some textures were missing (see tsStatic::onAdd() for details).
My original change of just leaving out the inLocalConnection() call on line 958 had the same effect, but it sent a download request to the server and waited for the server to figure out that it didn't have the file before continuing, which was overall a lot less efficient.
#75
I had problems originally with SOME particle textures not showing up, but using Daniel's version of transmitDatablocks and Ed Zavada's fix above this post, it works. This process works for me because I know datablocks will not change between mission loads.
The time to join a remote server or jump to another server while the game is running is now only about 3-5 seconds, and another 3-5 seconds while playgui is loading up and displaying. I know this could be further reduced by caching static objects as well, but I don't have need of that optimization yet.
If anyone still follows this thread and is interested in my changes more explicitly, I can compile them into a resource.
09/17/2008 (6:02 am)
I know this is an old thread, but I wanted to post and to mention that I've combined some preload techniques with some of the local load enhancement code here to load up all datablocks when client starts, and not retransmit datablocks when joining servers.I had problems originally with SOME particle textures not showing up, but using Daniel's version of transmitDatablocks and Ed Zavada's fix above this post, it works. This process works for me because I know datablocks will not change between mission loads.
The time to join a remote server or jump to another server while the game is running is now only about 3-5 seconds, and another 3-5 seconds while playgui is loading up and displaying. I know this could be further reduced by caching static objects as well, but I don't have need of that optimization yet.
If anyone still follows this thread and is interested in my changes more explicitly, I can compile them into a resource.
#76
I think there are ppl interested in this (including myself), so the resource will be a good thingy.
I've done some experiments on that but was not able to build a working solution (lack of time damnit!) :-/ so I would love to see what you ended up with.
09/17/2008 (6:07 am)
Hey Dave!I think there are ppl interested in this (including myself), so the resource will be a good thingy.
I've done some experiments on that but was not able to build a working solution (lack of time damnit!) :-/ so I would love to see what you ended up with.
#77
On top of datablock caching, the next release of vSide should include another feature we've been fooling with which further dramatically improves level load time, which is delayed texture loading. I'm not the guy implementing it (a smart guy named Randall Meyer is), but the gist is that for most textures, we preload a tiny (16x16 or 32x32) version of the texture from disk, and when the texture handle is first bound (i think), then it goes into a queue for loading the full version, and we limit the number of textures which leave the queue per frame. The result naturally has some drawbacks: when you drop into the world, textures are lo-res, and gradually pop into full-res, and there's also a bit of a framerate hit while textures are loading out of the queue. However, after about five seconds all is generally well. The benefit is that level load time is Much faster. We might be able to get around the framerate chugs by using another thread, but that's speculative. Also there's still a couple bugs, but hopefully they'll be ironed out in the next week or two. Another potential optimization would be to re-save the textures out to a local disk cache in the graphics-card-native format, so that the next time that texture is loaded, it doesn't go through JPG/PNG processing, it just plops right into RAM all ready to go.
09/17/2008 (9:27 am)
Dave, v. interesting. Doesn't that require that each server the client is connecting to have the same datablocks, declared in the same order ? (a reasonable assumption, imo)On top of datablock caching, the next release of vSide should include another feature we've been fooling with which further dramatically improves level load time, which is delayed texture loading. I'm not the guy implementing it (a smart guy named Randall Meyer is), but the gist is that for most textures, we preload a tiny (16x16 or 32x32) version of the texture from disk, and when the texture handle is first bound (i think), then it goes into a queue for loading the full version, and we limit the number of textures which leave the queue per frame. The result naturally has some drawbacks: when you drop into the world, textures are lo-res, and gradually pop into full-res, and there's also a bit of a framerate hit while textures are loading out of the queue. However, after about five seconds all is generally well. The benefit is that level load time is Much faster. We might be able to get around the framerate chugs by using another thread, but that's speculative. Also there's still a couple bugs, but hopefully they'll be ironed out in the next week or two. Another potential optimization would be to re-save the textures out to a local disk cache in the graphics-card-native format, so that the next time that texture is loaded, it doesn't go through JPG/PNG processing, it just plops right into RAM all ready to go.
#78
Wow, you guys are way ahead of me hehe. It would be nice to work with such rocket scientists but alas I am just an indie ;)
vSide has lots of texture usage and lots of objects, which is probably a natural state for a close quarters and visually intensive mission environment.
For Cubekind I have lots less objects by nature, but it is good to know there are further optimizations that can be done. I want to use this technique for my RPG projects as well, and the same assumptions stand.
09/17/2008 (9:48 am)
Yes, it requires that each server has the same set of datablocks and that they are exec'd in the same order. This is a valid assumption for most projects I have been working on, but there's also no reason it couldn't be improved and conditionalized better.Wow, you guys are way ahead of me hehe. It would be nice to work with such rocket scientists but alas I am just an indie ;)
vSide has lots of texture usage and lots of objects, which is probably a natural state for a close quarters and visually intensive mission environment.
For Cubekind I have lots less objects by nature, but it is good to know there are further optimizations that can be done. I want to use this technique for my RPG projects as well, and the same assumptions stand.
#79
http://www.garagegames.com/index.php?sec=mg&mod=resource&page=view&qid=15427
09/17/2008 (10:09 am)
I did a resource with the technique I described above, hope it's useful.http://www.garagegames.com/index.php?sec=mg&mod=resource&page=view&qid=15427
Torque 3D Owner Thomas Huehn
On client/server i prever only to raise the packetsize to 450 not change the packet rate with traffic in mind. I also set the PacketRateToServer=8 to reduce the traffic because the client sends each second 32 packets by default if needed or not.