Game Development Community

BUG? Cumulative Ticking Difference

by Matias Kiviniemi · in Torque X 2D · 05/03/2007 (12:49 am) · 26 replies

It seems that there's some very slight differences in the number of ticks you get over time in different machines. When I have the game running over network in two machines with same speeds, they slowly fall out of sync. The rate is quite slow, one tick in 1-2 seconds like a "fractional tick" would get ignored. The other machine is less powerful and runs at about 20fps, which might affect things.

I guess this is something that can't be totally eliminated, and is something one should prepare to handle. But in theory constant time ticks should converge over time, e.g. a game running at 10 ticks per second for a thousand seconds should get 10000 ticks and not for example 9500
Page «Previous 1 2
#1
05/03/2007 (7:30 am)
First, i think you should use the Update method for all of your game-type updates

and I think you can set up this update interval, though I havent done it myself.

Secondly, you shouldnt code your game in such a way as a difference in updates would cause your network communications to go out of sync. You should probably look into different design patterns that would help solve this issue, such as using one of the machines as a server.
#2
05/03/2007 (9:51 am)
I'm not sure if I understood you correctly, but the problem is that (IMO) the ticking system does not work as it should. The idea that one tick repesents a constant time period should mean that you get an equal amount of them over time. Now I have one machine that ticks slightly faster than the other. First thing that came to my mind was that there would be some cumulative rounding error when evaluating "is it time for a new tick".

I have a synchronization mechanism that fixes the difference, but it's still a workaround. Adjusting the ticks will always cause some side effects (depending on which direction you're doing it). And I don't really want to abandon the the ticking altogether unless I have to as it otherwise suits me relatively well.
#3
05/03/2007 (10:27 am)
My question would be are you taking into account the latency between connections, and how that affects your networking synchronization?

It's possible/probable that the ticking mechanism is throwing away partial ticks over a time period instead of aggregating them in some way, which would describe your root issue, but I think what Jason was getting at is that based on your post (which was a summary of course), it appears that you are relying on your two networked applications to always be on the same tick at the same time--and that's an unrealistic assumption for network synchronized applications.

Without going into painful detail, how TGE's networking system works is that every update sent from a server to a client is identified with a time stamp at which the update should be considered by the receiver, and there are mechanisms to unwind and rewind the receiver's simulation, apply/test the update, and correct back up to present state on that simulation.

The main reason for requiring a mechanism like this is that latency introduces an uncontrollable and unpredictable delay between when updates will arrive at two different simulations, and you cannot simply directly apply that update when it is received. A client with a very fast connection will receive the updates before a client with a very slow connection, and therefore by definition the updates will arrive at different ticks...and your networking/simulation interface needs to correct somehow for this.
#4
05/03/2007 (1:15 pm)
Let me try to elaborate. I'm taking into account the latency and have a similar mechanisms for making sure the updates are aplied "at right times". To put it (hopefully not too) short, updates are stamped with tick numbers, and rewinding&reticking occurs if necessary. In addition to this, I periodically resynchronize the clients by sending ping-pong-packets and comparing what ticks they're at. All in all it works pretty smoothly, I just need to resynchronize every once and a while. Problem is that resynchronizing every 1-5 seconds is a bit too often and causes it's own problems.

Now, if this is a bug, feature or just fact of life, I'm not sure. Intuitively it felt like the ticking system would automatically adjust this. For example consider:

if (TotalGameTicks * TickMS < TotalGameTimeMS)
DoTick();

Assuming TotalGameTimeMS keeps synchronized (probably not a safe assumption) in both clients, an equal number of ticks should occur in both clients over time. I guess one of the fundamental questions is that if this approach of using the tick number is a bad one. It wouldn't actually be a big thing to change from tick numbers to TotalGameTime or something (as suggested), but naturally I'd like know it's for a reason (and "select the right way(tm)").

But thank you for any comment or suggestions.
#5
05/03/2007 (1:23 pm)
I noticed this same behavior NOT on a network and just looking at the ticks on the individual machines independently. Try unhooking your network aspect and doing a tick count on each machine to see if it truely is a network related problem or a tick based problem in general like I am experiencing.
#6
05/03/2007 (1:25 pm)
I'm not sure if Torque X has it, but in TGE you could compare Platform::getRealMilliseconds() to the tickcount to measure & confirm the drift.
#7
05/03/2007 (2:23 pm)
My gut instinct tells me that Torque X is probably dropping partial ticks instead of managing them, which would explain both what you and Jonathon have reported.

I'm not sure and don't have time to do a code review, but this would explain the issue.
#8
05/03/2007 (6:57 pm)
Anything inparticular we could do to test this theory for you Stephen?
#9
05/03/2007 (6:59 pm)
Sorry I cant add any additional insight as to the mechanical solution (I am just starting my net code so i do find this interesting)

One thing you may try, is to use the gameTime variable (which is passed to both Update() and Draw(), so you should have it) and look at the gameTime.GetTotalRealTime (I think that's what it's called) variable. This number should be tick/lag independant, so you could use that for your network interpolation instead of the gameTime.gameTicks (or whatever it's called)

Hope that helps! It's 9am here in BKK, and I'm about to spend the entire day figuring out netcode, so I should be able to give a more informed post in the evening.

-Jason

2dMmo: my "Game in a year" entry.
#10
05/03/2007 (7:03 pm)
@Stephen and Jonathon:

Actually I would be incredibly amazed if gametime (ticks, as in the .Draw() method) were exactly synchrnonous

Even "Identical" systems are different (hard drive fragmentation, page files, memory usage, etc)

and for non-identical systems, every part of the computer can concievably affect your FPS (and therefor ticks) including changing the camera extent in the game.

Am I missing something? I honestly dont see how you could *ever* assume that the ticks would be the same.
#11
05/03/2007 (7:41 pm)
@Jason: the underlying concept of constant frequency ticks is that they are unrelated to frame speed. Torque X's ticking system has two modes, and the constant frequency ticks are designed to calculate the "amount of cpu time" needed for a constant tick frequency (shoots for 32ms frequency), and then pulses ticks to the objects at that rate.

Torque X also provides higher, variable frequency ticks for rendering/animation specifically, which is what interpolateTick() is for--finding positions at a very high frequency between constant time ticks to get smooth rendering and animation. variable frequency ticks are frame rate dependent in a way, but these are used for non-physics related subsystems as mentioned above.

Clark knows the theory behind ticks much more in depth than I do, and I'm going to ping him to see if he can add anything to the conversation, although I know he and the rest of the team are extremely busy right now!
#12
05/04/2007 (8:31 am)
Quote:Actually I would be incredibly amazed if gametime (ticks, as in the .Draw() method) were exactly synchronous.
- correct, it would be crazy to expect ticks on different machines to be synchronous. however, the tick rate should be identical. so if you count the number of ticks which elapse in five minutes, it should be the same (plus or minus maybe two ticks) on two different machines. That's what Matias is seeing problems with.
#13
05/04/2007 (2:18 pm)
@ Matias - what you are reporting does sound wrong. When Stephen first mentioned this thread to me I was worried that you were seeing different results per tick between machines. That would be inexplicable. Instead it sounds like you are getting slightly different number of ticks per a given time period. That's not so bad as long as the error is small and you aren't doing a network game.

But you're doing a network game so it's a concern.

In TGE we had this same problem at one point because we were throwing out the ms remainder every time we got a time update. That was fixed a long time ago, but I could imagine the same bug manifesting itself here. The question is whether it is on the TorqueX side or the XNA side.

I'm looking at TorqueEngineComponent.cs and I think this might be what is going on. If you look at the update method it gets elapsed time via:
float elapsed = (float)gameTime.ElapsedGameTime.Milliseconds;

Problem is, the Milliseconds property is an int. If you wouldn't mind playing with this, I think you can fix it fairly easily. If you track TotalMilliseconds instead, and take a delta each time, then it should perform better. Of course, you have to worry about floating point precision now (since once TotalMilliseconds gets real big you will lose precision and time won't advance linearly). But at least we ware dealing with a double so this is probably not an issue. Note that what we'd really want is an int TotalMilliseconds.

Another approach to the problem is to not request inter-tick updates. To do that, set UseFixedTimeStep to true (should already be true by default) and UseInterpolation to false (is true by default) on the TorqueEngineSettings.xml file.

The effect of doing this is to only give you updates on the tick. That might sound like a problem, since it means you can't properly render mid-tick frames. But it actually works out well, and saves a lot of effort interpolating. The reason it works is that even though your frame rate might be a little lower (since you have to wait till the next tick boundary) you are always rendering on time (less gpu variance).

In any case, turning off interpolation might work for you. If not, you can try making sure that the remainder isn't lost. In either case, let us know what works so we can be sure to roll whatever changes are required into the engine.
#14
05/04/2007 (2:20 pm)
BTW, how are you doing a network game in TorqueX (since XNA doesn't include networking yet)? I assume you are using .NET networking, but it sounds almost like you are writing TGE like network layer for TorqueX. Cool.
#15
05/04/2007 (8:34 pm)
I'm using the system.net functionality, which is working pretty well for me right now (client and server can talk to eachother on different ports, tho both are currently running in the same logical app, so I havent actually tested over the wire)

My biggest hesitation to spending more time on networking right now, is if/when XNA comes out with something bigger and better. While it's cool to think about xbla support for a game, that's really not a concern of mine at the moment, though I guess it could be for some people...

so what i'm saying, is that system.net has a lot of good functionality already, for people that "need" networking, it's a good crutch until better support gets integrated with xna.
#16
05/06/2007 (12:39 am)
Hi Clark,

I have to admit I was a bit skeptic about your recommendation, but it did in fact correct the issue :) The two computers still eventually fall out of sync, but only after a couple of minutes or if there are performance issues. But this I think is "normal behaviour" and needs to be expected & handled.

I guess the 0.5ms error from using an integer ElapsedTimeMS will (at least in my circumstances) cumulate in stead of canceling out (could happen in other cases). This would correspond with the "1/2 tick per second" that I experienced. I also checked that TotalTimeMilliseconds stays in fairly well in sync (within 200ms after several minutes), so it should also be a feasible to use. But this is solution "is good enough for now" (hoping that enhancements will find their way to engine at some point of time), thank you.

@Jason: I've used the C#/System.Net Lidgren Network Library for transport layer. It's pretty easy to use and handles all the dirty work like sockets, transferring the bits, TCP/UDP and connection management. This way I can focus on my game logic and keeping things synchronized (which is challenging enough :). I've also wrapped it to my own Client/Server abstraction, so I can later replace it with XNA-legos or something else.

Matias
#17
05/06/2007 (1:24 am)
@clark and Matias, interesting, I wouldnt expect that the rounding of ms to int's would matter that much, but i guess it does! good to know

As for Lidgren, I've taken a look at that, but due to it's fairly invasive licensing (GPL-ish, read the header of some of it's sourcecode to see what i mean) I would rather just crux together "just enough" functionality myself for the time being, until the XNA networking is done (suposidly in a few months i think). Granted, if I end up needing better networking before XNA is done, maybe i'll use lidgren for a prototyping-crutch, but definatly cant release anything under those licensing terms.
#18
05/06/2007 (6:25 am)
I don't think the BSD-license is that invasive. The main requirement is that you include the copyright notice, i.e. give credit to author. There's no requirement to release code or anything. Granted, it might not suite all commercial scenarios, but it's not nearly as restrictive as GPL.
#19
05/06/2007 (8:55 am)
Sorry to hijack your post :)

ianal, nor do I have a law background, but this is what i see in the source code:

Quote:
Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the "Software"), to deal in the Software without
restriction, including without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom
the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or
substantial portions of the Software.

Perhaps I am interpreting that incorrectly, but I read it to be that the same permissions need to be given to anyone that the product is distributed to, hence my hesitation to use the library to any large degree......

anyway, i think it's a moot point, as by the time I get a game done (hopefully about 8 months) there will be better net support built into xna.
#20
05/06/2007 (9:00 am)
@Matias, and btw, i just read the BSD license link you posted, and the license terms in the lidgren source code are definatly not the same.

The BSD license does looks good i agree.
Page «Previous 1 2