Game Development Community

dev|Pro Game Development Curriculum

Voice Comm using PortAudio, Speex and OpenAL

by Danni · 02/16/2007 (9:19 am) · 38 comments

Download Code File

This code is bare bones as a starting point to create your own VoiceComm in your game. It uses the client/server connection to transmit speex encoded audio, which the server then relays to everyone (including the sender in this implementation). This code has no channels, but will relay anything sent to the server to everyone connected, including the player who sent the voice stream in the first place. You are on your own to code that part as you see fit.

If you use this code, please credit me and let me know where you are using it. Other than that, you are free to do with it as you see fit! :)

This code is recommended for people who are familiar with speex, openal and torque in general. (Even though I am quite new to Torque)

To use this code, simply copy the voice.h and voice.cc from the zip into your engine/audio folder and add it to the project. You will need to link Torque.exe with libspeex.lib and portaudio.lib. If you do not know how, you should not be using this code. (Sorry!)

Next, add a voiceCommClient object called voiceComm to the GameConnection class. Don't forget to include audio/voice.h! (You would have if I didn't say anything wouldn't you! =P) You will also need to put the portaudio and speex headers in a place where the code can find them.

Also add the following in the GameConnection class in gameConnection.h.

voiceCommClient *getVoiceComm() { return &voiceComm; }


Now we need to Initialize and Deinitialize that object in GameConnection::onConnectionEstablished and GameConnection::onDisconnect respectivly.

e.g.
void GameConnection::onConnectionEstablished(bool isInitiator)
{
   if(isInitiator)
   {
      
      setGhostFrom(false);
      setGhostTo(true);
      setSendingEvents(true);
      setTranslatesStrings(true);
      setIsConnectionToServer();
      [b]voiceComm.Initialize();[/b]
      mServerConnection = this;
      Con::printf("Connection established %d", getId());
      Con::executef(this, 1, "onConnectionAccepted");
   }
   else
   {
      setGhostFrom(true);
      setGhostTo(false);
      setSendingEvents(true);
      setTranslatesStrings(true);
      Sim::getClientGroup()->addObject(this);

      const char *argv[MaxConnectArgs + 2];
      argv[0] = "onConnect";
      for(U32 i = 0; i < mConnectArgc; i++)
         argv[i + 2] = mConnectArgv[i];
      Con::execute(this, mConnectArgc + 2, argv);
   }
}



void GameConnection::onDisconnect(const char *reason)
{
   if(isConnectionToServer())
   {
      Con::printf("Connection with server lost.");
      Con::executef(this, 2, "onConnectionDropped", reason);
   }
   else
   {
      Con::printf("Client %d disconnected.", getId());
      setDisconnectReason(reason);
   }
[b]   voiceComm.Deinitialize();[/b]
}


This will create get everything ready once you connect to a server and kill it all when you disconnect. (DuH!)

Finally, you need one of those little bind thingies to make it all work.

function localVoice(%val)
{
	if (%val) {
		startVoiceRecord(1);
	} 
	else 
	{
		stopVoiceRecord();
	}
}

moveMap.bind(keyboard, f, localVoice);

Violla, VoiceComm! BTW, the argument passed to startVoiceRecord is the channel number (AllTalk, Squad, Team, Admin) etc. This is not implemented, but the baseline code is there for you to do so yourself.

Happy chatting! Let me know if you see anything funky, I have tested this code in this state and it worked for me.
Page «Previous 1 2
#1
02/16/2007 (10:47 am)
I'm having trouble with: "Next, add a voiceCommClient object called voiceComm to the GameConnection class." and finding the right portaudio resource.
Any hints?
#2
02/16/2007 (5:03 pm)
You have to add it as a static member and not a dynamically allocated one. E.g.

voiceCommClient  voiceComm;
and not
voiceCommClient *voiceComm;


The GameConnection class is probably not the best place as this would whack it in servers too, i just wanted to get the code working first, then identify where to put it later.

You should just need PA v1.9, you can get it here: www.portaudio.com/download.html
You will need to compile it, add the include directory, link the .lib and add the .dll to the working directory of your Torque.exe.

You may also need to increase the default network settings, one person at a time works fine, but get two to three and you could start hitting the default limits. Or you could use another transport method. :)
#3
02/18/2007 (10:20 pm)
Very useful tutorial. I needed exactly this.

However, I'm having trouble compiling portaudio into a .lib. It was looking for a bunch of headers that, when I searched around, I found I had to download the Vista SDK to get at, and even then I ran into problems. My only development environment is VS.NET 2005.

Any way you can send the .lib and .dll for speex and portaudio my way? Sly_Squash@hotmail.com
#4
02/19/2007 (4:40 pm)
Odd, i did have to hack a few things around in the library, specifically all the ASIO stuff, to make it compile OOTB on windows, but it should not be that difficult.

So here is my hacked up portaudio library. Use at your own risk: server1.animelab.com/portaudio.zip

Also forgot to mention, there is a race condition where the Source is not initialized and the thread activates causing an out of bounds memory access. I need to debug and fix that. Initialize somewhere other than gameconnection after openal is initiated and that should go away if it gives you trouble.
#5
03/01/2007 (2:17 pm)
Hi Danni,

Great resource. Thanks for providing us with this great start for VoIP!

I found a typo that when corrected makes the transmitted audio closer in level to the other in-game audio (prior to the change, the level of the transmitted voice was quite low).

In voice.cc...
Change...
speex_decoder_ctl(encoder, SPEEX_GET_FRAME_SIZE, &frame_size);
to...
speex_decoder_ctl(decoder, SPEEX_GET_FRAME_SIZE, &frame_size);

Another change I made was with the key binding from...
GlobalActionMap.bind(keyboard, f, localVoice);
to...
moveMap.bind( keyboard, f, localVoice );

In order for this to work, you have to either modify or remove your config.cs file as well. The result of using the GlobalActionMap.bing approach is that the keybind will be global, so it'll prevent you from typing an 'f' into the World Editor or the console.

Take care,

Ben
#6
03/02/2007 (6:33 pm)
Good catch. I modified the resource and zip to reflect these fixes.
#7
03/12/2007 (6:59 am)
Compiled and working in TGE 1.5.
Thanks Danni, great resource.
#8
03/15/2007 (5:42 pm)
Thanks! Glad to know it worked well for you.

Also something else i would like to point out before someone asks. I do not mix the streams on the server as i plan to have various different channels for proximity speech, squad, unit, team, and admin. Trying to break this down and mix streams for each unique client is just insane. I would rather waste bandwidth than CPU server side.

If you use a simpler structure you could interpolate streams server side using a simple mixing algorithm, there are various examples out there on the net.
#9
03/29/2007 (11:44 am)
Hi guys,

I was having the race condition issue frequently enough to need a fix. Thus, I simply pulled the calls to initialize and deinitialize the VoiceComm out of gameConnection.cc and added two console functions to voice.cc as follows...

ConsoleFunction(InitializeVoIP, bool, 0, 0, "")
{
	GameConnection *gc=GameConnection::getConnectionToServer();
	AssertFatal(gc,"Couldn't getConnectionToServer()");
	voiceCommClient *vcc=gc->getVoiceComm();
	AssertFatal(vcc,"Couldn't gc->getVoiceComm()");
	return vcc->Initialize();
}

ConsoleFunction(ShutdownVoIP, void, 0, 0, "")
{
	GameConnection *gc=GameConnection::getConnectionToServer();
	AssertFatal(gc,"Couldn't getConnectionToServer()");
	voiceCommClient *vcc=gc->getVoiceComm();
	AssertFatal(vcc,"Couldn't gc->getVoiceComm()");
	vcc->Deinitialize();

}

Then, in ~/client/scripts/game.cs, I added the following two script functions...

function clientCmdStartupVoIP()
{
   // Startup VoIP - BJS
   if (getWord(alGetString("AL_VERSION"), 0) $= "OpenAL")
      {
      // OpenAL is running - probably a better way but this will work for testing.
      if (!InitializeVoIP())
         echo("VoIP could not be initialized.");
      else
         echo("VoIP initialized.");
      }
   else
      echo("OpenAL not initialized");

}

function clientCmdShutdownVoIP()
{
   ShutdownVoIP();
   echo("Shutting down VoIP");
}

Then, in ~/server/scripts/game.cs...

At the end of the GameConnection::onClientEnterGame function, I added...
// Startup VoIP - BJS
   commandToClient(%this, 'StartupVoIP');

At the end of the GameConnection::onClientLeaveGame function, I added...
commandToClient(%this, 'ShutdownVoIP');

Though I haven't yet implemented it myself, another benefit of initializing and deinitializing via script is that if the player brings up the options dialog and changes audio drivers, etc., you can restart VoIP as needed.

Hope this is helpful,

Ben
#10
03/29/2007 (11:44 am)
Hi guys,

I'm pretty sure my approach to this isn't the best approach, but it may be of use to some of you...

First, a little background. As Danni stated earlier in the thead, the initial resource sends the stream to all clients - including the sending client. Danni has a start for supporting sending to all, team, squad, etc. but I didn't have a need to have that many options and I didn't have a clear understanding of how best to use the "channel" approach. Thus I went with a simple "don't send the stream to the client that created it" approach. If anyone has a better way (I'm certain they will), please share. Otherwise, this may help others.

Note: All of my changes are in bold and have a comment with (at minimum) my initials - BJS

In voice.h, both the voiceBlock_t and the decodedVoiceBlock_t structs have a player property. The outVoiceBlock_t did not. I found this was the easiest way to get the ID of the sending client so I added it.

struct outVoiceBlock_t {
	char  data[ENCODED_SIZE];
	S16	  dataSize;
	S8	  channel;
	[b]U32		player; // BJS[/b]
};

Then in voice.cc, I made the following mods...

In the pack function of the outgoingVoiceBlock class...
virtual void pack   (NetConnection *conn, BitStream *bstream)
	   {
		   [b]dataBlock.player = conn->getId(); // Put the ID in the datablock's player prop - BJS[/b]
		   bstream->write(dataBlock.dataSize);
		   bstream->write(dataBlock.dataSize, dataBlock.data);
		   [b]bstream->write(dataBlock.player); //  write out the ID - BJS[/b]
	   }

In the unpack function of the outgoingVoiceBlock class...
virtual void unpack (NetConnection *inConn, BitStream *bstream)
	   {
		   outVoiceBlock_t clientDataBlock;
		   voiceBlock_t	   serverDataBlock;	
		   bstream->read(&clientDataBlock.dataSize);
		   bstream->read(clientDataBlock.dataSize, clientDataBlock.data);
		   [b]bstream->read(&clientDataBlock.player); // grab the ID - BJS[/b]
		   serverDataBlock.dataSize = clientDataBlock.dataSize;
		   memcpy( serverDataBlock.data,  clientDataBlock.data, clientDataBlock.dataSize);
		   [b]serverDataBlock.player = clientDataBlock.player; // Make sure the original ID gets sent on later - BJS[/b]

		   for( NetConnection *conn = NetConnection::getConnectionList(); conn; conn = conn->getNext())
		   {
			   if(conn->isConnectionToServer())
				   continue;
			   if (dynamic_cast<GameConnection*>(conn) && static_cast<GameConnection*>(conn)->isAIControlled())      
				   continue;
			   
				conn->postNetEvent(new incomingVoiceBlock(&serverDataBlock));
		   }

	   }

In the pack function of the incomingVoiceBlock class...
virtual void pack   (NetConnection *conn, BitStream *bstream)
	   {
		   bstream->write(dataBlock.dataSize);
		   bstream->write(dataBlock.dataSize, dataBlock.data);
		   [b]bstream->write(dataBlock.player); // BJS[/b]
		   [b]//bstream->write(conn->getId()); // BJS[/b]
	   }

In the unpack function of the invomingVoiceBlock class...
virtual void unpack (NetConnection *conn, BitStream *bstream)
	   {
		   voiceBlock_t	   serverDataBlock;	
		   bstream->read(&serverDataBlock.dataSize);
		   bstream->read(serverDataBlock.dataSize, serverDataBlock.data);
		   bstream->read(&serverDataBlock.player);

		   [b]// BJS added to eliminate sending message to self
		   if (conn->getId() != serverDataBlock.player)
				{[/b]
				GameConnection::getConnectionToServer()->
					getVoiceComm()->decodeBlock(&serverDataBlock);
				[b]}[/b]
	   }

Hope this is useful.

Ben
#11
03/29/2007 (11:49 am)
Hi guys,

Okay, so I figured I should contribute before asking for help. ;-) Here's my question/problem...

I'm running on Mac OS X. I don't know if those running on Windows are experiencing the same issue. If not, just knowing that could be helpful. Anyway, here's what I find. If I don't turn the audio input level WAY down - almost completely down in my Mac's Audio/MIDI (read global) settings, the VoiceComm is really distorted (clips). Is this true on Windows as well?

I'm on an Intel Mac, so I don't think it's an issue of byte order when doing block copies, etc.

Any ideas?

Thanks,

Ben
#12
03/31/2007 (8:55 am)
Thanks Benjamin, that's a good solution for the race condition, i often forget about the scripting and flexibility there. I will modify the resource to include that fix later.

As to the clipping that is a normal experience for windows and speex but also i am not 100% sure this code is endian safe. I think i used all the Torque functions for handling memory so if those are, this code should be too.

If we initilized that way, we could pull it out of GameConnection completely, i keep getting this nagging feeling it should not be there.
#13
04/12/2007 (4:43 pm)
I have implemented this very useful resource and I stumbled upon another typo.
In voice.cc voiceCommCliend::Deinitialize() it should be:
speex_decoder_destroy(decoder);
instead of:
speex_encoder_destroy(decoder);

Without this, the game would crash for me when you exit the game.

I also use Marcelo Oliveira's Sound Upgrade resource, and I found you have to change the include in voice.h from audioStreamSource.h to audioStream.h in order to make it play nice with each other.
#14
05/01/2007 (6:35 am)
Hi,
I am using OGRE as the rendering engine. I want to know can I use the code but not link to TGE?
Thanks
#15
05/07/2007 (3:32 pm)
No, this uses API specific to Torque. Feel free to derive one from this one.
#16
05/08/2007 (2:43 am)
Sorry I am totally new about speex and portaudio

how can I modify your code so that I can use it as voice chat application?
Can I simply use Speex and Portaudio/openAL to implement voice chat?

sorry for my many questions
#17
05/09/2007 (4:08 pm)
It is fine. You can use portaudio for input and output, i used openAL for output because it is already used in Torque and I wanted positional voice in some instances.

All you need to do is initialize portaudio and speex as you need it, record chunks with PA, encode them with Speex and send them over the net. Receiving end, decode and buffer then playback similar to how i have done here. If you are using Ogre, you will need a network library (or your own code) to send the encoded blocks. eNet is a nice simple one, i have used Raknet before but I really dislike it.

You will also need a method of playback, since you already have portaudio, you can use that until you get something up and running in Ogre.
#18
05/09/2007 (8:30 pm)
oh I supposed PA can stream the encoded data to the other side, isn;t it?

I may misunderstand the usage of PA, I supposed PA can receive voice from microphone and the speex to compress the voice and PA again to send the encoded voice to other side.

Pls correct me if I am wrong.
many thanks! I will have a look of eNET
#19
05/09/2007 (8:42 pm)
Sorry
also I didn't see your code has Client-Server Architecture, how can we select ap person or a group of persons to talk?? Does eNET can provide this kind of feature?
#20
05/11/2007 (4:43 pm)
Nope:

PA -> Speex Enc -> Sender Enet -> Receiver Enet-> Speex Dec -> PA

My code uses Torques internal client/server architecture so i dont need another port just for voice. Enet is just a simple lightweight protocol built on UDP. You will need to create a higher level protocol on top of this to transport and mix the voice data.
Page «Previous 1 2