Audio Capture / Compositing
by Anthony Rosenbaum · in Torque 3D Professional · 06/25/2012 (7:58 am) · 1 replies
We at Plastic Games are working on a project that requires the dynamic output of a mpeg-4 video. We were happy to see a framework for Video capture in; all we had to do was make a new video encoder (revel in this case).
However I am having trouble figuring out how to extract sound. In-game sounds (ambient noise etc) is not required, we want to over lay the captured video with background music and pre-recorded Voice over.
Revel's approach to adding audio is appending a sound after video capturing, which means we either need to capture the audio as it plays in-game OR composite two sounds together dynamically. Either way, we take the resultant audio and then append it to the video capture.
I am just now exploring the SFX system and was hoping to get some hints from the community. I see things like SFXVoice and SFXbuffers. Can anyone outline a process for this?
However I am having trouble figuring out how to extract sound. In-game sounds (ambient noise etc) is not required, we want to over lay the captured video with background music and pre-recorded Voice over.
Revel's approach to adding audio is appending a sound after video capturing, which means we either need to capture the audio as it plays in-game OR composite two sounds together dynamically. Either way, we take the resultant audio and then append it to the video capture.
I am just now exploring the SFX system and was hoping to get some hints from the community. I see things like SFXVoice and SFXbuffers. Can anyone outline a process for this?
About the author
Associate Rene Damm
All of its backend APIs except XAudio2 do have support for setting up capturing devices and then tapping the raw sample stream, so the quickest way would probably be to decide on one backend API and just add two quick script functions to start and stop recording. FMOD ships an example with its SDK. Making all this stream to disk in the background still is a bit of work.
Worse, though, is that you're basically tapping into the raw mixer stream which means you get everything that's playing in one final mix (which includes stuff that may not even come from your application). Capturing only select channels definitely seems a little more involved to me. Honestly, though, I don't really have any experience with sound capturing so I'm more or less just relying on looking at documentation and google here.
If the audio doesn't really have to be a real-time in-game capture but rather allows to be constructed after the fact, it may possibly be better to render the audio stream separately. If you need cues for the voice overs, maybe you could save a temporary file with position markers to know where you need to cue which pre-recorded voice over. Then you could render it all with background audio into a final stream. An added benefit would be that you'd reduce the cost of capturing as only video would need to stream to disk live.
But then, seems all mighty complicated this way.