How does Discord hook into a specific process's audio?

Question

Going through Google search results, there is no widely known way to capture audio from a specific application on Microsoft Windows, at least without having to resort to workarounds such as sending audio from one process to a separate virtual audio loopback device (which however results in an inability to hear the sound, unless you either use a hardware loopback playback device or "listen" to the emulated input via the main output).

These workarounds are clunky, require configuration for each specific application and software will often misbehave, no longer successfully make any sound or straight-up stop working if their output device is changed during execution. Meanwhile, launching a Discord "Live Streaming" session allows you to easily, without failure, share a single application's sound with a VoIP group call. Sound from other application is completely removed. Looking at audio devices, it appears that no virtual loopback routing is taking place, and there is absolutely zero interruption in audio playback on the client side. The functionality isn't available on the macOS or Linux versions of the software, only on Windows. Capturing sound from a specific process is thus possible in Win32, but why isn't anyone else doing this? What would it take, say, to implement something like this in a fork of software where such functionality would be extremely useful, like OBS or Audacity?

MarioMan22 · Accepted Answer

EDIT: Not sure if this is useful at all, but I found this page: https://obsproject.com/forum/threads/audio-sources.465/

In particular, this strikes me as useful information:

It's quite similar to hooking Direct3D. You hook the IAudioRenderClient interface, and intercept GetBuffer to read the audio samples.

Beginner's reverse engineering time!

Also, I cannot give a definitive answer, but I can lead you in the right direction.

Discord has a directory called \modules\discord_hook inside of it's root directory, here we can find there is a JavaScript file, named index.js, a json file named manifest.json, a .node file, named discord_hook.node (which is compiled/encrypted and I cannot read), a directory with .dlls and .exes, and it also generates a log file, named hook.log.

index.js appears to just load discord_hook.node and do some other things that aren't important to us.

Googling manifest.json brings me here: https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/manifest.json

The manifest.json file is the only file that every extension using WebExtension APIs must contain.

In the .json file, we find it is referencing the .exes, .dlls, discord_hook.node, index.js, and itself.

The .node file as previously mentioned is for the most part unreadable by a human being.

hook.log doesn't output anything seemingly helpful, just stuff about the graphics/video share.

This leaves us with looking inside the exe and dll files inside the subdirectory here, I have no knowledge of asm, but we can look at some strings left inside of these binaries.

I found a section of strings referencing audio at offset 1266B4 to offset 126EA6 in DiscordHook.dll (This may and almost definitely WILL change in future versions of discord)

Here are some of the strings that seem to be worth posting here.

Audio buffer stopped, WASAPI capture stopping
Failed to get format of WASAPI audio buffer, not capturing, error code [%d]
Failed to get WASAPI audio client from render client, not capturing
Starting capture of WASAPI buffer with sample rate %d, depth %d, %d channels
Starting capture of Windows Sonic stream with downmix sample rate %d, depth %d, %d channels
ISpatialAudioObjectRenderStream::Stop
ISpatialAudioObjectRenderStream::BeginUpdatingAudioObjects
ISpatialAudioObjectRenderStream::EndUpdatingAudioObjects
ISpatialAudioObject::GetBuffer
HookWasapi failed to load audioses.dll
WaveFormatFromRenderClient failed with error code [%d]
LoadWASAPIOffsets failed with error code [%d]
WASAPI module sizes don't match (expected: %lu, actual: %lu)
WASAPI offsets invalid (stop: %lu, getBuffer: %lu, releaseBuffer: %lu, clientOffset: %lu, endpointOffset: %lu)
WASAPI offsets out of bounds (size: %lu, stop: %lu, getBuffer: %lu, releaseBuffer: %lu)
IAudioClient::Stop
IAudioRenderClient::GetBuffer
IAudioRenderClient::ReleaseBuffer
HookWasapi: MH_ApplyQueued failed 0x%x

Also, I googled "hook process audio" and found this: https://ywjheart.wordpress.com/2017/02/26/audio-captureapihook-based-for-obs-studio/

It doesn't give any code examples, or downloads, but it describes some stuff on doing this very thing but in OBS. They also link the stuff they used to document it at the bottom.

Good luck, I hope all this information can help in some way!

How does Discord hook into a specific process's audio?

Tags:

reverse-engineering

winapi

hook

audio

Manchineel

1 Answers

MarioMan22

Recent Activity

Donate For Us

How does Discord hook into a specific process's audio?

Tags:

reverse-engineering

winapi

hook

audio

Manchineel

1 Answers

MarioMan22

Related questions

Recent Activity

Donate For Us