Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For surround sound audio (5.1/7.1) extract only the center channel audio from the video file. #54

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

varenc
Copy link

@varenc varenc commented Oct 27, 2024

The center channel should contain all the dialogue with less background noise.

Made minor tweaks and changes to support detecting center channel and applying additional ffmpeg -af 'pan=c0=FC' filter for center channel extraction. Slightly increases accuracy for speech detection in limited testing.

downside is this may cause worse detection in some cases, or very bad detection if for some reason the dialogue isn't on the FC center channel. Will cause ffmpeg errors for audio with 6 or 8 channels that don't have a center channel. (6.0(front))

TODO: Make this a configurable option in the future.

extra PR note: I don't really think this should get merged as in, but just planting the seed that this improvement is reasonable. I think it also needs to be a configurable option and the audio detectiion should be more robust and check for actual 5.1, 5.1(side), 7.1, 7.1(wide), etc, etc channel layouts instead of just seeing if the audio has 6 or 8 channels. Also more confirmation that media really should always have all the subtitled dialogue on the FC channel.

…dio from the video file.

The center channel should contain all the dialogue with less background noise.

made minor tweaks and changes to support detecting center channel and applying
additional ffmpeg `-af 'pan=c0=FC'` filter for center channel extraction.
slightly increases accuracy for speech detection in limited testing.

downside is this may cause worse detection in some cases, or very bad detection
if for some reason the dialogue isn't on the FC center channel. Will cause ffmpeg
errors for audio with 6 or 8 channels that don't have a center channel.

TODO: Make this a configurable option in the future.
@varenc
Copy link
Author

varenc commented Oct 28, 2024

For confirming that 5.1 audio media indeed has all the dialogue on the FC channel, I used this helpful ffmpeg command:

ffmpeg -i <input_media> -af 'asplit[a1][a2];[a1]pan=mono|c0=FC[a1];[a2]pan=mono|c0=0.5*FL+0.5*FR+0*FC+0.707*LFE+0.5*BL+0.5*BR[a2];[a1][a2]amerge=inputs=2' -c:a aac -c:v copy -c:s copy out.mkv

That takes in a media with 5.1 sound and downmixes the audio to stereo so that just the FC (center) channel is on the left channel, and all the other channels are downmixed to mono and put on the right channel. The numbers in the second pan filter are just downmixing the 5.1 audio with 0*FC to make sure the FC channel isn't included.

This results in a stereo media file you can listen to with earbuds and you can easily hear to confirm that all dialogue is on the left earbud while everything else is on the right earbud. In my brief testing I found all dialogue to always be on the FC channel. A media file like this also demonstrates how just FC is a cleaner audio source for just dialogue with less background noise.

@varenc
Copy link
Author

varenc commented Oct 28, 2024

update: I found some counterexample media where extracting just the center FC channel results in much worse/nonsense performance. I haven't figured out why yet exactly but I would guess it's because on this particular media there's some dialogue that's missing from the FC channel. So definitely wouldn't recommend this as the default behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant