Skip to content

Commit

Permalink
google_rtc_audio_processing: Major rework
Browse files Browse the repository at this point in the history
Lots of work on AEC, with an eye to getting it working on main, where
it had bitrotten, and pulling in various features (IPC3 pipeline state
management and DP scheduling) that had merged in other branches.

Dynamically configure stream formats (sample format/rate and channel
count) from the connected streams at prepare() time instead of relying
on build-time tuning.

Port the code to use the source/sink API, in as sophisticated a manner
as I can find.  Copies unroll cleanly into just a few instructions per
sample, including integer/float conversions and de-/interleaving.

Support both 16 and 32 bit sample formats, with a fairly clever
inlining scheme to share as much code as possible between them.  The
component will select "copy" function pointers at prepare() time.
This works by using the _float32 variant of the AEC API (which is
actually the core internal implementation) instead of the _int16 one
which involves a conversion.

The large buffers required by this component (input and output staging
and an internally-managed pool/heap block) are now static symbols
instead of dynamic memory from the heap.  These are very large, taking
up about half of what is available to the linker on MTL.  Relying on
heap allocation is just dangerous in this context.

This fully decouples AEC from the playback stream.  It will run
without an active reference happily, feeding zeros to the processing,
and pick up in stride when the pipeline starts.  This requires adding
a trigger handler for pipeline control in IPC3, which will propagate
certain triggers across pipeline boundaries when shutting down
playback streams, breaking active capture.  Note that this feature is
unexercisable on IPC4, where the kernel automatically starts up
connected pipelines.

Fixes a few bugs and misfeatures also:

+ Chunk the copies by full buffer strides between AEC processing calls
  instead of testing at each copied frame.

+ Copy the reference and mic streams in tandem, preventing them from
  becoming out of sync if the devices weren't themselves synchronized.

+ Copy the AEC results to the output stream after the call to
  ProcessCapture() instead of before.  This was a hidden latency bug
  in the original code, I think.

+ Cleans up the Kconfig to remove stale variables and guard all
  the component-specific tunables under the top-level component
  variable.  Also uses a default instead of a select to couple to
  CONFIG_STUBS, allowing AEC to be manually tuned.

Signed-off-by: Andy Ross <[email protected]>
  • Loading branch information
andyross committed Jan 10, 2024
1 parent 1cc7a4c commit bd12891
Show file tree
Hide file tree
Showing 4 changed files with 386 additions and 309 deletions.
27 changes: 12 additions & 15 deletions src/audio/google/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ config COMP_GOOGLE_HOTWORD_DETECT
config COMP_GOOGLE_RTC_AUDIO_PROCESSING
bool "Google Real Time Communication Audio processing"
select COMP_BLOB
select GOOGLE_RTC_AUDIO_PROCESSING_MOCK if COMP_STUBS
default n
help
Select for Google real-time communication audio processing. It
Expand All @@ -24,6 +23,8 @@ config COMP_GOOGLE_RTC_AUDIO_PROCESSING
This component takes raw microphones input and playback reference
and outputs an echo-free microphone signal.

if COMP_GOOGLE_RTC_AUDIO_PROCESSING

config COMP_GOOGLE_RTC_AUDIO_PROCESSING_SAMPLE_RATE_HZ
depends on COMP_GOOGLE_RTC_AUDIO_PROCESSING
int "Sample rate for Google Real Time Communication Audio processing"
Expand All @@ -32,21 +33,15 @@ config COMP_GOOGLE_RTC_AUDIO_PROCESSING_SAMPLE_RATE_HZ
Sets the sample rate for the memory buffer for the Google real-time
communication audio processing.

config COMP_GOOGLE_RTC_AUDIO_PROCESSING_NUM_CHANNELS
depends on COMP_GOOGLE_RTC_AUDIO_PROCESSING
int "Number of channels to process for Google Real Time Communication Audio processing"
default 1
help
Sets the number of channels to process in the Google real-time
communication audio processing.

config COMP_GOOGLE_RTC_AUDIO_PROCESSING_NUM_AEC_REFERENCE_CHANNELS
depends on COMP_GOOGLE_RTC_AUDIO_PROCESSING
int "Number of AEC reference channels for Google Real Time Communication Audio processing"
config COMP_GOOGLE_RTC_AUDIO_PROCESSING_CHANNEL_MAX
int "Max number of AEC channels"
default 2
help
Sets the number AEC reference channels in the Google real-time
communication audio processing.
Sets the maximum number source/sink channels Google Real
Time Communication Audio Processing will use for. This is a
computation and memory budget tunable. Channel counts are
retrieved at runtime, but channels higher than this number
are ignored (on input) or cleared (output).

config COMP_GOOGLE_RTC_AUDIO_PROCESSING_MEMORY_BUFFER_SIZE_BYTES
depends on COMP_GOOGLE_RTC_AUDIO_PROCESSING
Expand Down Expand Up @@ -74,10 +69,12 @@ config COMP_GOOGLE_RTC_AUDIO_PROCESSING_MIC_HEADROOM_LINEAR

config GOOGLE_RTC_AUDIO_PROCESSING_MOCK
bool "Google Real Time Communication Audio processing mock"
default n
default y if COMP_STUBS
depends on COMP_GOOGLE_RTC_AUDIO_PROCESSING
help
Mock Google real-time communication audio processing.
It allows for compilation check and basic audio flow checking.

endif # COMP_GOOGLE_RTC_AUDIO_PROCESSING

endmenu
Loading

0 comments on commit bd12891

Please sign in to comment.