Articles

Notes on WebRTC echo cancellation, after twelve device families

AEC defaults look fine in the office and embarrassing on a Pixel 4a in a café. Here's what actually worked across the matrix.

April 28, 2026 · 5 min read · android, webrtc

AEC is a wall a lot of voice apps run into around month three. The defaults look fine on the engineer’s desk in a quiet office and embarrassing on a Pixel 4a in a café — and you only find out when a user sends you a clip of their own voice echoing back at them, half a second late, mid-sentence.

I’ve shipped two production WebRTC products for Numrah, and the AEC story has been the same on both: pick a default, accept that it’ll fail on roughly a quarter of devices, and build the matrix that tells you what to send to the other three quarters. This post is that matrix, sketched honestly.

What WebRTC gives you for free

Out of the box, libwebrtc ships with WebRtcAudioUtils.setWebRtcBasedAcousticEchoCanceler(true) on Android — a software AEC that runs on top of whatever hardware AEC the device claims to expose. On most modern Snapdragon devices the hardware path is good enough that you can leave the software path off entirely. On low-end MediaTek chips and on a surprising number of Samsung A-series devices, the hardware path lies — it returns true from AcousticEchoCanceler.isAvailable() and then does nothing audible.

The decision tree we ship looks like this:

If Build.MANUFACTURER is in our deny-list (MediaTek-heavy OEMs we’ve burned on), force software AEC on top of hardware AEC.
Else, trust hardware AEC and verify with a one-second silence probe at call start.
If the probe sees energy above −45 dBFS during silence, fall back to software AEC for the rest of the call and remember the device fingerprint.

The matrix

I keep a spreadsheet of devices, OS versions, and the AEC mode that works. It’s the most useful artifact on the team. The shape of it, simplified:

Device family	Hardware AEC	Software AEC	Mode we ship
Pixel 6+	Good	Unnecessary	HW only
Pixel 4a / 5	Marginal	Helps	HW + SW
Samsung S22+	Good	Adds artifacts	HW only
Samsung A-series	Lies	Required	SW only
Xiaomi Redmi	Variable	Helps	HW + SW
iPhone 12+	Excellent	Disabled by OS	HW only

The honest version of this table is twice as long and changes every quarter. Don’t take this one as gospel — take the shape of it: build the matrix from your own user base, and trust your own probes more than any vendor’s documentation.

Code that survived contact with users

Here’s the Kotlin we run at call start. Note the silenceProbe runs concurrently with ICE gathering, so it doesn’t add latency:

object AecConfig {
    private val SW_FORCED_OEMS = setOf("samsung", "oppo", "realme")

    suspend fun configure(audio: AudioDeviceModule): AecMode = coroutineScope {
        val oem = Build.MANUFACTURER.lowercase()
        val hwAvailable = AcousticEchoCanceler.isAvailable()

        if (oem in SW_FORCED_OEMS && Build.MODEL.startsWith("SM-A")) {
            return@coroutineScope AecMode.SoftwareOnly.also { audio.applyMode(it) }
        }

        val probe = async { silenceProbe(audio, durationMs = 1000) }
        val noisy = probe.await().peakDbfs > -45.0

        when {
            hwAvailable && !noisy -> AecMode.HardwareOnly
            hwAvailable && noisy  -> AecMode.HardwarePlusSoftware
            else                  -> AecMode.SoftwareOnly
        }.also { audio.applyMode(it) }
    }
}

The whole thing is sixty lines of Kotlin and one server-side classifier. The first version of it took an afternoon. The current version took fourteen months — every line of that grew out of a real call recording from a real user.

What I’d do differently

Build the matrix before you ship, not after. The devices you’ll see in production are not the devices on your desk.

If I were starting this project over, I would buy the five worst-rated Android phones on Daraz before writing the first line of audio code. Not the flagships, not the dev kits — the Tecno Spark 8s and the Itel A56s. Those are the devices my users actually carry, and those are the devices the defaults fail on.

Everything else flows from that. The matrix is just bookkeeping; the work is taking the bookkeeping seriously.

If you’ve shipped a voice product on Android and your matrix looks different from this one, I’d genuinely like to hear about it — the more device families in the corpus, the better the defaults can be.