AEC is a wall a lot of voice apps run into around month three. The defaults look fine on the engineer’s desk in a quiet office and embarrassing on a Pixel 4a in a café — and you only find out when a user sends you a clip of their own voice echoing back at them, half a second late, mid-sentence.

I’ve shipped two production WebRTC products for Numrah, and the AEC story has been the same on both: pick a default, accept that it’ll fail on roughly a quarter of devices, and build the matrix that tells you what to send to the other three quarters. This post is that matrix, sketched honestly.

What WebRTC gives you for free

Out of the box, libwebrtc ships with WebRtcAudioUtils.setWebRtcBasedAcousticEchoCanceler(true) on Android — a software AEC that runs on top of whatever hardware AEC the device claims to expose. On most modern Snapdragon devices the hardware path is good enough that you can leave the software path off entirely. On low-end MediaTek chips and on a surprising number of Samsung A-series devices, the hardware path lies — it returns true from AcousticEchoCanceler.isAvailable() and then does nothing audible.

The decision tree we ship looks like this:

The matrix

I keep a spreadsheet of devices, OS versions, and the AEC mode that works. It’s the most useful artifact on the team. The shape of it, simplified:

Device familyHardware AECSoftware AECMode we ship
Pixel 6+GoodUnnecessaryHW only
Pixel 4a / 5MarginalHelpsHW + SW
Samsung S22+GoodAdds artifactsHW only
Samsung A-seriesLiesRequiredSW only
Xiaomi RedmiVariableHelpsHW + SW
iPhone 12+ExcellentDisabled by OSHW only

The honest version of this table is twice as long and changes every quarter. Don’t take this one as gospel — take the shape of it: build the matrix from your own user base, and trust your own probes more than any vendor’s documentation.

Code that survived contact with users

Here’s the Kotlin we run at call start. Note the silenceProbe runs concurrently with ICE gathering, so it doesn’t add latency:

object AecConfig {
    private val SW_FORCED_OEMS = setOf("samsung", "oppo", "realme")

    suspend fun configure(audio: AudioDeviceModule): AecMode = coroutineScope {
        val oem = Build.MANUFACTURER.lowercase()
        val hwAvailable = AcousticEchoCanceler.isAvailable()

        if (oem in SW_FORCED_OEMS && Build.MODEL.startsWith("SM-A")) {
            return@coroutineScope AecMode.SoftwareOnly.also { audio.applyMode(it) }
        }

        val probe = async { silenceProbe(audio, durationMs = 1000) }
        val noisy = probe.await().peakDbfs > -45.0

        when {
            hwAvailable && !noisy -> AecMode.HardwareOnly
            hwAvailable && noisy  -> AecMode.HardwarePlusSoftware
            else                  -> AecMode.SoftwareOnly
        }.also { audio.applyMode(it) }
    }
}

The whole thing is sixty lines of Kotlin and one server-side classifier. The first version of it took an afternoon. The current version took fourteen months — every line of that grew out of a real call recording from a real user.

What I’d do differently

Build the matrix before you ship, not after. The devices you’ll see in production are not the devices on your desk.

If I were starting this project over, I would buy the five worst-rated Android phones on Daraz before writing the first line of audio code. Not the flagships, not the dev kits — the Tecno Spark 8s and the Itel A56s. Those are the devices my users actually carry, and those are the devices the defaults fail on.

Everything else flows from that. The matrix is just bookkeeping; the work is taking the bookkeeping seriously.


If you’ve shipped a voice product on Android and your matrix looks different from this one, I’d genuinely like to hear about it — the more device families in the corpus, the better the defaults can be.