Could you please clarify one more point? I assume your previous answer was based on the premise that background noise was present from the very start of the call.
Would this issue still occur if the call began in a quiet environment without background noise, but background noise was introduced later?
Also, background noise can be either continuous or consist of sudden, momentary sounds; if the issue occurs during the call, is there any difference in how it manifests depending on whether the noise is continuous or momentary?
Original Message:
Sent: 06-16-2026 08:33
From: Cesar Padilla
Subject: Audio quality degradation in softphones using the WebRTC SDK
hi @Masahiro Shioi,
Great questions - you're now digging into the exact layer where this behavior originates 👍. Let me break it down clearly.
✅ Q1 – Why Echo Cancellation can degrade OP audio in noisy scenarios
What you're observing is a known side-effect of how WebRTC Acoustic Echo Cancellation (AEC) works internally.
🔍 How AEC works (simplified)
AEC is designed to remove the remote audio that leaks back into the microphone (echo). To do this, it:
- Takes the far-end audio signal (customer audio) as a reference
- Builds an adaptive filter model of the acoustic environment
- Subtracts an estimated version of that signal from the microphone input
⚠️ Why it fails in your scenario
In your case:
- The customer is in a high-noise environment
- That noise is coming through the far-end audio stream
- The operator mic is also active and sending voice
👉 The AEC engine tries to determine:
"What part of the mic input is echo vs real speech?"
During this process:
- Background noise correlates poorly but inconsistently with the reference signal
- The adaptive filter becomes unstable or over-aggressive
- It may mistakenly suppress parts of the OP's voice
This causes exactly what you saw:
- Volume drops
- Intermittent muting
- "Pumping" or muffled audio
🎯 Why disabling AEC fixes it
When you disable echo cancellation:
- No subtraction model is applied
- The mic signal is passed through without modification by AEC
So:
✅ No misclassification
✅ No aggressive suppression
✅ Stable OP audio
💡 Important nuance
This issue is more likely when:
- AEC + Noise Suppression + AGC run together
- AND the far-end signal is noisy and non-stationary
That combination increases the chances of cross-interference between processing modules.
✅ Q2 – Why only during the first ~10 seconds
This is the most interesting part - and it strongly confirms it's an adaptive filter convergence issue.
🔍 What happens at call start
When a call begins, AEC:
- Starts with a zeroed or generic filter
- Begins a training/convergence phase
- Continuously adjusts coefficients based on:
- Room acoustics
- Delay
- Signal correlation
⏱️ First ~10 seconds = stabilization period
During this phase:
- The filter is not yet accurate
- Signal classification is uncertain
- The system may:
- Over-subtract
- Under-subtract
- Misinterpret noise as echo
👉 This creates temporal instability in the processed audio.
✅ After convergence
After a few seconds:
- The filter adapts correctly to:
- Delay
- Acoustic path
- Signal characteristics
Result:
✅ Stable echo estimation
✅ Proper separation of speech vs echo
✅ No more OP audio degradation
🧠 Why your workaround works so well
By controlling getUserMedia and effectively:
- Disabling AEC in the browser layer
- Keeping AGC (optionally)
- Limiting processing complexity
You are:
✅ Avoiding unstable adaptive filtering
✅ Reducing interactions between DSP modules
✅ Getting deterministic audio behavior
🚀 Practical recommendation
Your current approach is valid and production-safe, especially if:
- Your agents use headsets (low echo risk)
- Echo from speakers is not a major concern
Suggested configurations:
Option A (most stable)
Option B (maximum predictability)
🔎 Bonus tip
If you ever want to go deeper, check:
👉 chrome://webrtc-internals
You'll be able to see:
- Audio processing states
- AEC metrics
- Gain changes over time
------------------------------
Cesar Padilla
INDRA COLOMBIA
------------------------------
Original Message:
Sent: 06-16-2026 04:47
From: Masahiro Shioi
Subject: Audio quality degradation in softphones using the WebRTC SDK
Could you please explain the mechanism behind the echo cancellation feature within the advanced microphone settings?
[Background]
> After receiving this comment and modifying the program code as described below,
> "If the external party's environment contained significant ackground noise, the audio from our operator (OP) side during the initial 10 seconds of the call would intermittently drop in volume or cut out, making it difficult to hear."
> This issue has been resolved.
We found that the issue can be prevented by disabling the softphone's echo cancellation (while keeping automatic gain control and noise suppression enabled).
As we consider making this setting change permanent, we would like to understand the technical rationale-specifically, the mechanism by which echo cancellation influences the occurrence of this issue.
[Question 1]
We can reproduce this issue with 100% consistency under the following conditions:
・Echo cancellation is enabled.
・It occurs during the first 10 seconds of the call.
・The customer initiates a hands-free call from a location with background noise (such as a server room or a street with traffic), causing that noise to be picked up by the operator.
Based on this, we infer that the echo cancellation process, while attempting to eliminate the feedback of the customer's own voice, was also interfering with the operator's voice.
Why does this echo cancellation function cause instability in the operator's call quality (such as audio dropouts or muffled sound)?
Could you please explain the mechanism behind this, as well as why disabling echo cancellation resolves the issue?
[Question 2]
This issue is limited to the first 10 seconds of the call; it does not occur after that point, even if background noise remains present on the customer's end.
Why does this issue occur only at the beginning of the call?
Could you please explain this in the context of any initialization sequences, such as calibration, that take place when the call starts?
We would appreciate your response.
------------------------------
Masahiro Shioi
なし
------------------------------
Original Message:
Sent: 06-04-2026 11:14
From: Cesar Padilla
Subject: Audio quality degradation in softphones using the WebRTC SDK
@Masahiro Shioi, I'm glad it helped you. Ideally, you should monitor and check how it behaves to see if any further adjustments are needed.
------------------------------
Cesar Padilla
INDRA COLOMBIA
------------------------------
Original Message:
Sent: 06-04-2026 10:19
From: Masahiro Shioi
Subject: Audio quality degradation in softphones using the WebRTC SDK
Thanks, Cesar,
The situation has improved.
> 2. But more importantly → enforce it at getUserMedia level
> If the SDK allows overriding or injecting constraints, make sure you're effectively applying:
> :
> This is key because SDK-level settings alone may not override browser defaults.
After receiving this comment and modifying the program code as described below,
"If the external party's environment contained significant background noise,
the audio from our operator (OP) side during the initial 10 seconds of the call
would intermittently drop in volume or cut out, making it difficult to hear."
This issue has been resolved. Thank you very much for your cooperation.
The specific changes are as follows:
- Before change
const defaults = {
micAutoGainControl: { exact: true },
echoCancellation: { exact: false },
noiseSuppression: { exact: false }
};
:
:
let webrtc_sdk = new window.GenesysCloudWebrtcSdk.GenesysCloudWebrtcSdk({
environment: environment,
accessToken: accessToken,
defaults: defaults,
});
- After change
const audioOptions = {
micAutoGainControl: true,
echoCancellation: false,
noiseSuppression: false
};
:
:
const audioStream = await navigator.mediaDevices.getUserMedia({
audio: {
autoGainControl: audioOptions.micAutoGainControl,
echoCancellation: audioOptions.echoCancellation,
noiseSuppression: audioOptions.noiseSuppression }
});
:
:
let webrtc_sdk = new window.GenesysCloudWebrtcSdk.GenesysCloudWebrtcSdk({
environment: environment,
accessToken: accessToken,
defaults: {
...audioOptions,
audioStream,
},
});
That was a great help.
------------------------------
Masahiro Shioi
なし
------------------------------
Original Message:
Sent: 06-03-2026 06:34
From: Cesar Padilla
Subject: Audio quality degradation in softphones using the WebRTC SDK
Hi Masahiro,
Great findings - that actually helps narrow it down a lot 👍
From what you describe, it looks like echoCancellation and noiseSuppression are not being fully controlled by the SDK defaults, which is a known limitation. In many cases, these constraints are ultimately enforced by the browser's WebRTC engine, not just the SDK config.
To properly disable micAutoGainControl and noiseSuppression, you should:
1. Set them explicitly in defaults
2. But more importantly → enforce it at getUserMedia level
If the SDK allows overriding or injecting constraints, make sure you're effectively applying:
👉 This is key because SDK-level settings alone may not override browser defaults.
Important insight from your test
What you observed strongly suggests:
- The issue is triggered when AGC + Noise Suppression interact
- When only AGC is ON → behavior is stable
- When multiple processors are ON → WebRTC overreacts to background noise (especially at call start)
Recommendation
Try forcing:
autoGainControl: falsenoiseSuppression: false- Leave
echoCancellation depending on your use case
And validate in chrome://webrtc-internals to confirm the constraints are really applied.
You're very close - this looks more like browser-level audio processing behavior than a pure SDK issue.
Hope this helps 👍
------------------------------
Cesar Padilla
INDRA COLOMBIA
------------------------------
Original Message:
Sent: 06-03-2026 01:52
From: Masahiro Shioi
Subject: Audio quality degradation in softphones using the WebRTC SDK
Hi, Cesar,
Thank you for your comment.
I would like to share what I have discovered since my last update.
Currently, I am setting values during the SDK initialization phase as follows:
const defaults = {
micAutoGainControl: { exact: true },
echoCancellation: { exact: false },
noiseSuppression: { exact: false }
};
:
:
async start(accessToken, environment) {
this.accessToken = accessToken;
try {
// Initialize the Genesys Cloud WebRTC SDK
// https://github.com/MyPureCloud/genesys-cloud-webrtc-sdk/blob/develop/doc/index.md#constructor
let webrtc_sdk = new window.GenesysCloudWebrtcSdk.GenesysCloudWebrtcSdk({
environment: environment,
accessToken: accessToken,
defaults: defaults,
});
While changing the `micAutoGainControl` value between `true` and `false` does alter the audio output during a call,
it appears that changing the values for `echoCancellation` and `noiseSuppression` between `true` and `false` does *not* affect the audio output during a call.
Although I previously stated, "this issue does not occur when using the standard Genesys Cloud softphone-which operates directly within a web browser"
I have since found that the phenomenon *does* occur when `micAutoGainControl`, `echoCancellation`, and `noiseSuppression` are all enabled;
conversely, the phenomenon does *not* occur when only `micAutoGainControl` is enabled.
Therefore, I hypothesized that if I were to disable both `micAutoGainControl` and `noiseSuppression` within the SDK settings,
the phenomenon would cease to occur.
Could you please tell me how I can go about disabling `micAutoGainControl` and `noiseSuppression`?
------------------------------
Masahiro Shioi
なし
------------------------------
Original Message:
Sent: 06-02-2026 08:20
From: Cesar Padilla
Subject: Audio quality degradation in softphones using the WebRTC SDK
Hi Masahiro,
This behavior looks like a WebRTC audio processing "warm-up" issue, especially related to AGC/VAD reacting to strong background noise at the start of the call.
Key clues:
- Only happens during the first ~10 seconds
- Recorded audio shows the same issue → it's client-side processing
- Doesn't happen after using the Genesys softphone → likely better initialization
Some things you can try:
- Force explicit audio constraints in
getUserMedia (enable/disable AGC/NS/EC explicitly) - Always create a new media stream per call and fully stop tracks after each call
- Reset any
AudioContext if used - Add a short mic "warm-up" (2–3s) before starting the call
Most likely cause is audio processing ramp-up or incomplete media reinitialization between calls.
Hope this helps 👍
------------------------------
Cesar Padilla
INDRA COLOMBIA
------------------------------