Genesys Cloud - Developer Community!

 View Only

Sign Up

  • 1.  Audio quality degradation in softphones using the WebRTC SDK

    Posted 21 days ago

    We have developed a softphone utilizing the `genesys-cloud-webrtc-sdk`.
    When making an external call using this softphone, we encountered a specific issue: if the external party's environment contained significant background noise, the audio from our operator (OP) side during the initial 10 seconds of the call would intermittently drop in volume or cut out, making it difficult to hear.
    After the initial 10-second period had elapsed, the audio from the OP side returned to normal; furthermore, the audio from the external party remained clear and problem-free throughout the entire call.
    Playback of the corresponding Genesys Cloud call recording files reveals the exact same audio anomalies.

    We tested the SDK's audio-related parameters using the following configurations:
    ・Automatic Gain Control: ON, Echo Cancellation: OFF, Noise Suppression: OFF
    ・Automatic Gain Control: OFF, Echo Cancellation: ON, Noise Suppression: OFF

    Notably, this issue does not occur when using the standard Genesys Cloud softphone-which operates directly within a web browser-as opposed to the custom softphone we developed.
    Furthermore, if we make a call using the standard Genesys Cloud softphone immediately before making a call with our custom softphone, the issue does not manifest. However, if we then proceed to make a second consecutive call using our custom softphone, the issue reappears.

    Is anyone familiar with similar occurrences or symptoms? Additionally, does anyone know of a potential solution to this problem?


    #EmbeddableFramework
    #PlatformSDK

    ------------------------------
    Masahiro Shioi
    なし
    ------------------------------


  • 2.  RE: Audio quality degradation in softphones using the WebRTC SDK

    Posted 20 days ago

    Hi Masahiro,

    This behavior looks like a WebRTC audio processing "warm-up" issue, especially related to AGC/VAD reacting to strong background noise at the start of the call.

    Key clues:

    • Only happens during the first ~10 seconds
    • Recorded audio shows the same issue → it's client-side processing
    • Doesn't happen after using the Genesys softphone → likely better initialization

    Some things you can try:

    • Force explicit audio constraints in getUserMedia (enable/disable AGC/NS/EC explicitly)
    • Always create a new media stream per call and fully stop tracks after each call
    • Reset any AudioContext if used
    • Add a short mic "warm-up" (2–3s) before starting the call

    Most likely cause is audio processing ramp-up or incomplete media reinitialization between calls.

    Hope this helps 👍



    ------------------------------
    Cesar Padilla
    INDRA COLOMBIA
    ------------------------------



  • 3.  RE: Audio quality degradation in softphones using the WebRTC SDK

    Posted 19 days ago

    Hi, Cesar,
    Thank you for your comment.

    I would like to share what I have discovered since my last update.

    Currently, I am setting values ​​during the SDK initialization phase as follows:

    const defaults = {
    micAutoGainControl: { exact: true },
    echoCancellation: { exact: false },
    noiseSuppression: { exact: false }
    };
    :
    :
    async start(accessToken, environment) {
    this.accessToken = accessToken;
    try {
    // Initialize the Genesys Cloud WebRTC SDK
    // https://github.com/MyPureCloud/genesys-cloud-webrtc-sdk/blob/develop/doc/index.md#constructor
    let webrtc_sdk = new window.GenesysCloudWebrtcSdk.GenesysCloudWebrtcSdk({
    environment: environment,
    accessToken: accessToken,
    defaults: defaults,
    });

    While changing the `micAutoGainControl` value between `true` and `false` does alter the audio output during a call,
    it appears that changing the values ​​for `echoCancellation` and `noiseSuppression` between `true` and `false` does *not* affect the audio output during a call.

    Although I previously stated, "this issue does not occur when using the standard Genesys Cloud softphone-which operates directly within a web browser"
    I have since found that the phenomenon *does* occur when `micAutoGainControl`, `echoCancellation`, and `noiseSuppression` are all enabled;
    conversely, the phenomenon does *not* occur when only `micAutoGainControl` is enabled.

    Therefore, I hypothesized that if I were to disable both `micAutoGainControl` and `noiseSuppression` within the SDK settings,
    the phenomenon would cease to occur.

    Could you please tell me how I can go about disabling `micAutoGainControl` and `noiseSuppression`?



    ------------------------------
    Masahiro Shioi
    なし
    ------------------------------



  • 4.  RE: Audio quality degradation in softphones using the WebRTC SDK

    Posted 19 days ago
    Edited by Cesar Padilla 19 days ago

    Hi Masahiro,

    Great findings - that actually helps narrow it down a lot 👍

    From what you describe, it looks like echoCancellation and noiseSuppression are not being fully controlled by the SDK defaults, which is a known limitation. In many cases, these constraints are ultimately enforced by the browser's WebRTC engine, not just the SDK config.

    To properly disable micAutoGainControl and noiseSuppression, you should:

    1. Set them explicitly in defaults

    JavaScript
    const defaults = {
    micAutoGainControl: { exact: false },
    echoCancellation: { exact: false },
    noiseSuppression: { exact: false }
    };

    2. But more importantly → enforce it at getUserMedia level

    If the SDK allows overriding or injecting constraints, make sure you're effectively applying:

    JavaScript
    audio: {
    autoGainControl: false,
    noiseSuppression: false,
    echoCancellation: false
    }

    👉 This is key because SDK-level settings alone may not override browser defaults.


    Important insight from your test

    What you observed strongly suggests:

    • The issue is triggered when AGC + Noise Suppression interact
    • When only AGC is ON → behavior is stable
    • When multiple processors are ON → WebRTC overreacts to background noise (especially at call start)

    Recommendation

    Try forcing:

    • autoGainControl: false
    • noiseSuppression: false
    • Leave echoCancellation depending on your use case

    And validate in chrome://webrtc-internals to confirm the constraints are really applied.


    You're very close - this looks more like browser-level audio processing behavior than a pure SDK issue.

    Hope this helps 👍



    ------------------------------
    Cesar Padilla
    INDRA COLOMBIA
    ------------------------------



  • 5.  RE: Audio quality degradation in softphones using the WebRTC SDK

    Posted 18 days ago
    Thanks, Cesar,
    The situation has improved.
     
    > 2. But more importantly → enforce it at getUserMedia level
    > If the SDK allows overriding or injecting constraints, make sure you're effectively applying:
    >  :
    > This is key because SDK-level settings alone may not override browser defaults.
     
    After receiving this comment and modifying the program code as described below,
    "If the external party's environment contained significant background noise,
    the audio from our operator (OP) side during the initial 10 seconds of the call
    would intermittently drop in volume or cut out, making it difficult to hear."
    This issue has been resolved. Thank you very much for your cooperation.
     
    The specific changes are as follows:
     
    - Before change 
        const defaults = { 
            micAutoGainControl: { exact: true }, 
            echoCancellation: { exact: false }, 
            noiseSuppression: { exact: false } 
        }; 
            : 
            : 
        let webrtc_sdk = new window.GenesysCloudWebrtcSdk.GenesysCloudWebrtcSdk({ 
            environment: environment, 
            accessToken: accessToken, 
            defaults: defaults, 
        });
     
    - After change 
        const audioOptions = { 
            micAutoGainControl: true, 
            echoCancellation: false, 
            noiseSuppression: false 
        }; 
            : 
            : 
        const audioStream = await navigator.mediaDevices.getUserMedia({ 
            audio: { 
                autoGainControl: audioOptions.micAutoGainControl, 
                echoCancellation: audioOptions.echoCancellation, 
                noiseSuppression: audioOptions.noiseSuppression } 
         }); 
            : 
            : 
        let webrtc_sdk = new window.GenesysCloudWebrtcSdk.GenesysCloudWebrtcSdk({ 
            environment: environment, 
            accessToken: accessToken, 
            defaults: { 
                ...audioOptions, 
                audioStream, 
            }, 
        });

    That was a great help.


    ------------------------------
    Masahiro Shioi
    なし
    ------------------------------



  • 6.  RE: Audio quality degradation in softphones using the WebRTC SDK

    Posted 18 days ago

    @Masahiro Shioi, I'm glad it helped you. Ideally, you should monitor and check how it behaves to see if any further adjustments are needed.



    ------------------------------
    Cesar Padilla
    INDRA COLOMBIA
    ------------------------------



  • 7.  RE: Audio quality degradation in softphones using the WebRTC SDK

    Posted 6 days ago
    Could you please explain the mechanism behind the echo cancellation feature within the advanced microphone settings?
     
    [Background]
    > After receiving this comment and modifying the program code as described below,
    > "If the external party's environment contained significant ackground noise, the audio from our operator (OP) side during the initial 10 seconds of the call would intermittently drop in volume or cut out, making it difficult to hear."
    > This issue has been resolved.
     
    We found that the issue can be prevented by disabling the softphone's echo cancellation (while keeping automatic gain control and noise suppression enabled).
    As we consider making this setting change permanent, we would like to understand the technical rationale-specifically, the mechanism by which echo cancellation influences the occurrence of this issue.
     
    [Question 1]
    We can reproduce this issue with 100% consistency under the following conditions:
    ・Echo cancellation is enabled.
    ・It occurs during the first 10 seconds of the call.
    ・The customer initiates a hands-free call from a location with background noise (such as a server room or a street with traffic), causing that noise to be picked up by the operator.
    Based on this, we infer that the echo cancellation process, while attempting to eliminate the feedback of the customer's own voice, was also interfering with the operator's voice.
    Why does this echo cancellation function cause instability in the operator's call quality (such as audio dropouts or muffled sound)?
    Could you please explain the mechanism behind this, as well as why disabling echo cancellation resolves the issue?
     
    [Question 2]
    This issue is limited to the first 10 seconds of the call; it does not occur after that point, even if background noise remains present on the customer's end.
    Why does this issue occur only at the beginning of the call?
    Could you please explain this in the context of any initialization sequences, such as calibration, that take place when the call starts?
     
    We would appreciate your response.


    ------------------------------
    Masahiro Shioi
    なし
    ------------------------------



  • 8.  RE: Audio quality degradation in softphones using the WebRTC SDK

    Posted 6 days ago
    Edited by Cesar Padilla 6 days ago

    hi @Masahiro Shioi,

    Great questions - you're now digging into the exact layer where this behavior originates 👍. Let me break it down clearly.


    Q1 – Why Echo Cancellation can degrade OP audio in noisy scenarios

    What you're observing is a known side-effect of how WebRTC Acoustic Echo Cancellation (AEC) works internally.

    🔍 How AEC works (simplified)

    AEC is designed to remove the remote audio that leaks back into the microphone (echo). To do this, it:

    1. Takes the far-end audio signal (customer audio) as a reference
    2. Builds an adaptive filter model of the acoustic environment
    3. Subtracts an estimated version of that signal from the microphone input

    ⚠️ Why it fails in your scenario

    In your case:

    • The customer is in a high-noise environment
    • That noise is coming through the far-end audio stream
    • The operator mic is also active and sending voice

    👉 The AEC engine tries to determine:

    "What part of the mic input is echo vs real speech?"

    During this process:

    • Background noise correlates poorly but inconsistently with the reference signal
    • The adaptive filter becomes unstable or over-aggressive
    • It may mistakenly suppress parts of the OP's voice

    This causes exactly what you saw:

    • Volume drops
    • Intermittent muting
    • "Pumping" or muffled audio

    🎯 Why disabling AEC fixes it

    When you disable echo cancellation:

    • No subtraction model is applied
    • The mic signal is passed through without modification by AEC

    So:

    ✅ No misclassification
    ✅ No aggressive suppression
    ✅ Stable OP audio


    💡 Important nuance

    This issue is more likely when:

    • AEC + Noise Suppression + AGC run together
    • AND the far-end signal is noisy and non-stationary

    That combination increases the chances of cross-interference between processing modules.


    Q2 – Why only during the first ~10 seconds

    This is the most interesting part - and it strongly confirms it's an adaptive filter convergence issue.

    🔍 What happens at call start

    When a call begins, AEC:

    1. Starts with a zeroed or generic filter
    2. Begins a training/convergence phase
    3. Continuously adjusts coefficients based on:
      • Room acoustics
      • Delay
      • Signal correlation

    ⏱️ First ~10 seconds = stabilization period

    During this phase:

    • The filter is not yet accurate
    • Signal classification is uncertain
    • The system may:
      • Over-subtract
      • Under-subtract
      • Misinterpret noise as echo

    👉 This creates temporal instability in the processed audio.


    After convergence

    After a few seconds:

    • The filter adapts correctly to:
      • Delay
      • Acoustic path
      • Signal characteristics

    Result:

    ✅ Stable echo estimation
    ✅ Proper separation of speech vs echo
    ✅ No more OP audio degradation


    🧠 Why your workaround works so well

    By controlling getUserMedia and effectively:

    • Disabling AEC in the browser layer
    • Keeping AGC (optionally)
    • Limiting processing complexity

    You are:

    ✅ Avoiding unstable adaptive filtering
    ✅ Reducing interactions between DSP modules
    ✅ Getting deterministic audio behavior


    🚀 Practical recommendation

    Your current approach is valid and production-safe, especially if:

    • Your agents use headsets (low echo risk)
    • Echo from speakers is not a major concern

    Suggested configurations:

    Option A (most stable)

    JavaScript
    autoGainControl: true,
    noiseSuppression: true,
    echoCancellation: false

    Option B (maximum predictability)

    JavaScript
    autoGainControl: false,
    noiseSuppression: false,
    echoCancellation: false

    🔎 Bonus tip

    If you ever want to go deeper, check:

    👉 chrome://webrtc-internals

    You'll be able to see:

    • Audio processing states
    • AEC metrics
    • Gain changes over time



    ------------------------------
    Cesar Padilla
    INDRA COLOMBIA
    ------------------------------



  • 9.  RE: Audio quality degradation in softphones using the WebRTC SDK

    Posted 4 days ago

    Hi, Cesar,
    Thank you for the detailed explanation.
    The points that were unclear to me have now been clarified.

    Could you please clarify one more point? I assume your previous answer was based on the premise that background noise was present from the very start of the call.
    Would this issue still occur if the call began in a quiet environment without background noise, but background noise was introduced later?
    Also, background noise can be either continuous or consist of sudden, momentary sounds; if the issue occurs during the call, is there any difference in how it manifests depending on whether the noise is continuous or momentary?



    ------------------------------
    Masahiro Shioi
    なし
    ------------------------------