Thanks, Kevin. But don't worry, if you haven't already done it. There was one already there on this subject.
That being said, it isn't the installation of Wireshark that's the problem - it's identifying the right machines to install it on and then actually capturing the packets if / when the issue reoccurs. We get that a pcap is possibly going to tell us why the calls drop, but getting the right pcap is like herding cats 😲
Original Message:
Sent: 08-23-2023 12:36
From: Kevin Brown
Subject: WebRTC Calls dropping
Paul,
I will add a document to the portal that describes step by step how to install Wireshark traces behind the scene with no GUI or interaction with the end user. It's the dumpcap function of Wireshark. I've run this many times over the years to capture Wireshark logs without agents even knowing that it is running. You just need access to their hard drive (I map their drive over the network, pull the logs and drop the mapping.)
------------------------------
Kevin Brown
Miratech, Inc
Original Message:
Sent: 08-23-2023 10:19
From: Paul Simpson
Subject: WebRTC Calls dropping
I just wanted to thank everyone for their responses so far.
Whilst it pains me to hear that other people are having issues (I wouldn't wish it on anyone!) it's also somewhat comforting to know that we are not alone.
Just to emphasize a couple of things, for anyone following this thread....
- The customer uses IP Whitelisting, so direct connection to GC from the user's home PC is specifically blocked.
- The customer's VPN (required due to the above point) has Split Tunneling in place. My gut is telling me this is related and I suspect it's going to be about what traffic goes where. If you currently don't use Split Tunneling, think carefully before putting it in, is my advice!
- The sporadic nature of the issue is one of the most perplexing. Problem has occurred three times so far that we are aware of. On each occasion, it lasted for one business day and then apparently cleared up on its own. I you are interested, the dates were July 11th, August 8th and August 18th. Do those dates ring any bells for anyone? Anyone aware of other issues occurring on the interwebs on those days?
- We cannot determine anything that the affected users have in common, that is not shared by the unaffected ones. We do know that on-site users are not affected, so it has to be something to do with VPN / Home internet / Split Tunneling. It also would seem from reports that affected users share the same ISP, although there are other (unaffected) users using the same one.
When (if) we figure this one out, I will obviously update this thread. In the meantime, if anyone has any bright ideas....
------------------------------
Paul Simpson
Eventus Solutions Group
Original Message:
Sent: 08-21-2023 17:26
From: Paul Simpson
Subject: WebRTC Calls dropping
Thanks, Robert.
The first problem we have is that it's random. It's not even "intermittent" as such. It lasts for a day and then goes away. (The customer normally experiences less than 3% calls marked as "dropped", on the days in question, this rises to 10-15%.) So any tests / attempted fixes are difficult to verify! The fact that this most recent occurrence coincided with Genesys' "August Mental Heath Fridays" really didn't help either....
We have supplied the logs you mention to Genesys and it was they who requested Wireshark captures. I think they are hoping to have verifiable proof of path the packets are taking. Of course, they requested captures from various places along the way, but given the RTP is going over the public internet, that's not happening any time soon! The only thing they can tell us from the logs is that they get an "ICE Disconnect Error", but they are unable to say if that's due to RTP being interrupted, or an issue with the signaling. Beyond "it's the network", they haven't been able to shed any light on the matter.
Unfortunately, we are unable to bypass the VPN for Gen Cloud entirely. They have IP whitelisting in place, which means the client will only run on their site, or via the VPN. It's the WebRTC where it starts to get more murky. From what I am told (yeah, I have to rely on information from their team on the ground!) the STUN traffic goes out both directly and via the VPN. Now, I'm thinking that when it goes over the VPN, that would cause problems, but even if it is related, it doesn't explain the 1-day peaks, followed by normal operations for a couple of weeks. I suggested blocking STUN from the VPN (just to be sure) but they are not keen to do that without some evidence that it will help.
I agree about load balancers etc., however this does not affect office-based users at all, so the issue has to be something to do with either the users' home networks / ISP or the VPN. One thing we have noted is that it would seem all affected users share the same ISP. Of course, they deny any responsibility for this!
I haven't mentioned Audio Profiles to them, but I will (can't hurt, right?) but again, that doesn't account for the sporadic nature of it all. The only thing I can think of is some localized "event" that is delaying traffic. Maybe a heavy internet user who is on the same ISP? I have asked if they were pushing out Windoze updates, or performing backups, or anything else at the time, but apparently not. No unusual or different applications being run.
We have requested involvement from Genesys' PS to perform a network review. I'm hoping that if I can get some folks who thoroughly understand the networking from Genesys' side to talk to the customer's network team, we might make some progress! (For example, we know we have IP address whitelisting in pace, but it isn't clear exactly which services / ports are affected by that.) My gut is telling me the Split Tunnel isn't helping, but the only alternative is to push everything over the VPN, which will overload their firewalls and probably introduce an unacceptable level of latency.
Oh, and to cap it all, Micro$oft Teams' WebRTC phones work without any issues! One thing I've learned from this is that every vendor's implementation of WebRTC is slightly different - so much for standards!!!
It's a head-scratcher, that's for sure! 😲
------------------------------
Paul Simpson
Eventus Solutions Group
Original Message:
Sent: 08-20-2023 17:51
From: Robert Wakefield-Carl
Subject: WebRTC Calls dropping
Some of this sounds like an issue with the local device, but if it is intermittent, could be other applications conflicting. The best way to see this would be the console and network logs on the local machine. Also, make sure they are all using audio profiles in Genesys to define the device. The other thing it should like is a web proxy or load balancer like an F5 causing redirects in the SIP traffic. SIP ALGs can also cause this. I don't think the Wireshark capture will show too much outside of what the console log or the WebRTC Sysinternals would show. The HTML5 issue is suspiciously like a problem with a button press not getting through to Genesys Cloud. One thing you could try is to allow direct signaling and https to split out from the VPN to mypurecloud.com or usw2.pure.cloud (or your home region) and see if the problem goes away, it is the router or firewall or VPN fabric.
------------------------------
Robert Wakefield-Carl
ttec Digital
Sr. Director - Innovation Architects
Robert.WC@ttecdigital.com
https://www.ttecDigital.com
https://RobertWC.Blogspot.com
Original Message:
Sent: 08-18-2023 14:16
From: Paul Simpson
Subject: WebRTC Calls dropping
Hi!
I have a somewhat perplexing issue with one of my customers and I'm hoping someone may be able to shed some light on it / suggest something we haven't tried.
On three separate occasions, they have experienced a sharp rise in WebRTC calls just randomly disconnecting. On the last two occasions, (beginning of July and beginning of August) this lasted for 1 day then went away. We are in the middle of the third occurrence right now. We have a ticket open with Genesys, but of course today is Mental Health Friday and they are essentially closed. I have been on hold for over 30 minutes trying to call them!
So, specifics (not sure which of the following is relevant, but here you go!)
- All affected users are remote.
- Customer has a very strict security policy, so connection to Genesys is from Whitelisted IPs only, therefore users connect over a VPN. They use Split Tunneling to divert the WebRTC audio stream directly to AWS. Signaling goes over the VPN though.
- Until about half an hour ago, they were blocking access to the Google STUM/Turn service, relying on Genesys' own only (which Care said should not be an issue). They do, however see a lot of error 701s for both services.
- We are seeing a lot of warnings "HTML5 Audio pool exhausted, returning potentially locked audio object".
- We have seen some errors from Phone Integration service "Request was rejected because user is not permitted to perform this operation"
- It's intermittent.
I am awaiting them installing Wireshark on remote PCs to get a packet capture. Anyone have any ideas?
#Telephony
------------------------------
Paul Simpson
Eventus Solutions Group
------------------------------