|
|
Hi, I've had a PlusNet 500mb fttp for a few months. When the full 500mb is being used the connection suffers 5-6% packet loss.
I've tried multiple different routers the PlusNet one and an openwrt one. I've even tried a dial up pppoe connection from a pc and all suffer the same. MJ quinn did the install and struggled to splice the connection at the CSP. Would a poor splice cause these sort of issues?
At light usage the connection is fine and speeds are fine. But any time critcial stuff suffers such as voip and gaming, i'd expect latency issuess when downoading etc but over 5% packet loss doesnt seem right.
A few pics of the issue showing packet loss when downloading at 500mb.
https://www.thinkbroadband.com/broadband/monitoring/...
https://www.thinkbroadband.com/broadband/monitoring/...
Ive had alt net provider at my old house and never saw these issues.
Any ideas?
|
|
|
Would a poor splice cause these sort of issues?
I seriously doubt it.?
54-46 was my number
|
|
|
Try this packet loss site https://packetlosstest.com/
Make sure to select the server in the UK
|
|
Register (or login) on our website and you will not see this ad.
|
|
|
So I started an update on battle.net (Call of duty) on 1 of the pcs. It's hard wired direct to the router. Once the update starts downloading at 60-64MB/s any device starts getting packet loss to outside. You ping the router no issues with no drops.
Just did a Packetlosstest.com test and it shows 6.9% packet loss over a 60 second test.
Think broadband monitor also shows this loss.
BTW all tests are done hard wired.
Edited by LIGHTFAST (Wed 23-Oct-24 13:32:04)
|
|
|
Who was the old alt net? curious if it was pppoe or ipoe(dhcp).
BT wholesale will apply a rate limit as they dont like buffer bloat on their network. If that gets hit, then you have indiscriminate dropped packets.
|
|
|
|
It sounds like you are getting packet loss on other devices when you are fully utilising the bandwidth on the game download, if that is the case it is exactly what you would expect. You can't get more data through the connection when it's already saturated, so loss of data will happen.
|
|
|
Just did a Packetlosstest.com test and it shows 6.9% packet loss over a 60 second test. Did you do the Packetlosstest.com test while nothing else was using your broadband connection?
|
|
|
|
the old alt net was brsk (wan was dhcp) and was very good.
At idle I get no packet loss. I would expect latency not packet loss though on the connection when downloading?
If anyone else can try start an update on battlenet or epic games then ping 8.8.8.8 and see if get issues. I had a temp 30mb Sogea line for a while and even that managed better without packet loss just latency.
|
|
|
Possibly simply that the router is not a very high spec. Under heavy download routers that are pinged remotely can just drop the ping rather than respond to it.
You can often see this when doing tracerts, the entries that have an asterisk instead of a response time but do have valid time entries on the same line. Meaning the router concerned had more important things to do.
We know that the organized workers of the country are our friends. As for the rest, they don’t matter a tinker’s cuss - Manny Shinwell
Connections: Pixel 9 on Three 4+ (LTE)/5G, Pixel 6a on EE in reserve. At home Three Mobile, with (Three)ZTE MC888 router giving 5G on a good day.
Edited by pluralist (Wed 23-Oct-24 15:04:39)
|
|
|
The buffer bloat sounds possible after googling it.
Its like anything above a certain latency gets dropped. The cap seems to be very low.
https://www.thinkbroadband.com/broadband/monitoring/...
My usual router is a belkin RT1800 running openwrt, i've also tried a decent gaming pc as a dial up PPPOE connection with the same result. I'm currently on the plusnet router again, they want to send one of their engineers to check it out (not openreach)
It would be great if someone could test epic or battle.net downloading at full speed to see if they have similar results.
|
|
|
This is actually caused by the opposite of buffer bloat, under buffering. In essence the buffer on the provider side is too small to absorb the bursts of packets the CDNs are sending. This causes every round of packets they send to over flow the buffer. The LNS server on the ISP side cant enter and exit the dropping state quickly enough to pass the full line rate. The 'fix' on the ISP side is to increase the size of the buffer however the problem with this is that it increases hardware requirements on their side.
This behiaviour is usually accompanied by lower than expected throughput on the line while flows from a bad source are happening. I've worked in depth with BT Business on this and I believe that buffers on at least certain BT Business products have been adjusted which has helped in my cases. I encourage you to reach out to your provider to discuss this as you are the first person I've ran into 'in the wild' and 'outside of my customer base' that is impacted.
The buffer bloat sounds possible after googling it.
Its like anything above a certain latency gets dropped. The cap seems to be very low.
https://www.thinkbroadband.com/broadband/monitoring/...
My usual router is a belkin RT1800 running openwrt, i've also tried a decent gaming pc as a dial up PPPOE connection with the same result. I'm currently on the plusnet router again, they want to send one of their engineers to check it out (not openreach)
It would be great if someone could test epic or battle.net downloading at full speed to see if they have similar results.
Edited by agoodm (Wed 23-Oct-24 15:41:10)
|
|
|
Presume BT still heavily using Cisco gear like I know they were years ago?
Only asking as I had to spend a day or two recently tweaking the buffer settings on a spanking C9300X which altough has hardware specs that make you blush, out of the box is like a blank canvas and the default buffer sizes are terribly undersized. Performance was woeful at full tilt until I upped the port buffer limits. A well known Cisco issue.
Possibly of tangental help to the OP.
Edited by Pheasant (Wed 23-Oct-24 15:45:06)
|
|
|
|
Post deleted by agoodm
|
|
|
Accidentally deleted my other reply but in essence I dont think its a exactly a secret that BT use Nokia SRS devices heavily throughout their network including routers and LNS roles. By default the queue length on the LNS is/was 1000 packets which on the gig service = 5MS. Sadly common software applications create 10-15 flows from 10-15 distinct CDN nodes which all appear to dump 64KB onto the network at likely 100Gbps per round. When this arrives at the LNS the buffer overflows and the queue enters the dropping state. Crucially with the buffer being so small the LNS simply cant enter and exit the dropping state quickly enough which results in lower than expected throughput. The 'fix' is for the provider to increase the size of the buffer and indeed this is/was being trialled at least for BT Business customers. Unfortunately its a million dollar question re how large the buffer should be. Too large hurts performance under load, too small risks under buffering. The size of the buffer also dictates hardware sizing on the LNS side which is a big deal in the race to the bottom market they operate in... Ultimately in my opinion pressure should be applied to the CDNs to stop activing like a DoS attack. I'm running into situations where one user hitting one of their services can be impossible to QoS because even dropping 10+Mbps does not cause them to back off...
Presume BT still heavily using Cisco gear like I know they were years ago?
Only asking as I had to spend a day or two recently tweaking the buffer settings on a spanking C9300X which altough has hardware specs that make you blush, out of the box is like a blank canvas and the default buffer sizes are terribly undersized. Performance was woeful at full tilt until I upped the port buffer limits. A well known Cisco issue.
Possibly of tangental help to the OP.
|
|
|
the old alt net was brsk (wan was dhcp) and was very good.
At idle I get no packet loss. I would expect latency not packet loss though on the connection when downloading?
If anyone else can try start an update on battlenet or epic games then ping 8.8.8.8 and see if get issues. I had a temp 30mb Sogea line for a while and even that managed better without packet loss just latency.
Thanks for confirming, PPPoE is a more demanding protocol than IPoE, so you have two possible causes now, either BTw anti buffer bloat causing it, or the router not been able to handle PPPoE at those sort of speeds.
|
|
|
Accidentally deleted my other reply but in essence I dont think its a exactly a secret that BT use Nokia SRS devices heavily throughout their network including routers and LNS roles. By default the queue length on the LNS is/was 1000 packets which on the gig service = 5MS. Sadly common software applications create 10-15 flows from 10-15 distinct CDN nodes which all appear to dump 64KB onto the network at likely 100Gbps per round. When this arrives at the LNS the buffer overflows and the queue enters the dropping state. Crucially with the buffer being so small the LNS simply cant enter and exit the dropping state quickly enough which results in lower than expected throughput. The 'fix' is for the provider to increase the size of the buffer and indeed this is/was being trialled at least for BT Business customers. Unfortunately its a million dollar question re how large the buffer should be. Too large hurts performance under load, too small risks under buffering. The size of the buffer also dictates hardware sizing on the LNS side which is a big deal in the race to the bottom market they operate in... Ultimately in my opinion pressure should be applied to the CDNs to stop activing like a DoS attack. I'm running into situations where one user hitting one of their services can be impossible to QoS because even dropping 10+Mbps does not cause them to back off...
Presume BT still heavily using Cisco gear like I know they were years ago?
Only asking as I had to spend a day or two recently tweaking the buffer settings on a spanking C9300X which altough has hardware specs that make you blush, out of the box is like a blank canvas and the default buffer sizes are terribly undersized. Performance was woeful at full tilt until I upped the port buffer limits. A well known Cisco issue.
Possibly of tangental help to the OP.
Did you / do you see clients with packet loss issues using Openreach-based FTTP networks though?
I've been on again off again using an Openreach FTTP service at my place in Suffolk. I've variously used Cerberus, TalkTalk Biz and the since the middle of this year I'm back using it again, but this time with EE as the ISP on the 1.6Gbps tier.
I've not really seen any packet loss. I'm back there tomorrow and could probably run some deeper tests, but it's not something I've ever noticed, even running at full tilt using a PPPoE connection.
I'm inclined to think the issue is not so much with the provider, but potentially with the OP's router.
|
|
|
Thanks for confirming, PPPoE is a more demanding protocol than IPoE, so you have two possible causes now, either BTw anti buffer bloat causing it, or the router not been able to handle PPPoE at those sort of speeds.
I'm also inlined to think its the latter.
Even Windows-based laptops I've found can struggle running a PPPoE client, especially if they are a few years old, hence aren't necessarily an indicator of rude network heath.
The OP is better off testing using either a Mac or Linux machine direct into the ONT.
|
|
|
Did you / do you see clients with packet loss issues using Openreach-based FTTP networks though?
I've been on again off again using an Openreach FTTP service at my place in Suffolk. I've variously used Cerberus, TalkTalk Biz and the since the middle of this year I'm back using it again, but this time with EE as the ISP on the 1.6Gbps tier.
I've not really seen any packet loss. I'm back there tomorrow and could probably run some deeper tests, but it's not something I've ever noticed, even running at full tilt using a PPPoE connection.
I'm inclined to think the issue is not so much with the provider, but potentially with the OP's router.
I've got personal experience in BT Business FTTP. I've also tested ICUK G.Fast with similar results. They use the same or similar buffer sizes on FTTC however with the lower speeds on FTTC the buffer sizing is adequate. I have still ran into difficulties providing QoS though as they are so aggressive and not necesarily respond to packet loss.
The acid test I've been using is installing Fortnite in the Epic Games Launcher. I would see below line rate in the download but would not be able to use the remainder of the connection. EG the Fortnite download might run at 650Mbps but starting another download along side it would not result in throughput increasing to the line rate. BQM would show like OPs with base latency going up 5MS and significant packet loss, upwards of 10%. In order to get the problem to go away I would need to QoS this download to around 450Mbps. I was trialling a longer buffer size at one point, I dont know if this has been rolled out across BT Business.
In my case the 'router' in question is an HP Proliant server which is comfortably capable of 10Gbps of PPPoE without breaking a sweat. Indeed even a BT Business Hub 5 can do 1Gbps of PPPoE assuming its otherwise fairly idle.
|
|
|
|
Thanks for the info very interesting to hear. I do work for a business telecom provider and connected lots of gamma and zen connections but I haven't come across this issue on fttp. I've just swapped the router out for a draytek 2766. Had the same result with packet loss. If I bandwidth limit the connection to 440mb the packet loss stops. I do see the occasional 1x 1000ms+ response in the mix which is odd but no packet loss.
Can't imagine the Plusnet guy coming will have any idea on this one. Unless they can escalate it.
|
|
|
Can't imagine the Plusnet guy coming will have any idea on this one. Unless they can escalate
Can’t imagine that the guy will work for Plusnet.
54-46 was my number
|
|
|
Thanks for the info very interesting to hear. I do work for a business telecom provider and connected lots of gamma and zen connections but I haven't come across this issue on fttp. I've just swapped the router out for a draytek 2766. Had the same result with packet loss. If I bandwidth limit the connection to 440mb the packet loss stops. I do see the occasional 1x 1000ms+ response in the mix which is odd but no packet loss.
Can't imagine the Plusnet guy coming will have any idea on this one. Unless they can escalate it.
Just out of intrest, on the PC side, what speeds do you see in task manager when the limit is enforced, is it more than 440 or does it not respect it?
I have noticed the issue you raise, if under bandwdth limit as it doesn't adhear to it continuously for some UDP game updates.
For me it's not a big issue as I limit each device and VLAN well under the gig symmetric to stop one device or VLAN using it all.
Many Thanks,
RR-THE-IT-GUY
YouFibre 1Gbps symmetric
Talktalk 2014-2018 ADSL → Virgin Media Vivid 50 13/10/2018-2019 → Virgin Media M100 2020-05/2022 → Virgin Media M500 2022-05/10/2023 → IDNET 110x20 (FTTP) 20/11/2023 → YouFibre 1Gbps Symmetric with Static IP 2023-Current
|
|
|
The 'engineer' visit is probably from a useless outfit called Circet who Plusnet use to sort out complicated issues like not connecting the router up correctly so they will be a complete waste of time. Might be worth telling them not to bother.
|
|
|
Not plusnet so can't really help in this regard, but I did do a bit of tuning on my own OPNsense firewall when we switched to Youfibre 1000/1000 earlier this year in order to reduce the impact of multiple connections on the overall connection latency.
In effect, I apply some shaping rules to cap the overall bandwidth slightly below the line speed to ensure that there is typically additional bandwidth available for other services. This is visible on a speedtest when you look at the contended vs uncontended latency figures which for me are typically 3ms (uncontended) and maybe 6-10ms (contended). Without shaping, the contended latency can easily jump over 100ms or more which means that a single heavy user can significantly impact other activities in the house:
https://www.speedtest.net/result/16926798950
(it's the blue "download latency" figure - 10ms here)
With shaping disabled, the speed is slightly higher, but the contended latency goes through the roof and would certainly have other users complaining about performance if you were maxing out the connection:
https://www.speedtest.net/result/16926805762
There is a small, peak throughput cost to this, but in the real world it's well worth doing. Perhaps something available to you in your openwrt router? Perhaps also worth checking the performance of your router too, to be sure it's not the cause. Mine can easily shift 10G+ so I know that horsepower is not a limitation here.
|
|
|
|
TCP will attempt to find the fastest the path can manage therefore packet loss under full load is basically guaranteed to be happening somewhere. QoS and other AQM systems manipulate the flows to take advantage of this behaviour. (I literally make my living selling managed networks that make use of these technologies).
The issue you see is that the connection is under buffered + the LNS server cant enter and exit the dropping state quickly enough, which; combined with the bursty nature of the CDN traffic (which is probably an artifact of efficiency/power saving measures on the CDN side called generic segementation offload) leads to an odd situation where you cant use your full connection speed when downloads from certain sources are happening.
There is no local network issue happening. The issue is at the LNS within the providers data centre. Do not order an engineer to your house for this!
To fix this one of the following needs to happen:
1) The CDN be less bursty, which could be achieved by the game download systems using less TCP threads for their download or adjusting parameters on the CDN.
2) The buffer in the ISP network be increased to a size such that it can accommodate the CDN traffic bursts
3) The LNS at the ISP be tweaked to allow it to enter and exit the dropping state more rapidly, so it can still pass the full line rate even when dropping heavily
The only realistic option in my opinion is 2 because the CDNs are a law unto themselves. However as yet nobody I've ran into; apart from select people high up at BT group really understand whats going on here. End users just blame the provider for being 'slow' however ultimately as you can see the situation is far more complicated. This isnt an end user router throughput issue, not an issue with PPPoE vs IPoE (DHCP) etc. This is simply a case of the bottle kneck being under buffered.
The major headache here though is that the providers are running at such huge scale that providing say 50ms buffer for a 1Gbps service needs 15MB of ram, or thereabouts, plus overheads and I suspect the buffer capacity for the LNS is extremely expensive. That coupled with the race to the bottom price point means they cant/wont do it. Bit of a tricky one?!
|
|
|
Thanks for the help. I've told plusnet its likely a waste of time sending the in house engineer and asked if it would be better to raise a ticket with their network team but they dont seem to be able to deviate from the script. that if the line test passes its an in house problem even though i've sent images and explained the issue.
I wonder if an upgrade to 900mb would help?
Really is a pain and im about 5 months into a 2 year contract
|
|
|
|
Slower services have the same buffer size as far as I am aware, in effect meaning their buffer is larger so for this purpose the slower services are actually 'better'. On the plus side, with the gig (sold as 900m) services your 100+GB game downloads in about 15 minutes so not the end of the world.
I had the same experience trying to get support through the business as usual processes on this. Thankfully I had contacts in other departments for various reasons.
|
|
|
|
Out of interest what service did you subscribe to with your altnet previously?
|
|
|
|
Did have 500/500 on brsk which was much better even with CGNAT. Then when moving house I ordered plusnet (no other altnets around). They put a SOGEA in as a temp line until some civils could be done for FTTP (which got installed after a month or 2).
|
|
|
I think agoodm explained it very well.
CDN's can be quite aggressive as I expect they constantly dealing with "its too slow " complaints, so threads get bumped up and the window size gets pumped up to a very high value.
CDN's also bring the internet closer to us, but there is a downside, a low RTT in my opinion aggravates this problem, because steam allows you to choose your mirror, I found as an example choosing a mirror further way was quite effective at mitigating the problem.
In my personal experience I had the problem quite bad when I was on VDSL, I never resolved it in what I consider a proper way, but instead did things like apply a speed limit in the steam download client, traffic shaped downloads on my games consoles, and eventually started traffic shaping my ack's on steam. What I noticed on steam is if you restrict the download speed, it still remained very bursty as agoodm described, so if I as an example set it to 30mbit/sec, instead of being constantly around 30mbit/sec it would burst to 60, then stop for a second then burst to 60 again, so it averaged out slower, but when it bursted, it was prone to harming the quality of my connection. I discovered shaping the ack's upstream would naturally slow it down to a more constant speed by reducing the congestion window on the CDN sending the data. The most effective shaping was using an intermediate VPN and shaping the output to myself from that. I also late on artificially added latency on dummynet for steam and console traffic.
I had all sorts of potential causes of the problem including the very small buffers from the ISP or backhaul provider, but of course none of this was ever proven, it would be very difficult to do so. However one day when I had issues on the VDSL connection I started running my entire lan from 4G over my phone and noticed the behaviour completely changed, it no longer had these symptoms, and I could push the connection to 99% with less of an issue than if I was pushing the VDSL to 80%. Once I witnessed this I started the process of changing to a difference fixed line connection.
I then moved to gig1 on Virgin Media which is carried out over DHCP not PPPoE, and had much more bandwidth, VM do buffer a lot more data as well. The problems completely vanished, I removed all configurations related to managing downstream traffic, and only left a basic upload QoS in place.
I am now on a gigabit both ways connection to CityFibre via AAISP, initially I had a regression although it was nowhere near as bad as VDSL was, I then moved to more powerful hardware so the PPPoE wasnt saturating the CPU and made sure I had working max-mss headers. The connection also uses baby jumbo frames to allow a 1500 byte MTU, all these changes, its now almost as good as things were on VM. It is also worth mentioning I cant fully saturate the connection from a single device due to having a gigabit LAN, this may be helping things, as it kind of acts as a built in QoS that keeps a bit of unutilised bandwidth at all times.
Since gigabit is harder to saturate, I think that alone would improve things, I do think providing you provide enough hardware grunt to drive the PPPoE and make sure you dont have fragmentation issues, then upgrading to a faster connection alone might kill off the problem.
Because of my previous VDSL issues is why I took an interest in your post.
Edited by Chrysalis (Sat 26-Oct-24 18:15:51)
|
|
|
The memory for buffers on the Nokia SRs lives on the line cards and is shared and contended: when it runs out packets get dropped however I can't see it being full at 2pm on a weekday so that seems unlikely.
Plusnet punters signing up any time recently would've had the LAC and LNS the same way EE, BT Consumer and BT's business services run - the BT Wholesale Nokia SR in the headend exchange doing everything.
Can see a reasonable amount of buffer there, 10ms or so. It's behaving like either the other side is super inconsiderate with tons of TCP flows using an aggressive stack, UDP with poor or no congestion control over the top or there's a configuration mismatch.
Another thing that comes to mind is if OP is near a BT Wholesale POP where there are CDN caches so the latency to those is really low: they will respond super fast to loss and ramp the windows back up really fast from the tiny round trip times.
I've no idea if it'd be useful but be interesting to see the IP Profile the BT Wholesale speed checker shows for this to make sure it's correct, and to rebuild the service on the BT Wholesale network then if still bad the Openreach OLT. The rate limiting happens in two places, egress on the SR facing the Openreach network then egress on the Openreach OLT PON port and the SR should be taking care of it: no Openreach customer should ever be sending more to the OLT than the customer is provisioned for.
Edited by XGS_Is_On (Sat 26-Oct-24 18:16:50)
|