User comments on ISPs
  >> Zen Internet


Register (or login) on our website and you will not see this ad.


Pages in this thread: 1 | 2 | (show all)   Print Thread
Standard User ByteTraveller
(newbie) Sun 23-Jun-13 14:14:15
Print Post

Understanding packet loss on the first hop


[link to this post]
 
I am making my second pass through investigation of packet shaping locally to prioritise certain traffic (SSH etc) over other traffic to ensure that it always gets the bandwidth and latency that it needs.

I downloaded a popular torrent whilst seeding on the usual stuff in order to test how bad the latency got from the results of my first pass through packet shaping 2 years ago, by watching the delay on pinging the first hop (62.3.84.33 in this case).

I ran mtr alongside, and I was shocked to see that ICMP packets were being dropped! 8% loss during the ~4MB/Sec d/l ~800KB/Sec u/l - see screenshot. I sync with the cabinet at 5588/966KB/Sec currently, so such a test doesn't even max my local connection.

Is it correct that I'm seeing congestion at Zen's gateway server? I'm too n00b to conclude much from this, but I would always expect that ICMP packets would always work, and if congestion was actually happening then flows with lots of information (i.e. TCP) would start to get dropped.

Amazingly in an earlier test I recorded 0.6% packet loss to my gateway!! I havent been able to recreate this, including when maxing the local Gbit network connection, so am not sure what to make of this... (presumably in passing this test, the network hardware, wires and switch must be fine).

Thanks for any help.

ZeN / Zen Fibre Office Plus (VDSL2)
ISP Representative SkyFire
(isp) Sun 23-Jun-13 14:29:55
Print Post

Re: Understanding packet loss on the first hop


[re: ByteTraveller] [link to this post]
 
Hi there,

We don't prioritise responding to ping on our network. What happens is whether data is getting back from the source reliably. I posted this recently which may help your understanding: Packet loss or latency at intermediate hops.

kind regards,
Phil Long

--
Phil Long
ZeN Performance and Process Improvement Manager
The above post has been made by an ISP REPRESENTATIVE (although not necessarily the ISP being discussed in the post).
Standard User ByteTraveller
(newbie) Sun 23-Jun-13 14:59:39
Print Post

Re: Understanding packet loss on the first hop


[re: SkyFire] [link to this post]
 
Thankyou for such a fast response smile You don't have to respond to this, although ofc I would be delighted - I just need to get through this thinking some how.

Reading what you have linked, it sounds like ICMP is a bad proxy for overall connection performance under load then (which is suspicious as this has always been used, including back when I was on Windows with cfosspeed).

This is interesting, because when there is 'load', there is clearly a difference in behaviour on the Zen hop - dropping packets as compared to responding to them all - this still does suggest its a proxy for determining something, but I guess not what I want, which is the connection limit.

Basically based on my first packet shaping work, I want an indicator to demonstrate when I have reached the limit of my connection - where I then take this throughput as my max bandwidth, and rate limit this then prioritise the bandwidth in contained child queues.

But what you are getting at from the link is not to treat my connection as the bottleneck, but think of each actual connection individually, and then go from there. This VDSL2 connection is awe-inspiring, but I don't think I've reached the point at which I can discount it as the bottleneck though, particularly for uploading.

So perhaps my proxy to connection use should be the 'ping times' of a TCP connection to some server that is considered unlimited and therefore not a bottleneck itself - time to look into hping.

ZeN / Zen Fibre Office Plus (VDSL2)


Register (or login) on our website and you will not see this ad.

Standard User Acidic
(learned) Sun 23-Jun-13 21:50:20
Print Post

Re: Understanding packet loss on the first hop


[re: ByteTraveller] [link to this post]
 
If you're only experiencing the packet loss when running the torrents then in all likely hood you will be saturating your connection. When it comes to dropping packets due to congestion there's no differentiation between TCP packets and ICMP packets. If the queue's full the packets just get dropped regardless of the size/type.

If you were to run TCPdump/wireshark to do a packet capture you'll be able to see the TCP retransmits due to packet loss.

On a path with no QoS applied ICMP packets are as good as any other for testing performance so long as the end destination you're pinging is capable of responding promptly.
Standard User ByteTraveller
(newbie) Mon 24-Jun-13 14:58:55
Print Post

Re: Understanding packet loss on the first hop


[re: Acidic] [link to this post]
 
Ah, but it isn't my connection in this case - see my earlier post with the rate I'm syncing at - theres a good 1MB/Sec of headroom (such luxuries these days!).

Thinking about this, I remember coming across kitz.co.uk's BT 21CN diagram - as a normal user when I ping Zen's server I think that I'm literally pinging it direct, because its the first hop - however the diagram shows that there is much magic implementing that first hop, and I can only demystify the VDSL2 modem part at my end.

So the packet loss I'm seeing is either Zen's gateway not having the capacity to serve my connection (pretty unlikely I'd imagine?), or some hop on BT's underlying network not having the capacity (e.g. my local exchange) - I guess as a user theres nothing I can do to get visibility into that.

I'll brush off wireshark/tcpdump tomorrow - aside from torrenting, I imagine I can grep tcpdump to see retransmits happening on the SSH connection VNC is running over, which I'm particularly interested in.

Thanks for your help.

ZeN / Zen Fibre Office Plus (VDSL2)
Standard User therioman
(knowledge is power) Mon 24-Jun-13 19:52:07
Print Post

Re: Understanding packet loss on the first hop


[re: ByteTraveller] [link to this post]
 
In reply to a post by ByteTraveller:
Reading what you have linked, it sounds like ICMP is a bad proxy for overall connection performance under load then (which is suspicious as this has always been used, including back when I was on Windows with cfosspeed).


I don't think "proxy" is the right word, but no, ICMP is not a reliable or useful measure in a traceroute in determining the reliability of the link, intermediate hops and so on, no.

Routers don't have to respond to ICMP at all in this situation, and many don't. Others, rightly prioritise them at the end of the queue - there are lots of sound technical reasons but in broad terms, processing that uses more resources on the router CPU than dealing with many other requests, and to avoid the cpu being consumed by unimportant ICMP, they're often configured in very much a "when you have a moment, could you" style.

This is interesting, because when there is 'load', there is clearly a difference in behaviour on the Zen hop - dropping packets as compared to responding to them all - this still does suggest its a proxy for determining something, but I guess not what I want, which is the connection limit.


Basically based on my first packet shaping work, I want an indicator to demonstrate when I have reached the limit of my connection - where I then take this throughput as my max bandwidth, and rate limit this then prioritise the bandwidth in contained child queues.


The limit of your connection is when you exhaust it's bandwidth, which would be when it is at 100% utilisation of that link (until then it's not at the limit). So the limit in your case is going to be the sync rate less the overhead.

I would recommend setting it at 95% so you always have a little spare, then set priority based on the protocols you want leaving your side first. It doesn't get priority like that beyond you, as Zen/BT won't be obeying your requests for priority, [although assuming they've got more capacity available than you have in use [they do] it isn't a big deal] but it does mean you can control your flow - remember that you can only really control the outbound flow - you can't do much about inbound as the traffic has arrived once your router sees it (I believe BTs packet shaping at least on the older network had some harsh controls mind).

But what you are getting at from the link is not to treat my connection as the bottleneck, but think of each actual connection individually, and then go from there. This VDSL2 connection is awe-inspiring, but I don't think I've reached the point at which I can discount it as the bottleneck though, particularly for uploading.


I don't know what your ultimate issue is, but if you exhaust your upstream, you will rapidly see an increase in latency, and packet loss is going to start happening (because there's quite literally nowhere for the packet to go).

So perhaps my proxy to connection use should be the 'ping times' of a TCP connection to some server that is considered unlimited and therefore not a bottleneck itself - time to look into hping.


You're approaching this the wrong way. You don't ping/test other hosts to determine if your link is saturated, you look at the link capacity. There's no meaningful way I can think of in a consumer broadband setup to (real time) manage your shaping based on a remote ping. The number 1 cause of latency increases and packet loss on standard dsl (including vdsl fibre) is going to be you running out of upstream.
Standard User ByteTraveller
(newbie) Tue 25-Jun-13 14:38:55
Print Post

Re: Understanding packet loss on the first hop


[re: therioman] [link to this post]
 
In reply to a post by therioman:
I don't think "proxy" is the right word, but no, ICMP is not a reliable or useful measure in a traceroute in determining the reliability of the link, intermediate hops and so on, no.

Routers don't have to respond to ICMP at all in this situation, and many don't. Others, rightly prioritise them at the end of the queue - there are lots of sound technical reasons but in broad terms, processing that uses more resources on the router CPU than dealing with many other requests, and to avoid the cpu being consumed by unimportant ICMP, they're often configured in very much a "when you have a moment, could you" style.


OK. I could respond and say that I have an unlimited server that I could ping and guarantee it responds, but I dont anymore and I guess you would say that intermediate hops could still slow the ICMP packets down.


In reply to a post by therioman:
The limit of your connection is when you exhaust it's bandwidth, which would be when it is at 100% utilisation of that link (until then it's not at the limit). So the limit in your case is going to be the sync rate less the overhead.

I would recommend setting it at 95% so you always have a little spare, then set priority based on the protocols you want leaving your side first. It doesn't get priority like that beyond you, as Zen/BT won't be obeying your requests for priority, [although assuming they've got more capacity available than you have in use [they do] it isn't a big deal] but it does mean you can control your flow - remember that you can only really control the outbound flow - you can't do much about inbound as the traffic has arrived once your router sees it (I believe BTs packet shaping at least on the older network had some harsh controls mind).


Indeed. I will have the current bandwidth synced to hand, and then set to that (I'll see if this rate is real or actually doesn't take the overhead into account, which presumably is all IP packets now rather than ATM cells). This doesn't deal with the other possibility though that my local connection is no longer the limiting factor - hence pinging the first real internet host and seeing how it responds - but it sounds like there isn't a reliable way to judge that yet (even if I cant automagically shape my connection to it, its a good single number to keep tabs on to ascertain the quality of the connection).

About shaping incoming bandwidth: Ive seen that said a few times before, but I don't think its as simple as this - I can drop packets coming in at the gateway, and the ends of the TCP connections will detect that and throttle back (hence looking for the retransmits that Acidic mentioned to detect congestion earlier).

In reply to a post by therioman:
I don't know what your ultimate issue is, but if you exhaust your upstream, you will rapidly see an increase in latency, and packet loss is going to start happening (because there's quite literally nowhere for the packet to go).


Yes, thats part of the basics. The packet loss thing may not be valid based on ICMP being unreliable, I'll be wiresharking/tcpdumping shortly to see if there is actual packet loss on TCP connections (if there is, well, theres pinging vindicated wink)

In reply to a post by therioman:
You're approaching this the wrong way. You don't ping/test other hosts to determine if your link is saturated, you look at the link capacity. There's no meaningful way I can think of in a consumer broadband setup to (real time) manage your shaping based on a remote ping. The number 1 cause of latency increases and packet loss on standard dsl (including vdsl fibre) is going to be you running out of upstream.


I spent some years doing this on my normal DSL connection - pinging worked well in that it clearly showed the connection choking, even when the used bandwidth was below what the line was supposed to accomplish (modem stats were available then as well).

ZeN / Zen Fibre Office Plus (VDSL2)
Standard User therioman
(knowledge is power) Tue 25-Jun-13 21:42:35
Print Post

Re: Understanding packet loss on the first hop


[re: ByteTraveller] [link to this post]
 
OK. I could respond and say that I have an unlimited server that I could ping and guarantee it responds, but I dont anymore and I guess you would say that intermediate hops could still slow the ICMP packets down.


...or the upstream peer(s) for that server have an issue, or it's connected switch dies or any other manner of reasons any of the items inbetween you and that get busy. It's just not the way to control QoS period.



Indeed. I will have the current bandwidth synced to hand, and then set to that (I'll see if this rate is real or actually doesn't take the overhead into account, which presumably is all IP packets now rather than ATM cells). This doesn't deal with the other possibility though that my local connection is no longer the limiting factor - hence pinging the first real internet host and seeing how it responds - but it sounds like there isn't a reliable way to judge that yet (even if I cant automagically shape my connection to it, its a good single number to keep tabs on to ascertain the quality of the connection).


There's overhead in the protocol, I "sync" at a full 80000/20000 but I can't get exactly 80000/20000 once protocol overhead is accounted for, hence previous suggestion.

About shaping incoming bandwidth: Ive seen that said a few times before, but I don't think its as simple as this - I can drop packets coming in at the gateway, and the ends of the TCP connections will detect that and throttle back (hence looking for the retransmits that Acidic mentioned to detect congestion earlier).


The point I'm making is that the most likely reason you'll suffer quality issues is a congested upstream. I can run my line on downloads at full tilt and notice zero difference - as long as my upstream has capacity.

Tonight for a time I had zero upstream available -- result... my streaming audio (all of 256kbps from 80mbps fibre service) started having buffering issues. Enabled QoS to keep my upstream to no more than 18.5mbps and problem went away as I had plenty of headroom to keep things flowing.


You're approaching this the wrong way. You don't ping/test other hosts to determine if your link is saturated, you look at the link capacity. There's no meaningful way I can think of in a consumer broadband setup to (real time) manage your shaping based on a remote ping. The number 1 cause of latency increases and packet loss on standard dsl (including vdsl fibre) is going to be you running out of upstream.

--

I spent some years doing this on my normal DSL connection - pinging worked well in that it clearly showed the connection choking, even when the used bandwidth was below what the line was supposed to accomplish (modem stats were available then as well).


How did the "ping" to "somewhere" clearly show your broadband connection (which has any number of endpoints for each connection made) had an issue?

If I had a 80 meg broadband link and I ping a host reliably, but my line is idle and all of a sudden my ping's aren't returned, it could mean a million things, including somewhere on the route to the destination there being congestion, for example perhaps an intermediate peer or transit provider issue. My connection wouldn't be congested, just my route to a particular end point.

As I said before, I'm not sure your "ping" concept to a single endpoint gives a useful measure on which I would like to make decisions affecting my entire connection to all destinations.
Standard User ByteTraveller
(newbie) Thu 27-Jun-13 19:39:18
Print Post

Re: Understanding packet loss on the first hop


[re: therioman] [link to this post]
 
Firstly yes I know that upload bandwidth is critical - I learnt that years ago by experience and reading. The question atm was about apparent packet loss when torrenting and upload bandwidth was not exhausted.


In reply to a post by therioman:
How did the "ping" to "somewhere" clearly show your broadband connection (which has any number of endpoints for each connection made) had an issue?

If I had a 80 meg broadband link and I ping a host reliably, but my line is idle and all of a sudden my ping's aren't returned, it could mean a million things, including somewhere on the route to the destination there being congestion, for example perhaps an intermediate peer or transit provider issue. My connection wouldn't be congested, just my route to a particular end point.

As I said before, I'm not sure your "ping" concept to a single endpoint gives a useful measure on which I would like to make decisions affecting my entire connection to all destinations.


Because its not a question of the endpoints - when I used that method, my eye was on the DSL step being the bottleneck, not even Zen's gateway server.

Anyway, enough on that... I have finally fought GTK3, Wireshark and LibreOffice enough to conclude on the tests I did where I would download and upload a popular torrent, and without saturating the connection's download and upload bandwidth, see ICMP packet loss (no ping replies).

I wanted to see a significant increase in 'bad' TCP events (expert.severity on warning or error, which represents DUP ACKS, retransmits, etc etc) when the pings were not being serviced, which would then show that ping reflected actual TCP packet loss.

However! Unfortunately for me, I did not see this - congratulations wink Ping does well to indicate the line is 'busy', but I cant conclude from my tests that it reflects TCP packet loss. See the graphs 1 and 2 from the two tests a day apart - the orange line is supposed to represent 'ping reply received' or 'not received' - 1/0 that has been scaled to be visible on the other data.

Its clear that the bad events increase, but thats probably the torrent going faster/involving more peers in general. Annoyingly, I noticed that by far more data goes through UDP than TCP when torrenting these days - and I will not be able to detect associated packet loss in this test. Saying that though, there were clearly a number of TCP connections still in use, so I don't want to discard the conclusion just because of that.

I need to make my conky script more complicated so it doesnt freeze when Zen doesnt service the pings... and now on to the final test - VNC over SSHing in during torrenting and making sure theres no suspicious interference with the TCP connection.

ZeN / Zen Fibre Office Plus (VDSL2)
Standard User ByteTraveller
(newbie) Fri 28-Jun-13 17:35:52
Print Post

Re: Understanding packet loss on the first hop


[re: ByteTraveller] [link to this post]
 
I have now done some testing with VNC over SSH from a different connection to my SSH server while downloading the popular torrent, with at most 50% upload bandwidth used - see graph - there is a packet loss effect, but aside from the suspicious blip its probably minimal and not enough to get concerned about.

Now I've rediscovered wireshark's IO Graph functionality such stuff is easier to look into now.

Overhead-wise, thinking about this I have been pretty brainless taking the sync rate values as the truth - the modem has no idea about my connection, from what I can tell it just bridges BT's network with mine - so my real line throughput must take into account the fact all the IPv4 packets are encapsulated in a PPPoE connection, presumably it is real ethernet and not ATM below that as well. I'll just use your 95% value :/

ZeN / Zen Fibre Office Plus (VDSL2)

Edited by ByteTraveller (Fri 28-Jun-13 18:19:56)

Pages in this thread: 1 | 2 | (show all)   Print Thread

Jump to