User comments on ISPs
  >> AAISP


Register (or login) on our website and you will not see this ad.


Pages in this thread: 1 | [2] | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | (show all)   Print Thread
Standard User E300
(committed) Fri 12-Jan-24 13:18:53
Print Post

Re: Poor uptime and reliability


[re: perlen] [link to this post]
 
Looks like another drop, just cut off a video call. Not good.

Standard User jimbof
(committed) Fri 12-Jan-24 13:21:52
Print Post

Re: Poor uptime and reliability


[re: jimbof] [link to this post]
 
Even worse than I thought...
Standard User perlen
(newbie) Fri 12-Jan-24 17:35:37
Print Post

Re: Poor uptime and reliability


[re: jimbof] [link to this post]
 
Z.Witless is now in service and on new hardware - I was moved back on to it at 16:33 today.

Unfortunately base latency (was 9ms now 11ms) has increased by 2ms frown


Register (or login) on our website and you will not see this ad.

Standard User perlen
(newbie) Fri 12-Jan-24 17:43:41
Print Post

Re: Poor uptime and reliability


[re: perlen] [link to this post]
 
Y.Witless crashed today also...


INITIAL
4¼ hours ago by Andrew
At 13:16 Customers on Y.Witless dropped and reconnected a few minutes later. The cause is being investigated as a matter of high priority.

UPDATE
55¾ minutes ago by Andrew
Y.Witless crashed again at 16:30.

RESOLUTION
28 minutes ago by Andrew
We've updated the main post regarding these drops: https://aastatus.net/42577


------------------

UPDATE
28¾ minutes ago by Andrew
An update of where we are (Friday 12th January).

Some customers have had interruption to their service this week as we have seen a number of crashes on both Z.Witless and Y.Witless.

Today we replaced the hardware of Z.Witless.

Our developers have been working on investigating each crash we have. We have been saying in recent updates that progress had been made on the crashes we have seen, and this week we applied the software update to two of our three 'Witless' LNSs. In our test lab we have never seen this updated software crash during 3 weeks of testing. However, we have had crashes this week since applying the updated software.

Usually with a crash, our developers are sent a crashlog with details specifying exactly where in the code the crash happened. However, the crashes that have been affecting us are different in that the hardware locks up and restarts - with this type of crash we have less forensic to work with which is making getting to the bottom of the problem that much harder.

We are still working hard to resolve this. We various avenues of investigation to take, and during the next week we will be planning more overnight work as well as datacentre trips.

We know how disruptive this has been for those customers affected, and we are doing all we can to work towards a stable service for everyone.
Standard User jpm
(fountain of knowledge) Fri 12-Jan-24 18:11:33
Print Post

Re: Poor uptime and reliability


[re: perlen] [link to this post]
 
In reply to a post by perlen:
Z.Witless is now in service and on new hardware - I was moved back on to it at 16:33 today.

Unfortunately base latency (was 9ms now 11ms) has increased by 2ms frown

Have they explained why they are doing this sort of work at 5pm instead of scheduling overnight changes?
Standard User E300
(committed) Fri 12-Jan-24 18:14:35
Print Post

Re: Poor uptime and reliability


[re: perlen] [link to this post]
 
I see the same thing, a 1.5 to 2ms increase in latency if I connect to Z.Witless, I think this is because it is in a different data-centre to X and Y so routing changes.

Also I've found on BT Wholesale that latency can vary by plus or minus 2ms for me, as there is some variation in the back-haul routes, but a few drops of PPP will usually see it come back up on the shorter route.

Sometimes when we get these overnight drops, I can see a 4 or 5ms increase in latency the next morning because I've landed on Z.Witless plus got a longer routing on BT backhaul, which just doesn't feel right going in the wrong direction even though it doesn't make any noticeable difference.

The differing latency via BT Wholesale of a couple of ms I've seen with my previous ISPs as well, so just one of things, some sort of load balancing.

Standard User E300
(committed) Fri 12-Jan-24 18:19:11
Print Post

Re: Poor uptime and reliability


[re: jpm] [link to this post]
 
In reply to a post by jpm:
Have they explained why they are doing this sort of work at 5pm instead of scheduling overnight changes?


This was another crash of an LNS and so connections fell over to Z.Witless. It's easy to spot these on the blip graph https://aastatus.net/index.cgi#blip if it was a controlled move over, we would see a red blip below the green blip, but because the LNS has just crashed, it doesn't update the graph with any disconnections, only re-connections show up.

So that's two crashes today a few hours apart.

At least they are open about the issues and we know what's going on, and so as customers we aren't thinking is it our kit or wasting time rebooting our own routers etc.

Edited by E300 (Fri 12-Jan-24 18:32:58)

Standard User perlen
(newbie) Fri 26-Jan-24 08:18:50
Print Post

Re: Poor uptime and reliability


[re: E300] [link to this post]
 
And another:

Jan 26, 07:00 AM
https://aastatus.net/42612

6AM: Z.Witless LNS had a hardware lock-up, causing lines on it to drop and reconnect
Standard User E300
(committed) Fri 26-Jan-24 09:31:44
Print Post

Re: Poor uptime and reliability


[re: perlen] [link to this post]
 
In reply to a post by perlen:
And another:

Jan 26, 07:00 AM
https://aastatus.net/42612

6AM: Z.Witless LNS had a hardware lock-up, causing lines on it to drop and reconnect


I had a drop overnight due to a "Lost Carrier" which suggest it was BT work then as I'm on X.Witless. Z.Witless I think is new hardware now with extra debug logging so having that crash on Z.Witless might be good news in a way, as they may find out what is causing it.

Standard User serichards
(regular) Fri 26-Jan-24 10:13:13
Print Post

Re: Poor uptime and reliability


[re: E300] [link to this post]
 
I had a lost carrier just before 1am. Couple of minutes outage.

I'm on gormless then aimless judging by the traceroute.

I do like their naming scheme. I wonder if they have a feckless?!
Pages in this thread: 1 | [2] | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | (show all)   Print Thread

Jump to