You mention you administer many circuits, of a business critical nature (I presume). I think it would be fair to say that any business which is relying on the stability of broadband run over a rock bottom standard infrastructure known as 'the phone line' is, for the most part, probably not a business that is serious about paying proper money for a solid connection to the internet. If it really was that important to them, they would be going the SDSL/leased line route.
I never said that it was absolutely end of the earth critical. But clearing up lots of circuits that just don't work anymore and cannot authenticate again after many hours DOES cause major issues, especially when you cannot get updates from the ISP because its own phone system is down... Yes Zen, I'm looking at you.
I know, such technologies are way more expensive, but the reason is simple. Different infrastructure, away from the rest of the commoners and peasants like myself which use the cheap (phone line) based infrastructure.
Actually if you knew that much, you'd also know that leased lines and other similar technologies are generaly delivered over very similar infrastructure and go down. Ask AAISP what happened to its ethernet circuits last week when BT lost the DSL, that's right, the ethernet stuff died too.
That digger goes right through the lines, and goodbye leased line. Same infrastructure. The exchange my DSL is in and my Leased Line loses power (happened recently), both dead. All that extra money, same outcome.
In real terms having extensively tested this, I can get the same overall level of reliability by having 2 or 3 DSL Lines from different ISPs as I can by having a LL + DSL backup. Excep I save a lot of money.
Unless we start talking about true diverse Leased Lines (and often they're not as truely diverse as you believe - I've seen them have common points of failure too)
I understand your frustration by saying it should be Zen sorting this out, but you surely realise what you were signing up to when setting up the accounts. An SLA was probably not even in the terms of service.
It is in the spirit of the service offered that reasonable competence will be used. Killing lines just before the end of the working week to do rebalancing is not the best plan. Having a phone service for support as an ISP that died is pretty poor too. No resiliance by Zen there. Wouldn't matter if I did have a Leased Line from them, I'd STILL have been unable to reach them for support. Great stuff.
So to complain and rant here about that is futile. You would be better protesting with your feet, and moving, either to another provider which doesn't use the BT 21CN network, or SDSL/leased line where appropriate.
Funnily enough I moved many lines away from Zen 2-3 years ago. I do this as a matter of routine, but not every organisation has the money for lots of lines, and although not critical, reading a status page saying issues are resolved, then finding it is far from resolved and even a 20 minute turn it off and on again session leaves lines dead for another hour is just not acceptable - it is massively misleading and could have been avoided.
Plus, most of the reason these load balancing efforts lead to lines dying and failing to come back live with BTs Radius system and how it handles session reconnects - Zen should (as should other ISPs) be hassling BT on this since the way it handles this scenario is substandard and since 21CN comes up fairly regularly as a problem.
If the businesses you are managing can't or are unwilling to do this, then they must pay for that decision by make doing with a less reliable connection. Someone within those organisation needs to assess what such a decision would cost them in the long run (in terms of money they would make/lose for the time they don't have connectivity) when faced with such outages as this one.
It isn't always about money lost.
It is your problem though.
No, it is Zens problem. Right now I pay the money to them. If I choose to spend money with someone else it does cease to be Zens issue, but right now it is down to Zen. You're side stepping the responsibility.
[quote[
For me, it seems A&A provide the best service they can given the flaws of the infrastructure they have to work with.
[/quote]
Actually I'd disagree from personal experience. As much as I like Adrian and co, I don't think A&A offers the best service they can.
I think Zen are the same breed. Although, if load balancing is one of the aftermath issues caused by such outages as these, I agree they should be doing their best to obviate these issues where possible. I think (but maybe a Zen representative can confirm this) that the load balancing is actually done at BT's side, not Zen's, which is why Zen then have to re-distribute users after such an event, and why it is so disruptive to their customers when they have to knock people off and let them reconnect again to spread the load. I'm sure this is something that can be addressed, but maybe it's not a high priority. Either way, if it bothers you that much, maybe it is time to walk and join another provider.
Thanks for the lesson - but seriously, read my historical posts, I do point out if you want to have a "guarantee" you need LLs etc (but this is not the same as reality - an SLA can say anything, but it doesn't make it happen, and provides pretty poor compensation when it doesn't. An SLA is for the most part a box ticking exercise.
As I say, I have significant experience in this field, and have successfully resolved some customers long term reliability issues by moving them AWAY from Leased Lines some times - not because they couldn't afford it, but the service wasn't actually offering any enhanced reliability over DSL - by running them side by side for 6 months, I proved it. They moved to our Dual DSL service and since then I've never heard from them on support. It just works.
My issues are not ultimately with the "mission critical" debate, it is that:
(a) Zen made poor decisions to load balance when they did
(b) This was doubly poor knowing LB's cause stale sessions to happen every single time, and know it leads to death of connection for extended periods. The 20 minute claim is nonsense. All our circuits came back at EXACTLY the same time - that's not symptomatic of a BT stale session issue
(c) Zen's phone service disappeared so you couldn't get through. Bad lack of resiliance there on Zen's part and makes you wonder if in fact the issues are related.
As for an SLA, actually there is an implied SLA of sorts - anyone providing services is obliged to provide the service with reasonable competence and skill (a pretty common and well accepted legal term). That is what I expect. I'm not talking about stupid percentage figures and complicaetd documents full of get out clauses.
On a lighter note, my favourite SLA was one that excluded any sort of failure caused by a third party in any way. When I pointed out that this meant pretty much everything was excluded from the SLA, they didn't get it. Took us a lot of effort to get them to remove this clause. We had to remind them that we have no influence over the choice of suppliers they use, and that the way it had been worded meant absolutely everything was a third party clause. They clearly knew this and hadn't ever paid out on the SLA and nobody had bothered to read the SLA properly. That's why they're basically worthless.