Opening up SSH to the outside world is obviously a security risk as some folks have recently discovered.
There are many things going on here and without context this is disingenuous. My day job is running an High Performance Computer (HPC) or supercomputer to the layman. I have been running public facing servers now for over 25 years and never had a machine of mine compromised.
Firstly HPC sites like ARCHER run by the EPCC, have to make themselves available to remote login. To not do so would make them largely pointless.
Secondly EPCC have to some extent been hoisted by their own petard in this case. There is a lot of "tail wagging the dog" going on in account management IMHO at the EPCC. Basically if I have more than one project I have to have separate accounts because for each project because that makes it easier for EPCC to manage things and then each account has to have a different password. As you can imagine under such awkward conditions bad practice by the end user prevails.
In my case, appropriate firewall rules means allowing inbound access from specific IP addresses to these servers. All my servers run SSH but not all are accessible from any IP address.
For HPC sites this is basically impossible and anyway would not have helped the EPCC.
I do run some SSH servers accessible from everywhere (except for countries I've decided are rogue) and for these I run SSH on a non-standard port. I know people will say "but if people scan you they will find your SSH servers anyway" but that's highly unlikely.
Most SSH scans target port 22 and any IP address attempting to connect to me on this port gets autoblocked for a significant period of time. Also, a port
scan results in an automatic block for a long time.
And I have logs that show none standard SSH ports being found on never before used IP addresses within 36 hours. If you think that a none standard SSH port helps I have a slightly used bridge with only a few snapped cables and brand new truss end links to sell you.
If someone does find my SSH server by chance (which has never happened, and I've got the logs to prove it) then they get three attempts at the password before Fail2ban locks them out - again for a significant time. Even if they get the username and password correct, they still need to use multifactor authentication - which is tied to devices in my possession
Good idea, but this would not have helped the EPCC and multifactor authentication is not trivial to implement when you have hundreds of users. If you only have a handful or it's just yourself then MFA is pointless just pick a decent random password and don't write it down.
So what happened at the EPCC and other HPC sites across the UK and Europe back in April? One thing to bear in mind is that users of HPC machines often have accounts on multiple machines across countries and continents. Certainly the machine I look after has users with accounts on ARCHER, and other systems that where compromised including the ones in Munich and Dresden. These are sites that absolutely would be on your allow list of IP addresses. Saying use a VPN is impractical because an end user on one system does not get to setup a VPN to another system.
Well firstly an account belonging to a user in Poland (or at least a user account on a system in Poland) was compromised. The method for this compromise has not been published.
They where able to use this local access to gain root on the systems (more on this later). Once you have root on a system it is trivial to scan all the accounts looking for private SSH keys that have not been secured with a passphrase (which turns out to be lots). On the HPC system I run it takes under one second to scan just shy of 900 user accounts. You can then take a look at the users known_hosts file (if it is not hashed) to look for potential hosts to try those keys on. Note even if known_hosts is hashed I could just scan your Bash history file, it is also trivial to do. In a small amount of defence if your Google setting up SSH keys a lot of tutorials on the web tell you not to supply a passphrase for your private SSH key. This includes "trusted" sources like RedHat for crying out load.
Using this method the hackers (who where coming from IP addresses originating in China) where able to move from HPC system to system across Europe. They where not able to gain root on all systems which with the lack of new local privilege escalation vulnerabilities published in the last two months would indicate they where using a pre existing vulnerability that was not patched. HPC systems are notorious for not applying patches regularly for a range of reasons a good few of which revolve around patches changing the results of computations which screws up your research...
My guess is that https://nvd.nist.gov/vuln/detail/CVE-2019-18634
was the mechanism used for the local to root privilege escalation. It is the most likely candidate and the deafening silence when I have asked the sites effected is noticeable. I want to know is there a zero day I need to be worried about on my HPC site or is down to a lack of patching. Of course HPC sites don't want to admit to a lack of patching...
You might say well ban IP addresses in China or Russia from connecting to your system. The problem with that is we have users from those countries as students and researchers at our University who have accounts on our system for the research. At the moment a few of them are stuck back in their home countries due to COVID-19 lockdowns....
A good password that only exists in your head with a none standard account name (root, admin or a first name are all bad) something like ahd12935 is much better, combined with rate limiting connection attempts is way more secure than SSH keys. We have actually disabled SSH key based access because we cannot secure it. We have just had to [censored] a couple of users for sharing passwords....
The take away from all this is the HPC systems where hacked due to the stupidity of users over their handling of SSH keys which as an admin you have absolutely no way of checking up on and probably a lack of timely patching by admins.