From the PUG (corrected from PUNG MrSaffron) site: http://usergroup.plus.net/forum/index.php/topic
As with my initial posting on the loss of customer email data - http://usergroup.plus.net/forum/index.php/topic,2472.0.html, what follows is an open and honest account of the events surrounding the attempt to recover the data.
Firstly you need to understand that in an attempt to recover the data swiftly, the engineer who deleted the 3 volumes in the first place swiftly followed up his error by immediately trying to create a volume of the same size as the 1st of the volumes in the same place. This is an old sysadm "trick" that on some file systems could have revealed the lost data, however in this case, it did not work, and in fact caused us more problems, as you will see later on in this account.
Within 2 hours of the data being deleted, a data recovery company had been contacted and within 3 hours the NAS was in transit to them. By 14:00 on that day the specialists were racking the NAS and began the process of copying all the 1's and 0's from our equipment to their own. This is standard operating procedure for anyone working in the field of data recovery, and is simply about ensuring that there is always an untouched copy of the information in case something further goes wrong while working on the recovery. Due to the volume of data that was being dealt with, the copy took until the early hours of the following morning.
At that point, based on their initial investigations, the data recovery specialists set the expectation with us, that we would recover some of the data, possibly not all of it though, and that it could take 4-5 days. From that point forward we have a tale of increasing woe as each new deadline set by the data recovery people was broken as they discovered more and more problems. In the following paragraphs I will briefly cover off the main problems that have been encountered.
The Sun NAS that we had selected for the mail storage platform is the first series of products to emerge from Sun since their purchase of StorageTek, and as such does not run the usual Sun OS of Solaris. It uses StorageTek's own proprietary OS which is a heavily modified FFS2 (Fast File System 2). The modifications are all about increasing the performance of the system to ensure enterprise level performance.
As the kit is fairly new to market, the data recovery specialists had not worked on this specific OS before, though they do have a lot of experience with NAS's in general, Therefore they had to significantly rewrite the tools that they use for analysing and recovering data. They utilised their engineering departments in both the UK and the US to work around the clock to achieve a re-worked set of tools.
Apart from the tools issue, the proprietary OS, uses the 1st volume it has access to, to store the master inode table. For more information about inodes, take a look at the Wikipedia article http://en.wikipedia.org/wiki/Inode Essentially this is the table that tells the system where all the other files on the system are. As I mentioned earlier, the PlusNet engineer involved, had attempted to recover the data by creating a volume of the same size in the same place as the 1st volume. That action more than any other has caused us the most issues. By creating a new volume, the existing inodes were wiped and all data that was on that volume was essentially gone. Without that master inode table and with no knowledge of where the system stored it's back-up copy of this table, it has proved very difficult to work out what the data on the relatively undamaged 2nd and 3rd volumes actually is.
We have received a partial file list from the 2nd and 3rd volumes. This list amounts to a list of inodes and the data in them, not the list of complete files. Without even a partial directory structure it becomes vastly more complex to work out which inodes are associated with which other inodes and therefore piece together the complete files. Without the data on the 1st volume we do not believe we are ever going to get the directory structure. Without the directory structure it becomes vastly more complex trying to work out which file from the partial list belongs to which user.
So, here we are, it is almost a month since the 700GB of email and mailing lists were lost and we still have no recovered data to return to you. This is of course upsetting for us, and even more so for the customers whose data has been affected. The longer we wait for the equipment to be returned to us the greater the risk we run of hitting other capacity issues that we know are ahead of us, and we do not feel that we can justify any longer a wait, and still be taking the appropriate action for our customers.
When it is implemented the new platform will provide us and you with a vastly scalable, site resilient mail storage set-up, with 6 hourly checkpoints to ensure we can roll back the majority of the changes that happen on the system within a four hour period.
What this really means is that we are currently arranging for the return to us, of the Sun NAS head unit and Disk Arrays, so that we can push forward with the implementation of the new email storage platform.
Network Improvement Consultant
[MrSaffron fixed broken links]
PLUS NET - The Shaping Of Things To Come
"If you don't like it here it's very simple for you to find another ISP."
Head of Marketing, Plus Net plc
15 September 2005
Edited by MrSaffron (Thu 03-Aug-06 16:04:24)