Continuing/following on from his bulletin on backups in December, this week Charlie looks at them in more detail and provides some useful points to consider in our own organisations.
I did my first bulletin about backups before Christmas and thought I would continue today by sharing what I have learned. I must say thanks to James Watts, the Managing Director of Databarracks, who was very patient with me and spent time answering my long list of questions. It has to be noted that this bulletin is written from a non-technical point of view so does not go into the technological solutions. Instead, it looks at what you must think about and what questions you must ask when preparing a backup strategy. The list here might also be useful for a business continuity manager to use as a checklist for discussing backups with their IT department to understand what level of IT resilience is in place. When I started my conversations with James, I thought backups were, conceptually, easy — you backed up IT to the industry standard — but as I have learned more, I found, like many things in life, it is not that simple!
Figure 1 shows all the different considerations when looking at your backup strategy, which, as I said earlier, you could use as a checklist for checking your understanding of your organisation’s level of preparedness.

Why are we backing up?
The first of the considerations is why are we backing up. This seems obvious that if we lose our main systems, then we could lose all our data. Anyone who has had that cold sinking feeling of thinking all the family electronic photos could be lost knows that some data cannot be replaced, and for many organisations, if they lost all their data, they wouldn’t have a business. In some cases, you can reproduce the data. I understand SEPA (Scottish Environment Protection Agency), after their cyber attack in 2020, had to create their entire financial records from bank statements. There can be a number of different reasons for backing up your data, which then will have a knock-on effect on how long you retain data for. It could be for general operational reasons, so you are looking at covering scenarios such as total loss of your production system due to some sort of disaster, or it could be loss of a file due to deletion or corruption which needs to be restored. There may be compliance reasons for the storage of your data, and so you may have to keep it for seven years or longer. Recovery after a ransomware attack may be another consideration which may produce an air-gapped solution geared around recovering your systems to a required Recovery Time Objective (RTO) to ensure the continuity of the organisation’s operations. Spending some time answering this question will help drive the answers to some of the other considerations.
What are we backing up?
I always thought that this was a simple question — you back up the data — but the more I looked, the more complex it got. You can just back up the application’s data, but then in terms of recovery, you need to find somewhere to recover the replacement systems. You need a server, in the cloud or in a data centre, you need the application which processes the data, and you need the configuration of the applications as it will most likely be customised to your organisation. If you just back up the data, then there are a number of steps you have to do to recover the system as a whole. If you back up the application with its data, applications, and configuration, your recovery will be much faster. This type of backup is not so convenient for recovering a single corrupted file, and so sometimes you may back up the same application in two different ways: singularly as the data and then as a total working system.
In talking to James, the single most important part of your IT system which should always be backed up is your active directory. This is the directory of your users and what systems they are allowed to access and contains the single sign-on which allows users to access internal as well as external systems. Without active directory working, nobody can log on and users can access any systems. There is a brilliant article from Wired Magazine on how Maersk dealt with an active directory issue during the NotPetya cyber incident.
The other backup you have to look at is your SaaS data. Although the SaaS organisation is responsible for keeping the applications working, up to date and patched, they are not always responsible for the backup of the data, or they may do this in a limited way which may or may not, suit your organisation’s requirements. You don’t want to find that you have been affected by an incident and find your data was not backed up and your SaaS provider has lost your data forever. Therefore, you need to look at the small print of your contract to check what backup is required. You also need to look at the data deletion policy. Microsoft retains Microsoft 365 Outlook data for 30 days; after this, the data is lost forever. Companies like Databarracks would love to help you back up your Microsoft data, but you need to decide whether the out-of-the-box service provision is enough, coming back to ‘why are we backing up’, or whether you need to make additional arrangements. We must also remember when backing up SaaS that big providers such as Microsoft and Salesforce are wealthy companies who can afford top-of-the-range security, hosting, and maintenance. Smaller niche SaaS providers may not have the same money to spend on these items and may be more susceptible to a major disaster, which could result in the loss of your data. Read the small print and check if you are happy with their existing backup provision or if you want to take additional precautions yourself.
Cost and risk appetite
As in all things business continuity, there is always a balance to be had between what you are protecting and the impact if they are lost against the cost of protecting them. You can spend almost an infinite amount of money on backups, backing up in different ways and providing redundancy and splitting your risk. The basics of three, two, one backups we talked about in the previous bulletin of having three backup copies, two on different media and one offsite, can be a basic provision, and they might decide to have more offsite. You may want to use two different cloud providers to host your backups to mitigate the risk of an outage or a company issue such as bankruptcy, but then this would double your cost. As in business continuity, there is always a balance between cost and the speed of recovery. The faster you want to recover, the more it costs, and so it is the same with backups. When this cost-benefit discussion is being had, the senior managers signing off the solution and providing the budget should understand the risks they are running. You may not be able to afford all the recovery you want, but at least you are aware if there is a disaster and you have to recover, you know your timescales, and have to be comfortable that the organisation will survive. You understand the pain the organisation will have to suffer before recovering.
Once you have established why you are backing up, what you are going to back up, and your risk appetite, there are a number of considerations you need to make decisions on.
- How long will you keep your backup files? This could be driven by business need or it could be a regulatory requirement such as to keep files for 7 years.
- Volume of data. If you keep multiple versions of the same databases, then it will cost more, so the cost of your backup service is directly driven by the volume of data you are backing up.
- Number of versions of the same file. Again, this is rather obvious. If you keep multiple versions of the same file, then you increase your data volume. If you are going for the 3-2-1 strategy, then you are keeping three versions, but then keeping sets of data in different places decreases the risk of having a single point of failure which could cause all your backups to be lost.
- An additional cost is your testing and validation. You need to be regularly testing your backups to check that they can be recovered and within the time agreed. I worked for an organisation a while ago, and one day they discovered they had been diligently backing up their systems for years, on to tape, only to discover that there was nothing on the tapes and no backups were actually being written. Luckily they found this out before they had to use the backups for a serious recovery. Testing can make sure that your IT people know how to recover a system and that they can do so within the set RTO. When IT departments are understaffed and under pressure, this is an easy element not to carry out, but it is vital in ensuring your recovery strategy will work as it is meant to.
- When we carry out business impact analysis for clients, we capture the RTO (Recovery Time Objectives) and RPO (Recovery Point Objectives) for activities the organisation carries out. We also capture the applications that are essential for them to operate at an agreed level. We then go and speak to IT and see what RTOs and RPOs they have for their applications. Often there are large gaps between what the organisation wants and what IT can provide. The RTOs and RPOs may be set by the IT department without consultation with the users, or in many cases, this is not done at all. The first stage for me is for both sides to acknowledge the gap, and then a strategy can be put in place to meet it. Often, due to cost, the RTOs and RPOs wanted by the business cannot be met, and compromises have to be made. I always think that it is vital organisations know their gaps and their capabilities and have a managed risk position rather than finding out on the day of disaster that they can’t recover as they thought they could.
- Again, with RPOs, like RTOs, we often see a gap between what the organisation wants and what IT can provide. In a ransomware attack, the hackers will try and get online backups, and so the recovery will have to be from offline backups. Understanding the RPOs is vital for departments to understand what data they could lose in a cyber attack.
- The locations of storage have to be considered along with the associated costs. You can have gapped backups in the same data centre as your main systems, but this does present a single point of failure. Taking backups to an offsite data centre prevents having a single point of failure, but then the recovery of the application can take time due to the bandwidth needed in transferring the files from the data centre where they are stored to the data centre they are recovering to.
- The more security you want, the higher the cost, so there is again a balance to be had, and this is where you have to rely on technical advice from your IT department in ensuring that there is appropriate security on your backups. Backups should be encrypted when they are moving around the network and should be encrypted at rest. I learnt the acronym WORM, which means write once and read many times, so your backups can’t be altered or encrypted, making them more secure.
I think it is important for practitioners to have a basic understanding of backups. They will be hugely important in recovering an incident affecting your IT, especially a cyber incident. As you have seen, there is a whole technical layer, science, and a multitude of products and services which vendors can provide to meet your requirements. I am just learning about this aspect of business continuity so I might not have got everything right. I’m happy to get any comments to correct any mistakes!
If you would like to speak to someone more technically qualified about backups, email contact@databarracks.com or look on their website.