Should I use differential or incremental backups?

While not strictly a Backup Exec question, this gets asked a lot. So what are the advantages and disadvantages of each?

For the purposes of simplicity, we’ll assume you are backing up files based on the archive bit. This is a marker on each file, which gets “turned on” when a file is created or modified. In Windows you can see it by viewing a file’s advanced properties (It’s called File is ready for archiving).

When you perform a full backup, every file included in your selection lists (apart from those excluded by active file exclusion) is backed up, and the archive bit is “turned off” on each file.

When you perform an incremental backup, every file included in your selection lists (apart from those excluded by active file exclusion) that has the archive bit “turned on” is backed up, and the archive bit is “turned off” on each file.

When you perform a differential backup, every file included in your selection lists (apart from those excluded by active file exclusion) that has the archive bit “turned on” is backed up, but the archive bit is not changed at all.

This means that instead of performing full backups each night, you can avoid backing up the data that has not changed. This saves time and storage space.

Let’s look at two typical examples:
Company A, Company B and Company C each have 100GB of data to backup, and each day 1 GB of that data changes.
Company A performs a full backup each night.
Company B performs a full backup on each Monday night, and an incremental backup each other night.
Company C performs a full backup on each Monday night, and a differential backup each other night.

Company A will backup 100GB of data each night.
Company B will backup 100GB of data each Monday night, and 1GB of data (the changed files) each other night.
Company C will backup 100GB of data each Monday night, and 1GB of data on the Tuesday night. On the Wednesday night, between 1 and 2 GB of data will be backed up, depending upon whether the files changed on Wednesday were the same files changed on Tuesday. By Friday, between 1 and 4 GB of data will be backed.

So Companies B and C will spend far less time (and resources) backing up than Company A. Whether Company B spends much less time backing up than Company C depends upon their data usage.

So how does this affect the restore process?

Let’s assume Companies A, B and C all have a catastrophic failure on a  Monday morning.

Company A can restore from the Friday night media. Nothing else is needed. If Friday’s tape won’t restore for some reason, they can try Thursday’s tape.
Company B will need to restore from the previous Monday night backup. Then the Tuesday, Wednesday, Thursday and Friday backups will each need to be restored (in order).
Company C will need to restore from the previous Monday night backup, then restore the Friday night backup.

Obviously Company A’s recovery will be quickest, followed by Company C, then Company B.

There remains a potential problem, however for Companies B and C. If a file has been deleted, this information does not get stored on the backup tape (unless you use the Advanced Disk-based Backup Option, but we’ll ignore that).
If a file existed on the Monday night for Company C, but was deleted during the week, it will be restored from the Monday night backup, and when the Friday backup has been restored, the restored file will still exist, even though it wasn’t there when the Friday night backup was taken.
Similarly, for Company B, If a file existed on any weeknight for Company B, but was deleted during the week, it will be restored from the Monday night backup, or the backup of the day it was created, and when the Friday backup has been restored, the restored file will still exist, even though it wasn’t there when the Friday night backup was taken.

As a result, Company C may have some files around which they don’t expect to be there. Company B may have many more! If you have an application which relies on the presence or absence of files, this may cause issues.