For the first part of this series, please read “Website Backup: Part 1: Planning

With the continuous growth of data stored within web applications, there are now several free and paid backup options/types available in the maurket, that can be chosen based on requirements and affordability.

Methods to backup the files

Unless explicitly mentioned, the methods listed and explained below are generally used to backup all the required files except the database

RAID (Redundant Array of Inexpensive Disks) – This is one of the standard backup methods that is provided as a standard feature by most of standard webhosting companies. RAID is the method where the data in the main hard disk is mirrored or replicated into one or more hard disks. It generally provides protection only against hard disk failures where, when the main hard disk fails, the data is available in the other hard disks and can be easily recovered. However, it does not help in the case of accidental deletions and complete server crashes. Since the complete hard disk is mirrored, RAID also automatically backs up the database as well.

Backup server – Backup server is usually used in cases where the amount of data is huge. A backup server is nothing but a server dedicated only for taking backups. The backup server may or may not be in the same location of the main server. The backup server may also take backups from several servers (for various applications). Having and maintaining a backup server is expensive. A dedicated backup server is also not required when the amount of data being backed up is relatively less.

Offsite backup using FTP – This is a new service which is now being offered by most of the web hosting companies to its customers. The web hosting companies set up backup servers in locations far away from the location of the actual servers (off site). Backup of data is done via FTP from the main server to the backup server automatically at durations that are setup by the customer. When the customer purchases the Offsite backup via FTP option, the data transfer and sometimes the space used in the backup server is unlimited thus giving the customer a freedom of multiple backups and snapshots whenever necessary not worrying about space or data transfer. Since the server is in a remote location, the backup is not affected during times of natural disasters or accidents (ex. fire) when the main server and its data may be lost

Backup and send to E-mail – This method can be useful only when the amount of data is very small. This suits backup of small blogs and content management websites. Using freely available custom scripts, daily backups of website files can be sent as an email attachment. Here backups are stored in the Inbox and it is cheapest (free!) method of backup.

Backup via FTP / SCP – FTP is the standard method/protocol through which files are uploaded and downloaded from the webhost. Using free software like FileZilla, WinSCP and several such others, one can connect to the web server via FTP and download the required files for backup and store the files in a hard disk, CD, DVD or any storage medium. Even though this method of backup is primitive, it is very effective for small websites, especially static ones where a weekly backup of the files in this method serves the purpose.

Amazon S3 – Amazon S3 (owned by Amazon Web Services) is a cheap and effective pay as you go online storage solution. This solution can be effectively used as a backup medium for a web application. Using freely available scripts, the data that needs to be backed up can be copied regularly into the Amazon S3 storage. For restoration, the data can also be quickly restored. This is very good option if the amount of data that needs to be backed up is huge.

How to backup a MySQL Database?

Unlike website files backup, a database backup is different. MySQL the most widely used database for web applications and most other database platforms can be easily backed up by taking a physical dump of the database where the database structure and all the data stored in the database are converted to SQL queries and stored in a SQL file (text file with .sql extension). When the data needs to be restored, the SQL queries stored in the text file are run in the database thereby restoring the database.

The various methods available for backing up the database are.

MySQLdump – This is a command line utility which is available within the MySQL server. This tool as mentioned earlier, exports the database data along with the database structure as sql text files. Since the database is converted to a normal text file by MySQLdump, the text file can backed up using any of the methods that are used to backup required files. There are scripts widely available that export database into sql and stores it into specific folders. The stored files in turn are backed up using the backup method used to backup regular files

phpMyAdmin – This is similar to MySQLdump in a way. The difference is this is a web based menu driven utility where the database can be converted into sql file and downloaded and stored into hard drive/CD/DVD or any other storage media as required. The drawback in this method is that this method cannot be used if the database is huge i.e. contains a lot of data. There are other tools similar to phpMyAdmin which can be used for the same purpose, like sqlyog, sqlmanager etc.

Automatic MySQL backup and email script – Using freely available scripts, databases can be baked up as sql text files, compressed and emailed the file as an attachment. This method is very cheap (free!) however it can be used only when the size of the database is very small.

Backup by Copying Table Files – This is a physical backup method (requires shutdown of MySQL server) where a raw backup of the /var/lib/mysql (MySQL data directory) directory which contains .frm, .myd files done. These files are the files used by MySQL to store data internally, thus cleverly backing up these files is equivalent of backing up the MySQL databases.

However, problems may occur in some cases during database restoration. Say, the MySQL database server crashes and a new MySQL database server is installed and the backed up .myd and .frm files are restored. If the version of the MySQL server which crashed and the version of the MySQL server which has been re-installed do not match or other configurations do not match exactly, restoring the .myd and .frm files may not actually work thus creating data loss.

MySQL Replication Slaves – This method is expensive but highly effective when the database of the application is huge and has several transactions happening at any given time (high traffic). This method is similar to mirroring seen in RAID but the backup data is stored in another MySQL server. Here a replication slave MySQL server maintains a mirror copy of the main database. The mirror is updated real-time and up to the second data is saved in the replication slave

Since the replication is real-time, if required the slave server can be exported as a sql file and backed up which is backing up of the master database.

Types of Backup

The methods of storing backed up files are of different types. Each method stores files differently and each method has its own uses, advantages or disadvantages.

Full Backup – Full backup can be termed as normal backup. In full backup all the files are copied from the original location to the backup location. At the end of the backup a complete copy of the original location is available in the backup location. The disadvantage of this method is the space required by the backup increases exponentially equivalent to the size of the source, every time a backup is made. In practice a full backup is taken only for the first time a backup operation is done. Future backups are incremental backups

Incremental Backup – Incremental backup is used to back up files which are modified or changed after taking the last backup (the last backup may have been either a full backup or an incremental backup). This backup saves time and cost because each backup takes the space equivalent to the size of only the modified files and not all files.

Differential Backup – This is slightly different from incremental backup. Differential backup takes the back up of files which are modified or changed since last full backup (not last incremental backup).

Mirror Backup – A mirror backup is similar but slightly different from full backup. Files are generally compressed and backed up in full backup but mirror backup take backups without compressing. Also an incremental or differential backup of only modified backup is not possible after a mirror backup.

Backup Strategy along with backup schedule

Backups become outdated very quickly as the live data keeps changing. New data are added, removed or modified. An old backup which is several weeks old may be obsolete and may be of no use at all. It is absolutely necessary to formulate a backup strategy and schedule to ensure at the time of data loss, the most recent data is available in the backup thereby causing very minimal or absolutely no loss of data.

Owners of applications or implementation team or the person in charge must have a clear backup strategy. Strategy can be formed by taking into account the following factors.

  • Priorities: The data based on its importance. Database and source files should be the top priority.
  • Data retention policy: How long should an old backup be stored before being purged.
  • Estimation of the size of the backup required. This has to be calculated before the type and medium for backup is decided.
  • Planning Backup type. After how many incremental or differential backups would a full backup be executed etc

Also, the most important part of the backup strategy is the schedule. It is nothing but how often and when the backup will be executed and who will be executing the backup (if it’s not automated)

As a general rule:

  • Important application files such as user uploaded files and databases must be backed up every day or night or both, but less important source files which don’t often change can be backed up every 2 days or a week.
  • Apart from the regular backups in one storage media, on a weekly or bi-weekly basis, a secondary backup of the primary/regular backup must be taken at least for business critical data. This adds a second layer of protection in case the backup media / server fails
  • Backups up to a minimum 3 versions old should be retained before purge. For example 1st, 2nd, and 3rd daily backups must remain till the end of 4th.

There are several software products, free and commercial widely available which automate the whole process of backing up of data. Everything from the type of backup to what should be backed up, when should be backed up and even purging schedule. It is advisable to have such a setup in place to ensure trouble free and smooth backup. No one ever knows when data loss can occur, so for data, like everything else in life “it’s always necessary to have a backup”.