Backups and Monitoring

Host Server Backup Considerations

The two most important assets of a TeamDrive Host Server are the storage volumes that host the actual TeamDrive Spaces as well as the MySQL database that stores the related meta data.

The backup schedule depends on the amount of users, their activity and your recovery point objective. We recommend to run a backup at least once a day. The backups should be safely stored on another system.

Ideally, the time and frequency of the Host Server backup should be synchronized with the backup schedule used on the associated Registration Server — this ensures that the information about Users and their Space Depots is consistent across these systems.

In a virtualized environment, the usage of VM snapshots is highly recommended, as these provide atomic and instant full-system copies across multiple instances that can be backed up offline.

The backup of the Host Server’s Space Volume(s) can be performed by any given file system backup tool.

When planning a backup of the volume containing the TeamDrive Spaces, keep in mind that the last.log files, located in each Space directory in the directory protolog are frequently updated by the TeamDrive Clients. New space data and events are constantly appended to the file. When the log files reach a certain size (currently set to 8MB, but this value is not fixed and could change in later versions of TeamDrive), they get renamed and new last.log files will be created. This operation is initiated by the Clients. The naming scheme is to rename last.log to <number>.log, where <number> is the next free number, starting from 0. Previously renamed log files are not modified anymore, but must remain available to the clients since these logs must be read when a Space is joined.

To create a consistent backup, the best approach is to perform a snapshot of the entire Space Volume file system, preferably after shutting down the Apache http Server beforehand. If you are using an incremental backup method like rsync, keep in mind that some Spaces may have been changed while the rsync job is still running. For consistency, we suggest to perform a full rsync run while the service is running (to sync the bulk of the changes), then briefly change the volume’s status to Standby or shut down the Apache HTTP Server and run rsync once more, to transfer the remaining changes that have occurred in the meanwhile. Once the rsync job has finished, the Apache HTTP Server can be started again.

The MySQL databases must also be backed up periodically, ideally at the same time the Space Volume(s) are being backed up. This ensures a consistent snapshot of the file system and the related meta data included in the MySQL database.

The Host Server’s MySQL databases that need to be backed up are named pspace and (optionally) hostapilog. They use MySQL’s InnoDB storage engine to provide transaction support, fast recovery and consistency. Any of the ususal MySQL backup methods may be used, e.g. mysqldump. The size of the Host Server’s MySQL Databases is usually quite small, if API logging is not enabled.

The MySQL backup can be performed using any established MySQL backup method, e.g. running a mysqldump via a cron job, or using more sophisticated tools like Percona XtraBackup or Oracle’s MySQL Enterprise Backup. Other commercial backup solutions usually offer MySQL-specific plugins or extensions as well.

An example MySQL backup job using mysqldump could look like as follows. The SQL dump is piped through gzip for compression before it is written to a directory /backup, using a time stamp for the file name:

[root@regserver ~]# mysqldump -u root -p --single-transaction \
--databases pspace hostapilog \
| gzip > /backup/td-hostserver-mysql-$(date +%Y-%m-%d_%H.%M).sql.gz

See the MySQL documentation at https://dev.mysql.com/doc/refman/5.1/en/backup-and-recovery.html for more details and hints on how to define a MySQL backup strategy.

If the I/O overhead introduced by running the backup job on the production database is a concern, we recommend setting up a MySQL replication slave on another host and use this one to perform the backup. This second MySQL instance can also function as a hot standby server for high-availability purposes.

More details about MySQL replication and high availability can be found in the MySQL reference manual at https://dev.mysql.com/doc/refman/5.1/en/replication.html and https://dev.mysql.com/doc/refman/5.1/en/ha-overview.html.

In addition to the Space Volumes and MySQL databases, we recommend to create backup copies of the Server’s configuration files. Please refer to the TeamDrive Host Server Installation Guide for details on the relevant configuration files.

These files should be backed up at least every time you changed them. These backups can be performed using any file-based backup method, e.g. using tar, rsync or more sophisticated backup tools, e.g. Amanda or Bacula.

Restoring individual Spaces or Volumes

In case of corrupted or lost data of a single Space or complete Volume, is is possible to restore the Space or Volume data from a previously created backup.

An example scenario would be a Space that was entirely deleted by a user by accident, or the recovery of a file that was moved to the Space’s Trash Folder and the Trash was then emptied.

Note

Note that it is not possible to restore an individual file from a particular Space on the Host Server — due to the client-side encryption it’s impossible to determine the correct file on the server side. However it is possible to restore the entire state of a Space and all of its files to a previous version, which will allow the user to extract the missing file(s) on the client side.

An additional challenge is identifying the Space(s) you want to restore; by default, Space names are not stored on the Host Server and are only referenced by their ID. Take extra caution and double check you’re working on the correct Space.

The process of restoring a Space or Volume involves the following steps:

  1. Identify the ID of the Space or Volume you want to restore.
  2. Deactivate the Space or all Spaces of a Volume by setting the “Deactivated for restore” status, to prevent TeamDrive Clients from accessing the affected TeamDrive Spaces.
  3. Restore the Space(s) by restoring the necessary Space directory or entire volume directory from your backup to the corresponding location.
  4. Reactivate the Space(s) to make it/them available to the TeamDrive Clients again.
  5. The Clients will be notified that a Space recovery is required, which should be performed according to the procedure outlined in the TeamDrive Client documentation.

If a restore of a single Space is required, the task of (de-)activating it can be performed via the Host Server’s Administration Console. Open the Space Details page, check the Deactivated for restore checkbox and click Save to change the state.

After the restore has finished, click Complete Restore to re-enable the Space.

../_images/HostServerRestoreSpace.png

Host Server Administration Console: Restore Space

Alternatively, the Space deactivation and reactivation can be performed on the command line, as outlined below.

If you need to restore an entire volume, all Spaces contained in this volume need to be marked for restore. You can use a script included in the Host Server installation, which performs the tasks of deactivating and reactivation of all Spaces on the affected volume.

In any case, the actual task of restoring the Space(s) from backup has to be perfomed manually by the administrator and the TeamDrive clients will have to perform a local Space recovery to get the local Spaces back into a consistent state.

The examples describe the usage of the “Restore Script” used to restore an individual Space as well as an entire Space Volume.

Load and start the Restore Script

The Restore Script has to be executed using the Yvva Runtime Environment’s commandline shell yvva:

[root@hostserver ~] yvva
Welcome to yvva shell (version 1.2.0).
Enter "go" or end the line with ';;' to execute submitted code.
For a list of commands enter "help".

> execute file 'RestoreSpace.pbt' location 'setup/scripts/Restore';;

Usage:
-------
[] List volumes available for restore:
        RestoreSpace:volumes();
[] List spaces availbale for restore:
        RestoreSpace:spaces();
[] Deactivate volume before restoring:
        RestoreSpace:deactivateVolume(<ID>);
        i.e. 'RestoreSpace:deactivateVolume(5)'; to deactivate volume with ID 5.
[] Deactivate space before restoring:
        RestoreSpace:deactivateSpace(<ID>);
        i.e. 'RestoreSpace:deactivatSpace(7);' to deactivate space with ID 6.
[] Reactivate volume after restoring:
        RestoreSpace:reactivateVolume(<ID>);
        i.e. 'RestoreSpace:reactivateVolume(5);' to reactivate volume with ID 5.
[] Reactivate space after restoring:
        RestoreSpace:reactivateSpace(<ID>);
        i.e. 'RestoreSpace:reactivateSpace(7);' to reactivate space with ID 7.

Loading the Restore Script does not execute any task yet, it only displays a short usage information as shown above. You can execute selected tasks by entering their name and providing the respective Volume or Space ID in brackets. Finish the command with two consecutive semicolons to immediately execute the submitted code.

Identify the Space you want to Restore

Usually, the ID of the Space to restore should be obtained from the TeamDrive Client. The Space Information displayed in the Client window for each Space contains a field Space ID that contains the ID used on the Host Server. The actual Space data will is stored in a subdirectory below the Space volume, using the Space ID as the directory name.

To list all available Spaces on the Hosting Service you can execute the command RestoreSpace:spaces():

> RestoreSpace:spaces();;
.----------------------------------------------------------------------------.
| Spaces                                                                     |
|----------------------------------------------------------------------------|
| Volume | ID     | Title                         | ResID | Status           |
|--------|--------|-------------------------------|-------|------------------|
| vol01  | 3      |                               | 2     | 0                |
| vol01  | 4      |                               | 3     | 0                |
'----------------------------------------------------------------------------'

A list of all active Spaces will be displayed which should look similar to the list shown above. Identify the ID of the Space you want to restore.

Deactivate the Space to Restore

After identifying the Space you want to restore, you have to deactivate it by providing the ID to the command RestoreSpace:deactivateSpace(<ID>):

> RestoreSpace:deactivateSpace(3);;
140610 13:49:56 [Note] Deactivate space [3]
140610 13:49:56 [Note] Space with ID '3' deactivated.

In the example above the Space with the ID ‘3’ has been deactivated. After a short while, the Client will notice this change and mark the Space accordingly on its side.

Restore Backup

After deactivating a Space, you can now restore its data by copying the backup of that Space into the corresponding location on the Space volume, e.g. /spacedata/vol01/3 in our case.

Reactivate Space

After copying the backup to the deactivated Space, you have to reactivate the Space which makes it available to the TeamDrive Clients again. To reactivate a certain Space you have to execute the command RestoreSpace:reactivateSpace(<ID>):

> RestoreSpace:reactivateSpace(3);;
140610 13:54:07 [Note] Reactivate space [3]
140610 13:54:07 [Note] /spacedata/vol01/3
140610 13:54:07 [Note] Space '3' reactivated successful. Restore ID: 3.
Restore Log No: 0, Restore Log Offset: 855

In the example above the Space with the ID ‘3’ has been reactivated.

The Client will now notify that the Space has been reactivated and a local Space recovery operation has to be performed.

Identify the Volume you want to Restore

To identify all available volumes on the Hosting Service you have to execute the command RestoreSpace:volumes():

> RestoreSpace:volumes();;
.--------------------------------------------------------.
| Volumes                                                |
|--------------------------------------------------------|
| ID    | Name  | Status                                 |
|-------|-------|----------------------------------------|
| 1     | vol01 | Operational                            |
| 2     | vol02 | Operational                            |
'--------------------------------------------------------'

A list with the volumes will be displayed which should look similar to the list shown above. Identify the ID of the volume you want to restore.

Deactivate the Volume to Restore

After identifying the volume you want to restore, you have to deactivate the volume with all its Spaces. To deactivate a volume with a certain ID you have to execute the command RestoreSpace:deactivateVolume(<ID>):

> RestoreSpace:deactivateVolume(1);;
140603 17:13:21 [Note] Deactivate volume [1]
140603 17:13:21 [Note] 2 Spaces of Volume 'vol01' left to deactivate.
140603 17:13:21 [Note] Space with ID '3' deactivated.
140603 17:13:21 [Note] Space with ID '4' deactivated.

In the example above the volume with the ID ‘1’ and all the spaces within that volume will be deactivated.

Restore Backup

After deactivating a volume, you can now restore its data by copying the backup of that volume into the corresponding location.

Reactivate Volume

After copying the backup to the deactivated Volume, you have to reactivate the Volume which makes the Spaces available to the TeamDrive Clients again. To reactivate a certain Volume you have to execute the command RestoreSpace:reactivateVolume(<ID>):

> RestoreSpace:reactivateVolume(1);;
140603 17:15:37 [Note] Reactivate volume [1]
140603 17:15:37 [Note] Reactivate space [3]
140603 17:15:37 [Note] /spacedata/vol01/3
140603 17:15:37 [Note] Space '3' reactivated successful. Restore ID: 2.
Restore Log No: 0, Restore Log Offset: 855
140603 17:15:37 [Note] Reactivate space [4]
140603 17:15:37 [Note] /spacedata/vol01/4
140603 17:15:37 [Note] Space '4' reactivated successful. Restore ID: 2.
Restore Log No: 0, Restore Log Offset: 384

In the example above all Spaces on the Volume with the ID ‘1’ have been reactivated.

Exit the yvva session

You can close the yvva session by typing quit or pressing Ctrl+D on the > prompt.

Setting up Server Monitoring

It’s highly recommended to set up some kind of system monitoring, to receive notifications in case of any critical conditions or failures.

Since the TeamDrive Host Server is based on standard Linux components like the Apache HTTP Server and the MySQL database, almost any system monitoring solution can be used to monitor the health of these services.

We recommend using Nagios or a derivative like Icinga or Centreon. Other well-established monitoring systems like Zabbix or Munin will also work. Most of these offer standard checks to monitor CPU usage, memory utilization, disk space (especially the file systems providing the TeamDrive Space Volumes) and other critical server parameters.

In addition to these basic system parameters, the existence and operational status of the following services/processes should be monitored: