.. _tshs: TeamDrive Scalable Hosting Storage ================================== If you require a scalable hosting system that grows with the number of users, then you have 2 alternatives: * Use a scalable file system that allows multiple access points, or, * Use TSHS, which stores the data in a cluster of MySQL databases. The TeamDrive Scalable Hosting Storage, TSHS, is a scalable storage system for the TeamDrive Hosting Service based on the MySQL database. It is an alternative to the standard file system based TeamDrive Hosting Service. If you are not sure of your requirements, you can begin with file system based storage. If your requirements outgrow a file system-based configuration, you can upgrade to TSHS at any time. How to do this is described in the section :ref:`upgradetshs`. .. note: Note that TSHS is only supported on version 3.0.013 (or later) of the TeamDrive Hosting Service. By default, TeamDrive Hosting Service stores files in the local file system. When TSHS is enabled, the Hosting Service stores files in a cluster of MySQL Servers (not to be confused with MySQL Cluster NDB). Each server in the cluster stores a partition (known as a "shard") of the data. TSHS refers to such a database server as a "Storage Node". .. figure:: ../images/TeamDrive-HostServer-Overview-TSHS.png TeamDrive Scalable Hosting Storage (TSHS) One of the databases in the cluster contains the administration data for the cluster, and this is known as the "Admin Node". The Admin Node contains a list of Storage Nodes in the cluster, and is the starting point for connecting to the cluster. Scale-out can be achieved by placing each database (Storage Node) in the cluster on a different machine. When a Storage Node exceeds its capacity (either in the terms of storage or computing power), the shard that it contains can be split, which creates a new Storage Node. Whenever a Storage Node (or the Admin Node) is created, the system administrator must first create an empty MySQL database that will be used by the Node. In this way the administrator can control where the data for the Node will be stored, and how scale-out is achieved. TSHS and S3 Compatible Object Storage ------------------------------------- In general, the TeamDrive Hosting Service is capable of using an S3 compatible object store as secondary storage. This is also the case when using TSHS. An object store is used as secondary storage to provide unlimited capacity if this is not provided by the primary storage (the file system or TSHS database cluster). .. figure:: ../images/TeamDrive-HostServer-Overview-TSHS-S3.png TeamDrive Scalable Hosting Storage (TSHS) with S3 object store When using the file system for storage, ``s3d`` (the TeamDrive S3 Daemon) is responsible for transferring files to the object store. This job is done by the ``tshs`` command line tool when TSHS is enabled (see below). The object store also serves as a backup for the data in the primary storage. If the primary storage is lost for some reason the data can be restored from the object store. The actuality of the restored data will depend on the frequency with which data is moved to the object store. The frequency of this operation can be set by the administrator in the configuration of the ``s3d`` or the ``tshs`` tool. Nevertheless, it is likely that data is lost in the most active spaces upon restore. This eventuality is handled by the TeamDrive Client if the correct restore procedure is followed, as described before. The ``tshs`` Command Line Tool ------------------------------ The TSHS cluster is managed using the ``tshs`` command line tool. The command line tool is all you need to create, maintain and expand the TSHS cluster. Alternatively, you can use the Host Server Administration Console, which calls ``tshs`` in the background to perform the requested task. The ``tshs`` command line tool can be used in 3 different modes: #. **One off commands**: Run ``tshs`` with command line arguments to perform various once-off tasks, such as create a new Storage Node, split a Storage Node or add a reference to an S3 compatible object store. #. **Interactive "shell" mode**: To introspect and perform a number of commands on a cluster the easiest is to start ``tshs`` in interactive mode, also know as shell mode. In this mode you can run all command line operations, in addition to a few shell only commands. #. **Background "daemon" mode**: A number of tasks need to be started regularly. These are basically TSHS commands that are run repeatedly by ``tshs`` running in daemon mode. For example, if a Storage Node is split, then data must be copied from one node to the other. The ``copy-shards`` maintenance command does this. All maintenance tasks can be run in parallel, and from any machine. This way, it is possible to scale-out the maintenance operations. Additional ``tshs`` commands can be run to increase the throughput of any task that needs to be performed. For example, when a node split occurs, a lot of data may need to be transferred from the old to the new Storage Node. Each ``copy-shards`` command only transfers one file at a time. So, starting (for example) 10 ``tshs`` instances running the copy-shard command will transfer 10 files simultaneously. Enter ``tshs help`` to get a full list of commands provided by the ``tshs`` command line tool. Creating a TSHS-based TeamDrive Hosting Service ----------------------------------------------- The first step is to create a ``tshs_admin`` database, setup the ``[tshs]`` MySQL group and enable TSHS in the Hosting Administration Console as described in the section :ref:`initializetshs` below. Following this, to the menu item **Host -> TSHS** and create a Storage Node as described in the section :ref:`createstoragenode`. After you have successfully created a storage node, you need to restart all Apache instances. TSHS cluster usage should then begin immediately. Check the Apache error log (e.g. ``/var/log/httpd/error_log``) to ensure that the module has loaded correctly, and that no errors occurred. You should see something like this:: [notice] mod_pspace 1.5.04 Loaded; Build Jun 18 2014 18:15:23; Crash-Reporting-Disabled [notice] Admin API booted: TSHS Enabled; Importing Volumes; S3 n/a; Path: /spacedata You can confirm that the TeamDrive Apache module has connected to the TSHS cluster by running the ``tshs list-access-points`` command:: $ tshs list-access-points ID Ack Access time Description -------- --- ------------------- ----------- 205 1 2014-06-20 15:57:41 pspace module;host=www.teamdrive.com;pid=75744 206 1 2014-06-20 15:57:41 pspace module;host=www.teamdrive.com;pid=75742 207 1 2014-06-20 15:57:41 pspace module;host=www.teamdrive.com;pid=75741 208 1 2014-06-20 15:57:41 pspace module;host=www.teamdrive.com;pid=75743 209 1 2014-06-20 15:57:41 pspace module;host=www.teamdrive.com;pid=75745 210 1 2013-03-26 21:24:42 pspace module;host=www.teamdrive.com;pid=75746 211 1 2014-06-20 15:57:41 tshs shell;host=www.teamdrive.com;pid=75765 This command lists all processes connected to the TSHS cluster. In the example above, the "pspace module" is listed a number of times, once for each process started by Apache. .. _initializetshs: Initializing a TSHS Cluster --------------------------- A TSHS Cluster is initialized by creating an "Admin Node", which is simply a MySQL database within the cluster with the name (by default) ``tshs_admin``. As mentioned above, a TSHS Cluster consists of a group of MySQL databases. The topology (i.e. the configuration of databases, MySQL servers and hardware) of the database cluster is determined by the administrator. However, it is obvious that maximum scalability is achieved by placing each database on its own hardware. The databases used by TSHS must always be created by the administrator. The administrator determines the name of the user that will be used by TSHS to access the database, and grants the TSHS user the required rights. In order to create a node (Admin or Storage), TSHS requires the right to create its own tables in the database. After that point it only needs complete access to the tables it has created. Observe the following steps to initialize a TSHS cluster: #. **Create the tshs_admin database**: Start by creating a database called ``tshs_admin``. Add a user to the database (for the purpose of this documentation we will call the user ``teamdrive``) and grant the user the right to create tables. #. **Configure the MySQL connection**: Create an options group called ``[tshs]`` in your MySQL options file (``/etc/td-hostserver.my.cnf`` file). The ``[tshs]`` group contains the MySQL connection parameters used to connect to the tshs_admin database:: [tshs] database=tshs_admin user= password= host=localhost port=3306 socket=/var/lib/mysql/mysql.sock How to setup a MySQL connection is explained in the MySQL documentation: http://dev.mysql.com/doc/refman/5.6/en/option-files.html . Use the ``--mysql-cfg-file`` option to specify the location of the MySQL options file on the ``tshs`` command line, if you are not using the default location (``/etc/td-hostserver.my.cnf``). #. **Create the Admin tables**: This is done by using the Hosting Administration Console (menu item **Settings -> TSHSENabled**) to set the configuration setting ``TSHSENabled`` to ``True``. When TSHS is enabled, the Console will run the ``tshs create-admin-node`` command which creates and initializes the tables used by TSHS in the ``tshs_admin`` database. The command will succeed if TSHS is able to access the ``tshs_admin`` database you have already created. If the tables already exist, the command has no effect. The ``[tshs]`` group described above is the entry point for accessing the entire TSHS cluster. Connection information to other nodes is stored in the ``tshs_admin`` database itself. After TSHS has been enabled, the Administration Console will display an additional entry **TSHS** located below the **Host** section of the left navigation bar. Use this page to manage TSHS. .. _createstoragenode: Creating a Storage Node ----------------------- Before data can be stored in a TSHS cluster you need to create one or more Storage Nodes. To create a Storage Node, you must first create an empty MySQL database. The location of the database depends in the topology of your cluster but, as mentioned before, the database must run on its own machine for maximum scalability. The database may have any name you choose and must contain a user that has the right to create tables. We will call this user ``teamdrive`` for the purpose of this documentation. Under the menu item **Host -> TSHS**, choose the ``create-storage-node`` command, and enter the connection options for accessing the database you have just created in the **Parameters** text field. The ```` is a list of MySQL connection options separated by a semicolon (';' character). For example:: user=teamdrive;password=;host=127.0.0.1;database=tshs_storage_1 You can use any options that you would otherwise use in a MySQL options file. A complete list of options is provided in the table on this page: http://dev.mysql.com/doc/refman/5.6/en/mysql-options.html Now press the **Execute** button to run this command using ``tshs``. This will execute the command ``tshs create-storage-node `` to create TSHS Storage Node. This command may only be executed when all Storage Nodes are empty in the cluster. As soon as the cluster contains data, you need to use the ``split-shard`` command to create new Storage Nodes. Under **Host -> TSHS**, the Hosting Administration Console displays a list of Storage nodes you have created. The list all the Storage Nodes displayed by the Console is obtained by using the ``tshs list-storage-nodes`` command:: $ tshs list-storage-nodes ID Phase Boundary (size) Connection options (totals) --- ------------ ----------------------- --------------------------- 1 ACTIVE_SHARD 0 (2147483648) user= td;port=3306;database=tshs_storage_1 (InDB=4558 OnS3=2677 R/O=3013 Bytes=4651260) 2 MIRROR_SHARD 2147483648 (2147483648) user=td;port=3306;database=tshs_storage_2 (InDB=5812 OnS3=3908 R/O=3102 Bytes=4523606) Data can be stored in the cluster as soon as you have one Storage Node. However, it is possible to start with several Storage Nodes if you are expecting the cluster to grow large. This will save splitting shards later, which is an expensive operation. Create additional Storage Nodes by calling ``create-storage-node`` several, times. Each time you must provide a new empty MySQL database. As long as the cluster is empty you can create and delete Storage Nodes until you have the desired start configuration. Ideally, you should not need to change the topology of your TSHS cluster (other than adding new Storage Nodes) once it is in use. Changing the topology effectively requires changing the connection options stored for a particular Storage Node, as a result of moving one of the storage databases to a different machine or MySQL server. Although this is possible, it currently requires the entire cluster to be shutdown before changing the connection options (shutting down the cluster is done by stopping all the Hosting Service Apache instances). If the database needs to be copied, then the cluster must remain shut down during this time. You can shorten the downtime, by using MySQL replication to create a copy of a Storage Node database before changing the topology of the cluster. .. _upgradetshs: Upgrading to TSHS ----------------- An existing file system based TeamDrive Hosting Service can be easily upgraded to use TSHS. This process is automatic, and runs while the system is online. .. note:: If your existing Hosting Service uses an S3 object store you must **stop and disable the TeamDrive S3 Daemon background task** (``s3d``) before you enable TSHS:: [root@hostserver ~] service s3d stop [root@hostserver ~] chkconfig s3d off However, the setting ``S3SyncActive`` must remain set to ``True``. When you enable TSHS, the Hosting Administration Console will check to see if file system storage is active. If so, it will automatically set the configuration setting ``TSHSImportVolumes`` to True. When ``TSHSImportVolumes`` is enabled, the data in the File System volumes will be transfered in the background to the TSHS cluster. This transfer occurs automatically, while the system is online. The main work of importing the data into TSHS is done by the ``tshs perform-import`` command. TSHS is configured to perform this command by default. The command checks to if an import job is active, and if so, executes the import. You retrieve the status of an import jobs by listing all "jobs" in the system using ``tshs list-jobs``:: $ tshs list-jobs ID Job Parameters ------ ------ ---------- 236887 IMPORT path=/spacedata;s3-host-id=6 The example shows that an import job is in progress. If an S3 object store is in use, then the job contains a reference to an S3 host (see ``tshs list-s3-hosts`` above). Although it can be run while the Hosting Storage is online, the ``perform-import`` command places quite a load on the system. If necessary the command can be run at non-peak times, for example, for a few hours every night until the import is complete. Using the ``--time-limit`` option you can limit the time a maintenance task will run. As soon as the ``perform-import`` task has completed successfully, it is recommended to disable ``TSHSImportVolumes`` (set the setting to False). The name of the job is changed from IMPORT to IMPORTED when all data has been transfered from the File System volumes to TSHS. Disabling ``TSHSImportVolumes`` is an important optimisation because TSHS no longer to search the File System (or S3 Storage if enabled) in order to find files requested by the TeamDrive Client. When import is complete, all the required information on the location of files is stored in the TSHS cluster. .. note: Note that disabling ``TSHSImportVolumes`` before import is complete will cause Space corruption because data not imported into TSHS will be reported as missing to the TeamDrive Clients. Scaling Out the Cluster ----------------------- When a Storage Node reaches its capacity, either in terms of storage or computing power, then the data shard on the node needs to be split. The ``tshs split-storage-node `` operation splits the shard on a given node, and creates a new Storage Node at the same time. For example:: tshs split-storage-node 2 user=teamdrive;password=;database=tshs_storage_3 The ```` parameter is a reference an existing Storage Node which you want to split. Approximately half the files that reside on the this Storage Node will be moved to the new Storage Node (using copy and delete operations). As described above, before you can create a new Storage Node, you must create an empty MySQL database. The ```` operation parameters specify the MySQL options required to connect to the new empty MySQL database. After you have split a Storage Node shard you need to run the ``tshs copy-shards`` maintenance command. This command copies all files as required from the old Storage Node to the new Storage Node. This operation can be sped up by starting a number of ``tshs copy-shards`` commands in parallel. Of course, the more copy operations running, the greater the load on the MySQL databases involved. As a result it is recommended that you monitor the database load during this time and start or stop copy operations as appropriate. The new Storage Node remains in MIRROR_SHARD state until the copy operation is complete. Note that by default the tshs command is setup to run in daemon mode, and perform all outstanding tasks. This incudes the ``copy-shards`` command. A Storage Node shard can be split while the cluster is online. Neither the ``split-storage-node`` or ``copy-shards`` command interrupt normal operation of the cluster, other than causing increased load due to the copy operation. The copy operation must be complete (i.e. the Storage Node must be in the ACTIVE_SHARD phase) before it can be split again. Connecting TSHS to an S3 Compatible Object Store ------------------------------------------------ When TSHS is connected to an S3 compatible Object Store, it periodically transfers files from the Storage Node database(s) to the Object Store. This is done by the ``tshs move-to-s3`` maintenance command. In order to enable the transfer of data to the object store, TSHS needs to be configured for accessing it. This can done by taking the S3-specific setting from the Host Server settings and adding the necessary access details to the ``thsh_admin`` database using the command ``tshs add-s3-host``. Ensure that the settings ``S3Brand``, ``S3Server``, ``S3DataBucketName``, ``S3AccessKey`` and ``S3SecretKey`` contain the correct information to access the S3 bucket that will be used to store the Hosting Service data. See chapter :ref:`configuring_s3d` for details. ``add-s3-host`` will ping the S3 service before actually adding the host details. The Hosting Administration Console will then take these settings to create the TSHS-specific S3 host entry when you enable S3 by setting the configuration setting ``S3SyncActive`` to ``True``. You can verify this by running ``tshs list-s3-hosts`` on the command line or via the Hosting Administration Console. If any of these setting are incorrect, enabling S3 will fail with an appropriate error message. .. note:: You need to restart the Apache http Server using ``service httpd restart`` after enabling ``S3SyncActive``. Check the log file ``/var/log/httpd/error_log`` for the following notice:: [notice] Admin API booted: TSHS Enabled; S3 Enabled (s3d n/a) The Hosting Administration Console sets up the S3 host in the cluster by executing the following command:: add-s3-host [] For example:: tshs add-s3-host AMAZON s3.amazonaws.com my-bucket 022QF06E7MXBSAMPLE kWcrlUX5JEDGM/SAMPLE/aVmYvHNif5zB+d9+ct MaxAge=1;MinSize=1;MinBlockSize=5M;MaxUploadParts=1000;MaxUploadThreads=10 Currently, ```` must be set to AMAZON or OPENSTACK. ```` is the host name of the server (for amazon this is ``s3.amazonaws.com``). ```` is the name of the S3 bucket. ```` and ```` are the keys required to access the bucket. The Hosting Administration Console takes these values from the associated configuration settings. The ```` string has the form: ``=;=;...`` Options are: ``MinSize`` The minimum size of a file that will be uploaded to S3. Use K=Kilobytes, M=Megabytes, etc. ``UrlTimeout`` The time in seconds that a redirect for reading is valid. The default is 10. ``MinBlockSize`` The minimum block size that can be used for multi-part upload. Default is 5MB. The valid range of this parameter will be determined by the implementation of the object store being used. ``MaxUploadParts`` The maximum number of parts a file can be divided into for upload. Default is 1000. The valid range of this parameter will be determined by the implementation of the object store being used. ``MaxFileUploadSize`` Files larger than this will not be transferred to S3. The default, 0, means all files are uploaded. ``MaxUploadThreads`` The thread pool size to use for multi-part uploads. The default, 0, means single threaded. The options value used by Hosting Administration Console is stored in the ``S3Options`` configuration settings. When an S3 host is added it, becomes the current S3 host. All files uploaded to S3 storage automatically go to the current S3 host. You can set the current S3 host using the command ``tshs set-current-s3-host ``. You can list all S3 hosts with the ``list-s3-hosts`` command:: $ tshs list-s3-hosts ID Cur Server Bucket Public Key Private Key Options --- --- ---------------- -------------------- -------------------- ---------------------------------------- ------- 6 yes s3.amazonaws.com mybucket 022QF06E7MXBSAMPLE kWcrlUX5JEDGM/SAMPLE/aVmYvHNif5zB+d9+ct MinSize=1;UrlTimeout=10;MinBlockSize=5K;MaxUploadParts=1000;MaxUploadThreads=2 To check the file transfer progress, you can use the command ``tshs transfer-to-s3-info``, which will display how many files and their total size are ready to be transferred. These commands can be executed from the command line, or by using the **Host -> TSHS** Hosting Administration Console page. .. note:: If you wish to change S3 parameters, then first set ``S3SyncActive`` to ``False``, then change the parameters and set ``S3SyncActive`` back to ``True``. Note that if the S3 parameters are changed, data is already stored in S3 will continue to use the old parameters! If you wish to change the S3 parameters of existing S3 data, then you must use the ``tshs update-s3-host`` command. Running Maintenance Tasks ------------------------- The following maintenance tasks must be run periodically: copy-shards If a Storage Node has been split, this task copies files from the old storage node to the new node, which contains the new shard. remove-deleted This task removes files from the cluster that have been marked for deletion. This includes removing files from S3 storage. In general, the TeamDrive Scalable Hosting Storage does not delete files in the cluster immediately, but rather marks them for deletion. This can also take the form of a DELETE job, if an entire space is deleted. DELETE jobs that have been scheduled are displayed using ``tshs list-jobs``. If a file is marked for deletion then this can been seen when you use the ``tshs list-files`` command. perform-import This task imports existing file system based Hosting data into the TSHS cluster. move-to-s3 This task moves files to S3 storage that have been scheduled for transfer. The TeamDrive Hosting Apache module schedules files to be moved to S3 storage, as soon as the writing of the file is concluded. Files marked to be transferred, and copied to S3 by the ``move-to-s3`` and then removed from the cluster. copy-to-s3 This task performs a backup of files that need to remain in the TSHS cluster. Currently this is all files called ``last.log``. The ``last.log`` files are constantly updated and may therefore never be moved to S3 permanently. Whenever a ``last.log`` file is modified it is marked to be copied to S3. In order to revent ``last.log`` files from being constantly copied to S3, the ``copy-to-s3`` task should run less frequently than the other task (the default is once every 4 hours). You can run each maintenance task as a separate command, or you can use the ``tshs do-tasks`` command:: tshs do-tasks [all] [copy-shards] [remove-deleted] [perform-import] [move-to-s3] [copy-to-s3] [all-not-s3-copy] Generally it is recommended to use the ``tshs do-tasks all`` or ``tshs do-tasks all-not-s3-copy`` commands which will start all maintenance tasks at once. Use ``all-not-s3-copy`` , which omits the ``copy-to-s3`` task, to avoid too frequent backups. The advantage of using the ``do-tasks all`` is that new tasks that may be added to future versions of TSHS will automatically be executed. In order to run maintenance tasks regularly you can start ``tshs`` in the daemon mode with the --start option, and use the ``repeat`` command to execute a command or list of commands separated by ``and``. By default, ``tshs`` is setup as a service to run as follows:: tshs repeat 60 do-tasks all-not-s3-copy repeat 14400 tshs do-tasks all Which means that maintenance tasks except copy will be run once a minute (60 seconds), and the copy task which does a backup of the ``last.log`` files will be run every 4 hours (14400 seconds). A number of options allow you to control how maintenance tasks are run: --max-threads= Specifies the maximum number of threads to be started by a command. By default, the maximum number of threads is 10. This is the maximum number of threads that will run, no matter how many tasks are started. Note that this does not include the threads used to upload files to S3 as specified using the ``MaxUploadThreads`` parameter. --pause-on-error= Specifies the time in seconds to pause after an error occurs during a maintenance task. The default is 2 minutes. When running in daemon mode, ``tshs`` ignores this option. --retries= The number of times to retry if an error occurs during a maintenance task. Set to 'unlimited' if you want tshs to retry indefinitely. The default is 0, which means that the task quits immediately if an error occurs. When running in daemon mode, ``tshs`` ignores this option. --time-limit= Specifies the time limit in seconds for running maintenances tasks. tshs will stop running the tasks after the specified time, whether the tasks are finished or not. The default is 0, which means unlimited. When a task is restarted it will continue from where it left off so you do not loose anything by stopping a task and restarting it again later. Use this parameter to prevent tasks from running during high load times when it could disturb normal operation. --node-affinity= This options tells ``tshs`` to only transfer data from a specific storage node. This optiona may be used if you have a ``tshs`` service running on each storage node. Such a configuration reduces the amount of network traffic when data is transfered between nodes, or to and from S3 storage. These options may be specified on the command line, or placed in the ``tshs`` options file: ``/etc/tshs.conf``, which is consulted by the ``tshs`` background service. When running in daemon mode, errors and other messages are written to the ``tshs`` log file, which is ``/var/log/tshs.log`` by default.