Auto-sync is one of the NexentaStor provided data replication mechanisms. It is a schedulable, fault-managed, fully configurable (tunable) NexentaStor Data Replication service that can be used in a variety of backup, archiving, and DR scenarios.
- For general and background information, please see Data Replication.
- For NexentaStor Automatic Continuous Data Protection (CDP) extension, please see AutoCDP.
During the last several release iterations, auto-sync replication has been significantly improved, and a number of new features added. That includes:
- the services can be set up to run only once - at a given scheduled time.
- auto-sync can execute in a daemon mode and run incremental replications every second or every few seconds.
- non-recursive auto-sync (the -N option) will replicate only a given source folder (filesystem). Note that by default auto-sync replicates nested folders and zvols.
- Zvol replicatioin: auto-sync can now replicate zvols directly. Previously, auto-sync was required to have a folder (filesystem) as its source; today this restriction is removed.
- auto-sync can be used to replicate locally or remotely the appliance's system folder (a.k.a. root filesystem) that contains appliance's Operating System and configuration. The replication destination may or may be another NexentaStor appliance, and - in the case when it is an appliance - may or may not reside on appliance's system volume.
- Reverse direction: auto-sync can be now run in reverse direction. This new cabability, which addresses a variety of Disaster Recovery and "multi-office workplace" scenarios, has status EXPERIMENTAL at this point. All possible directions supported: local-to-local, local-to-remote, remote-to-local. Please see for details "Auto-Sync in reverse" PDF document on the website.
- Logging verbosity and logging improvements. The auto-services (auto-sync including) can now be configured at creation time and runtime to produce a detailed (verbose) log.
- Bandwidth throttling. Added ability to throttle data replication based on specific deployment requirements. The corresponding "rate_limit" option and property allows to limit network and I/O bandwidth allocated for (consumed by) auto-sync replication. Bandwidth throttling is supported by auto-tier replication as well.
- Added 'last_replic_time' property, to record and show duration of the last auto-sync run, in seconds.
- New high-performance auto-sync transport. See section "zfs send/recv over netcat" below.
- Auto-snap, auto-tier, auto-sync services create '-latest' snapshot by default. Now, this feature can be turned off in NMC, by specifying '-X' option. For example:
nmc$ create auto-sync -X
- Improved and stabilized automated recovery. Auto-sync has been deployed in a great variety of scenarios and environments. We have "translated" this experience into improved and robust mechanisms to handle (and recover from) network failures or sudden power outages.
- Improved mechanism to mount/unmount replication destination, based on ZFS 'canmount' property.
Note that auto-sync over zfs send/receive transfers not only data but all metadata as well, including ACLs. This could be one of the reasons to consider using the service, even though it may be performing slower than other available and supported data replication options.
zfs send/recv over netcat
The default auto-sync transport mechanism is native ZFS send/recv that replicates both data and filesystem metadata. When executed between two ZFS systems (for instance, between two NexentaStor appliances), ZFS send/recv replication is transported over SSH "pipe". Until the version 2.0.1 of the appliance, SSH pipe was the only mechanism to "extend" ZFS send/recv over distance.
Starting 2.0.1 we are introducing another piping mechanism: netcat. As per the referenced Wikipedia article:
"""In 2000 according to www.insecure.org Netcat was voted the second most functional network security tool. Also, in 2003 and 2006 it gained fourth place in the same category. Netcat is often referred to as a "Swiss-army knife for TCP/IP" , and for a good reason..."""
The primary motivation for SSH alternative is certainly - performance. SSH encryption takes a toll on a local and remote CPUs and increases latency of the network transfers, sometimes significantly. The end result of this may be poor auto-sync performance in local-to-remote or remote-to-local replications.
We have made some tests to compare netcat performance vs. various over-SSH encryption methods on one hand, and the raw local disk read speed, on another. The following table summarizes the results:
The tests were run on a quad core x64 server over 1G Ethernet link. As you can se, SSH encryption reduces the throughput by a factor of 2 to 4.