The following use case details the steps for backing up a large data set from a server to a remote machine.
First, create and name your job. Select the "Backup" option to specify a one-way replication and click "Ok".
Click on the left folder to specify your source location and select “My Computer” to access your local file system.
Select the desired folder that contains the data set to be replicated.
When selecting the remote destination, choose GoodSync Connect to ensure maximum speed and performance potential.
Select the target folder that will receive the data set on the destination device.
Once the source and destination folders have been set, there are a number of approaches and configurations that may further optimize speed and replication time:
- Direct Connection
- Disabling Safe Copy
- Disabling versioning
- Splitting a large job
The “Run Parallel Threads in Sync, this many” option allows GoodSync to replicate multiple small files at once, rather than sequentially one at a time. Adjust the value to allow GoodSync to leverage multiple software threads for parallel replication.
By default, the GoodSync Connect protocol uses a mediator server to facilitate connection between remote source and destination devices, providing convenient, out-of-the-box usage. This intermediary network hop adds overhead to the total replication time. To maximize speeds, direct connection should be leveraged instead.
If the destination machine can be pinged by the source machine, replace the .goodsync FQDN with the private IP of the destination machine.
If the destination machine cannot be pinged by the source machine and is on a separate network, replace the .goodsync FQDN with the public IP of the destination network, and port forward TCP port 33333 to the internal destination machine.
Disabling Safe Copy
“Safe Copy using temporary files” is an option that is checked by default as an additional redundancy check during replication. Unchecking this option will resolve the overhead produced by this action.
"Save deleted/replaced files to Recycle Bin, last version only" and "Save deleted/replaced files to History folder, multiple versions are options that allow GoodSync to store the previous version of a file before it was updated, modified, or deleted. Unchecking these options will resolve the overhead produced by these actions.
Physically preloading a copy of the source data on a destination machine is known as seeding. This allows GoodSync to omit the process of performing the initial large backup, so that it may focus on incrementals moving forward.
Company XYZ needs to backup 20 terabytes of data to a remote file server. The initial backup will take an unreasonable amount of time to complete, so the data is copied to a hard drive and sent to the remote office. This data is then copied to the file server. Source and destination now both have the exact same 20 terabytes of data. GoodSync sees that both sides are equivalent and is now able to focus on processing only the files that may be updated in the future.
Splitting a large job
If backing up the data set is unfeasible via a single job - due to network bottleneck, hardware bottleneck, or other variables - it may be beneficial for performance to split a large job into multiple.
To do so, right-click the existing job and select "Clone".
Name the first cloned job as a subset of the first...
As well as the second...
And so on...
Once the original job has been split into its desired quantity, remove the original by right-clicking and selecting "Delete".
Click on the first job and choose the Left Folder. In the Left Folder window, click on the "Select multiple Folders" checkbox and select the first subset of the total data to replicate. Click "Ok".
Do the same for the second job and select the next subset.
For the final job, select the remaining subset of data to replicate.
The right/destination folder may remain the same without additional configuration, as GoodSync will know to only process the respective subsets on either side.
Once the jobs are fully configured, an Analyze and Sync may be performed.
Each subset will process effectively in its own job run, and all data will be replicated.