TorpedoSync - LAN Based File Synchronization Between Machines

Mehdi Gholam

4.91/5 (37 votes)

22 May 2020CPOL20 min read

61.1K

2.6K

This is how TorpedoSync came about, from the use of other software to sync My Documents across my machines and finding them troubling.

TorpedoSync (works on pi and ubuntu) takes a different approach to other file sync applications like resilio and syncthing in that it does not compute a cryptographic hash for each file and consequently is much faster and uses less CPU cycles

Introduction

They say that "necessity is the mother of invention", well necessity and sometimes an itch to do the same yourself. This is how TorpedoSync came about, from the use of other software to sync "My Documents" across my machines and finding them troubling. Hence the immortal cry "I can do better!".

I start out by using bitsync which then became resilio sync (a free-mium product) which I liked at first but then became a real memory and disk space hog.

After searching a lot, I came across syncthing which is open source and written in go, which I found seemed to be connecting to the internet and was not as nice to use as resilio.

At this point I gave up and decided to write my own.

The source code can be found on github also at : https://github.com/mgholam/TorpedoSync

Main Features

TorpedoSync has the following features:

Sync folders between computers in a local network
Safe by default : changed and deleted files are first copied to the "archive" folder
Based on .net 4 : so it will work on old Windows XP computers, good for backup machines.
Built-in Web UI
Run as a console app or as a Service
Single small EXE file to copy deploy
No database required
Minimal memory usage
Ability to zip copy small files : for better network roundtrip performance (syncing pesky folders like node_modules )
Read Only shares : one way syncing data from master to client
Read Write shares : two way syncing
Ignorable files and folders
Raspberry Pi support
Linux support

Installation

To run TorpedoSync just execute the file and it will start up and bring up the Web UI in your browser.

Alternatively you can install TorpedoSync as a windows service via torpedosync.exe /i and uninstall it with torpedosync.exe /u

Working with TorpedoSync

On the machine that you want to share a folder from, click on the add button under the shares tab:

Enter the required values, and optionally copy one of the tokens you want other machines to connect to.

On the client machines, open the connections tab and click on the connect to button:

Enter the required values and save, the client will wait for confirmation from the master:

Click on the confirm button under the connections list and the machine name of the client, after which the client will start syncing.

To see the progress you can click on the connection share name to see the connection information or you can switch to the logs tab.

You can view and change the TorpedoSync configuration via the settings tab

The Differences to Competitors

Torpedosync takes a different approach to other file sync applications like resilio and syncthing in that it does not compute a cryptographic hash for each file and consequently is much faster and uses less CPU cycles.

No relational database is used to store the file information, so again the performance is higher and the storage is much smaller.

TorpedoSync only works in a local network environment and does not sync across the internet, so you are potentially much safer, and your data is contained to your own network.

Definitions

To understand what is implemented in TorpedoSync you must be familiar with the following definitions:

Share : is the folder you want to share
State : is the list of files and folders for a share
Delta : is the result of comparing states for which you get a list of Added, Deleted, Changed files and Deleted folders
Queue : is the list of files to download for a delta
Connection
- Master machine : is the computer which defined the share and the delta computation happens even in read/write mode
- Client machine : is the computer connecting to a master which initiates a sync

Sync Rules

Master is the machine that defined the "share"
Defaults to move older or deleted files to "old" folder for backup
First time sync for Read/Write mode will copy all files to both ends
Only clients initiate a sync even in read/write mode
Delta processing is done on the master
if OLDER or LOCKED -> put in error queue and skip to retry later in another sync
if NOTFOUND -> put in error queue

How it works

For sync to work, machines on both ends of the connection should be roughly time synchronized, especially when 2 way or read write syncing if performed.

In the case of syncing My Documents between your own machines then this is not very important since you would not be changing files at the same time on both machines. If this does happen then you can recover your files from the .ts\old folder in your share directory.

As stated you have 2 types of synchronization, read only and read/write.

Read only / 1 way Sync

The client in this mode does the following :

Generates a State for the folder, i.e. getting a list of files and folders for the share.
Sends this State to the server and waits for a Delta from the server.
Given the deleted files and deleted folders in the Delta it will move them to .ts\old.
Given the added and changed files in the Delta it will queue them for downloading.

The server in this mode does the following:

Waits for a State from a client.
Get the current State of the share.
Computes a Delta for the client based on the 2 states above and sends to the client.

Read Write / 2 way Sync

The client in this mode does the following :

Generates a State for the folder, i.e. getting a list of files and folders for the share.
Creates a Delta from the above state and the last stored State if it exists.
Clears the added and changed files in the above Delta.
Sends the current State and the Delta to the server and waits for a Delta from the server.
Given the deleted files and deleted folders in the Delta it will move them to .ts\old.
Given the added and changed files in the Delta it will queue them for downloading.

The server in this mode does the following:

Removes the files defined in Delta sent from the client.
Compares the current State and the last saved State on the master to create a Delta deleted files and folders for the client.
Process the client State with the master State for the changed and added Delta for the master
Process the master State with the client State for the changed and added Delta for the client.
Return the client Delta to the client.
Queue the master Delta on the master for downloading.

Downloading files

Once a Delta is generated, deleted files and folders defined in it are moved to .ts\old under the exact folder structure of the original files and the added and changed files are queued for downloading.

The downloader process the queue in one of two ways:

If the files in the queue are smaller than 200kb (a configuration) and the total size of these files are less than 100mb (another configuration) they are grouped together and downloaded as a zip file from the master, this is for better roundtrip performance of smaller files.
If the file is larger than 200kb, it is downloaded individually and chunked in 10mb blocks.

The download block size is configurable and should reflect the bandwidth of the network connection, i.e. for wired networks 10mb should be fine, and for wireless networks 1mb would be better (1 block per second).

The download outcome can be one of the following:

Ok : the data is ok and written to disk.
Not found : the requested file was not found on the master, and is ignored.
Older : the file on the master is older than the one requested, the request is put on the error queue.
Locked : the file on the master is locked and can not be downloaded, the request is put on the error queue.

The error queue is no more than a container for failed files and reset, since the files in question will probably be in the next sync anyway.

The internals

The source code is composed of the following:

Class	Description
TorpedoSyncServer	The main class that loads every thing and processes TCP requests
TorpedoWeb	The Web API handler derived from CoreWebServer
ServerCommands	Static class for server side handling of TCP requests like syncing and zip creation etc.
ClientCommands	Static class for sending TCP client request
CoreWebServer	Web server code based on HttpListener designed as an abstract class
DeltaProcessor	Static class for State and Deltahandling and computation
QueueProcessor	The main class which handles a share connection, queueing downloads, downloading files and new sync initialization
UDP	Contains the UDP network discovery code which maintains the currently connected machine addresses
NetworkClient, NetworkServer	Modified TCP code from RaptorDB for lower memory usage
Globals	Global configuration settings

The files and folders

Once a Share is created on the master, TorpedoSync creates a .ts folder in that shared folder for "old archive" files which is named old. Also in the .ts folder there is a file named .ignore which you can use to define what files and folders to ignore in the sync process ( i.e. will not be transferred).

Besides the EXE, TorpedoSync creates a log folder for logs and a Computers folder for connection information. Inside the Computers folder a folder is created for each connected machine which will be the machines name, which stores the state, config and download queue for each share that that computer is connected to.

.ignore file

Within the share folder inside the special .ts directory you have this special file which contains the list of files and folder you want to ignore when syncing, each line is a ignore item which follows the rules below:

# anything after in the line is a comment
* or ? in the line indicates a wildcard which will be applied to match filenames
\ in the line indicates a directory separator character
anything else will match the path

For example:

# comment line
obj            # any files and folder under 'obj'
*.suo          # any file ending in .suo
code\cache     # any files and folder inside 'code\cache'

settings.config file

The settings.config file resides beside the TorpedoSync EXE file and stores the following configuration parameters in a json format:

Parameter	Description
UDPPort	UDP port for network discovery (default 21000)
TCPPort	TCP port for syncing (default 21001)
WebPort	Web UI port (default 88)
LocalOnlyWeb	Web UI only accessible on the same machine (default true)
DownloadBlockSizeMB	Breakup block size for downloading files, i.e. if you are on a wireless network set this to 1 to match the transfer rate of the medium (default 10)
BatchZip	Batch zip download files (default true)
BatchZipFilesUnderMB	Batch zip download files if total size under this value (default 100)
StartWebUI	Start the Web UI if running from the console (default true)

sharename.config file

This file stores the configuration for the share connections in a JSON format:

Parameter	Description
Name	Name of the share
Path	Local path to the share folder
MachineName	Machine name to connect to
Token	Token for the share
isConfirmed	On the master this value indicates that the share is confirmed and ready to send data
ReadOnly	Indicates that this share is read only
isPaused	If this share connection is paused
isClient	if this share connection is a client

sharename.state file

In 2-way read/write mode, this file stores the last list of files and folders form which TorpedoSync derives which files and folders were deleted since the last sync process.

If this file does not exist (i.e. first time sync), then TorpedoSync assumes that it is the first time and all missing files are transferred to both ends of the connection (no missing files and folders are deleted on both ends of the connection).

This file is in a binary json format.

sharename.queue and .queueerror files

These files store the files which should be downloaded from the other machine, and the list of failed download files.

These files are in a binary json format.

sharename.info file

These files store the share statistics and metrics information, and are in binary json format.

The Development Process

TorpedoSync repurposed some code from RaptorDB which I initially included raptordb.common.dll, but then decided to copy the code to the solution so I get one EXE to deploy which makes things easier for users.

The project is divided into two parts:

The main c# code
The Web UI which builds to a folder in the main code folder

The Web UI is built using vue.js.

So you must have the following to compile the source code:

Visual Studio 2017 ( I use the community edition)
Visual Studio Code
- Vetur extension
node and npm
- npm install in the WebUI folder

You can compile the c# code in Visual Studio.

You can debug the Web UI by running WebUI\run.cmd which will startup a node development server for debugging and fast code compilation in webpack.

After you are good with the UI code you do the following :

WebUI\build.cmd to build a production version of the UI
WebUI\deploy.cmd to copy the production files to the c# source folder
Rebuild the c# solution to include the UI code

The pains

The following are a highlight of the pain points when writing this project.

Long filenames

While Windows supports long filenames (>260 characters), .net seems to be lacking in this area (especially .net 4). These files especially arise when working with node_modules folders which nest very deep and frequently go over the 260 character limit.

To handle this there are special longfile.cs and longdirectory.cs source files which use the underling OS directly.

One other pain point I encountered was that even Path.GetDirectoryName() suffers from this so I had to write a string parser to handle it.

UDP response from multiple machines

My initial UDP code didn't seem to get responses from all other machines in the network, which really took me a long time to figure out, the problem being, I was closing the UDP connection and not getting all the responses in the stream.

So the solution was to stay in a while loop and process not just the first response but any other responses or time out.

Brain dead dates in the zip format

Now this one took a long long time to figure out, and the problem started when the 2 way sync was downloading files that seemed to have already been downloaded and worst seemed to be going back to the master in a loop.

After much head scratching the problem seemed to be seen when the batch zip feature was used and going through the code by Jaime Olivares I found this gem in the comments :

DOS Date and time:
     MS-DOS date. The date is a packed value with the following format. Bits Description
                0-4 Day of the month (1..31)
                5-8 Month (1 = January, 2 = February, and so on)
                9-15 Year offset from 1980 (add 1980 to get actual year)
     MS-DOS time. The time is a packed value with the following format. Bits Description
                0-4 Second divided by 2
                5-10 Minute (0..59)
                11-15 Hour (0..23 on a 24-hour clock)

The problem being the seconds are divided by 2 when stored. After much research, I found this was a limitation of FAT in MSDOS back in the day and was carried through to today in the original zip format, later extensions to zip support the full date time, but unfortunately the code from Jaime didn't have this but it did have a comment section for each file zipped and I used that to store the date time for each file until Jaime gets round to supporting the extended zip headers.

Webpack

I'm not a web developer, and I'm recently getting into it, so figuring out webpack and npm was tricky for me especially the build process, which takes a long time. Also configuring Webpack for special functionality is really hard and frustrating since I always get errors from the snippets I found on the web, so I basically gave up for the most part.

After some time I found out that if you npm run dev the code you get hot code changes in your browser and the development cycle is much shorter and nicer (i.e. you don't have to build after each code change and wait and wait...)

The problem with npm run dev is that the web code runs on port 8080 and my server runs on port 88 so I came up with the following workaround in the debug.js file :

JavaScript

import './main.js'
window.ServerURL = "http://localhost:88/";

This sets a variable to my own web server in the c# code, and in the production build I ignore it and just package the main.js file so the webpack.config.js has the following:

JavaScript

var path = require('path')
var webpack = require('webpack')

module.exports = {
    entry: './src/debug.js',    //normal dev build
...
if (process.env.NODE_ENV === 'production') {
    module.exports.entry = './src/main.js'    // if production
    module.exports.devtool = '#source-map'
...

The only problem I encountered with this approach is that I can't POST json to my server and I get CORS (cross origin resource sharing) errors in the browser and my request don't make it to my server in developer mode.

While I only had 1 point in the code which I posted data here and I could work around it, in a larger project this will be a problem, so a better solution is needed.

The pleasures

The following are a list of what I enjoyed using while working on this project.

Visual Studio Code

At first I used Visual Studio Community 2017 for the Web UI code, but I found that the vue.js support was lacking and not very helpful. So I decided to use Visual Studio Code with the help of some extensions for vue.js, and I have to say that it is phenomenal and very enjoyable.

The general usability and productivity is very high and I absolutely recommend it for web development.

Vue.js

After getting webpack working and support for compiling .vue files in place I have to say that vue.js is outstanding and simple to code by (.vue files are a great help and simplify the development process).

I shopped around a lot for a "framework" and found vue.js powerful and most importantly simple to learn and use for someone who has never worked in JavaScript. I also looked at angular.js but I found that too complex and difficult.

Appendix v1.1 - pi and ubuntu

The refactoring of the code is done and TorpedoSync can now run on the raspberry pi and on the Ubuntu OS. The above animated gif in on my pi b2+ which is syncing data from my development machine running Win 10.

To run TorpedoSync on pi and Ubuntu you must install mono with the following command, the EXE file is the same that runs on Windows.

sudo apt-get install mono-complete

Appendix v1.2 - npm3

I recently came upon npm3 which is a newer version and has the ability to optimize the dependancy for the web ui project and in doing so my node_modules folder size came down from 76Mb to around 56Mb saving me 20Mb.

So I switched the build and run scripts accordingly (the usage is the same just add a 3 to the end).

Appendix v1.5.0 - Parcel Packager

With regular notifications from GitHub regarding security vulnerabilities in some javascript libraries in the node modules for TorpedoSync's web UI, I decided to upgrade webpack from v3 to v4, and what a pain that was, the build scripts failed and trying to fix it was beyond me, given that the scripts were mostly scrounged around from various sources and were beyond me in the first place.

Frustrated by the overly complex webpack packager, I searched around and came across the parcel packager (link) which was simple and without configuration and mostly did what I wanted out of the box.

The only problem I have with parcel is that it appends a hash value to the filename of the outputs, in the documentation it states that this is for the use of content delivery networks, I haven't been able to turn this off and it seems you can't.

This name mangling is a pain when trying to embed the files in TorpedoSync since you have to edit the project file each time it changes.

Other than the above, everything else is wonderful with parcel and I like it a lot since I don't have to fiddle with knobs and configurations.

Appendix v1.6.0 -Rewrite the Web UI in Svelte

Continuing my love affair with svelte which started with rewriting the RaptorDB Web UI, I decided to do the same with TorpedoSync. While the structure and mentality of vue and svelte are similar, the process of converting required me to create some svelte components from scratch (I could not find while searching). These components were:

Movable/resizable modal forms
A message box replacement for sweet alert, I'm quite proud of the message box control because it is tiny, powerful and really easy to use.
- It supports Ok, Ok/Cancel, Yes/No, single line input, multi line input and progress spinner.

The result is quite remarkable :

vue distribution build :
- js size = 147 kb
- total size = 201 kb
svelte distribution build :
- js size = 83 kb
- total size = 91 kb

Some of the highlights:

I changed the icons from glyphicons font to svg font awesome component. While the overall size of the output was smaller it is not really a good option if you have a lot of icons in your app (the 14 icons used took ~13kb of space while the entire glyph font is ~23kb).
The node_modules for the svelte project is approximately half the size of vue even with the font awesome icons.
The rollup javascript packager while not configuration free like parcel (it has a manageable config file) is only 4mb on disk compared to nearly 100mb for parcel.
I changed the CSS colours scattered around the source files to use the global CSS var() which allows for themes and on the fly colour changes (currently disabled in the UI).

Previous Releases

You can download previous version here:

History

Initial Release : 17^th January 2018
Update v1.1 : 6^th February 2018
- code refactoring
- working with mono on pi and ubuntu
Update v1.2 : 14^th February 2018
- bug fix web page refresh extracting server address
- GetServerIP() retry logic
- changed build run scripts for npm3
Update v1.3 : 2^nd July 2018
- streamlined tcp network code
- removed retry if master returned null -> redo in next sync (edge case queue stuck and not syncing)
- check if file exists when ziping (no more exceptions in logs)
Update v1.5.0 :
- moved to using parcel packager instead of webpack
- updated to new zipstorer.cs
Update v1.6.0 : 20^th August 2019
- rewritten the web ui with svelte
Update v1.7.0 : 10^th December 2019
- WEBUI: refactored tabs out of app into own component
- upgrade to fastJSON v2.3.1
- upgrade to fastBinaryJSON v1.6.1
- bug fix zipstorer.cs default not utf8 encoding
- added deleting empty folders
- updated npm packages
Update v1.7.5 : 18^th February 2020
- updated npm packages
- added support for env path variables (%LocalAppData% etc.)
Update v1.7.6 : 23^rd May 2020
- WEBUI using fetch()
- js code cleanup
- modal form default slot error
- network retry logic (thanks to ozz-project)

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)