Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / Javascript

Monitoring a Folder for Changes in Files and Folders with Node.js

4.85/5 (11 votes)
20 Dec 2013CPOL11 min read 80.8K   867  
Getting Started with Node.js Part 2

George's Getting Started with Node.js Series

Introduction

My dive into the world of Node.js continues with this article.

I have to say that using JavaScript for more than just working with the DOM has been interesting. I (like many developers I know), primarily use JavaScript for such things as AJAX calls and making web pages a little bit better and more responsive. Up until now, I've never really viewed JavaScript as something I might consider useful for anything beyond UI. I'm really starting to re-evaluate that position.

In this particular article, I start delving into the fs (File System) module of Node.js. Which to me is a complete departure from anything I've ever done in JavaScript. I usually do file handling on the server side with some other technology like C# or even PowerShell.

So the requirements are: I need some way to monitor a directory (and possibly all of its child directories) for changes such as new files added, files deleted, files modified, folders removed, folders, added, etc. This article will focus on that functionality.

Reporting from the Trenches (thoughts 'til now)

While this article doesn't fulfill all the requirements of Sprint 2 of my self assigned "getting started" project, I think I tried to cover too much in the first article instead of focusing on one piece at a time. I'm going to try and correct that going forward by focusing on one item at a time. Hey! you never know, my writing skills might improve as well as my JavaScript/Node skills... stranger things have happened.

I've learned a great deal with just the few modules I've written so far, but there is still so much to learn. Sadly when using JavaScript as I do now, I rely extremely heavily on jQuery and other frameworks. Doing this has really impacted my knowledge of the basics of JavaScript, so, as with the first article I continue to avoid frameworks until I get a strong handle on both Node and JavaScript. That way when I do use a framework, it's a time-saver and not a crutch. I'm still not an expert and don't pretend to be, so for those of you reading this article: realize this is done in the "Learn Along" kind of format and not as an expert trying to teach newbies... I am the newbie when it comes to JavaScript/Node, so keep that in mind. Need help with C++, C#, SQL,, or Assembly: I'm your man and can answer a lot of questions off the top of my head... JavaScript/Node? I'm going to scratch my head and answer "I don't know, but let's find out together."

One of the things you may notice is my coding style is changing between each article. I'm working on trying to align my style and conventions to the majority of JavaScript code out there, but that's fighting against practices and styles I've been doing for many many moons. I've discovered Douglas Crawford's Code Conventions for the JavaScript Programming Language, and I've taken some hints from there... but my own preferences shine through... probably more than some style cops would like.

I've also started running into issues of variable scope, closures, and the fact that JavaScript isn't targeted at OOP as I'm currently comfortable with: I'm working on it. Other issues is the realization that Node/JavaScript is single threaded and you handle background processes with non-blocking async methods and callback functions. Something that working with this particular article's accompanying code became important to get my head around pretty quick. I don't think I've even come close to "mastering" the concept yet, but I'm working on it.

Lastly, before we jump into monitoring directories, I've just started working with GitHub for the first time (I'm used to TFS/SourceSafe). I've put the code for this article on GitHub so people can collaborate or branch it. I'm hoping to get more familiar with git as well..

So... on to the code!

Monitoring (watching) a Directory with Node

The fs module of Node.js provides a method <a title="fs.watch method" href="http://nodejs.org/docs/latest/api/fs.html#fs_fs_watch_filename_options_listener">fs.watch(filename, [options], [listener])</a> that is fairly simple to use. Although the documentation (see link) notes that it's unstable, it truly is very simple to use.

Note: Since I'm using the Node.js tools for Visual Studio (hence Windows (I know, I know, I can hear the hisssssss from some purists)) I'm going to assume that you have a folder on your machine (like I do on mine) named "C:\sim". You can modify the code to "/usr/whatever/whatever/" on Linux or do as you wish to point to a folder on your machine. The example code, and the module I've written work on CentOS and Windows you'll just have to modify the paths a tiny bit.

So: Using the built in Node.js fs.watch method to monitor the target folder would look something like this...

JavaScript
// Require the file system
  fs = require("fs");
// Watch the sim directory
fs.watch("c:\\sim", { persistent: true }, function (event, fileName) {
  console.log("Event: " + event);
  console.log(fileName + "\n");
});

Pretty simple right? When something changes in the sim directory, the listener function will log to the console the event that happened and the file name that was affected. So the return is pretty simple as well. If you want more, then you'll have to go get information about the file yourself (which is pretty handy that they provide you the file name on change). Let's look at the output of the above code under the following scenarios.

1. Adding a File to the Folder

Let's add a file named test.txt to the sim folder and see the results.

Event: rename
test.txt

Event: change
test.txt

Event: change
test.txt

2. Now Let's Rename test.txt to blah.txt

Event: rename
null

Event: rename
blah.txt

Event: rename
blah.txt

3. Now Let's Delete that File

Event: rename
null

4. Now Let's Add a Folder "New Folder"

Event: rename
New Folder

5. Now Let's Rename the Folder "Test Folder"

Event: rename
null

Event: rename
Test Folder

6. And Finally, Let's Go Ahead and Delete the Folder

Event: rename
null

As you can see, we don't get a lot of useful information. We know that something has happened, the general event and sometimes the name of the file / folder that was added. This works for a lot of uses. and you get slightly better results when you watch a single file and not a folder, but it's still not what I need. I need a little bit better information.

Enter DirectoryWatcher

So... seeing as the following conditions apply,

  1. The fs.watch method doesn't exactly do what I want,
  2. I'm attempting to learn how to do this stuff in JavaScript/Node instead of other technologies,
  3. This seems like the perfect learning experience opportunity, and
  4. I'm slightly off my nut... I decided to "roll my own" solution.

The DirectoryWatcher module (attached to this article, and available on GitHub) uses the fs module heavily. It uses a timer in the form of setInterval to control how often the Directory you're monitoring is checked, and can recursively check any child folders. I wouldn't recommend using it in any production application as it is a module written by an admitted newbie, but I think it's a good starting point. You can take a look at the heavily commented code to see how I built it in detail at your leisure.

Describing DirectoryWatcher

The DirectoryWatcher module exports an object (named creatively enough) DirectoryWatcher. This object scans a given directory (and children if recursive is set to true) and raises six events based on what it finds. The DirectoryWatcher also exposes a directoryStructure object that represents a view of the directory being monitored as it's scanned.

DirectoryWatcher Constructor

The DirectoryWatcher has a constructor, DirectoryWatcher(root, recursive) where root is the path to the directory you want to monitor, and recursive determines if it monitors child folders (true = monitors children, false = it doesn't).

So when you create a new object of DirectoryWatcher, you would do so like this.

JavaScript
// Imports / Requires
var dirwatch = require("./modules/DirectoryWatcher.js");

// Create a monitor object that will watch a directory
// and all it's sub-directories (recursive) in this case
// we'll assume you're on a windows machine with a folder 
// named "sim" on your c: drive.
// should work on both linux and windows, update the path
// to some appropriate test directory of your own.
// you can monitor only a single folder and none of its child
// directories by simply changing the recursive parameter to
// to false
var simMonitor = new dirwatch.DirectoryWatcher("C:\\sim", true);

DirectoryWatcher Properties

The DirectoryWatcher object has the following properties:

  1. root: The base directory path of the directory being monitored
  2. recursive: Determines if the children folders of that directory are monitored as well as the root
  3. directoryStructure: An object representing the scanned structure and details about its files
  4. timer: The current timer for this instance of the object
  5. suppressInitialEvents: Determines if events are fired the first time the DirectoryWatcher scans a directory (when .start() is called)

Exposed Methods of DirectoryWatcher

The DirectoryWatcher object exposes the following methods:

  1. scanDirectory(dir, suppressEvents): The primary scanning method. Tries to be non blocking as possible. Scans a given directory, then attempts to record each file in the directory.
    • dir = the directory to scan
    • suppressEvents = Suppress any events that would be raised this scan iteration
    • true = Events will be suppressed
    • false = Events will be raised.
      Note: I don't really see a reason to use this method directly in most circumstances, use start(interval) and stop() instead.
  2. start(interval): Starts this instance of the DirectoryWatcher monitoring the given root path (set when the object was created) and defines the interval to check for changes.
    • interval = Time (in milliseconds) between checks for update for the given monitored directory.
  3. stop(): Stops this instance of the DirectoryWatcher from watching for changes

Events the DirectoryWatcher will Raise

The DirectoryWatcher currently raises six events, those events are:

  1. fileAdded: Raised when a file is added to a monitored folder. it will also pass back a FileDetail object that describes the file that has been added.
  2. fileChanged: Raised when a file has changed. it will also pass back a FileDetail object describing the file that was changed and an associative array object that describes what exactly has been changed.
  3. fileRemoved: Raised when a file has been removed. Also passes back the full path to the file that was removed.
  4. folderAdded: Raised when a folder has been added to a monitored directory, also passes back the path to the folder that was just added. (only fires in recursive mode)
  5. folderRemoved: Raised when a folder has been removed from a monitored directory or one of its children, will also pass back the path to the folder that was removed. (only fires in recursive mode)
  6. scannedDirectory: Raised every time a directory has been completely scanned (this one fires a lot)

Associated Object: FileDetail

There is an associated object with this (included in the module) called FileDetail this object is passed back from the fileAdded and fileChanged events, and is also contained within the directoryStructure object to represent files in the directory. The FileDetail object has the following properties:

  1. directory: parent directory of the file
  2. fullPath: the entire path including directory of the file.
  3. fileName: just the name of the file without the path
  4. size: the size in bytes of the file
  5. extension: the extension of the file (.js, .txt, etc)
  6. accessed: the last accessed date of the file
  7. modified: the last modified date of the file
  8. created: the last created date of the file

The FileDetail object also has a method compareTo(fileDetail) that compares the current FileDetail object to the one that is passed in and returns information about if it's different and HOW it's different. The object it returns has a property different which is a boolean indicating if the files are different and a property differences which is an associative array of differences found.

Putting It All Together (Using the Code)

Ok, now that I've described the module above, let's illustrate how this module can be used in a very simple example application. The example will mimic the application we started with that just used the fs.watch method.

The major differences in use are that we'll have to capture events instead of passing through to a listener function... see below code.

Note: I assume that the DirectoryWatcher module is in a folder named "modules" in the same directory as the app.js code. (This example is contained in the attached code files and GitHub link.)

JavaScript
// Imports / Requires
var dirwatch = require("./modules/DirectoryWatcher.js");

// Create a monitor object that will watch a directory
// and all it's sub-directories (recursive) in this case
// we'll assume you're on a windows machine with a folder 
// named "sim" on your c: drive.
// should work on both linux and windows, update the path
// to some appropriate test directory of your own.
// you can monitor only a single folder and none of its child
// directories by simply changing the recursive parameter to
// to false
var simMonitor = new dirwatch.DirectoryWatcher("C:\\sim", true);

// start the monitor and have it check for updates
// every half second.
simMonitor.start(500);

// Log to the console when a file is removed
simMonitor.on("fileRemoved", function (filePath) {
  console.log("File Deleted: " + filePath);
});

// Log to the console when a folder is removed
simMonitor.on("folderRemoved", function (folderPath) {
  console.log("Folder Removed: " + folderPath);
});

// log to the console when a folder is added
simMonitor.on("folderAdded", function (folderPath) {
  console.log("Folder Added: " + folderPath);
});

// Log to the console when a file is changed.
simMonitor.on("fileChanged", function (fileDetail, changes) {
  console.log("File Changed: " + fileDetail.fullPath);
  for (var key in changes) {
    console.log("  + " + key + " changed...");
    console.log("    - From: " + ((changes[key].baseValue instanceof Date) ? 
    changes[key].baseValue.toISOString() : changes[key].baseValue));
    console.log("    - To  : " + ((changes[key].comparedValue instanceof Date) ? 
    changes[key].comparedValue.toISOString() : changes[key].comparedValue));
  }
});

// log to the console when a file is added.
simMonitor.on("fileAdded", function (fileDetail) {
  console.log("File Added: " + fileDetail.fullPath);
});

// Let us know that directory monitoring is happening and where.
console.log("Directory Monitoring of " + simMonitor.root + " has started");

Now, Let's run the above sample application and perform the same operations in the sim directory as we did before...

1. Add a File named test.txt to the sim Folder

Directory Monitoring of C:\sim has started
File Added: C:\sim\test.txt

2. Now Let's Rename test.txt to blah.txt

File Added: C:\sim\blah.txt
File Deleted: C:\sim\test.txt

3. Now Let's Delete that File

File Deleted: C:\sim\blah.txt

4. Now let's add a folder "New Folder"

Folder Added: C:\sim\New Folder

5. Now Let's Rename the Folder "Test Folder"

Folder Removed: C:\sim\New Folder
Folder Added: C:\sim\Test Folder

6. Let's Go Ahead and Delete the Folder

Folder Removed: C:\sim\Test Folder

7. Finally Let's Add test.txt Back to the sim Folder and then Modify it

File Changed: C:\sim\test.txt
  + size changed...
    - From: 51
    - To  : 112
  + modified changed...
    - From: 2013-12-21T03:15:02.000Z
    - To  : 2013-12-21T03:15:31.000Z

I think that's a little more informative than before. The performance hit might be a little larger, but for a first try... I can live with that.

From here, I could build an application that monitors a directory, reports when something is changed, or use this object to build an application that waits for a file to be dropped in a directory before processing it.

I can think of a great number of uses. I'm sure you can too. The module would have to be refactored a little bit and error handling added, but it could work.

Final Thoughts

I enjoyed building the DirectoryWatcher and I hope that this article describing it was helpful to you and that the code attached helps you learn along with me.

I still plan on trying to do two of these articles a month (work and children permitting), and the next article will still be in the fs module "space". Due to time constraints, even this small module took a week and a half of my free time, so I won't commit to more than twice a month, however, publicly stating my release commitments should keep my nose to the grind stone.

Enjoy, and [merry, happy, undefined] $holidayName to all of you.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)