Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / CSV

Using diff -e Option to Create a Baseline diff File

5.00/5 (2 votes)
23 Jun 2018CPOL2 min read 3.6K  
How to use diff -e option to create a baseline diff file

Introduction

For a system/service I am working on, I basically need to keep a baseline which is a CSV file containing demographic information. The baseline file has many rows, of which each row represents a person's demographic information. Then a cronjob service will create a temporary CSV file with updated information or added persons. My service processes the new temporary file and updates the baseline to include any changed data or added persons, and also any persons to be removed from the baseline.

This article basically tells you how to use the diff -e option to generate a GNU ed editor script combined with ed and some other commands to properly create a new baseline file.

Bash Script & Variables

The script is run in a bash shell, and I create variables to reference the baseline and temporary files:

Bash
#!/bin/bash

NOW=$(date +"%Y%m%d%H%M") # Timestamp for creating files.
BASELINE=`ls baseline/baseline.csv`
TEMP=`​ls temp/compare.csv​`

In the above script, the variable $NOW is used for timestamping files. The format for the timestamp is “yyyymmddhhmm”. This is handy whenever you want to keep track of when a file was created. The variable $baseline is the baseline file and $TEMP is the temporary file.

Creating ed Script

The following line uses the -e option  with diff to create an ed script:

Bash
diff -e $BASELINE $TEMP > ed-script

The file “ed-script” is basically an ed editor script.

Creating New Baseline

Then, to create a new baseline with the ed script, you need to run the following command(s):

Bash
cp $BASELINE baseline/new_baseline.csv
(cat ed-script && echo w) | ed - baseline/new_baseline.csv

I’ve shown 2 command lines, one to first create a copy of the original baseline so as to not overwrite the original baseline yet; and then secondly, create the new baseline. The (cat ed-script && echo w) part of the script, basically cats the ed-script to standard output and then issues a w to write the file; this is all piped into ed and the new baseline file to generate.

Backing Up Old Baseline

It’s a good idea to archive (or backup) things in case anything goes wrong:

Bash
mv $BASELINE archive/baseline_$NOW.csv
mv baseline/new_baseline.csv $BASELINE

The above moves the original baseline to an archive folder and appends a timestamp to the filename. Then the second move (mv), renames the file to baseline.csv which completes the creation of the new baseline.

Entire Script

For reference, here is the entire script:

Bash
#!/bin/bash

NOW=$(date +"%Y%m%d%H%M")	# Timestamp for creating files.
BASELINE=`ls baseline/baseline.csv`
TEMP=`ls temp/compare.csv`

diff -e $BASELINE $TEMP > ed-script
cp $BASELINE baseline/new_baseline.csv
(cat ed-script && echo w) | ed - baseline/new_baseline.csv
mv $BASELINE archive/baseline_$NOW.csv
mv baseline/new_baseline.csv $BASELINE

Enjoy!

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)