For a system/service I am working on, I basically need to keep a baseline which is a csv file containing demographic information. The baseline file has many rows, of which each row represents a persons demographic information. Then a cronjob service will create a temporary csv file with updated information or added persons. My service processes the new temporary file and updates the baseline to include any changed data or added persons, and also any persons to be removed from the baseline.
This article basically tells you how to use the
diff -e option to generate a GNU ed editor script combined with ed and some other commands to properly create a new baseline file.
Bash Script & Variables
The script is run in a bash shell, and I create variables to reference the baseline and temporary files:
#!/bin/bash NOW=$(date +"%Y%m%d%H%M") # Timestamp for creating files. BASELINE=`ls baseline/baseline.csv` TEMP=`ls temp/compare.csv`
In the above script, the variable
$NOW is used for timestamping files. The format for the timestamp is “yyyymmddhhmm”. This is handy whenever you want to keep track of when a file was created. The variable
$baseline is the baseline file and
$TEMP is the temporary file.
Creating ed Script
The following line uses the
-e option with diff to create an ed script:
diff -e $BASELINE $TEMP > ed-script
The file “ed-script” is basically an ed editor script.
Creating New Baseline
Then to create a new baseline with the ed script, you need to run the following command(s):
cp $BASELINE baseline/new_baseline.csv (cat ed-script && echo w) | ed - baseline/new_baseline.csv
I’ve shown 2 command lines, one to first create a copy of the original baseline so as to not overwrite the original baseline yet; and then secondly, create the new baseline. The
(cat ed-script && echo w) part of the script, basically cats the ed-script to standard output and then issues a w to write the file; this is all piped into ed and the new baseline file to generate.
Backing Up Old Baseline
It’s a good idea to archive (or backup) things in case anything goes wrong:
mv $BASELINE archive/baseline_$NOW.csv mv baseline/new_baseline.csv $BASELINE
The above moves the original baseline to an archive folder and appends a timestamp to the filename. Then the second move (mv), renames the file to baseline.csv which completes the creation of the new baseline.
For reference, here is the entire script:
#!/bin/bash NOW=$(date +"%Y%m%d%H%M") # Timestamp for creating files. BASELINE=`ls baseline/baseline.csv` TEMP=`ls temp/compare.csv` diff -e $BASELINE $TEMP > ed-script cp $BASELINE baseline/new_baseline.csv (cat ed-script && echo w) | ed - baseline/new_baseline.csv mv $BASELINE archive/baseline_$NOW.csv mv baseline/new_baseline.csv $BASELINE