Programming In Shell Script: Merge Directories

A couple of years ago I was tasked with writing a script for merging the files within two directories into a new directory. This could be accomplished through a variety of scripting languages like Perl but I thought it was be good time to work on some Linux/Unix shell scripting.

Here is the program in its entirety broken into sections:


# The script merges the regular files in directory1 and directory2 into destdir (exclude subdirectories).  done
# Files that are in only one of the directories may be copied directly.  done
# However, if the file is in both directories, copy the newer file.  done
# Make sure the source directories exists.  done
# Create the destination directory if it does not exist.  done
# Print out a "Usage" message if the user did not provide 3 arguments. done

# Syntax:   merge.sh directory1 directory2 destdir

#!/bin/sh

dir1=$1; dir2=$2; desdir=$3;

# Usage
[ $# -ne 3 ] && { echo "Usage: merge.sh requires 3 directory parameters.\nSyntax: merge.sh directory1 directory2 destdir\nmerge.sh takes all regular files from two directories and copies them into one destination directory.\nIf each directory has the same file, the newer one is copied over."; exit 1; }

# make sure Dir 1 ad Dir 2 exist
exs=0 # 0 = both present | 1 = dir 1 missing | 2 = dir 2 missing | 3 = both dir 1 and dir 2 missing
[ ! -d $dir1 ] && exs=`expr $exs + 1`
[ ! -d $dir2 ] && exs=`expr $exs + 2`

[ $exs -eq 1 ] && { echo "Directory one cannot be found"; exit 1; }
[ $exs -eq 2 ] && { echo "Directory two cannot be found"; exit 1; }
[ $exs -eq 3 ] && { echo "Directory one and directory two cannot be found"; exit 1; }

This initial part of the program does our argument validation and determines if the two directories we wish to merge are real directories.


# make the destdir if it does not exist
# Check if a file with this name exists
# if so:
	# check if it is a directory
	# if so:
		# continue
	# if not:
		# error, we would overwrite another file with that name
# if not:
	# create this directory

if [ -e $desdir ]; then
	if [ ! -d $desdir ]; then
		echo "Your specified destination directory exists as a file that is not a directory.\nThis program will not overwrite this file.\nProgram end."
		exit 1
	fi
else
	mkdir $desdir
fi

The above portion checks the target directory for the merged files to be transferred two. If the directory does not exist it is created.


# Loop through Directory 1

	# check if current file is a FILE
	# if so:
		# check for existance of file in Directory 2
		# if exists:
			# compare last modified time of each file
			# copy file into the Destination Directy which has the latest modification
		# if no exist:
			# copy file directly
	# if not:
		# ignore file

for d1file in $dir1/*; # $d1file is like: dir1/file
do
	curFile=`basename $d1file`
	if [ -f "$d1file" ]; then
		if [ -f "$dir2/$curFile" ]; then
			if [ "$d1file" -nt "$dir2/$curFile" ]; then
				cp "$d1file" "$desdir"
			else
				cp "$dir2/$curFile" "$desdir"
			fi
		else
			cp "$d1file" "$desdir"
		fi
	fi
done

# at this point we have taken all unique files from directory one and moved them into the destination directory.
# files which were in both, have been moved from either directory 1 or directory 2 depending on which one is newer
# directory 2 still has files which are unique to it, and they must be moved

We then loop through directory one and copy all of the files into the destination directory.


# Loop through Directoy 2

	# check if current file is a FILE
	# if so:
		# Check if file has been moved into Destination Directory already (from Dir 1 loop)
		# if so:
			# ignore file
		# if not:
			# copy file into Destination Directory
	# if not:
		# ignore file

for d2file in $dir2/*;
do
	curFile=`basename $d2file`
	if [ -f "$d2file" ]; then
		if [ ! -e "$desdir/$curFile" ]; then
			cp "$d2file" "$desdir"
		fi
	fi
done

# at this point all files which were also in Dir 1 and were newest were copied by Loop 1
# unique files in Dir 2 have ben copied into Destination Directory

exit 0;

Same is done for directory two leaving us with all the files of both directories copied into a new directory of our choosing. Lets see it in action:

jordan@jordan-VirtualBox:~/Desktop$ mkdir dir1;
jordan@jordan-VirtualBox:~/Desktop$ touch dir1/A;
jordan@jordan-VirtualBox:~/Desktop$ touch dir1/B;
jordan@jordan-VirtualBox:~/Desktop$ touch dir1/X;
jordan@jordan-VirtualBox:~/Desktop$ mkdir dir2;
jordan@jordan-VirtualBox:~/Desktop$ touch dir2/C;
jordan@jordan-VirtualBox:~/Desktop$ touch dir2/D;
jordan@jordan-VirtualBox:~/Desktop$ touch dir2/X;
jordan@jordan-VirtualBox:~/Desktop$ echo "dir1 X file" > dir1/X;
jordan@jordan-VirtualBox:~/Desktop$ echo "dir2 X file" > dir2/X;
jordan@jordan-VirtualBox:~/Desktop$
jordan@jordan-VirtualBox:~/Desktop$ ls -l
total 12
drwxr-xr-x 2 jordan jordan 4096 2011-12-14 10:58 dir1
drwxr-xr-x 2 jordan jordan 4096 2011-12-14 10:58 dir2
-rwxrwxrwx 1 jordan jordan 3321 2009-04-04 15:18 merge.sh
jordan@jordan-VirtualBox:~/Desktop$

Above I have created a test environment for the script. We have two directories, each with unique files and each with one file “X” with the same name but differing content. Now I will run the merge.sh shell script:

jordan@jordan-VirtualBox:~/Desktop$ ./merge.sh dir1 dir2 dest
jordan@jordan-VirtualBox:~/Desktop$
jordan@jordan-VirtualBox:~/Desktop$ ls -l
total 16
drwxr-xr-x 2 jordan jordan 4096 2011-12-14 11:00 dest
drwxr-xr-x 2 jordan jordan 4096 2011-12-14 10:58 dir1
drwxr-xr-x 2 jordan jordan 4096 2011-12-14 10:58 dir2
-rwxrwxrwx 1 jordan jordan 3321 2009-04-04 15:18 merge.sh
jordan@jordan-VirtualBox:~/Desktop$ cd dest
jordan@jordan-VirtualBox:~/Desktop/dest$ ls -l
total 4
-rw-r--r-- 1 jordan jordan  0 2011-12-14 11:00 A
-rw-r--r-- 1 jordan jordan  0 2011-12-14 11:00 B
-rw-r--r-- 1 jordan jordan  0 2011-12-14 11:00 C
-rw-r--r-- 1 jordan jordan  0 2011-12-14 11:00 D
-rw-r--r-- 1 jordan jordan 12 2011-12-14 11:00 X
jordan@jordan-VirtualBox:~/Desktop/dest$
jordan@jordan-VirtualBox:~/Desktop/dest$ cat X
dir2 X file
jordan@jordan-VirtualBox:~/Desktop/dest$

As you can see, both directories merged as expected and the matched file “X” was taken from “dir2″ versus “dir1″ because “dir2″ files were copied after “dir1″ files.

Though the program is simple in nature, it fits the concept of shell scripting very nicely. Developing large applications is not really suited well for shell scripting, but making smaller tools that can be used consistently like many of Unix tools is of extreme benefit to the programmer.

Download the script and try it out yourself.

Download: merge.zip

Comments are closed.