For some time now, I’ve been trying to figure how to mirror a directory recursively so that all files (in any depth of the source directory) in the source directory will be in the destination direction. In addition, if any files on the source directory changed, we could update the destination directory just by running the program again.
For the program to run, you need to make the variables, from_dir and to_dir, point to the source and destination directories respectively.
At the moment, the script works excellently well and is well commented to elucidate the workings of the script and my intentions. On the other hand, I believe there can be some improvements to the script like:
- Enabling input facilities in the python script so that the client will only need to enter the values of from_dir and to_dir variables.
- Make it cleaner by commenting more.
- Or even package it into a class.
- Or if we (or you or I) feel very challenged (or less lazy), we/you/I could distribute it as a python third-party package.
Please modify as you please! This code is in the public domain.
#!/usr/bin/env python # Python script to compare two directories, from_dir and to_dir # and copy or files from from_dir that are not in to_dir # it does this recursively as it walks on directories # author Daniel Alabi # date import os import os.path import re import subprocess # change from_dir and to_dir as appropriate from_dir = "/home/daniel/jquery/" to_dir = "/home/daniel/jquery-alias/" # getExtDir gets the remaning path of the full # path when "from_dir" has been removed from # dir def getExtDir(dir): if re.search(from_dir, dir): return re.sub(from_dir, "", dir) # getUpperLevelDir gets the full path of the present # directory (".") def getUpperLevelDir(dirString): # handle the case where dirString already has a # trailing "/" if (dirString[-1] == "/"): dirString = dirString[0:-1] to = dirString.rfind("/") return dirString[0:to] + "/" # updateDirs recursively updates to_here until # it has the same files (up-to-date ones) and # structure as from_here def updateDirs(from_here, to_here): os.chdir(from_here) # cd to from_here # walk the from_here directory for file_or_dir in os.listdir(from_here): to_file_or_dir = to_here + file_or_dir from_file_or_dir = from_here + file_or_dir # check if to_file_or_dir is a dir by first determining if it # is supposed to be a dir # if it is a dir, first of all attach a trailing forward slash # this will make it easier for the remaining part of # the program to deal with the directories, from_here and to_here if (os.path.isdir(file_or_dir)): to_file_or_dir += "/" from_file_or_dir += "/" # check if to_file_or_dir (as a dir) exists # if it doesn't make the directory if (not os.path.exists(to_file_or_dir)): os.mkdir(to_file_or_dir) upperleveldir = getUpperLevelDir(from_file_or_dir) # now we are sure that to_file_or_dir # exists and is a dir; recurse into it updateDirs(from_file_or_dir, to_file_or_dir) # cd back to where you came from os.chdir(upperleveldir) else: # it is a file # check if the file exists if (os.path.exists(to_file_or_dir)): # get the times when they were modified last modified_to = os.stat(to_file_or_dir).st_mtime modified_from = os.stat(from_file_or_dir).st_mtime # here we check if the modified times of to_file_or_dir # is the same as that of from_file_or_dir # Since we just want to make sure that the file # in to_here (the directory we are going to) mirrors # the one in from_here, we just check that the modified # times are equal # if they aren't we copy the one from from_here # to the one in to_here and remove # the already existing one in to_here if (modified_from > modified_to): copyString = from_file_or_dir + " " + getUpperLevelDir(\ to_file_or_dir) # print useful messages to screen about the update # i'm about to make print from_file_or_dir, "has been modified and",\ to_here, "not updated" print "Copying" , from_file_or_dir, "from", \ from_here, "to", to_here subprocess.Popen('cp ' + copyString, shell=True) os.remove(to_file_or_dir) else: # the file to_file_or_dir does not exist # so we copy the file from from_here # to to_here copyString = from_file_or_dir + " " + getUpperLevelDir(\ to_file_or_dir) # print useful message to screen about copying # from_file_or_dir to to_here print from_file_or_dir, "does not exist in",\ to_here print "Copying" , from_file_or_dir, "from", \ from_here, "to", to_here subprocess.Popen('cp ' + copyString, shell=True) # traditional main that does calls # updateDirs -- the recursive function def main(): print "***Updating files in", to_dir, "to mirror the ones in", \ from_dir, "***\n" updateDirs(from_dir, to_dir) print "***FINISHED***" if __name__ == "__main__": main()
End of Program
Note: There might be some indentation mistakes in the above script which might have occurred in the process of copying the source from my editor to the WordPress post textArea. Bear with me!