The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

imagecluster - clusters and renames image files based on exif tags

SYNOPSIS

imagecluster -v -d /home/myuser/imagegallery -t 72 *JPG

DESCRIPTION

The inspiration for this program came from recently getting a new Canon SD500 camera to replace my Canon S30 that I'd had for years. The upside, the Canon SD500 rocks! The downside, I now have 2 cameras that are burning through the same sequence numbers, so my previous solution of just putting all the files in to directories by the first 2 digits of the sequence numbers was no longer going to work.

Imagecluster solves this problem, plus another grouping problem that I'd been thinking about, by extracting the CreateDate and FileNumber exif tags from the images, and using that as the basis of a new filename (typically YYYY:mm:dd_HH:MM:SS_FileNumber.jpg). This ensures that 2 images taken at the same second have an even smaller chance of colliding, as their camera sequence numbers would have to also be the same at that second.

But that is just the first step. I have noticed that I am an occational photographer, so take pictures in bursts, often for a weekend of hanging out with folks, though sometimes for a vacation as well. This got me thinking. What I really needed is a tool that also creates directories that allows for some minimum tollerance between CreateDate, that is used to cluster images. For me, the optimum value seems to be 36 hours, though this is configurable via the command line.

This took me an afternoon to pull together, I'm sure it could be smarter, but it is useful enough to post for others to use.

OPTIONS

-d directory

Set the target directory for images. Defaults to /tmp/photos, which is probably not what you want.

-D

Dryrun. Tells you what the program would have done.

-h

Print out help message

-s

Seperator character. It defaults to : (i.e. 2005:10:09...), but is user configurable because my friend Clemens wants to use - (i.e. 2005-10-09) instead.

-t

Set the tollerance for image clustering. This is the maximum time between any 2 pictures in a cluster, which will cause a new cluster to be created. The name of the cluster will be YYYY:MM:DD of the first image in the cluster, even if it spans multiple days. Because this tollerance is the maximum time between any two images in the cluster, it is possible that all images you have ever taken could be in 1 cluster, if you took a picture every day of your life. Hence, this feature isn't useful to everyone. If you are that kind of person, set tollerance to 16 hours or something, and you'll tend to get 1 day sized buckets.

-v

Prints verbose output

TODO

*

See how useful this actually is. On my brief tests reorganizing my own images, it was incredibly useful.

*

Create other types of groupings. My friend Mike takes a picture of his son everyday, so really wants some other kind of grouping.

*

Add some tests. There really aren't any at this point. This would require some fake small exif data jpegs, which should be easy enough.

BUGS

None known at this time. I've only tested with Canon camera images though, so reports working on other versions would be good.

LICENSE

GPL v2

AUTHOR

Sean Dague <sean at dague dot net>