User:InverseHypercube/flickr_exif.py

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

A Python script for downloading files from Flickr with EXIF metadata intact. This can already be done with images from Pro users who choose not to hide their original image; however, this will work for all others, downloading the largest image size possible. It has problems with photos that have numbered EXIF tags; this seems to be something Flickr has done in the past.

You must have exiftool and Python Flickr API (available on Ubuntu as the package python-flickrapi) installed. Then, run the file with the Flickr image ID as the argument; for example, for http://www.flickr.com/photos/dahlstroms/4083220012/ do python flickr_exif.py 4083220012. You can provide multiple IDs as arguments, and it will download one after the other. The --auth argument is for turning SafeSearch off, and requires sign-in.

Run it in an empty directory since it creates files and will overwrite existing ones if they have the same name.

The output names will be the image IDs.

#!/usr/bin/env python
import urllib
import os
import argparse

import flickrapi

api_key = '367aed513876077c1cdcadb29d88ef02'
api_secret = '9b6e223653519900'

parser = argparse.ArgumentParser()
parser.add_argument('--auth', action='store_true')
parser.add_argument('photos', nargs='+')
args = parser.parse_args()

flickr = flickrapi.FlickrAPI(api_key, api_secret)

if args.__dict__['auth']:
    (token, frob) = flickr.get_token_part_one(perms='write')
    if not token: raw_input("Press ENTER after you authorized this program")
    flickr.get_token_part_two((token, frob))

for photo in args.__dict__['photos']:
    url = flickr.photos_getSizes(photo_id=photo).getiterator('size')[-1].attrib['source']

    filename = '%s.'%photo + url.split('.')[-1].split('?')[0]

    urllib.urlretrieve(url, filename)

    tags = flickr.photos_getExif(photo_id=photo).getiterator('exif')

    with open('tags.xml', 'w') as tags_file:
        tags_file.write("<?xml version='1.0' encoding='UTF-8'?>\n<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'>\n")
        for tag in tags:
            tags_file.write('<%s:%s>%s</%s:%s>\n'%(tag.attrib['tagspace'], tag.attrib['tag'], tag.getchildren()[0].text.strip().encode('utf-8'), tag.attrib['tagspace'], tag.attrib['tag']))
        tags_file.write('</rdf:RDF>\n')

    os.system('exiftool -overwrite_original -tagsfromfile tags.xml %s'%filename)

os.remove('tags.xml')