I recently did a European cycle trip and as described here I made some gpx files of geocaches along my intended route.
The trouble is most of the cache descriptions were in German and my German is practically non-existent. I wanted to do all my caching offline so I wrote a small python script to use Google Translate to translate all the descriptions and logs in the GPX file.
The code uses the minidom
module in the python standard library to parse the xml
in the GPX file.
It uses the goslate library to query Google translate. The codecs
module also in the python standard library is used to correctly write out the unicode characters.
The script attempts to translate the description and logs for each geocache into English and append them after the original text.
WARNING: This code only translates some descriptions and logs, if there is a lot of formatting in the description it fails to translate and just returns a second copy of the original text. I haven't spent much time trying to resolve this bug. Feel welcome to improve the script for your on use. Also use at your own discretion after reading Google Translate's terms and conditions.
from xml.dom import minidom
import codecs
import goslate
import time
gs = goslate.Goslate()
fn = 'route-caches.gpx'
xmldoc = minidom.parse(fn)
descs = xmldoc.getElementsByTagName('groundspeak:long_description')
n = len(descs)
for i, node in enumerate(descs):
try:
desc = node.firstChild.nodeValue
time.sleep(0.2)
endesc = gs.translate(desc, 'en')
node.firstChild.nodeValue = (desc + ' <hr> ' + endesc)
except:
pass
print '%d of %d' %(i, n)
texts = xmldoc.getElementsByTagName('groundspeak:text')
n = len(texts)
for i, node in enumerate(texts):
try:
text = node.firstChild.nodeValue
time.sleep(0.2)
entext = gs.translate(text, 'en')
node.firstChild.nodeValue = (text + ' <hr> ' + entext)
except:
pass
print '%d of %d' %(i, n)
with codecs.open('route-caches-en.gpx', "w", "utf-8") as out:
xmldoc.writexml(out)