Back to Top

Sunday, August 19, 2012

Lightning talk at Cluj.PM


The slides from my Cluj.PM lightning talk:

It was a stressful (but fun!) experience. Thanks to the organizers!

Running pep8 and pylint programatically


Having tools like pep8 and pylint are great, especially given the huge amount of dynamism involved in Python - which results in many opportunities to shooting yourself in the foot. Sometimes however you want to invoke these tools in more specialized ways, for example only on the files which changed since the last commit. Here is how you can do this from a python script and capture their output for later post-processing (maybe you want merge the output from both tools, or maybe you want to show only the lines which changed since the last commit, etc):

import pep8
  sys.stdout = StringIO()
  pep8_checker = pep8.StyleGuide(config_file=config, format='pylint')
  pep8_checker.check_files(paths=[ ...path to files/dirs to check... ])
  output = sys.stdout.getvalue()
  sys.stdout = sys.__stdout__

from pylint.lint import Run
from pylint.reporters.text import ParseableTextReporter

reporter = ParseableTextReporter()
result = StringIO()
Run(['--rcfile=pylint.config'] + [ ...files.., ], reporter=reporter, exit=False)
output = result.getvalue()

It is recommended that you use pylint/pep8 installed trough pip/easy_install rather than the Linux distribution repositories, since they are known to contain outdated software. You can check for this via code like the following:

if pkg_resources.get_distribution('pep8').parsed_version < parse_version('1.3.3'):
    logging.error('pep8 too old. At least version 1.3.3 is required')
if pkg_resources.get_distribution('pylint').parsed_version < parse_version('0.25.1'):
    logging.error('pylint too old. At least version 0.25.1 is required')

Finally, if you have to use an old version of pep8, the code needs to be modified to the following (however, this older version probably won't be of much use and will most likely annoy you - you should really try to use an up-to-date version - for example you could isolate this version using virtualenv):

result = []
import pep8
pep8.message = lambda msg: result.append(msg)
for code_dir in [ ...files or dirs... ]:

Wednesday, August 15, 2012

Clearing your Google App Engine datastore


Warning! This is a method to erase the data from your Google App Engine datastore. There is no way to recover your data after you go trough with this! Only use this if you're absolutely certain!

If you have a GAE account used for experimentation, you might like to clean it up sometimes (erase the contents of the datastore and blobstore associated with the application). Doing this trough the admin interface can become very tedious, so here is an alternative method:

  1. Start your Remote API shell
  2. Use the following code to delete all datastore entities:
    while True: keys=db.Query(keys_only=True).fetch(500); db.delete(keys); print "Deleted 500 entries, the last of which was %s" % keys[-1].to_path()
  3. Use the following code to delete all blobstore entities:
    from google.appengine.ext.blobstore import *
    while True: list=BlobInfo.all().fetch(500); delete([b.key() for b in list]);  print "Deleted elements, the last of which was %s" % list[-1].filename

The above method is inspired by this stackoverflow answer, but has the advantage that it does the deletion in smaller steps, meaning that the risk of the entire transaction being aborted because of deadline exceeded or over quota errors is removed.

Final caveats:

  • This can be slow
  • This consumes your quota, so you might have to do it over several days or raise your quota
  • The code is written in a very non-pythonic way (multiple statements on one line) for the ease of copy-pasting