When working on FMN's new architecture I been wanted to profile a little bit the application, to see where it spends most of its time.

I knew about the classic cProfile builtin in python but it didn't quite fit my needs since I wanted to profile a very specific part of my code, preferrably without refactoring it in such a way that I could use cProfile.

Searching for a solution using cProfile (or something else), I ran into the pycon presentation of A. Jesse Jiryu Davis entitled 'Python performance profiling: The guts and the glory'. It is really quite an interesting talk and if you have not seen it, I would encourage you to watch it (on youtube)

In this talk is presented yappi, standing for Yet Another Python Profiling Implementation and writen by Sümer Cip, together with some code allowing to easy use it and write the output in a format compatible with callgrind (allowing us to use KCacheGrind to visualize the results).

To give you an example, this is how it looked before (without profiling):

t = time.time()
results = fmn.lib.recipients(PREFS, msg, valid_paths, CONFIG)
log.debug("results retrieved in: %0.2fs", time.time() - t)

And this is the same code, integrated with yappi

import yappi
yappi.set_clock_type('cpu')
t = time.time()
yappi.start(builtins=True)
results = fmn.lib.recipients(PREFS, msg, valid_paths, CONFIG)
stats = yappi.get_func_stats()
stats.save('output_callgrind.out', type='callgrind')
log.debug("results retrieved in: %0.2fs", time.time() - t)

As you can see, all it takes is 5 lines of code to profile the function fmn.lib.recipients and dump the stats in a callgrind format.

And this is how the output looks like in KCacheGrind :) kcachegrind_fmn.png