Archive of articles classified as' "threading"

Back home

goless 0.7 released, with Python3 support and bug fixes


goless version 0.7.0 is out on PyPI. goless facilitates writing Go language style concurrent programs in Python, including functionality for channels, select, and goroutines.

I forgot to blog about 0.6 at the start of July, which brought Python 3.3 and 3.4 support to goless (#17). I will support pypy3 as soon as Travis supports it.

Version 0.7 includes:
- A “fix” for a gevent problem on Windows (socket must be imported!). #28
- Errors in the case of a deadlock will be more informative. For example, if the last greenlet/tasklet tries to do a blocking send or recv, a DeadlockError will be raised, instead of the underlying error being raised. #25
- goless now has a small exception hierarchy instead of exposing the underlying errors.
- Better PyPy stackless support. #29
- can be called with (case1, case2, case3), etc., in addition to a list of cases (ie, ([case1, case2, case3])). #22

Thanks to Michael Az for several contributions to this release.

Happy concurrent programming!

No Comments

goless now on PyPI


goless is now available on the Python Package Index: . You can do pip install goless and get Go-like primitives to use in Python, that runs atop gevent, PyPy, or Stackless Python. You can write code like:

channel = goless.chan()

def goroutine():
    while True:
        value = channel.recv()
        channel.send(value ** 2)

for i in xrange(2, 5):
    squared = channel.recv()
    print('%s squared is %s' % (i, squared))

# Output:
# 2 squared is 4
# 3 squared is 9
# 4 squared is 16

I’ve also ported the goless benchmarks to Go, for some basic comparisons to using goless on various Python runtimes (PyPy, CPython) and backends (gevent, stackless):

Thanks to Rui Carmo we have more extensive examples of how to use goless (but if you’ve done or read any Go, it should be relatively straightforward). Check them out in the examples folder:

And just a compatibility note (which is covered in the docs, and explained in the exception you get if you try to use goless without stackless or gevent available): goless works out of the box with PyPy (using in its stdlib) and Stackless Python. It works seamlessly with gevent and CPython 2.7. It works with PyPy and gevent if you use the tip of gevent and PyPy 2.2+. It will support Python 3 as soon as gevent does.

Thanks and if you use goless, I’m eager to hear your feedback!

No Comments

goless Benchmarks


I benchmarked how goless performs under different backends (goless is a library that provides a Go-like concurrency model for Python, on top of stackless, PyPy, or gevent). Here are the results, also available on the goless readthedocs page:

Platform Backend   Benchmark      Time
======== ========= ============== =======
PyPy     stackless chan_async     0.08400
CPython  stackless chan_async     0.18000
PyPy     gevent    chan_async     0.46800
CPython  gevent    chan_async     1.32000
~~~~~~~~ ~~~~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~
PyPy     stackless chan_buff      0.08000
CPython  stackless chan_buff      0.18000
PyPy     gevent    chan_buff      1.02000
CPython  gevent    chan_buff      1.26000
~~~~~~~~ ~~~~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~
PyPy     stackless chan_sync      0.04400
CPython  stackless chan_sync      0.18000
PyPy     gevent    chan_sync      0.44800
CPython  gevent    chan_sync      1.26000
~~~~~~~~ ~~~~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~
PyPy     stackless select         0.06000
CPython  stackless select         0.38000
PyPy     gevent    select         0.60400
CPython  gevent    select         1.94000
~~~~~~~~ ~~~~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~
PyPy     stackless select_default 0.00800
PyPy     gevent    select_default 0.01200
CPython  stackless select_default 0.19000
CPython  gevent    select_default 0.25000

The trends are that PyPy with its built-in stackless support is fastest, then Stackless Python (2-5x), then PyPy with gevent (5-10x), and finally CPython with gevent (15-30x).

Remember that these are benchmarks of goless itself; goless performance characteristics may be different in real applications. For example, if you have lots of C extensions, CPython/stackless may pull ahead due to stack switching.

If you’re using goless for anything, I’d love to get some benchmarks, and find out how it’s working, so please comment below, or create a GitHub issue.

Disclaimers: It’s possible that goless is inefficiently using gevent, but the backend-specific code is so simple I doubt it (see goless/goless/ These benchmarks are all done against Python 2.7. Also I am no benchmark master (especially with PyPy) so there may be problems, look over goless/ for benchmark code. These were done mostly for curiosity.

No Comments

goless- Golang semantics in Python


The goless library provides Go programming language semantics built on top of Stackless Python or gevent.* Here is a Go example using channels and Select converted to goless:

c1 = goless.chan()
c2 = goless.chan()

def func1():

def func2():

for i in range(2):
    case, val =[goless.rcase(c1), goless.rcase(c2)])

While I am not usually a Go programmer, I am a big fan of its style and patterns. goless provides the familiarity and practicality of Python while better enabling the asynchronous/concurrent programming style of Go. Right now it includes:

  • Synchronous/unbuffered channels (send and recv block waiting for a receiver or sender).
  • Buffered channels (send blocks if buffer is full, recv blocks if buffer is empty).
  • Asynchronous channels (do not exist in Go. Send never blocks, recv blocks if buffer is empty).
  • The select function (like reflect.Select, since Python does not have anonymous blocks we could not replicate Go’s Select statement).
  • The go function (runs a function in a tasklet/greenlet).

goless is pretty well documented and tested, but please take a look or give it a try and tell us what you think here or on GitHub’s issues. I’m especially interested in adding more Go examples converted to use goless, or other Go features replicated to create better asynchronous programs.**

*. goless was written at the PyCon 2014 sprints by myself, Carlos Knippschild, and Simon Konig, with help from Kristjan Valur Jonsson and Andrew Francis (sorry the lack of accents here, I am on an unfamiliar computer). Carlos and myself were both laid off while at PyCon- if you have an interesting job opportunity for either of us, please send me an email:

**. We are close to getting PyPy support working through its implementation. There are some lingering issues in the tests (though the examples and other ‘happy path’ code works fine under PyPy). I’ll post again and bump the version when it’s working.


Why I love blogging


I started to write a post about how I missed multithreading and speed going from C# to Python. I ended up realizing the service we built which inspired the post was poorly designed and took far more effort than it should have. The speed and multithreading of C# made it easier to come up with an inferior design.

The service needed to be super fast, but the legacy usage pattern was the problem. It needed to run as a service, but then we realize it’ll be running as a normal process. This is what happens when you focus too much on requirements and architecture instead of delivering value quickly. You create waste.

I wouldn’t have realized all of this if I didn’t sit down to write about it.

No Comments

A story of simplification and abstraction (stackless timeouts)


Someone was asking me the other day how to implement a timeout in a thread. His initial implementation used two background threads: one to do the work (making requests to a web service and updating a counter), and the other in a loop polling the counter and sleeping. If the first thread stopped updating the counter, the second should report some sort of error.

I helped him simplify the design in a couple ways. First I had him use stackless instead of threads and taught him how threading and microthreads work. Based on that, I suggested that instead of a counter and loop/sleep, there is a parent tasklet that kicks off a child tasklet which does the actual work.* The parent tasklet recvs on a channel with a timeout, and the child tasklet sends on the channel to act like a heartbeat. If the parent recv times out, it means the child tasklet hasn’t reported in and the user can be alerted. This simplified the code considerably.

I then asked a colleague (Kristján Valur) how to do a timeout with stackless, and he told me about the stacklesslib.util.timeout context manager. Doh! It ended up being as simple as:

    for item in items:
        with stacklesslib.util.timeout(200):
except stackless.util.TimeoutError:

It’s pretty amazing what sort of power you’re able to wield with a good language and framework. It’s so important to have the right abstractions, but you need to know how to use it. Even with documentation, nothing beats a little help from your friends.

*Instead of a channel, we probably could have used an Event.


A use for killing tasklets


A few weeks ago, I posted about Killing Tasklets, a feature of Stackless Python that allows you to abort a tasklet at any point. And it turns out I had a perfect use case for them just last week and things went swimmingly well.

We have a client program that controls the state of 3D objects, and a server program that does the rendering. The client calculates a serialized version of the server’s state (based on the client’s state) and sends it to the server. It does this through a publish/subscribe. The server receives the state and applies it to the current scene, moving objects and the like (of course we have other mechanisms for rebuilding the entire scene, this is just for ‘updating’ the attributes of the current object graph).

This causes a problem when the server takes longer to apply the new state to its scene than it does for the client to calculate it (maybe the client is super fast because it is caching everything). The server lags further and further behind the client. So when the server receives the ‘update’ command, it kicks off and stores the tasklet to do the updating. If another ‘update’ comes in while the previous update’s tasklet is still alive, it kills that tasklet and starts a new one. This way we get as smooth an updating as possible (dropping updates would cause more choppiness). This does require that updates are ‘absolute’ and not relative to other updates, and can be aborted without corrupting the scene.

Killing tasklets turned this into very straightforward code. In fact none of it other than the few lines that handle subscriptions on the server know anything about it at all. This sort of “don’t think about it too much, it just works like you’d expect” promise of tasklet killing is exactly why I like it and exactly what was fulfilled in my use case.

No Comments

Killing Tasklets


Today at work we had a presentation from the venerable Kristján Valur Jónsson about killing Tasklets in Stackless Python, a technique he and partner-in-crime Matthías Guðmundsson started using for their work on DUST514. As someone who’s done some asynchronous programming, this idea sounded blasphemous. It took a little while to stew but I see the value in it now.

The core idea is really simple. Code invokes the kill method on a tasklet. All kill does is synchronously raise a TaskletExit exception (which inherits from BaseException so as not to be caught by general error handling, same idea as SystemExit) on the tasklet. This bubbles up to the Stackless interpreter, and is caught and swallowed.

There are details of course, but that’s the gist. There are a few reasons I like this so much.

First, it uses standard Python exception handling. It’s really easy to explain and understand, like no question about finally blocks being executed (they are). The fact that the killed code runs and dies as I expect makes this at least something to look at.

Second, it can be synchronous. There’s no ‘kill and spin until it’s not alive’ type of thing. When you tell the tasklet to die, it faithfully obeys. You are Thulsa Doom. You can kill it asynchronously but the fact that synchronous behavior works in such a straightforward way is a big thing.

Third, it actually cancels the tasklet where it is. You cannot replicate this with some sort of cancellation token or flag, which always has some delay if it supports cancellation at all. It not only simplifies things but makes them work totally as desired. So even if you are theologically opposed to Tasklet.kill and prefer other disciplined techniques for writing async code, you can’t argue with the superior results.

In the end, you still need to write good code and maintain discipline about shared state and side effects, but I see not only the value in killing tasklets but the superiority of the choice. I hope Kristján (and also Christian Tismer, the other Christian primarily responsible for Stackless) do a more thorough talk about killing tasklets at some conference.

1 Comment

Python Singletons


In my opinion, good python libraries and frameworks should spend effort guiding you towards the ‘pit of success’, rather than trying to keep you from failing. They do this by spending most effort on things related to the critical path- clear interfaces, simple implementations, thorough documentation.

Which is why singletons are, to me, the worst form of framework masturbation in python. You will never be able to stop people from doing something stupid if they’re determined (in pure python). In the case of a singleton, that means instantiating more than one instance of a type. So spending effort on ‘designing’ singletons is not just a waste of effort, but actively harmful. Just provide a clear way to use a single instance, and your system should fail clearly if it detects an actual problem due to multiple instances (as opposed to, trying to detect multiple instances to keep said problem from happening).

The best method for singletons in python, then, is- whatever is simplest!

  1. Some form of module or class state is, to me, the clearest. It requires someone reading or using your code to know nothing more than the most basic python. Just prefix your class def with an underscore, and expose an accessor function to an instance stored on the module (or on the class). The capacity for failure is minimal and the behavior is clear (it requires no behavior modification to the type itself).
  2. Overriding __new__ is pretty bad but OK. It requires someone to understand the subtleties of __new__, which is a useful thing to teach someone but, are singletons really the time and place?
  3. Using a metaclass is a terrible solution. It has a higher likelihood of failure (how many people understand the nuances of metaclasses!?). Misdirection even for people just reading your code, trying to understand your type’s behavior. Avoid.
The question to ask yourself before doing any of this is, “is a singleton a technical requirement or an architectural preference?” Ie, a single instance of an application event loop (QApplication, etc) I’d consider a technical requirement and make it foolproof (in C?). But technical requirements are few and far between and should be driven by underlying system/OS requirements rather than your code’s design or architecture. If it’s an architectural preference- “there should only be one instance of this manager/window/cache”- there’s absolutely no reason to confuse your code (especially you object’s behavior!) to achieve it. Just use design, documentation, and examples, to show people the right way to use it.

Why GUI’s Lock Up


This is a post for the Tech Artists and new programmers out there. I am going to answer the common question “why do my GUI’s lock up” and, to a lesser extent, “what can I do about it.”

I’m going to tell a story of a mouse click event. It is born when a user clicks her mouse button. This mouse click goes from the hardware, to the OS. The OS determines that the user wants to interact with your GUI, so it sends the mouse click event to your GUI’s process. Here, the mouse click sits in a queue of other similar events, just hanging out. At some point, usually almost immediately, your process (actually the main thread in the process) goes through and, one by one, dispatches the items (messages) in the queue. It has determined our mouse click is over a button and then tells the button it’s been clicked. The button then usually repaints itself so it looks pressed, and then invokes any callbacks that are hooked up. The process (main thread) goes into one such callback you hooked up, that will look at 1000 files on disk. This takes a while. In the mean time, the user is clicking, but the messages are just piling up in the queue. And then someone drags a window over your GUI, because they’re tired of their clicks not doing anything and want to see what’s new on G+. The OS sends a message to your UI that it needs to repaint itself, but that message, too, just sits in the queue. At some point, your OS may even realize your window is not responding, and fade it out and change the title bar. Finally your on-button-click callback finishes, the process (thread) is done processing our initial mouse click, and then goes back to processing the messages that may have accumulated in the queue, and your UI will refresh and start responding again.

All this happens because the thread that processes messages to draw the UI was also responsible for looking at 1000 files on disk, so it wasn’t around to respond to the paint and click messages. A few pieces of info:

  1. You can’t just ‘update the UI’ from the middle of your code. In addition to being terrible form code-wise, clearing the message queue would just cause other things to block the main thread, and it’d all get into one giant asynchronous mess. Some programs may have their own UI framework that supports this. Don’t trust it. You really just need the main/GUI thread clear as much as possible to respond to events.
  2. Your GUI process has a single ‘main thread.’ A thread roughly corresponds to, and I’m being not nuanced here, the software concept of a hardware CPU core. Your GUI objects can only be created and manipulated by the main thread.

This means, you want to keep your main thread free so it can act on GUI stuff (paint events, mouse clicks) only. The processing, such as your callback that looks at 1000 files, should happen on another thread (a background thread). When the processing is complete, it can tell the GUI thread that it is finished, and the GUI thread can update the UI. Your background thread can also fire events or invoke a callback that will be picked up by the GUI thread, so the GUI can update a progress bar or whatever.

How you actually do this varies with each UI framework. .NET, including WinForms and WPF, is quite easy to use (look at the BackgroundWorker class, but the Tasks Parallel Library and Async CTP make that less necessary). Python GUI frameworks are a bit worse off- multithreading in python in general is worse off- so it’ll be different for each one, and probably not as simple as .NET. There’s no excuse for python GUI’s to lock up, it just takes a little more effort to get it completely right (like callbacks to update a UI are a bit tricky).

There is one other vital thing to keep in mind- DCC programs generally require you to interact with the API or run all their script on the main thread, which as discussed should also be kept clear. Bummer! So the best thing we can do is block while we get our data from the scene, put the processing on a background thread, and report back to the main thread when done, applying the new data back to the scene if necessary. Unfortunately, if your processing interacts with the API in any way, you probably need to put it in the main thread as well. So, right now, your GUI’s in DCC apps may need to lock up, by design. There are, in theory, ways to avoid this, but they’re well outside of the scope of what you can handle if you’re learning anything from this article.

Whatever your language and program, those are the essentials of why your GUI locks up.

Note: This info is not nuanced (and is less accurate the lower down things go), may not be terminologically perfect (though it should be vulgarly comprehensible), and is Windows-only, though it should be enough to know how any higher-level GUI framework (such as Qt) would work on a non-Windows system).