Agile project management versus agile development

by Rob Galanakis on 24/02/2014

I have a saying I like to use when discussing Scrum: Scrum is an Agile project management methodology, not an Agile development methodology. Scrum delivers tools for managing the project (planning, scheduling), but very little for how to develop (design, program, test) it. To do Agile properly, you really need both. This is why eXtreme Programming (XP) and Scrum “fit together” so nicely, with XP telling you how to build your product and Scrum usually taking a higher-level view (and the overlaps are usually the same practices).

The complimentary nature of project management and development methodologies is important to understand and embrace. It is also one reason I don’t believe you can implement Scrum effectively without a programming background. Ultimately this comes down to a very simple thing for me:

If you do not have comprehensive automated tests, you cannot be Agile.(1) I consider this a fundamental truth. Yet rarely do books about Scrum spend more than a few pages talking about how vital this is, rarely is it discussed enough in Scrum Master training courses, rarely do the project managers now running most Agile implementations seem to understand this.

Automated tests are the fundamental building blocks from which all other Agile practices flow. This also informs how I treat Agile as a whole: Agile project management should be free-flowing and not rigid, but Agile development should be rigorously adhered to. That is sure to be inflammatory so let me elaborate.

Rigorous Agile development

  1. If automated tests are fundamental, and Test Driven Development (TDD) is the only way to get comprehensive test coverage, that means TDD is not optional. You don’t need to use it everywhere, but TDD should be the rule, not the exception.
  2. In the same way, you cannot be Agile if one person ends up being a bottleneck, so collective code ownership is required. I think it’s less important whether this is done through pairing or code reviews (probably both), and more that collective code ownership is a rule and any exceptions are seen as a big problem. Automated tests are required to collectively own code, as it’s the only way people can reliably make changes.
  3. Continuous integration is equally important, and depends on automated tests to be reliable and problems to be quickly fixed (it relies on the other two practices). You must always have a high quality build available and the code people are writing should be quickly (once a day or more) and easily getting into that build.

These three practices I consider absolute.(2). Maybe you can add some, but you are not allowed to decide you want to exclude any of those three from your Agile implementation.(3) To do so invalidates the principles on which Agile rests. So it follows that you should not be allowed to have zero experience in them if you’re an Agile leader. You cannot be Agile without them, no matter how briefly Scrum literature covers these topics; I would bet most of the writers of that literature would agree.

Flexible Agile project management

Opposite the rigorous adherence to specific development practices is experimentation with general project management practices. In this area, things are much more about principles (primarily feedback and continuous improvement) and less about practices or processes. Your sprint should be as long as you need to get good feedback, which varies depending on project/team/technology. Your retrospectives should be run so you can continuously improve. Your planning should be so you can get more accurate and insightful. Ten solutions can work well at ten different workplaces. Even more, ten solutions can work well at the same workplace(4). Just make sure you are continually improving, and keep trying new things!

Having your cake and eating it too

This distinction between development and project management is how I navigate the rift between Agile nihilists and Agile purists. The former say with disdain, “Whatever works for you!” The latter chant fervently, “You cannot pick and choose!” It turns out they can both be right, but it’s important to understand how and why. The nihilists end up floating, never realizing the transformative power of Agile because they refuse to adhere to the three vital, but initially taxing, processes. The purists can drive change, but not transform, because they do not create new practices that fully embrace each unique situation.(5) Rather than trying for a happy medium between the poles, I find Agile is best done by being at the extremes simultaneously.

  1. I don’t know how to rebut arguments against this point, other than asking “have you worked on a project that was well-developed with TDD?” If not, I would try it out before you make excuses for a compromised form of Agile without comprehensive automated tests.
  2. More accurately, what they achieve I consider absolute. If you wanted to get rid of any of them, you’d need to replace them with something suitable. For example, if you didn’t want to use TDD, you’d need to demonstrate some other way to reliably build comprehensive automated tests. And actually I have great hopes for some automated alternative to TDD one day…
  3. Uncle Bob recently posted about how software projects need this sort of discipline, perhaps by having a “foreman”. I don’t agree on the solution, but I do agree that we do need to rigorously adhere to certain practices.
  4. This is probably attributable to the Hawthorn Effect:
  5. To be fair, many purists (especially outside of programming) overlook the three vital development practices because they are so keen on implementing the easier Scrum project management tools that require less training and invasive changes.

Using code metrics effectively

by Rob Galanakis on 19/02/2014

As the lead of a team and then a director for a project, code metrics were very important to me. I talked about using Sonar for code metrics in my previous post. Using metrics gets to the very core of what I believe about software development:

  • Great teams create great software.
  • Most people want to, and can be, great.
  • Systematic problems conspire to cheat people into doing less than their best work.

It’s this last point that is key. Code metrics are a conversation starter. Metrics are a great way to start the conversation that says, “Hey, I notice there may be a problem here, what’s up?” In this post, I’ll go through a few cases where I’ve used metrics effectively in concrete ways. This is personal; each case is different and your conversations will vary.

Helping to recognize bad code

A number of times, I’ve worked with one of those programmers who can do amazing things but write code that is unintelligible to mortal minds. This is usually a difficult situation, because they’ve been told what an amazing job they’ve been doing, but people who have to work with that code know otherwise. Metrics have helped me have conversations with these programmers about how to tell good code apart from bad.

While we may not agree what good code is, we all know bad code when we see it, don’t we? Well, it turns out we don’t. Or maybe good code becomes bad code when we aren’t looking. I often use cyclomatic complexity (CC) as a way to tell good code from bad. There is almost never good code with a high CC. I help educate programmers about what CC is and how it causes problems, giving ample references for further learning. I find that because metrics have a basis in numbers and science, they can counteract the bad behaviors some programmers have that are continually reinforced because they get their work done. These programmers cannot argue against CC, and without exception have no desire to. They’re happy to have learned how they can keep themselves honest and write better code.

It’s important to help these programmers change their style. I demonstrate basic strategies for reducing CC. Usually this just means helping them split up monolithic functions or methods. Eventually I segue into more advanced techniques. I’ve seen lightbulbs go off, and people go from writing monolithic procedures to well-designed functions and classes, just because of a conversation based in code metrics and followup mentoring.

I use CC to keep an eye on progress. If the programmer keeps writing code with high CC, I have to work harder. Maybe we exclusively pair until they can stand on their own feet again. Bad code is a cancer, so I pay attention to the CC alarm.

Writing too much code

A curious thing happens in untested codebases: code grows fast. I think this happens because the code cannot be safely reused, so people copy and paste with abandon (also, the broken windows theory is alive and well). I’ve used lines of code (LoC) growth to see where it seems too much code is being written. Maybe a new feature should grow a thousand lines a week (based on your gut feeling), but if it grows 3000 lines for the last few weeks, I must investigate. Maybe I learn about some deficiency in the codebase that caused a bunch of code to be written, maybe I find a team that overlooked an already available solution, maybe I find someone who copy and pasted a bunch of stuff because they didn’t know better.

Likewise, bug fixing and improvements are good, so I expect some growth in core libraries. But why are a hundred lines a week consistently added to some core library? Is someone starting to customize it for a single use case? Is code going into the right place, do people know what the right place is, and how do they find out?

LoC change is my second favorite metric after CC, especially in a mature codebase. It tells me a lot about what sort of development is going on. While I usually can’t pinpoint problems from LoC like I can with CC, it does help start a conversation about the larger codebase: what trends are going on, and why.

Tests aren’t being written

A good metrics collection and display will give you a very clear overview on what projects or modules have tests and which do not. Test count and coverage numbers and changes can tell you loads about not just the quality of your code, but how your programmers are feeling.

If coverage is steadily decreasing, there is some global negative pressure you aren’t seeing. Find out what it is and fix it.

  • Has the team put themselves into a corner at the end of the release, and are now cutting out quality?
  • Is the team being required to constantly redo work, instead of releasing and getting feedback on what’s been done? Are they frustrated and disillusioned and don’t want to bother writing tests for code that is going to be rewritten?
  • Are people writing new code without tests? Find out why, whether it’s due to a lack of rigor or a lack of training. Work with them to fix either problem.
  • Is someone adding tests to untested modules? Give them a pat on the back (after you check their tests are decent).

Driving across-the-board change

I’ll close with a more direct anecdote.

Last year, we ‘deprecated’ our original codebase and moved new development into less coupled Python packages. I used all of the above techniques along with a number of (private) metrics to drive this effort, and most of them went up into some visible information radiators:

  • Job #1 was to reduce the LoC in the old codebase. We had dead code to clean up, so watching that LoC graph drop each day or week was a pleasure. Then it became a matter of ensuring the graph stayed mostly flat.
  • Job #2 was to work primarily in the new codebase. I used LoC to ensure the new code grew steadily; not too fast (would indicate poor reuse), and not too slow relative to the old codebase (would indicate the old codebase is being used for too much new code).
  • Job #3 was to make sure new code was tested. I used test count and coverage, both absolute numbers and of course growth.
  • Job #4 was to make sure new code was good. I used violations (primarily cyclomatic complexity) to know when bad code was submitted.
  • Job #5 was to fix the lowest-hanging debt, whether in the new or old codebase. Sometimes this was breaking up functions that were too long, more often it was merely breaking up gigantic (10k+ lines) files into smaller files. I was able to look at the worst violations to see what to fix, and work with the programmers on fixing them.

Aside from the deleting of dead code, I did only a small portion of the coding work directly. The real work was done by the project’s programmers. Code metrics allowed me to focus my time where it was needed in pairing, training, and mentoring. Metrics allowed the other programmers to see their own progress and the overall progress of the deprecation. Having metrics behind us seemed to give everyone a new view on things; people were not defensive about their code at all, and there was nowhere to hide. It gave the entire effort an air of believably and achievability, and made it seem much less arbitrary that it could have been.

I’ve used metrics a lot, but this was certainly the largest and most visible application. I highly suggest investing in learning about code metrics, and getting something like Sonar up on your own projects.

1 Comment

Using Sonar for static analysis of Python code

by Rob Galanakis on 15/02/2014

I’ve been doing static analysis for a while, first with C# and then with Python. I’ve even made an aborted attempt at a Python static code quality analyzer (pynocle, I won’t link to it because it’s dead). About a year ago we set up Sonar ( to analyze the Python code on EVE Online. I’m here to report it works really well and we’re quite happy with it. I’ll talk a bit about our setup in this post, and a future post will talk more about code metrics and how to use them.

Basic Info and Setup

Sonar consists of three parts:

  • The Sonar web interface, which is the primary way you interact with the metrics.
  • The database, which stores the metrics (Sonar includes a demonstration DB, production can run on any of the usual SQL DBs).
  • The Sonar Runner, which analyzes your code and sends data to the database. The runner also pulls configuration from the DB, so you can configure it locally and through the DB.

It was really simple to set up, even on Windows. The web interface has some annoyances which I’ll go over later, and sometimes the system has some unintuitive behavior, but everything works pretty well. There are also a bunch of plugins available, such as for new widgets for the interface or other code metrics checks. It has integrations with many other languages. We are using Sonar for both C++ and Python code right now. Not every Sonar metric is supported for Python or C++ (I think only Java has full support), but enough are supported to be very useful. There are also some worthless metrics in Python that are meaningful in Java, such as lines in a file.

The Sonar Runner

I’ll cover the Runner and then the Website. Every night, we have a job that runs the Runner over our codebase as a whole, and each sub-project. Sonar works in terms of “projects” so each code sub-project and the codebase as a whole have individual Sonar projects (there are some misc projects in there people manage themselves). This project setup gives higher-level people the higher-level trends, and gives teams information that is more actionable.

One important lesson we learned was, only configure a project on the runner side, or the web site. An example are exclusions: Sonar will only respect exclusions from the Runner, or the Web, so make sure you know where things are configured.

We also set up Sonar to collect our Cobertura XML coverage and xUnit XML test result files. Our Jenkins jobs spit these out, and the Runner needs to parse them. This caused a few problems. First, due to the way files and our projects were set up, we needed to do some annoying copying around so the Runner could find the XML files. Second, sometimes the files use relative or incomplete filenames, so parsing of the files could fail because the Python code they pointed to was not found. Third, the parsing errors were only visible if you ran the Runner with DEBUG and VERBOSE, so it took a while to track this problem down. It was a couple days of work to get coverage and test results hooked into Sonar, IIRC. Though it was among the most useful two metrics and essential to integrate, even if we already had them available elsewhere.

The Sonar Website

The Website is slick but sometimes limited. The limitations can make you want to abandon Sonar entirely :) Such as the ability to only few metrics for three time periods; you cannot choose a custom period (in fact you can see the enum value of the time period in the URL!). Or that the page templates cannot be configured differently for different projects (ie, the Homepage for the ‘Entire Codebase’ project must look the same as the Homepage for the ‘Tiny Utility Package’ project). Or that sometimes things just don’t make sense.

In the end, Sonar does have a good deal of configuration and features available (such as alerts for when a metric changes too much between runs). And it gets better each release.

The Sonar API

Sonar also has an API that exposes a good deal of metrics (though in traditional Sonar fashion, does not expose some things, like project names). We hook up our information radiators to display graphs for important trends, such as LoC and violations growth. This is a huge win; when we set a goal of deleting code or having no new violations, everyone can easily monitor progress.


If you are thinking about getting code metrics set up, I wholeheartedly recommend Sonar. It took a few weeks to get it to build up an expertise with it and configure everything how we wanted, and since then it’s been very little maintenance. The main struggle was learning how to use Sonar to have the impact I wanted. When I’ve written code analysis tools, they have been tailored for a purpose, such as methods/functions with the highest cyclomatic complexity. Sonar metrics end up giving you some cruft, and you need to separate the wheat from the chaff. Once you do, there’s no beating its power and expansive feature set.

My next post will go into more details about the positive effects Sonar and the use of code metrics had on our codebase.


Can you grok Agile without a programming background?

by Rob Galanakis on 12/02/2014

The last fourteen years have been strange. The Agile movement, which was spawned out of the controversy over Extreme Programming, skyrocketed into success, and was subsequently taken over by the project managers who all but pushed the programmers out. We’ve seen the creation, the wild success, and the corresponding (and predictable) impotence, of certifications… We’ve experienced continuous and vocal process churn as consultants and authors split and competed over Kanban, Lean, and every new project-management prefix-of-the-day…

Taken from Extreme Programming, a Reflection, by Uncle Bob

Well, I’m glad I’m not alone in those feelings! But the core question, which will continue recurring, is: can you truly understand Agile, or any development methodology, without having done that most fundamental development work: programming? My gut and experience tells me absolutely not but yet we continue to hand over control of our development methodology to people who have never done programming, personally or professionally. I doubt I am alone in this sentiment.

And then there’s W. Edwards Deming, the stepfather of the modern Japanese automotive industry and Lean movement, who was not in any way initially a car guy, who was a statistician by education and practice. I could dismiss his profound success and influence as due to his brilliance or individual abilities, but I feel it’d be unfair and simplistic. I suspect serious insight can be made by having a “process expert” who is not an expert in the subject matter. A voracious reader, experimenter, learner, statistician. But rare. Not many are needed; those that are should be coaches and not managers.

Deming was disruptive, and unaccepted in America. “No man is a prophet in his own land.” Is the project manager running your standups Deming-like? Is he applying the principles of Lean and Agile, or just the tools of Scrum? Do retrospectives involve questioning the entire system, or have they degenerated to complaints about other teams?

I’m not saying your project manager is doing a bad job, but he’s likely no Deming. So should he be responsible for the implementation of Agile? Agile is, at its core, about creating tools that support principles that govern the way programming and development is done. Why should this be up to someone that doesn’t program, and never has? What outcome do you expect?

Every original signatory of the Agile Manifesto has experience as a software developer (or in one case, tester).

Programmers: take back ownership over Agile and the way you work.


Teaching TDD: The importance of expectations

by Rob Galanakis on 8/02/2014

I read this interesting article from Justin Searls about the failures of teaching TDD, and his proposed improvements. Uncle Bob wrote up an excellent response on how Justin’s grievances are valid but his solutions misguided.

Justin’s article included an excellent image which he calls the “WTF now, guys?” learning curve. I think it sums up the problems with teaching TDD perfectly, and its existence is beyond dispute. I’ll call the gap in learning curve the WTF gap.

WTF now, guys?  learning curve

Uncle Bob, in his post linked above, touches on a very important topic:

Learning to refactor is a hill that everyone has to climb for themselves. We, instructors, can do little more than make sure the walking sticks are in your backpack. We can’t make you take them out and use them.

Certainly, giving a student the tools to do TDD is essential; but it is equally essential to stress the inadequacies of any tool an instructor can provide. A walking stick is not enough. A good instructor must lay down a set of expectations about TDD for students, managers, and organizations so they can deal with the frustration that arises once the WTF gap is hit.

First expectation: TDD describes a set of related and complicated skills. TDD involves learning the skills of writing code, test design, emergent design, refactoring, mocking, new tools, testing libraries, and more. You cannot teach all of these in a week, or even on their own. Instructors must introduce them via TDD, slowly. Until the student has proficiency with all of these topics, they cannot get over the WTF gap. This fundamentally changes how TDD must be practiced; students need a mentor to pair with them on work assignments when new TDD skills are introduced. If an organization or team is starting TDD, they need to have a mentor, or budget considerable time for learning.

Second expectation: Your early TDD projects will suck. Just like you look back in horror on your projects from when you first learned programming, or learned a new language, your first projects you use TDD on will have lots of problems. Tests that are slow, brittle, too high-level, too granular, code that has bugs and is difficult to change. This is normal. The expectation must be set that the first couple libraries or projects you use TDD for are going to be bad; the goal is to learn from it. The WTF gap is real; people must expect it and persevere past it.

Third expectation: If your organization isn’t supportive, you will fail. If you are one lonely person using TDD on a team of people who are not, you will fail. If you are one lonely team trying to use TDD in part of a larger codebase where others are not, you will fail. If your organization sets ridiculous deadlines and does not allow you to learn the TDD skillset over time, being slower initially, you will fail. If you want to do TDD and are not enabled, go somewhere you will be, budget time for changing the culture, or live with endless frustration and broken dreams. You need help to bridge the WTF gap.

Fourth expectation: TDD for legacy code is considerably more difficult. Learning environments are pristine; our codebases are not. There is a different set of strategies for working with legacy code (see Michael Feathers’ book) that require a pretty advanced TDD skillset. Beginners should stay away; use TDD for new things and do not worry about refactoring legacy code at first. At some point, pulling apart and writing tests for tangled systems will get easier, and may even become a hobby. The WTF gap is big enough; don’t make it more difficult than it need be by involving legacy code.

My feeling on teaching TDD is that no matter how you teach it, whether with some of Justin’s flawed ideas or Bob’s proven ones, you need to set proper expectations for students, managers, and teams.

No Comments

What a powerful thing metaprogramming is!

by Rob Galanakis on 5/02/2014

While editing a chapter of my book, I was introducing the concept of metaprogramming using Python’s type function. It occurred to me that I had already introduced metaprogramming several chapters earlier when introducing decorators.

Defining a function within another function is as important to my programming as bread is to French cuisine. I began thinking of all those cultures without wheat; all those languages without, or with newly added support for, metaprogramming. I have never done serious development in a language without anonymous functions, closures, and reflection.

It was exciting to think of a coming generation of programmers who are in my shoes (I started programming relatively late in life), who would be inherently comfortable with passing functions. It was exciting to realize where languages are going, keeping static typing but removing the explicit part. It was exciting to think of how flexible, expressive, and powerful languages have become.

It also allowed me to think of less flexible languages and what they’ve been able to achieve. I am lucky to be programming now, but surely each programmer before me felt the same about those before them. More will feel the same after and about me. Really my luck is to be part of what is still such a new and remarkable part of the human endeavor.

All of this feeling from type and decorators. What a powerful thing metaprogramming is!


Agile Game Development is hard

by Rob Galanakis on 2/02/2014

I’ve spent the last few weeks trying to write a blog post about why Agile software development is inherently more difficult for games than other software. I searched for some fundamental reason, such as games being works of art, or being entertainment, or being more difficult to test, or anything about their very nature that makes game development different from other types of software development.

I couldn’t find one. Instead, I came up with reasons that are purely circumstantial, rooted in business models and development environments. Nonetheless, it is the situation we are in; the good news is, we can change it.

4+ Reasons Agile Game Dev is Tricky

Number one: the insane business model based on packaged games. Develop a game for years, market the hell out of it, ship it, profit, repeat. Crunching hard is probably in there, as is going bankrupt. Each year fewer and fewer games garner a larger share of the sales, and budgets are often reaching into the hundreds of millions of dollars to continue this model. This is pure insanity, so development methodologies of greater sanity, like those based on Agile principles, simply cannot thrive. Often they struggle to even take hold. Don’t underestimate the depth of this problem. We have a generation of executives and marketers (and developers) who know only this model, and trying to explain to them how you need to be flexible and iterative with releases and develop with tests can feel like a losing battle.

Number two: We’ve equated Scrum with Agile. Agile embodies a set of principles, but we’ve equated those principles with a (limited) set of tools: the Scrum project management methodology (you can substitute Lean and Six Sigma in the previous example; this phenomenon is not unique to games). If you’re ever tried to impose Scrum on an art team, you can see how much of a disaster it is. Rather than take Agile or Lean principles and ask “what is a good way to work that values these principles?”, we just institute some form of Scrum. I’ve seen many people dismiss Agile because Scrum failed, which is a shame. And like Scrum, I’ve also seen forms of soulless Kanban implemented (soulless because it doesn’t support the principles of Kanban, like limiting work and progress, managing flow, and understanding constraints).

Number three: Game development was late to the Agile party. Software has had about 15 years to figure out how to apply Agile to business and consumer applications and websites. While “flaccid Scrum” now seems common in games, that’s relatively recent; combined with multi-year development cycles in these so-called “Agile” shops, there hasn’t been much of the learning and reflection that underpins Agile. On top of this, Agile is in a period of maturity right now and is being appropriated by project management, so it is difficult to innovate in the methodology space to come up with an alternative to something like eXtreme Programming that works in game development.

Number four is pretty interesting: Game sequels are not iterations. It is very common to build up mountains of debt to ship a game, and then throw away and rewrite those mountains for the sequel. This worked okay because sequels were usually much more disruptive than innovative so there were more opportunities for rewrites. In contrast, consider that the MS Office UI stayed basically the same from 1993 to 2006. Now as games are entering a loosely defined “software as a service” model, our development priorities must change. We need to be able to iterate month-by-month on the same codebase and pull things forward. This is a new set of skills we need to develop.

There are a number of smaller items that are less important but still should be pointed out:

  • Game development hasn’t embraced open source and is on Windows. Many developers and executives I’ve met have a distrust of OSS (CCP’s use and support of Python and other OSS is a source of pride for me and others here) and the majority of game development is on Windows. The Agile movement has strong roots in OSS and Linux, so aside from the cultural differences between the two communities (which should not be underestimated), there was just a lack of engagement between game developers on Windows and Agile evangelists on Linux.
  • Game development reinvent wheels. The availability of lots of excellent open source middleware has given non-game developers a leg up on focusing on their core product. If you had to develop your product and webserver, you’d incur not just the cost of developing both but of splitting focus. Game development has historically done a poor job of using middleware and has often reinvented the wheel; this has probably historically been due to the desire for maximum performance and ridiculous deadlines and business models. With more hardware to spare, I suspect this will change and we’ll see things like HTTP used between client/server instead of custom RPC stacks.

Reasons Agile Game Dev is not Tricky

Finally, there are a number of arguments I have thought over and rejected, including:

  • Games are art and art cannot be iterated on like other software.
  • Games require too much ‘infrastructure’ to make anything playable.
  • Games want users to spend time, not save time.
  • Games are impossible, or at least significantly more difficult, to test.
  • Fat clients are difficult to distribute.
  • Frequent releases are confusing for games, which are traditionally content-heavy.

Call to Action

There are solutions to all of these problems, but it requires getting to the core of Agile’s principles, and even more importantly, the Lean principles those are based on. What game development needs is a new set of practices and tools, better suited to our technological problems, that fulfill the same principles and can be mixed and matched with existing Agile practices and methodologies. Some ideas or topics for discussion in future posts.


A story of simplification and abstraction (stackless timeouts)

by Rob Galanakis on 11/01/2014

Someone was asking me the other day how to implement a timeout in a thread. His initial implementation used two background threads: one to do the work (making requests to a web service and updating a counter), and the other in a loop polling the counter and sleeping. If the first thread stopped updating the counter, the second should report some sort of error.

I helped him simplify the design in a couple ways. First I had him use stackless instead of threads and taught him how threading and microthreads work. Based on that, I suggested that instead of a counter and loop/sleep, there is a parent tasklet that kicks off a child tasklet which does the actual work.* The parent tasklet recvs on a channel with a timeout, and the child tasklet sends on the channel to act like a heartbeat. If the parent recv times out, it means the child tasklet hasn’t reported in and the user can be alerted. This simplified the code considerably.

I then asked a colleague (Kristján Valur) how to do a timeout with stackless, and he told me about the stacklesslib.util.timeout context manager. Doh! It ended up being as simple as:

    for item in items:
        with stacklesslib.util.timeout(200):
except stackless.util.TimeoutError:

It’s pretty amazing what sort of power you’re able to wield with a good language and framework. It’s so important to have the right abstractions, but you need to know how to use it. Even with documentation, nothing beats a little help from your friends.

*Instead of a channel, we probably could have used an Event.


Why CCP is still using Python 2

by Rob Galanakis on 9/01/2014

We at CCP are maybe the heaviest users of Python in videogames, though I have no data to back that up. (I’ll also use this opportunity to say this is a personal blog post, I am in no official capacity here) We use it in the client and server of both EVE Online (PC) and DUST 514 (PS3), and nearly all of our internal infrastructure uses it. What’s stopping us from upgrading at least some portions to Python 3?

Everything. Even if Python 2 weren’t good enough (it is), even if a hundred compelling features were added to Python 3 (there aren’t), even if Stackless was available for Python 3 (it isn’t), we still probably wouldn’t switch. Because literally everything conspires against enterprise employees who want to upgrade any significant codebase. Let’s go over some of those things.

We have our own localization solution inside EVE and the unicode/str bugs have been worked out. Oh, the solution is a nightmare, and our string handling is often a mess, but that just means changing it would be even more difficult. So there’s no real external product need, and internal products and tools usually aren’t localized. But wouldn’t it be great to change it and get rid of that technical debt and simplify things?

Sure it would be great to get rid of that tech debt! As it would for literally every area of our 11+ year old product. There is a limited amount of technical debt we can clean up, and none of it has to do with string handling or any Python 3 features. We just removed a custom importer we’ve wanted to remove for years, which paves the way for other technical debt cleanup. But we’re at least a year from another codebase-wide cleanup, of which there are many to do (let’s remove our dozen remaining builtins, please!). When low-level or non-value-adding work involves convincing people all the way up to the corporate/business level, there are very few people who can organize that sort of thing. I generally prefer that energy is spent on activities that general more value to our engineers.

We have relatively few automated tests. We’ve made some great progress on testing in the past couple years, especially for new or refactored/rewritten code, but there’s absolutely no way to uncover and fix Python 3 upgrade bugs easily. We have “extensive” manual regression tests, but it takes time to get builds to those testers and I imagine the regression test cycle would take months to get everything worked out. It would be a hard sell to QA and the turnaround time on bugs would mean the upgrade process would take months, not weeks.

The only place a Python 3 upgrade gets traction is with the core of the Python community here. And unfortunately we spend most of our capital by trying to improve our own systems, training our Python programmers, and even keeping other developers using Python.*

We can get backports of libraries developed for or in the stdlib of Python 3. If they weren’t provided in a sustainable way by others, we’d just copy them in and keep them locally.

We keep a critical eye on performance. While Python 3 seems fast enough for us now, that’s a pretty new development, and we absolutely will not regress on performance. So that could mean upgrading our codebase and still not be able to use Python 3, since we won’t fully know until we do it. Lovely!

Middleware, the scourge of OSS! We use Autodesk Maya, which uses 2.6 (soon 2.7) for our art pipeline. We would need to write a large chunk of our code to support Python 2 and 3 (an ongoing inconvenience), but also need infrastructure to test them in both (added work but not a huge deal, since we already do this in some cases for 2.6/2.7). But middleware is an easy excuse to stop an upgrade. Even upgrading middleware like Maya, which carries little risk, can take over a year.

Ultimately we’ve been too sloppy and monolithic** (like a large number of enterprise users, I’d imagine). Our products make money. Our developers are supposed to be working on those products and thus making money. We didn’t do a good job balancing technical and business needs, and besides our codebase is really old (EVE Online’s codebase is still an ancestor of the original codebase that EVE released with in 2001).

All of this means the “default” Python for me is Python 2. It is my “go to” language when I’m working on something where I have the option. I haven’t used Python 3 for any real work, so there’s still a threshold to cross. Familiarity and comfort. I don’t even have Python 3 installed at work. And most of the people I know are working in Python 2, so there’s a synergy of inertia. I have no problem with Python 3*** and plan to roll it out at work somewhere (it is a 2014 goal of mine), though I am intrigued with all the Python 2.8 (both vanilla and Stackless) discussions going on recently.

* There are some defections to other languages, and causing inconvenience to these customers on the fence is only more likely to push them into other languages. I have nothing against using other languages inhouse (in fact I wholly support it), I just want to make sure they’re used for the right reason. There’s value in being able to work in other people’s code without having to learn a new language and environment, and there’s value to doing as much as possible in high level languages.

** This has in many ways changed, with lots better automated testing and code quality overall, but legacy code is still the majority. We’re just producing less new legacy code.

*** I also understand that upgrading applications like EVE Online was not part of the Python 3 adoption strategy.


Multiple return values and errors

by Rob Galanakis on 31/12/2013

I was pointed at a rather simplistic article on arstechnica about why most programming languages only return a single value from a function, and thought about my “first class tuples” post from a few weeks ago. Python can effectively support multiple return values due to its built-in tuple unpacking, and doesn’t need to resort to heavyweight or unnecessarily verbose custom types or dictionaries (see tuples post for my rationale). I use “multiple return values” in this way often for the internals of a package where the API can be more fluid and implicit.

This wouldn’t be interesting but I was rereading some articles by the most excellent Laurence Tratt, including this one: A Proposal for Error Handling, which is about his experience writing very fault-tolerant software in C with error codes and comparing that to languages with exceptions. His proposal basically comes down to language support for (return value, error code), and turning error codes not explicitly handled by the caller into exceptions that should only rarely be caught. I know that sounds *just* how exception handling works, but there are subtle yet vital differences he provides in his article (which is worth a read, as I rarely see people who “get” both error-code programming and exception-based programming so thoroughly).

The talk of multiple return values and Laurence’s proposal reminded me of GoLang, and its support for (return value, error code) and panic/recover (like exceptions that should only rarely be caught). The design has always intrigued me, though I have never written a large enough Go program (or feel I understand Go idioms enough) to have a strong opinion. But given Go’s primary use for service programming, in light of Laurence’s article it seems like a very solid design.

But back to Python. I’m curious if anyone ever implemented a system where (return value, error code) was used by convention instead of exceptions as the primary error-handling mechanism? Because we have this built-in tuple unpacking, we can use (return value, error) without explicit Go-style language support, but I’m curious if anyone has done it and what their experience has been.