Too ignorant to know better

My first big python project last year was yet another feed aggregator (taogreggator). Before I started, I looked around at what other aggregators were available, and wasn’t happy with any of them in terms of features, complexity, or trying to get each working.

Of course, 9 months later, that project is dead and I’ve successfully got the python ‘planet’ module up and running at www.tech-artists.org/planet.

Note, this blog post probably reveals what a big programming phony I am ;) Remember though that this sort of thing is well outside my usual domain of expertise.

So what happened? Why did it take so long to realize I was doing something stupid, regroup, and adopt something that actually works?

I was too ignorant to know better. Well, to be fair, I didn’t undertake this project out of hubris or to build something better, I built it mostly as a significant project I could train my python skills with.

I’m not interested in why it failed. There are 100 reasons why it failed, none of them unexpected or interesting. I’m interested in why I undertook it in the first place and took so long to trash it.

1. I didn’t know anything about the web

I still know barely anything, but trying to take an existing package and get it running was incredibly difficult, because I was so out of water. I didn’t even have the vocabulary, and was unfamiliar with everything I was supposed to do and the concepts of how things worked. My own project allowed me to get into it gradually.

2. Too inexperienced to know the challenges ahead of me

It wasn’t actually that difficult to get the app running locally. I even opened up a router port and ran my PC as a server, for remote connections. But I had an Ubuntu server to deploy to, and know nothing about Linux. I had never created a web app before. So at every step, I thought I was almost there. Every known was an unknown unknown to me, because I had no idea what to expect.

3. Too inexperienced with the commandline and the python environment

I talked about it in my Relearning Python series. When I started out, I didn’t really get how python works, because I came from .NET where I didn’t have to worry about any of that. I have a much, much better understanding now, and the environment is one of the early things I teach any new python programmer, because once you start importing code, or writing complex scripts, you need to know how it works. I didn’t understand the environment so I had a very difficult time getting any third-party systems set up.

4. Pythonic is more than a coding style

When I came to python, I was indoctrinated in the ways of a .NET programmer. It took me a long time to understand that ‘pythonic’ applies to more than just lines of code. It has to do with how you run your entire application. The way I run planet I’d consider entirely pythonic- I have a very thin script that generates and uploads some files. The planet module itself is pythonic- there’s some straightforward documentation, commented ini files, and templates, and you’re supposed to customize things and build a few wrapper scripts to run the stuff you need. This looseness was foreign, as I was more used to a much data-driven, rigid way of customizing an app. Being data driven is not great in all circumstances, especially when developing frameworks and apps like this, where the programmer is the user. When I saw what I ended up with with planet, I was embarrassed with how confusing my design was (though, to be fair, it had more features planned). Without understanding how I should use modules like planet, I couldn’t use them. Such basic stuff is not covered in a readme.

So, several weeks ago, I finally made an effort to deploy my custom aggregator on an AWS windows server. I still couldn’t get it working. And I was having even more questions about why I did stuff a certain way (I don’t think the code or design is particularly bad, but it made it difficult to use on a server). It was a huge failure. So three days later, after an awful day at work, I regrouped, and spent the entire evening figuring out existing aggregators, and after struggling with various ones, chose ‘planet’, and got pretty much everything working.

The lessons are pretty clear. You need some minimum knowledge to be able to make an informed decision. Attempt something of a very limited scope to give you that knowledge before making your decision. You will have plenty of options to reinvent the wheel when you know what you’re doing. On the other hand, if you’re pursuing a project only for educational purposes, do whatever you want :)

Next time I’m going to follow some tutorial end to end. It was fun hacking away on something way too complex, but I failed to deliver a server to the community, and, tbh, the time could have been better spent.

Python logging best practices

Logging is one of those things that, being orthogonal to actually getting something done, many developers fail to learn the nuances of. So I want to go over a few things I had to learn the hard way:

We are blessed in the python community because we have the wonderful ‘logging’ module in our standard library, so there is no barrier to entry or excuse to not use proper logging mechanisms. There are often reasons to roll your own of something, that something will probably never be logging. Don’t do it (this goes for all major languages).

The logging module is incredibly flexible. The ‘handlers’ are the key to leveraging the power of the logging module. Handlers can do pretty much whatever you want them to do. Once you get past the most basic logging, you should start reading up on Handlers. Understanding handlers is the key to understanding logging, in my experience.

Root-level configuration should generally only be done by the application, not any library modules. Ie, ‘logging.basicConfig’ should only be (and usually can only be) called very early on. Examples of root-level configuration are setting the format of the logs, setting the logs to print to stdout/stderr, etc. Anything that has to do with global state (and streams are examples of global state), should be handled by the application, never by a library. Rarely should you add a StreamHandler. A FileHandler for a single logger can be useful in some cases (like, if you have a server that is part of a larger application) but should generally be avoided.

If you have multiple classes in a file, give them each their own logger. Do not use a single module logger for many classes. Identify the logger by the class name so you know what logger produced what log.

Putting self.logger = logging.getLogger(type(self).__name__) on a base class is a good way to get a unique logger for each subclass, without each subclass having to set up their own logger.

logger.<methodname>('spam, eggs, and %s', myvar) should be used instead of logger.<methodname>('spam, eggs, and %s' % myvar), as it saves a string formatting.

Make a module with your commonly used log format strings, so each developer doesn’t have to come up with their own, and you achieve some standardization.

Almost never use printing. Use logging, and set your logger(s) up to log to stdout with a StreamHandler while you are debugging. Then you can leave your ‘prints’ in, which will make life easier when you need to go back in to find bugs.

You almost never want to catch, log, and re-raise. Let the caller be responsible for logging and handling the error, at the level it can be handled properly. Imagine if at every level, every exception was logged and re-raised. Your log would be a mess!

I consider the levels are follows- DEBUG only for developers, INFO for general internal usage, WARNING for deployment (I don’t know why you’d have your log level set higher than WARNING). Another way of thinking about them is, DEBUG has all information which only developers care about, INFO has little enough information that the stuff there is relevant and enough that problems can be diagnosed by a technical person, and WARNING will just tell you when something goes wrong. I wouldn’t make any more fine-grained levels than this, but it is up to you and your team to figure out where to use what. For example, do you log every server and client send/recv as DEBUG or INFO? It depends, of course.

The more library-like your code, the less you generally log. Your library should be clear, working, and throw meaningful exceptions, so generally your real library-libraries shouldn’t even need to log.

Logging is not a replacement for raising exceptions. Logging is not a way to deal with exceptions, either.

Remember these are guidelines only (and my guidelines). There are always exceptions to these rules (no pun intended).

I have a feeling those of you writing web/server apps are more familiar with logging best practices than those of us writing code in client apps. But these are all things I’ve seen in the real world so I thought them worth giving my two cents about them. What are your logging guidelines?

Branching strategy is not a remedy for instability

4 years, 5 branching strategies. First we worked all in one branch. Then we became hyper-branched. Then we consolidated into a couple branches. Switched companies. First we were all in one branch. Now we’re splitting into branches.

This has all been in Perforce since it is the de-facto SCM system for the games industry. But if we were using DCVS we’d probably have the same issues. The problem has not been merging changes. So DCVS is not the answer here (though I love DCVS).

I’ve been through this at two companies and have read about the experiences and strategies of other companies. I’ve found one constant across the differences in companies and strategies:

Branching strategy changes are in response to the instability that follows fast growth.

You cannot simply take a working model of how some project manages its branches, apply it to your studio, and be done with it. In fact, you cannot seek out or design an “ideal” branching strategy for your studio that is going to fix your instability problems. Why?

Branching is not designed to fix code instability.

Branching is a way to isolate changes and manage a release. It allows a much more flexible and intuitive use of version control by both developers and the studio, and allows sane release management. The DCVS branching model has proven itself and now we’re stuck trying to figure out how to get something similar in SCM systems like Perforce. But this is largely orthogonal to the problem of code instability.

You can keep unstable code in a branch, but it does nothing to fix the instability. You can require developers to run smoke tests, but they’re still going to integrate broken stuff, and they even get less ‘free QA’ while in their branch. We can put everyone on their own branch, or group teams on branches, or whatever strategy you want to come up with, and I don’t think any are guaranteed to work for your studio. Furthermore, studios change people and size, so what works one year may not work the next.

Yet we put so much effort into branching strategy as a way to solve these problems. We design a system for how the branches are laid out. We make some tools for creating and managing branches. We focus communication and training on how people people are supposed to work. Yet branching is not and should not be the way we actually fix the problems that caused the instability that caused us to change our strategy.

How do I know this? Because with every change in strategy, there is a much less prominent component at work.

Infrastructure and automated testing are coincidentally improved when we change branching strategies.

I don’t think anyone doesn’t consider these two things important for improving code stability. It is just that I think they’re almost totally responsible. I think that if you were to trace the successes of people’s branching experiments, they’d be completely dependent upon when their automated testing and infrastructure (like continuous integration and better messaging) turned a corner and became robust. So the fact that Strategy D worked is because the improvements to testing and infrastructure made from A to B, B to C, and C to D, have accumulated to where you have far less instability problems.

So what’s my beef with branching, or more specifically, changing strategies?

I don’t have any. I think there are, definitely, better and worse ways to do things. My problem is when we focus on branching strategies as the most important part of the instability solution. My problem is that we document, educate, build in order to support branching. We talk about “how we are going to be working in branches,” rather than “how we are going to build testable systems and get legacy code under test.”  We put our resources behind developing tools and fixing the fallout of branching, instead of making a focused education and cleanup effort towards getting things into a more testable state (which often includes the testing infrastructure as much as it means the application code).

Imagine if every time you heard ‘branching’ it was replaced with ‘testing/infrastructure,’ my guess is you’ve never heard managers talking about testing and infrastructure that much. Unfortunately you are unlikely to, because branching is an easy problem to think about. It is a chess board. No real work, personalities, real-world spikes. Just figuring out how to best move around your pieces in a theoretical way.

When you’re creating infrastructure, it isn’t a chess board. It is a world of incremental changes, no glamour, making do with the bare minimum, all on mission-critical systems that have countless tentacles. It isn’t the world of a plumber, it is the world of a septic tank diver.

But the real reason you’re not likely to see branching effort replaced with testing and infrastructure effort is because to do so can require a huge cultural and educational shift at a studio. Good luck teaching dozens of really smart developers who have decades of experience on successful projects that their code isn’t sufficient anymore, that you want to use your new fangled techniques that have actually proven successful in the rest of the development world. Those conversations aren’t why people become managers.

But mark my words, if you have a studio where testing is a fact of life, where it is not just an ideal but a requirement, where your infrastructure and developer systems are well understood, documented, extensible, and reliable, you are going to see very little code instability, regardless of what your branching strategy looks like.

If you’re thinking about changing how you branch, consider instead if all of that effort is spent on turning your codebase into something testable, your infrastructure and systems into something widely usable and reliable. If you want to achieve stability, you are going to have to do it anyway. The question is, do you do it as a side effect and keep taking the painful medicine of changing branches strategies to keep getting the side effect, or do you do the much more difficult thing in the short term and approach your instability problem head-on, through building, and creating a culture of, testing and infrastructure?

Tabs vs. Spaces

A friend asked on G+ recently about tabs vs. spaces. A lot of people agreed with what I said so I thought I’d turn it into a proper post.

There’s a good summary here: http://www.jwz.org/doc/tabs-vs-spaces.html. This is also a link Jeff Atwood has in his post on the subject.

So why are spaces preferred except tabs? Tabs have the nice feature of being both more compact, and the display of the code in an IDE can be customized (I prefer shorter indents, some prefer larger). Spaces are more verbose in a lot of ways. But I’m not going to go over pros and cons with using them because, frankly, they’re not the reason.

Spaces are preferable to tabs because, like the Zen of Python says, explicit is better than implicit. Explicit in the sense that it is more compatible in more places.

PEP8 tells us to limit lines to 79 columns, because our code may be running on fixed-width terminal windows, and python is a scripting language, so people would be looking at the actual code on those terminal windows. As opposed to compiled code, where you’re generally not going to look at or edit the code on those terminal windows.

Speaking of terminals. There are a lot of times we’re editing code in unfamiliar places. That’s not just something like a terminal window. It is an unfamiliar text editor. It is an editor embedded into some program. It is a diff tool. It is any number of places we may need to write or debug code outside of our primary editor/IDE. Who knows what happens when you hit ‘tab’? How are things configured? Why bother with the ambiguity?

Well nothing is stopping you from requiring tabs for your studio, and breaking python’s PEP rules, and educating and configuring everyone’s editors to use tabs. However, the first time you need to go in and edit some code you find on the internet or download through pip or easy_install, you’re going to screw up and create a syntax error. Not only that but nearly every IDE can be easily configured to use spaces instead of tabs for both indenting and dedenting. And where you’re not sure of the default, or don’t want to configure it, you can just use spaces and backspace.

So for python, there’s no reason to use tabs. Just don’t do it. You’re using a language that is dependent upon whitespace for code structure. You need to take it seriously and remember you and your code is part of the larger python community. It isn’t about preference, it is about compatibility.

If you aren’t using a whitespace-dependent language, feel free to establish a standard and enforce it. Just never do it with python.

Whining, and Tech Art

A recent Facebook discussion prompted some discussion about how many programmers (especially in games) have awful development environments. So many studios don’t how to properly use (and the benefits of) source control. We work with proprietary or handicapped tools because we work with some frankensteined engine where standard tools are helpless. We don’t practice techniques like TDD because our labyrinthine legacy codebases make it almost impossible to hook up properly. We don’t educate ourselves on modern literature, tools, and techniques, because we’re stuck with these sorry conditions on projects that last for several years at a time.

And this is why I’m a Tech Artist, and not a Tools Programmer, or Game Programmer. Because I want to be a whiny bitch and be able to fix what I whine about.

I wouldn’t be able to put up with the conditions described above. I know myself and I wouldn’t. I wouldn’t put up with it, so I’d whine, and alienate myself or burn out, and the scope of the changes are so drastic (this is an endemic cultural problem at many places), I wouldn’t get anything done. So I am a Tech Artist, I can operate outside of the fold, trying out new things, talk about about what’s broken, changing things, and demonstrating why and how they’re better.

Of course, there is a time to buck up and get down to making a game. Complaining, advocating, and causing change is good, but using all your skills and brute force to ship the best thing you can is important too.

But if you’re not being disruptive, if you’re not using the flexibility of your role to learn all the new stuff you can, and then striving to show people a better way of doing things, you’re not living up to your potential as a Tech Artist.

Internal tools only require the critical path

I always try to remember how easy developing internal tools is. We have a captured audience. We can quickly deploy fixes. We are largely independent of rigid processes in place to support the customer base.

Our job, as Tech Artists and tools programmers, is easy. Well, easier, at least.

I think it comes down to this: Dev tools only require the critical path to work. We don’t have to worry about security. We don’t have to worry about cheating. We don’t have to worry about billing. We don’t have to worry about making optimizations that make our code more confusing. We don’t have to worry about the one thousand little things you should if you are developing an external product.

Remember this, and you will be a more effective Tech Artist and Tools Programmer, because you will spend your time where it matters. Forget this, and you will waste your time working on and worrying about things that provide no values, manufacturing theoretical problems.

Ensure the critical path works at all times. At all times. Ensure people are directed down the critical path through intuitive design and documentation. Restrict them from veering off the critical path where you must. Ensure if they veer off your tool’s critical path, it does not cause havoc and they can pick the path back up.

On the flip side, make sure you are being ambitious. Experiment. Have fun. Don’t be afraid to try new things. You are developing internal tools, you have that freedom. Try a new framework. Use a different issue tracker. Introduce a new coding or development style. Install a new tool. Your life is easy because you are working on non-critical software (that’s the truth of the matter, sorry folks, that’s why so many of us have shitty tools). Make up for that lack of prestige by having more fun at work.

Just keep the critical path in mind.

80/20 sometimes- Good Enough vs. Perfection

The 80/20 rule is generally a good one to understand. 80% of the effects come from 20% of the causes. So how does it apply to software, and to product development in general?

  • 80% test coverage. Higher coverage is notoriously difficult to achieve.
  • Fix the top 20% of your bugs as 80% of the problems come from them.
  • Make 80% of your code side-effect free.
  • 20% of your features are used 80% of the time, focus on those.

The list goes on. The 80/20 rule is a good one to keep in your head when you find yourself spending too much time on something.

But if you apply the 80/20 rule to everything, you will only end up with a product or code that is ‘good enough.’

I was looking at some trailers for a game I hadn’t looked at in a while, and the in-game cinematics were awful. Why? Because everything was good enough. The animation system is good enough. The combat is good enough. The tools were good enough. The character customization was good enough. It all amounted to a mediocre result (not the game, but the cinematics).

The 80/20 rule only works when you strive to hit 100%. If you aim for 80% test coverage, you’re going to be happy with 60. If you only care about 20% of your bugs, you’re going to become distant from your users. If you aim to keep 80% of your code side-effect free, you’re going to litter it with ‘one off’ side effects. If you only care about 20% of your features, you’ll probably only take each one to 80%.

It is obviously dangerous to demand 100% in everything, but 80/20 works as a way to balance priorities. 80/20 does not work as a goal. If you are phrasing your objectives in terms of 80/20, you’re never going to excel.

Everything can be a server/client!

We Tech Artists can get intimidated when talking about servers and clients. They remind us of a world of frameworks and protocols we’re not familiar with, run by hardcore server programmers who seem to have a very demanding job. Fortunately, that needn’t be the case, and understanding how to turn anything into a server/client can open limitless possibilities.

You can think of server/client as a way to get two processes to communicate to each other using sockets, that is more flexible than other means of IPC such as COM or .NET marshalling. Your server can be local, or it can be remote, and very little usually has to change. Moreover, you can define much more flexible protocols/mechanisms, so you can communicate across literally any programming language or platform.

The practical reason everything can be a server/client is because we don’t have to understand much of how anything works under the hood. You follow some examples of how to set up a server and client using the framework of your choice (I’m a huge, huge fan of ZeroMQ which has bindings for pretty much everything including python and the CLR). Once you get comfortable, you just design your interface, and implement one on the server and on the client (the client just usually sends data over to the server and returns the response). Actually I really like how WCF recommends you build your server and client types, even though I am not a big fan of the framework. And I do the same for Python even though it’s not strictly necessary ;)

So your server just needs to poll for messages in a loop, and the client sends requests to it, and the server sends back replies. So driving one app with another is as simple as creating a server on the slave and polling in a (usually non-blocking) loop, and having the client send commands to it. You can invert the relationship on a different port and now you have bi-directional communication (hello live-updating in your engine and DCC!).

The real power of this, I’ve found, is that I really have full control over how I want things to work. No more going through shitty COM or .NET interop, no more Windows messaging. I define the interface that declares what functionality I need, and can implement it in a sensible and native way (ie, not COM, etc.).

For example, we use this for:

  • Recreating a Maya scene in our editor, and interactively updating our editing scene by manipulating things in Maya, even though their scene graphs and everything else are nothing alike.
  • Running a headless version of our editor, so we can interact with libraries that only work inside the editor/engine, from any other exe (like regular python, Maya, or MotionBuilder).
  • Having a local server that caches and fetches asset management information, so data between tools is kept in sync for the entire machine and there are no discrepancies per-app.

If we had a need, we could easily extend this so any other programs could talk to each other. In fact this is generally how it’s done when apps talk to each other: I’m not presenting anything new, just trying to convince you it becomes really really easy.

If you’re anything like me, thinking about things in a server/client scenario can give you an entirely new perspective on how you develop tools and pipelines.

“Make it work”

I know a managers that use ‘make it work’ as an implicit demand, knowing they’re asking you to do the impossible with inadequate resources and forcing you to deal with it- as if it isn’t they’re responsibility. I know developers that are all too eager to say they’ll ‘make it work’, as a way to justify delivering mediocre result- it is a way to ignore the real problems they don’t have the will to deal with.

I know managers who refuse to ‘make it work’, holding back progress because it isn’t perfect, but being forced to release something far under expectations in the end. I know developers who refuse to ‘make it work’, and don’t realize how their selfish whining hurts the team.

Telling someone to ‘make it work’ is not an acceptable course of action. Find out what they need and reconcile what they can deliver with what resources are available. You should be deciding on something clearly achievable, and executing.

‘Making it work’ is not an aspiration. If you are the ‘make it work’ guy, you are, by definition, delivering consistently mediocre work, and short changing your teammates who need to deal with your shortcomings and are perceived as less productive.

A few blog site guidelines

Adding feeds to Planet Tech Art, it became clear that not everyone studies great bloggers like Scott Hanselman or Dave Winer. Here are some rules:

  • Your full name should appear somewhere on your main page. Prominently if you are advertising yourself, but at least somewhere in the footer or header. There were some blogs where even a first name was absolutely nowhere to be found. Unacceptable.
  • A link to your feed should be somewhere on your main page. In order of preference- top/top of sidebar, sidebar, bottom. I visited several blogs that did not have an RSS feed anywhere. Again, unacceptable.
  • Make your first few works count. Many more people will see an excerpt than read your entire post, make sure the first couple sentences don’t say ‘Sorry I haven’t posted for a while’ or something similar.
  • Make sure your name is in your blog feed. Or something to identify you. For Planet, this is taken care of automatically, but subscribe to your own feed and make sure it is recognizable.
  • Speaking of subscribing to your own feed- subscribe to your own feed and make sure your posts are formatted in RSS properly. I’ve seen more than a few with missed code samples and other plugin-dependent data.

Happy blogging.

Return top
 

Switch to our mobile site