Code metrics, requiring a culture of quality

by Rob Galanakis on 18/06/2011

Last time I went over how adhering to things like code quality metrics that are objective and ‘scientific’ is the key to creating and sustaining a strong codebase.  The difficulty comes with actually implementing that process and behavior wherever you work.  There is no shortage of obstacles:

1.  Convoluted process.  The unfortunate truth is that many of us work at a place with a convoluted submit/build/deploy process.  This is either so brittle that it is difficult to augment, or so complex it has a large dedicated build team.  Either is a problem because the hurdles to making process changes like setting up code analysis as a part of the submit/review/build process are very high.

2. Shame.  If more senior or lead developers are not willing to do this, it is unlikely it will be done.  This is compounded by the fact that the implicit blame of what the analysis reveals is on the shoulders of the more senior devs, so they may be less willing to do it.

3. Disagreement.  Fundamentally, there is a breed of developer that is opposed to Augmented Coding.  They tend to endorse (and exhibit) high proficiency with simple tools (text editors, commandline), and actively oppose more sophisticated GUI programs or tools.  It will be very difficult to get this type of programmer to change his or her view.

4. Scheduling.  No discussion of any change would be complete without talking about scheduling issues.  Someone has to do this work, train people, etc.  So like all things, this needs to be scheduled, or ninja’d in if possible (impossible if you have a bad process).

These problems combine in pretty frustrating ways.  And unfortunately I really don’t have a solution.  There’s no glorious ending to this blog post that will tell you how to overcome these problems.  Even if I’ve overcome all of these problems personally, these are cultural problems, and cultural problems are notoriously specific.  Ultimately I think it comes down to hoping that you can get key people- leads, build engineers- on board with the necessity of having code quality metrics as part of your pipeline.  That’s the most important thing you can do to make sure that this time, I’m going to do it right.

No Comments

Code metrics, the only ‘right constant’

by Rob Galanakis on 17/06/2011

I wrote recently about the experience of running a code analysis tool on a codebase and hinted at the difficulties involved with refactoring the problems.  There are far smarter people than me who have given much more thought to the technical problems and strategies involved.  I want to explore, instead, the cultural and human problems involved.

I doubt there’s a developer who wrote the first line of code in a codebase without thinking, ‘this time I’m going to do it right.’  And I also doubt there are many developers who are working in a codebase who aren’t thinking, “If I get a chance to start from scratch, I’m going to do it right.”  So how is it possible that these two sentiments exist simultaneously?

The answer is another paradox- early development is done without enough rigor and is done with too strict adherence to early established principles.  Ie, the rigor that is used is applied towards principles that fail in the long run.  Over several years, languages change, technologies become available or obsolete, developers grow and evolve, etc.- and the codebase becomes larger.

The way to ‘do it right,’ then, is to establish what is right as a constant and what is correct right now.  In all of software development, the only thing that I can think of that is ‘right as a constant’ is code quality metrics- things that are not subjective (like code reviews), and backed up with empirical evidence about effects.  If code quality metrics are not part of your process, your codebase is likely to fail.  As a codebase grows, so does the liklihood that future development is under the paradigms already existing in the codebase.  The problem is, these paradigms have no certainty that they will yield good code.  In fact, chance is they will be directly at odds with more widely established and accepted principles and paradigms that have evolved or appeared after the codebase started.  This is the nature of the myopia and bubble that forms at any sizeable development house.

The only way to fight this is to apply the steady force of the ‘right as a constant’ factors to a codebase.  If you can do this, you’ll always be at a more agile place, so you can refactor more easily.  Anecdotal evidence would indicate that any other strategy is futile.

Have I missed any other possible ‘right as a constant’ things that can be implemented?

Next up: What implications does this have for culture?

1 Comment

Blog roll: CodeBetter.com

by Rob Galanakis on 16/06/2011

I am going to start making some blog posts about other blogs when I don’t have time for bigger posts.  The first blog up is www.codebetter.com, which covers a variety of code quality and .NET topics.  It is contributed to by a number of people, so there’s a pretty good flow of excellent topics and posts.  It has quickly become one of my favorite blogs to read, and though it focuses on .NET, the lessons are applicable to any language.

Here are some recent highlights:

LINQ Intersect 2.7 times faster with HashSet

db4o’s no primary keys

On partitioning .NET code

Back to basics: Usage of static members

No Comments

Relearning python, part 8: Over the hump

by Rob Galanakis on 15/06/2011

I did it.  As I was finishing yesterday’s blog post, I finally got my project working, and exposed on the internet.  Now I that things are finally figured out, I can document and test it.

I ended up writing a process that runs a socket/ZeroMQ based service, which is long-running and persistent.  I have my web UI, written with pyjamas, that uses jsonrpc through CGI.  The CGI service/handler (which runs on the server, obviously) opens a brief connection to the persistent service to run whatever method call it was asked to call, and return the result.  Until I deploy it on the actual tech-artists.org server, I have my router port-forward incoming connections to my machine.  So I’ve used my service from my Droid successfully ;)  I have no idea if this is a terrible design, but it serves my needs well enough.

I’ve been really impressed with pyjamas.  I think I’ve gotten over the learning hurdle, and am starting to compose together a pretty nice UI.

Hopefully I can finish this project in the next couple weeks, and move on to other things as I just tighten things up and improve it.

Once I got over the hump, I went back to enjoying things again.  I could write code with confidence, and feel like I was learning and making progress, rather than just trying things arbitrarily.

That should wrap up the real work for this ‘relearning python’ series- I’m not sure that I’ll reach any more epiphanies, and I’m now pretty comfortable with the switch from C# to python.  I’ll make sure to wrap things up with a conclusion post or two, as promised.

No Comments

Relearning python, part 7: Which way is up?

by Rob Galanakis on 14/06/2011

Weeks after beginning python, I have hit a forest on a plateau- the speed of learning and discovery has slowed, and I’m getting confused and discovered.  I’m beginning to think I won’t finish my current project- I am sure deploying it on a server is going to be another exercise in frustration and hope I can even make it to that point before my life gets busy.  Here’s what I’m struggling with.

1. Changing my mind.  I was very much in the habit, in the static-typing world, to declare simple immutable data types.  This seems unnecessary and uncommon in python- instead, use dictionaries.  This makes sense- the onus of immutability is on the caller in python (in general), and the dynamic nature means having those simple data types doesn’t buy anything over a dictionary (especially if you use a bunch-like object, you will probably want to write your own though).  The problem is that a) this makes refactoring more difficult- this sort of linking is where static typing shines, so changing- and then removing- these data types is never ‘foolproof’, just quick and sometimes with minor breakages that unit testing hopefully finds.

2. Unittesting.  I am writing tests and actually enjoy it.  And I am getting better.  Mostly my problem is with running tests.  I am struggling along the spectrum of everything manually (set up to run the .py file with the tests run if __main__) and completely done through the IDE w/ nose (pycharm -> Python nosetests).  I actually found out a lot of my tests weren’t being run, somehow.  I’ve now switched it to be the manual setup, so I can have better control and learn more about what’s going on.  That’s a problem with convention-driven systems, I guess, is that the implications can get too confusing for noobs like me.

3. Web programming.  I didn’t approach this project correctly.  I bit off way more than I can chew.  New technologies are fine with new concepts, usually, but I’m using new technologies that use ‘old’ technologies, so the beginner-oriented tutorial material I need is not there.  So it’s been incredibly frustrating.  I now have my ZeroMQ-based service running, with a JSONRPC CGI service to field requests from my pyjamas-created web UI.  I feel I’ve acquired a dangerously unfocused lack of information about way too many things (socket programming, CGI, RPC, javascript, etc.), and the system is held together with bubblegum.  Maybe not, but I feel that way.

I am struggling more from not having people to ask questions to than I am because of anything to do with python or the web.  Part of the problem is I’m so unfamiliar with certain things that I don’t even know what questions to ask.

I look forward to starting at CCP and having people to field more general python questions, and I also feel myself getting over the hump and getting a hang of the web stuff enough to actually make something.

1 Comment

Relearning python, day 6: Please serve me!

by Rob Galanakis on 10/06/2011

I wrote last time about all the difficulty I had with deciding on a UI framework.  Well, it turns out that was nothing compared to trying to get a server up and running.

I suppose I should say first that I have no experience doing web programming.  None.  I have some experience writing server/client apps, that were all WCF based.  I expected difficulties with python, but nothing like this.

My goal was clear- develop an RPC-like service interface that I can call from pyjamas.  So, first thing, I implement an XMLRPC server, based on SimpleXMLRPCServer.  Wow, that was easy!

Wait.  pyjamas cannot use XMLRPC because it isn’t a supported module.  They suggest JSONRPC.  Alright, well, no biggie.  Right?

Well unfortunately, there’s nothing so simple as SimpleXMLRPCServer for jsonrpc.  Even SimpleJSONRPC, because nothing I tried on the client seemed to work.  So began evaluating a dozen modules, looking for what should have been very simple.

I looked at the pyjamas JSONRPC example.  But it used CGI, and I need a persistent service.  I am not a Unix programmer and do not know how to, and don’t care to, set up mod_python, which apparently allows you to run python in Apache or some shit that I don’t want to care about.

So I looked at anything with jsonrpc in the name.  There are well over 20 modules that hit ‘jsonrpc’ at pypi.python.org.  Including ones that seem to conflict (how many are used with ‘import jsonrpc’?  At least two, from what I saw).  Keep trying, nothing works.

Alright, let’s up the ante.  Let me try stackless- can you even make a server in that?  Well I couldn’t, and I can’t remember the problems.  Well let’s look at Twisted.  Wait, zope.interfaces needs to be installed.  OK, got that.  Alright, following directions, and… fuck, that example didn’t work.  Why?  Can’t remember.  Alright, let me look at PyZeroMQ.  Easy install doesn’t work on Windows.  Shocker.  I’ll use the MSI.  There we go.  Whoops, that example didn’t work.  Wait, I need to download and build ZeroMQ myself?  Are you fucking kidding me?  Here we go.  Wow, build problems.  So surprised.  Huh?  Wait, this examples working.  Did I have to build/install?  I don’t know.  Sort of working now… ok, lots of good examples that seem to be mostly working.  I can do this.  I am scared to see what’ll happen when I move from a local to a remote server, though.  Well, let’s clean up some of this crap… wait, I am scared to think about the dependency tree here, and have no idea what is or is not safe to remove.

Ultimately I have no idea if ZeroMQ is what I want or will do just what I need, but it was the only thing that ended up working reasonably well (I only struggled with it for 1-2 hours before using it successfully!).  My guess is I will have several more hours trying to figure out how to make it work with pyjamas.  But at least this whirlwind of problems, so rapid and severe I can’t even recall all the issues, has subsided for now.

This situation is unforgivable.  It is representative of a huge ‘lie’ about python in particular and a problem with the OSS situation in particular.  I’ll have more to say about it soon.

No Comments

Code quality metrics are king

by Rob Galanakis on 9/06/2011

If you want to induce a bout of anger and depression, run a tool like NDepend on your codebase (or Resharper’s code analysis, or VS2010′s code metrics, or any other similar tool).  I would guess, if you’re a competent developer who knows what ‘good code’ looks like, you’ll find a few things:

  1. The areas that you knew were problems show the worst.
  2. The areas that you think are good code have good quality ratings.
  3. Your dependency graph is a mess.
  4. The number of problems overall is so large it is debilitating to think about and hopeless to try to fix.
  5. The number of problems in areas people don’t think are problems will put you into a rage.

This has been my experience, at least.  Code quality in most places is pretty crappy overall, and it isn’t uncommon to find the official practices are not what they should be- yet they are vigorously defended.  Code quality metrics provide an unbiased, unequivocal judgement on specific pieces of code, and the codebase in general.  Metrics can be gamed, but I’d much rather have a codebase that has good quality metrics than one that doesn’t.

Running the analysis is the first part.  Fixing the problems is the last.  The real challenge is inbetween: the why and how of fixing the problems, and not just from a scheduling perspective.  The difficulties involved in creating a plan to improve code quality are less technical and more difficult, and I’ll go into some potential strategies in a future post.

Also, if you want a quick tutorial for running Visual Studio’s Code Metrics, take a look at Zain Naboulsi’s Visual Studio Tips and Tricks blog is this thing on?.  He recently  finished running a few posts about using Visual Studio’s Code Metrics tools.  And if you want to do some serious codebase analysis, of course take a look at NDepend.

1 Comment

Relearning python, day 5: UI Hell

by Rob Galanakis on 8/06/2011

Last week, I started writing the client and UI portions of my data aggregation service.  I spoke about client/service protocol frustrations in my last post, but the most frustrating part so far has been figuring out the UI.  There are three parts of this:

1. Oh my god, so many frameworks!

Bindings for Tkinter, Wx, Qt, Gtk+, oh my!  I’m used to two choices- WinForms if I want old, shitty, and familiar, or WPF if I want new, powerful, and intimidating.  After research, my understand is that Tk isn’t much used other than to leverage the fact that it is installed by default; Gtk+, is much less used; Qt is used in Autodesk Maya but I cannot determine whether it or Wx is a better fit for me as they both seem well regarded.

2. No intellisense really sucks!

I’ve said before that I’m fine with no intellisense in python.  And I was.  Until I started UI work.  The problem here is that UI frameworks are extremely large, stateful, and have tons of options.  Intellisense helps me see all those options at a glance and greatly contribute towards understanding the framework.  Even worse is the fact that so many UI signatures are *args and *kwargs, so even the limited intellisense available isn’t very useful- I need to use the API docs just to see what’s available.  That realy sucks.

3. And I don’t want to use the fucking designer!

I have been a vocal critic of Visual Studio’s WinForms designer.  Basically, it creates incredibly bloated code and encourages very poor encapsulation, organization, and practices.  It can be extremely powerful, but for people who don’t understand GUI programming in the first place, it just results if godawful code that practically gives me an aneurysm thinking about it and makes me write run on sentences.  With python, I was hoping I wouldn’t have to use a fucking designer, because I honestly barely needed it in WinForms.  Except combined with the lack of intellisense, it makes designing and laying out a UI most difficult.  So you’re basically forced to use a visual designer to work, which feels like I’m shitting into my mouth, unless you have the fucking API memorized, just because searching the docs for everything which feels like getting fucked in the ass.

 

Is this what is ultimately holding python back from greater visibility outside of the script kiddies and knowing experts?  That it takes a glutton for punishment, or a genius with a lot of free time, to build a UI effectively.  I wonder, if building UIs in python wasn’t so shitty, would those arrogant static typing pussies think more of it?

Anyway, there’s a solution: PYJAMAS!  Pyjamas is a framework that allows you to build your UI in python, and compile it to javascript and html.  This is AWESOME!  It means you have a cross-browser, cross-platform (incl. mobile!) UI you can write in python, only having to understand python- the rest of the API is as simple or simpler than the desktop UI framework alternatives.  Html/JavaScript UIs are where things are headed, and technologies like pyjamas are going to cause a paradigm shift.  In hindsight of choosing it, every other choice seemed profoundly stupid.

Unfortunately, installing pyjamas wasn’t seamless, especially for a Windows programmer.  And developing in it is a little difficult, because the iteration loop isn’t quite as far as normal GUI programming.  And the nature of the python-compiled-to-JS concept means only certain python modules are supported; which I see as a good thing in many ways, because it forces you to split all business logic out of the client.  Good but takes a little while to get used to.  It is pretty well documented, because it includes pretty excellent examples- much better and coherent than any other UI framework, though that’s probably because they are one of the only sources of documentation.  That said, the devs are obviously top-notch, and committed to using pyjamas to run the site (even though they moved the mailing list onto a pyjamas-run frontend that I find really lacking right now).  And it is frustrating not having all the python libraries available- you can really only use what’s written in pure python (hopefully), or modules the pyjamas team provides (which seem to be good enough in most cases).

So, look at pyjamas if you haven’t, and consider it if you’re writing any UIs.  And if you are still writing rich client apps (very common in games industry) and are afraid to move into client/server land, I don’t blame you- but you’re going to have to learn it sooner or later.

6 Comments

Relearning python, day 4: The ecosystem

by Rob Galanakis on 6/06/2011

First, sorry about not blogging- all of the last 3.5 days were spent working on my wedding invitations, and making 100 jars of peach jam.  I’ve just gotten home and am very thankful to have inlaws-to-be with a very large kitchen that I was able to use to make so much goddamn jam, and I don’t want to look at a peach for a few days.

I’m mostly finished my service architecture and have moved onto developing the client and UI.  As this has moved from the realm of ‘normal’ programming to more ‘framework driven’ programming, it has exposed me to a completely new part of python and one I wasn’t too familiar with- and one I certainly didn’t fully understand the repercussions of- the python/open source ecosystem.

I’ve worked in two other ecosystems- the 3ds Max/Autodesk system, which is to say it is all build-it-yourself because the quality of ‘modules’ you find in the wild blow really hard.  And the Microsoft/.NET ecosystem, in which most things are provided by MSFT, with a spattering of open-source projects (NLog, NUnit, NHibernate, I think that’s all I’ve really used other than codeproject code we ‘adopted’).  There just weren’t many options available on what you used for X- you used what Microsoft provided, augmenting it if needed.

Python is not at all like that.  I need to turn my service project into an actual service that can be accessed by a client.  If this were .NET, I’d use (and have used) WCF.  For an internal application (no outside compatibility issues), there’s no decision chart- just use WCF with configurations optimized for .NET.  With python, I have a dozen modules to choose from- a full framework like Twisted, simpler systems like CherryPy, xmlrpc servers, jsonrpc servers, etc., and each one is a package with unique considerations.  I need to also build a UI.  If this were .NET, I’d choose either WPF for a desktop app or ASP.NET MVC for an internet app.  With python, I have to choose from Tk, Wx, Qt, and more.  I don’t even know what to do for web UIs.  I eventually decided on pyjamas, a framework that compiles your python into javascript/html.

I still haven’t gotten everything working.  Which is not surprising, because of two unfamiliar attributes of the python/OSS community- 1) They are *unix based.  I have never used a Unix system, and have no desire to yet.  So things like pyjamas, which is still alpha/beta and developed on Linux, has some hiccups getting set up on a Windows machine.  It also means lots of the documentation on modules reads as a somewhat foreign language, though I’m sure I’ll learn how to translate Linux->Windows more and more (I figured out I couldn’t install pyjamas to a path with a space in it!).

And 2) the projects are run by developers.  So what I find is incredibly innovative and generally well written software with little external documentation, and more commonly, documentation written for people just like them.  That is to say, much more familiarity with python, Linux, and dependent/related frameworks and systems.  So there is a lot of trial and error and head banging trying to get things working, often.  Compare this to Microsoft, where any released feature is fully documented including a full battery of tutorials and supporting information written, usually, in the simplest terms possible.

For example, when I wrote my first WCF project, it took maybe 25 minutes to get a server/client running.  It took several more days to figure out the configuration issues and server/client issues like how things are sent over the pipe, but it was all in all pretty simple.  Doing the same in python, it took me several hours just to decide what framework to use! And because of the relatively less, and more foreign, documentation and examples, evaluating them was more difficult.  I will probably pick a jsonrpc system for easy use with pyjamas.

Which all seems to gel pretty well with my python hypothesis so far about the actually steep ramp-up and advanced-level requirement to use it effectively.

Next time, I’ll go over my UI frustrations, and why I chose pyjamas.

3 Comments

Relearning python, day 3

by Rob Galanakis on 31/05/2011

The last few days were spent re-organizing my code and writing unit tests.  Here’s what I learned:

I was still spending too much time thinking about namespaces, privateness, interfaces, and organization.  Once I got rid of some ‘abstract’ classes that served no purpose (they’re not meant to be subclassed outside the library- so why bother?), stopped trying to hide things behind so many layers of indirection, and organized my code differently (didn’t have all my modules start with an underscore because I was so concerned about what I’d expose), something magical happened- the API was simplified.  It’s funny- creating simple APIs in C# often involves a ton of substructure that is all internal/protected/private- in fact, I think I excelled at making great public APIs, but they always involved a lot of private implementation (and that’s not just me- look at something like Enum.TryParse).  Creating a simple public API in python has the opposite effect- it seems to streamline the code and make it more explicit (as you’d expect).

Unittesting is awesome.  I still have a ways to go to learn how to write tests well, but that’ll come in time.  Unit testing gives me the benefits of static typing (ensuring what needs to be called is callable), and more (it tests actual functionality).  Doctesting is wonderful as well- I tend to test library/utility functions with doctest and unittest for everything more complex or things that have side effects.  I enjoy documenting in python far more than C#- I’m not sure exactly why, but having a much simpler usage and the ability to test/demo via documentation is great.  I enjoy the more flexible arbitrary string style of docstrings, rather than the heavier xml-style documentation in C#.  Almost everything I’ve written is tested, which is great (because it also means I am writing code that is actually testable and modular).

I can’t decide whether to use camelCasing or lower_with_underscore.  I prefer C#’s casing style, objectively (I liked it more even though I came from using camel-cased)- and MS is much more decisive on style, which means even if I didn’t agree, I’d adapt.

I want to start looking into multithreading and will miss .NET’s Task Parallel Library and ThreadPool.  I’ll start looking into stackless.

The biggest thing I realized is that python is a big paradox.  It is supposedly easy to learn and relatively simple; a great language for beginners.  But so much of good python programming is understanding convention, in that the language doesn’t force a certain way of doing things.  This makes python awful for beginners, and it has been my biggest criticism since I have been coding C#.  There is an awful amount of awful code written by people who have not gotten over the education hump.  C# and .NET force the programmer to do a large number of things.  They are good for teaching basics because you must follow these basics.  There is a lot more overhead to writing .NET than there is to python, and that overhead implies some more rigorous study.  There’s no convention- just static typing, and everything is an intellisense dropdown away.  Python requires more thought but less work, which means I feel incredibly liberated working in it.

 

1 Comment