Posts Tagged ‘.NET’

Why GUI’s Lock Up

This is a post for the Tech Artists and new programmers out there. I am going to answer the common question “why do my GUI’s lock up” and, to a lesser extent, “what can I do about it.”

I’m going to tell a story of a mouse click event. It is born when a user clicks her mouse button. This mouse click goes from the hardware, to the OS. The OS determines that the user wants to interact with your GUI, so it sends the mouse click event to your GUI’s process. Here, the mouse click sits in a queue of other similar events, just hanging out. At some point, usually almost immediately, your process (actually the main thread in the process) goes through and, one by one, dispatches the items (messages) in the queue. It has determined our mouse click is over a button and then tells the button it’s been clicked. The button then usually repaints itself so it looks pressed, and then invokes any callbacks that are hooked up. The process (main thread) goes into one such callback you hooked up, that will look at 1000 files on disk. This takes a while. In the mean time, the user is clicking, but the messages are just piling up in the queue. And then someone drags a window over your GUI, because they’re tired of their clicks not doing anything and want to see what’s new on G+. The OS sends a message to your UI that it needs to repaint itself, but that message, too, just sits in the queue. At some point, your OS may even realize your window is not responding, and fade it out and change the title bar. Finally your on-button-click callback finishes, the process (thread) is done processing our initial mouse click, and then goes back to processing the messages that may have accumulated in the queue, and your UI will refresh and start responding again.

All this happens because the thread that processes messages to draw the UI was also responsible for looking at 1000 files on disk, so it wasn’t around to respond to the paint and click messages. A few pieces of info:

  1. You can’t just ‘update the UI’ from the middle of your code. In addition to being terrible form code-wise, clearing the message queue would just cause other things to block the main thread, and it’d all get into one giant asynchronous mess. Some programs may have their own UI framework that supports this. Don’t trust it. You really just need the main/GUI thread clear as much as possible to respond to events.
  2. Your GUI process has a single ‘main thread.’ A thread roughly corresponds to, and I’m being not nuanced here, the software concept of a hardware CPU core. Your GUI objects can only be created and manipulated by the main thread.

This means, you want to keep your main thread free so it can act on GUI stuff (paint events, mouse clicks) only. The processing, such as your callback that looks at 1000 files, should happen on another thread (a background thread). When the processing is complete, it can tell the GUI thread that it is finished, and the GUI thread can update the UI. Your background thread can also fire events or invoke a callback that will be picked up by the GUI thread, so the GUI can update a progress bar or whatever.

How you actually do this varies with each UI framework. .NET, including WinForms and WPF, is quite easy to use (look at the BackgroundWorker class, but the Tasks Parallel Library and Async CTP make that less necessary). Python GUI frameworks are a bit worse off- multithreading in python in general is worse off- so it’ll be different for each one, and probably not as simple as .NET. There’s no excuse for python GUI’s to lock up, it just takes a little more effort to get it completely right (like callbacks to update a UI are a bit tricky).

There is one other vital thing to keep in mind- DCC programs generally require you to interact with the API or run all their script on the main thread, which as discussed should also be kept clear. Bummer! So the best thing we can do is block while we get our data from the scene, put the processing on a background thread, and report back to the main thread when done, applying the new data back to the scene if necessary. Unfortunately, if your processing interacts with the API in any way, you probably need to put it in the main thread as well. So, right now, your GUI’s in DCC apps may need to lock up, by design. There are, in theory, ways to avoid this, but they’re well outside of the scope of what you can handle if you’re learning anything from this article.

Whatever your language and program, those are the essentials of why your GUI locks up.

Note: This info is not nuanced (and is less accurate the lower down things go), may not be terminologically perfect (though it should be vulgarly comprehensible), and is Windows-only, though it should be enough to know how any higher-level GUI framework (such as Qt) would work on a non-Windows system).

The doubly-mutable antipattern

I have seen this in both python and C#:

class Spam:
    def __init__(self):
        self.eggs = list()
    ...more code here...
    def clear():
        self.eggs = list()

What do I consider wrong here? The variable ‘eggs’ is what I call ‘doubly mutable’. It is a mutable attribute of a mutable type. I hate this because I don’t know a) what to expect and b) what the usage pattern for the variable is. This confusion applies to both people using your class (if they take references to the value of the attribute, either knowingly or unknowingly), as well as coders who are maintaining your class who will have to look every time to know which pattern you follow (clearing the list, or reassigning to a new list).

IMO you should have one of the following. This first example is preferred but as it involves an immutable collection, isn’t always practical as a replacement for this antipattern (which usually involves mutating a collection).

class Spam:
    def __init__(self):
        self.eggs = tuple() #Now immutable, so it must be reassigned to change
    ...more code here...
    def clear():
        self.eggs = tuple()

in which case, you know it is a mutable attribute of an immutable type. So you can’t change the value of the instance, you must reassign it.

Or you can do:

class Spam:
    def __init__(self):
        self.eggs = list()
    ...more code here...
    def clear():
        del self.eggs[:] #Empty the list without reassigning the attribute.

in which case, you understand it is to be considered an immutable attribute of a mutable type. So you would never re-set the attribute, you always mutate it. If you can’t use the tuple alternative, you should always use this alternative- there’s never a reason for the original example.

In C#, we could avoid this problem entirely by using the ‘readonly’ field keyword for any mutable collection fields. In python, since all attributes are mutable, we must do things by convention (which is fine by me).

I always fix or point out this pattern when I find it and I consider it an antipattern. Does it bother you as much, or at all?

Large initializers/ctors?

With closures (and to some extent with runtime attribute assignments), I find the signatures of my UI types shrink and shrink. A lot of times we have code like this (python, but the same would apply to C#):

class FooControl(Control):
  def __init__(self, value):
    super(FooControl).__init__()
    self.value = value
    self._InitButtons()    

  def _InitButtons(self):
    self.button = Button('Press Me!', parent=self)
    btn.clicked.addListener(self._OnButtonClick)

  def _OnButtonClick(self):
    print id(self.button), self.value

However we can easily rewrite this like so:

class FooControl(Control):
  def __init__(self, value):
    super(FooControl).__init__()
    btn = Button('Press Me!', parent=self)
    def onClick():
      print value
    btn.clicked.addListener(onClick)

Now this is a trivial example. But I find that many types, UI types in particular, can have most or all of these callback methods (like self._OnButtonClick) removed by turning them into inner functions. And then as you turn them into inner functions in init, you can get rid of stored state (self.value and self.button).

But as we take this to the extreme, we end up with very simple classes (and in fact I could replace FooControl with a function, it doesn’t need to be a class at all), but very long init methods (imagine doing all your sub-control creation, layout, AND all callback functionality, inside of one method!).

I’ve decided I’d rather have a long init method, usually broken up into several inner functions, rather than a larger signature on the class with layout, callbacks, and stored state. In my mind, it is easier to pull something out into a type attribute, rather than remove it, as anything on the type is liable to be used externally. And breaking up your layout into instance methods that can really only be called once (_InitButtons), from the init, adds a cognitive burden for me.

So I can justify this decision to eliminate extra attributes rationally, but what seals the deal is, I’m not unit testing any of this code anyway. So whether it is in one long method, or broken up into several methods, it isn’t getting tested.

I started out as very much in the ‘break into small methods’ camp but have wholesale moved into the ‘one giant __init__ with inner functions’ camp. I’m curious what you all prefer and why?

There’s idiomatic, and there’s just being respectful

I work in mixed language environments. Python, C#, C++, and more, can all make their rounds. It isn’t uncommon to have someone focused on C++ have to write something in another language, and it isn’t uncommon that I come across their code some point in the future.

It is easy to learn a language’s syntax but difficult to learn its idioms. Good luck trying to explain what ‘pythonic’ means to someone who is new to python or programming! So I forgive the transgressor when I see non-idiomatic code.

Usually.

There are some errors I find unforgivable. Errors that indicate a complete lack of understanding of the platform you are writing on. Errors like this (C#):

var foo = new Foo()
if (foo != null) {...}

Creating an instance is probably the most basic operation you can perform in an OO language, and the author clearly did not understand it.

Another unforgivable type of error is when someone tries to fix a bug but does not bother to understand what’s actually going on.

class Foo {
private bool _somevar;

...
}

There was some bug in the code somewhere, I can’t remember what. A developer changed ‘private bool _somevar’ to ‘private bool _somevar = False’ and declared the bug fixed (spoiler: it wasn’t).

Probably the best example comes from memory management, as the least understood things in programming tend to:

try { someUIControl.SetText(someGiantString); }
except OutOfMemoryException {
someUIControl.Clear();
GC.Collect()
someUIControl.SetText(someGiantString);
}

The only thing this did is change the stack trace. The problem was due to a .NET garbage collection implementation detail- the Large Object Heap and huge strings- and the ‘fixer’ just tried something every authority tells you not to do, which is catch an OOME.

If you’re going to leave your domain to write code in another language- I applaud you. It can show an endeavouring personality! But please have some respect for the language you are writing in- read a book, read a blog, ask for help. It’ll make you a better programmer, I promise.

Thank you, Rico Mariani, for reminding me how bad I was

A little while ago I read two great articles by Rico Mariani, a MS employee who usually blogs about performance in .NET (though python being an OO language the same advice applies there). The articles in question were these:

Performance Guidelines for Properties

Performance and Design Guidelines for Data Access Layers

I’d suggest at least skimming over them. He talks about, for property accessors, not allocating memory, locking, doing IO, having side effects, and being fast. For the DAL article, you should really read it, but the part that was especially relevant is “Whatever you do don’t create an API where each field read/write is remoted to get the value.”

It was a shocking reminder of my early days programming. Every point mentioned in those two articles, I was hands down guilty of. I don’t mean, I’ve done that sort of thing occasionally. I mean, I designed entire systems around everything you shouldn’t do with regards to properties and DAL design. To be fair, this was years ago, I was new to programming, in way over my head, and didn’t have people to turn to (no one at the studio could have told me what an ORM was or given me these suggestions about properties), so I don’t feel much guilt. And I learned better relatively quickly, well before reading those posts.

I work with a lot of new programmers, and experienced programmers who aren’t focused on higher level languages. The articles, most of all, reminded me how far I’ve come and how lucky I am. The new programmers haven’t had a chance to make the epic mistakes I have. The experienced programmers trained in a world without such useful managed languages, high quality bloggers, and sites like Stack Overflow; a world I’ve never known and I’ve benefited by learning best practices and new skills, and finding and breaking bad habits.

I remember at the time thinking how great some of their features were, the same features that, as Rico points out, are really terrible ideas. I felt fortunate that I already followed his guidelines, and even more fortunate that few people were around to witness the hideous abuses of them!

Don’t use global state to manage a local problem

Just put this up on altdevblogaday: http://altdevblogaday.com/2011/09/25/dont-use-global-state-to-manage-a-local-problem/


I’ve ripped off this title from a common trend on Raymond Chen of MSFT’s blog.  Here are a bunch of posts about it.

I can scream it to the heavens but it doesn’t mean people understand.  Globals are bad.  Well, no shit Sherlock.  I don’t need to write another blog post to say that.  What I want to talk about is, what is a global.

It’s very easy to see this code and face-palm:

global spam = list()
global eggs = dict()
global lastIndex = -1

But I’m going to talk about much more sinister types of globals, ones that mingle with the rest of your code possibly unnoticed. Globals living amongst us. No longer! Read on to find out how to spot these nefarious criminals of the software industry.

Environment Variables

There are two classes of environment variable mutation: acceptable and condemning.  There is no ‘slightly wrong’, there’s only ‘meh, I guess that’s OK’, and ‘you are a terrible human being for doing this.’

  1. Acceptable use would be at the application level, where environment variables can be get or set with care, as something needs to configure global environment.  Acceptable would also be setting persistent environment variables in cases where that is very clearly the intent and it is documented.  Don’t go setting environment variables willy-nilly, most especially persistent ones!
  2. Condemning would be the access of custom environment variables at the library level.  Never, ever access environment variables within a module of library code (except, perhaps, to provide defaults).  Always allow those values to be passed in.  Accessing system environment variables in a library is, sometimes, an Acceptable Use.  No library code should set an environment variable, ever.

Commandline Args

See everything about Environment Variables and multiply by 2.  Then apply the following:
  1. Commandline use is only acceptable at the entry point of an application.  Nothing anywhere else should access the commandline args (except, perhaps to provide defaults).
  2. Nothing should ever mutate the commandline arguments.  Ever!

Singletons

I get slightly (or more than slightly) offended when people call the Singleton a ‘pattern.’  Patterns are generally useful for discussing and analyzing code, and have a positive connotation.  Singletons are awful and should be avoided at all costs.  They’re just a global by another name- if you wouldn’t use a global, don’t use a singleton!  Singletons should only exist:
  1. at the application level (as a global), and only when absolutely necessary, such as an expensive-to-create object that does not have state.  Or:
  2. in extremely performance-critical areas where there is absolutely no other way.  Oh, there’s also:
  3. where you want to write code that is unrefactorable and untestable.
So, if you decide you do need to use a global, remember, treat it as if it weren’t a global and pass it around instead (ie, through dependency injection).  But don’t forget: singletons are globals too!

Module-level/static state

Module-level to you pythonistas, static to your C++/.NET’ers.  It’s true- if you’re modifying state on a static class or module, you’re using globals.  The only place this ever belongs is generally for caching (and even then, I’d urge you to reconsider).  If you’re modifying a module’s state- and then you’re acknowledging what you’re doing by, like, having to call ‘reload’ to ‘fix’ the state, you’re committing a sin against your fellow man.  Remember, this includes stuff like ‘monkeypatching’ class or module-level methods in python.

The Golden Rule

The golden rule that I’ve come up with with globals is, if I can’t predict the implications of modifying state, find a way not to modify state.  If something else you don’t definitely know about is potentially relying on a certain state or value, don’t change it.  Even better, get rid of the situation.  This means, you keep all globals and anything that could be considered a global (access to env vars, singletons, static state, commandline args) out of your libraries, entirely.  The only place you want globals is at the highest level application logic.  This is the only way you can design something where you know all the implications of the globals, and rigorously sticking to this design will improve the portability of your code greatly.

Agree?  Disagree?  Did I miss any pseudonymous globals that you’ve had to wrangle?

The skinny on virtual static methods

Today at work, I saw the following pattern in python:

class Base(object):
    @classmethod
    def Spam(cls):
        raise NotImplementedError()
class Derived(Base):
    @classmethod
    def Spam(cls):
        return 'Eggs'

After thinking it over and looking at the alternatives, I didn’t have an objection. I asked for some verbose documentation explaining the decision and contract, as this is really not good OOP and, some (me) would say, polymorphism through convention- which in dynamically typed languages like python, is how polymorphism works. But it is unfamiliar and generally considered bad practice, especially in statically typed languages.

Of course this sparked some disagreement, but fortunately, this is a topic I’ve read deeply into a number of times- first, when I was thinking about how to implement the pattern and realized the error of my ways.  Second, when I was working in another codebase that was a case study in antipatterns where this was used quite heavily.  I read into it yet again today so I could write this blog post and send it to the persons who disagreed with my judgement of the pattern in statically typed compiled languages.  So read on.

Let me reiterate- I’m not opposed to this pattern per-se in python or other dynamically typed languages.  I think it is a sure sign of code smell, but there are legitimate reasons, especially if you’re working on, say, a Maya plugin and have some restricted design choices, and this can smell better than the alternative, which is a bunch of other shitty code to deal with it.  In fact, it @abc.abstractstaticmethod was added to the abstract base class module in python 3.2, with Guido’s consent.  Which doesn’t mean it is right, and I’d still avoid it, but it’s not terrible.

My problem is with the pattern in statically typed languages.  When you make a method abstract or virtual on an interface of class, you are saying, ‘you can pass around instances of me or derived from me and they will have a method named Foo.’  Instance methods are bound to the instance.  When you define a method as static, you are binding it to the class.  So there are two reasons this doesn’t work- one technical/semantic, the other conceptual.

The technical/semantic cause is that virtual and static are opposite!  Virtual tells the compiler, ‘look up what method to call from the vtable of the object at runtime.’  Static tells it, ‘use this method at this address.’  Virtual can only be known at runtime, while static must be known at compile time.  I mean, you need a class to invoke the static method on!  The only way around this is through either dynamic typing, or some sort of reflection/introspection/codegen- in which case, you aren’t working with static typing, you’re emulating a dynamically typed environment in a statically typed one!

So, clearly the concept of ‘virtual static’ is impossible in static languages like C#, C++, and Java.  It doesn’t stop people from trying!  However, does the fact that the languages don’t support the feature make it a bad idea (the same way that just because python 3 supports it doesn’t necessarily make it a good idea)?

Yes, yes, a thousand times yes.  Let me restate the above: in order for ‘static virtual’ to be supported in a statically typed language, you need to use its dynamic typing faculties.  Look, I’m all for ‘multi-paradigm’ languages (even though some artard on Reddit berated an article I wrote because that phrase doesn’t actually make sense), but we need to be very careful when we start using patterns that go against the foundation of our languages.  Like I said- I’m not fundamentally opposed to the pattern, I am just opposed to using it in statically typed languages.

But that’s like saying, I prefer tomato juice to evolution.  I don’t even know what that means.  You cannot have virtual static methods in a statically typed language.  They are incompatible.

So much for the technical (vtable)/semantic (dynamic) reason (were those 2 reasons, or one reason?).  What about the conceptual one?

Well like I said earlier, virtual or abstract methods are part of a contract that says, I or a subclass will provide an implementation of this method.  Classes and interfaces define contracts, and at their core, nothing else (especially not how they’re laid out in memory!).  So if you’re passing around a type, and you’re saying, this type will have these methods- well, what does that look like?  I can hardly fathom what the class declaration would look like:

class Base {
    public virtual static Base Spam() { return Base(); }
    public virtual string Ham() { return "Base Ham"; }
}
class Derived : staticimplements Base {
    public override static Base Spam() { return Derived(); }
}

Well, shit. We’re saying here that Derived’s type implements the contract of Base’s type- well, Base has a ‘regular’ instance contract, the Ham method.  What happens to this on Derived?  Is Ham part of Base’s contract?  It must be, because otherwise I have no idea what the Spam() method is going to return for Derived.  Alright, so if you ‘staticimplements’ something, you get get all static and instance methods as part of your contract (and this is how python works, too).

So how do we use this?

void Frob(Base obj) { ... }

Wait. Shit. This says we’re passing in an instance of Base, whereas we want to pass in the Type of an object that staticimplements Base. So:

void Frob(BaseType obj) { ... }

So now let’s jump back to our class definitions:

class BaseType : Type { public virtual static Base Spam() { return Base(); }
class Base : staticimplements BaseType { public virtual string Ham() { return "Base Ham"; } }
class Derived : staticimplements Base { public override static Base Spam() { return Derived(); } }

Now we’re getting somewhere. We can define types that inherit from the Type object (that is some class like .NET’s Type class), and we can staticimplements those (and if you staticimplements, that implies you also get all instance methods).

Well shit, wait. If Base inherits from Type, then instances of Base will also get all instance methods from the Type object? Well ok, I can deal with that, we don’t have to use Type- what if Type inherits from RootType, and BaseType inherits from RootType, and RootType is just an empty definition so instances of objects that inherit from BaseType don’t have all of Type’s instance methods?

void Frob(BaseType baseType) {
    Base obj = baseType.Spam();
    //Well, how do we get an instance of BaseType from an instance of Base?  We can't.
    //RootType rt = obj.GetType(); //What good is RootType here?
    //BaseType bt = obj.GetBaseType(); //Wait, so we put a method on the instance that needs 
                                //to be overridden on the subclass to return an instance of the actual type?
}

I’m not going to go any further because I doubt anyone has even read this far. The question of virtual static functions in statically typed languages is pointless- much easier, then, to just throw up your hands and hack together whatever using reflection, dynamic, templating, or any other form of introspection. You can, for sure, come up with workarounds that are often quite specific and span hundreds of lines. I’ve read and gagged at a hundred of them.  But given that it is currently not just technically impossible, but conceptually brain-melting, why would you?

The problem at its core is, I think, that people learn a golden hammer in one language (in this case, python, Delphi, etc.) and try to apply it to another (C#, Java, C++), as well as people coming up with a design and then figuring out how to shoehorn the idea into the language.  Well guess what- not every language can execute any arbitrary patterns or designs well.  Learning python made this patently clear.  The language is so concise, each line (and character!) so meaningful, it is patently clear when I am doing something unpythonic- the meaningless lines of code, the extra characters, the redundant patterns in the code, the roundabout way to achieve something.  Static languages don’t have this concision, they don’t have the same economy, or flexibility.  There is too often a legitimate workaround or over-engineering, so when illegitimate workarounds are devised- well, who even considers it?  I certainly didn’t (hello, class Blah<T> where T : Blah<T>!).

So what are your choices here?  If you really think that this is your only option, your design stinks.  Stinks in that it has a foul code smell.  You have a few options:

  1. Singletons, which is a pretty big copout because you’re really just creating a static class by another name (but to be clear, are still a potential solution),
  2. Just create a new instance and call an instance method.  Suck it up, it really isn’t a big deal, though you basically require generics and a new() constraint for it, or the conceptual opposite:
  3. Dependency injection.  My guess is, if you’ve made it this far (congrats and thanks!), and you disagree, you’re not familiar with the idea of Dependency Injection or Inversion of Control.  I’d encourage you to read up on it, and realize it isn’t nearly as complicated as you may think it is, but far too interesting to get into here.

Good luck!


I’d encourage you to do your own googling and reading about this problem. It’s quite interesting and really highlights conceptual differences between languages, and understanding the problems or benefits to the approach will make you a better programmer and designer in your preferred language.

Optional parameters can be harmful

I’ve come around on optional parameters after being an opponent of adding them to .NET.  They can be very helpful, clean up the code by not needing overloads, and inform the caller what the defaults are.  They are an integral part of python and why it is easy to use.  This is great.

Except when you abuse them.
And you may be abusing them without knowing it.

Parameters should only be optional if a valid default value can be determined by looking only at the contents of the method.

What do I mean?  Well, if your default value is ‘None’, and you call another module to create the actual default value, this is not good practice.  It negatively impacts two important software metrics: it increases coupling by adding a dependency on your module, and it increases cyclomatic complexity by creating an ‘if’ statement.

It is better, in these cases, to just force the user to pass in a valid value.  If you’re jumping through hoops to determine a default value, odds are it is too specific and also breaks the reusability and dependability of the method.  The caller has a higher chance of already having a dependency to what you would be depending on for your default value (any chance is higher than the 0% chance your method has of needing it).  To demonstrate:

def exportFile(filename, branch=None, depotRoot=None):
    if branch is None:
        branch = os.environ['EXPORT_BRANCH']
    if depotRoot is None:
        depotRoot = someScmModule.singleton.depotRoot
    ...

This code is not unusual but it is really difficult to understand and maintain.  The default values determined for branch and depotRoot are entirely outside the logic and goal of the method.

Don’t do stuff like this and you (and especially others!) will have a much easier time maintaining and supporting code.

WTFunctional: Be Declarative

Functional programming is one of the most important developments in programming, but one that has been understandably slow to be adopted and understood by many programmers and tech artists.  Over a few posts, I’m going to try to go into the how and why of using a more functional style in your daily programming activities.

First up is demonstrating that functional programming is declarative: it makes your code more expressive and optimized.

Most programmers are used to seeing this:

list = []
for i = 0 to 10 do
  if i % 2 == 0:
    list.append(i)
//list is now [0,2,4,6,8,10]

Less familiar would be:

list = range(0, 10).filter(lambda i: i % 2 == 0)

The first focuses on the how: increment i from 0 to 10, and append every even item and 0 to a list.  This is an imperative style.  The second focuses on the what:  for each item from 0 to 10, select all even items.  This is a declarative style, which is an aspect of functional programming.  In this trivial case, the difference is, well, trivial.  But the key differences are:

  1.  The declarative style does not specify the enumeration mechanism- it uses the ‘range’ function, rather than incrementing explicitly (as a regular foreach loop does).
  2. The declarative style does not specify the filtering mechanism- it uses a ‘filter’ function, rather than an explicit ‘if’ statement.
  3. The declarative style does not specify the storage mechanism- it usually just returns any type that can be enumerated/iterated over, not a concrete type like a list/array/etc.

These differences create three key benefits:

  1. The abstracted enumeration mechanism means the enumeration mechanism can be optimized, and doesn’t have to be considered by the user.
  2. The abstracted filtering means the filtering can be optimized because its implementation is hidden from the user, and its intention is more explicit- this is the declarative part of it.  We’ll see how to read a more complex statement next.
  3. The abstracted storage mechanism grows out of the other two abstractions- there may not be a storage mechanism at all, but possibly just generators- it really depends on what is expedient for the statement.

Let’s try out a more concrete example.  In this case, we’ll be doing some complex enumeration- grouping, sorting, and projejcting.  We want to get a collection of MyObject from active table rows that are ordered by date and then by ID.

dateAndItemsMap = dict()
for row in myTable.rows:
    if row.isActive:
        if row.date not in dateAndItemsMap:
            dateAndItemsMap[row.date] = list()
        dateAndItemsMap[row.date].append(new MyObject(row))
sortedDates = dateAndItemsMap.values()
sortedDates.sort()
itemsSortedByDateThenId = list()
for date in sortedDates:
    items = dateAndItemsMap[date]
    items.sort(lamba obj: obj.id)
    itemsSortedByDateThenId.extend(items)

Wow, that’s a lot of code!  And not at all clear when reading it.  Let’s read it: Create a dictionary, and for each row, if it is active, make sure the map has a list for the row’s date, and append a new MyObject to the list at row.date in the map.  Then sort the keys, then iterate over the sorted keys, get the sorted list value, and keep extending the result list.  That’s a mouthful, and I think that was pretty brief.

Let’s compare this to the declarative style:

myTable.rows.filter(lambda r: r.isActive).select(lambda r: MyObject(r)).order_by(lambda o: o.date).then_by(lambda o: o.id)

One line?  One stinking line?  Let’s read it: For each row that is active, select a new MyObject, and order those by dates, and then by id.  Notice a) how the explanation expresses what you want, not how you want to get it, and b) the explanation reads very similar to the code.

This is why declarative programming rocks, right now.  It is worth its weight in gold to learn how to use LINQ in C#, itertools in python, or whatever declarative querying mechanism your language hopefully has.  Your code will become infinitely clearer.

The reason to be declarative will be even more awesome in the future, is when we can ‘prove’ software to be side-effect free (pure), and the compiler or runtime can automatically parallelize it and optimize it.  This is one reason languages like SQL have been so effective- the software/hardware can actually reorder or adjust your query to optimize it, and those algorithms or optimizations can change because the language itself has no notion of how the algorithms for JOIN, GROUP_BY, etc. are implemented.

That makes sense, I hope, and it is just one benefit of learning about functional programming.  Next up will probably be closures.

I booted up VS2010 today…

And I’m still waiting for it to load.  Even though I love all of Visual Studio’s features, my god, I forgot how big and heavy it is.  Having a light IDE and a system where I don’t need to create a new project (several more minutes) to just scribble some code like python has is nice.

Though I have faith that IDEs like VS will evolve somewhat more rapidly going forward, we’ll see if they get larger and more monolithic, or more feature-rich but componentized.

Return top
 

Switch to our mobile site