I just read through a great Twitter blog, New Tweets per second record, and how! It covers in some depth the changes to their engineering organization over the last three or so years. As CCP is undergoing similar technical stresses (we hit our PCU record recently), and responding with similar actions, I thought I’d write down my takeaways and personal feelings.
Interpreted languages are fast enough or too slow. Ruby helped Twitter get where it is, and without such insane peak performance needs I’m sure they would have stayed in Ruby. We’re actually about to embark down rewriting some core systems from Python into C++, because we can’t always throw (more|better) hardware at the problem.
Architect your organization the way your want your software to be architected. This is derived from Conway’s law (like, “4 teams working on a compiler will develop a 4 pass compiler”), and I feel is central. If you want to rearchitect your technology, you need to reform your organization. There’s no way around it. The basis of discussion for those organizational changes should be how the technology should look.
Prefer interfaces at the service level. This isn’t always possible (core libraries or frameworks, fat clients, etc.), but should be preferred. It is a natural boundary. Though, interfaces at the module/package/class level can work fine for a certain scale (up to quite large, I imagine). Twitter has more open Software Engineer positions than most projects have programmers! But SOA is a great thing, for technology and for organizations.
Self-organized teams around services are effective. Self organizing teams have been one of the keys to Agile’s effectiveness (even when Agile development principles aren’t followed). It is difficult to scale Agile up to multiple teams, though, so dividing at service boundaries is a convenient way to reduce how large Agile must scale up. (As an aside, I don’t mean to say “Agile doesn’t scale” or that other development methodologies scale better, it’s just really difficult to scale up software development)
Monolithic DBs will be a bottleneck. Twitter came up with some clever solutions for sharding and balancing that worked for Twitter. If you have a monolithic DB under great strain, you will need to get creative with your own solution.