During a discussion today about some necessary-but-scary infrastructural changes, someone (a Lead, mind you) described the status quo as a warm, comforting blanket he wasn’t ready to leave. Horrified, I replied: Great men don’t swaddle themselves in warm blankets.
Archive of published articles on March, 2011Back home
I on almost weekly basis, I run into some example of naive programming regarding threading. Generally they have the following in common:
1- Uses Thread.Start
2- Show no understanding of CPU-vs-IO bound operations
3- Show no understanding of how a computer manages threads
Take this psuedocode for some widely-used routines I ran into today (actually our custom File.Copy method uses this internally!):
function CopyFiles(fromFilenames, toFilenames): for i = 0 to fromFilenames.Count - 1: System.IO.File.Copy(fromFilenames[i], toFilenames[i]) function FastCopyFiles(fromFilenames, toFilenames): #Bucket filenames into arrays, one for each core bucketedFromFiles = BucketFiles(fromFilenames) bucketedToFiles = BucketFiles(toFilenames) for i = 0 to bucketedFromFiles.Count - 1: Thread.Start(CopyFiles(bucketedFromFiles[i], bucketedToFiles[i])) waitForAllThreadsToFinish() #implemented with some counter system
There are so many things wrong with this. I totally understand the idea- the Thread is waiting during IO, so just new up threads to send more IO, while each thread waits for the IO to complete. Here are the major problems:
- Newing threads are expensive! Each thread requires a 1MB stack and takes a significant amount of time to create and destroy.
- Managing threads is expensive! Each core on your computer can only run 1 thread at a time (basically). There are other programs running on your computer, as well as possibly other threads in your program. Windows allows ‘context switching’, which means a CPU binds to a different thread- which requires unloading and loading a thread’s cache onto the CPU, and a host of other stuff. Creating more threads than you have cores means context switches happen more often. More threads will get created in your program when the CLR detects a thread is blocked and there are things to do, or you request one with Thread.Start.
- Your threads are doing NOTHING! While each thread is waiting for the IO to complete, it is doing absolutely nothing. It is just killing time, and your performance.
Naive programming involves parallelizing a process, but not making it asynchronous. Parallelization (especially custom algorithms, not using the built-in ThreadPool/Threading.Tasks.Parallel.ForEach/PLINQ/etc) is good, but you NEED to be wary of IO-bound operations (or threads that launch a separate process, etc.).
The correct approach here is to basically have a single thread (well, just let the ThreadPool manage the threads) to begin an asynchronous write operation, and Wait for the tasks to finish. The ideal is that a thread gets a ‘BeginWrite’ request, runs to the HDD, drops off the request, then comes back up and does more work (probably running back to the HDD to drop off another request). As the HDD finishes the requests, a thread (the same or different ones) can pick up the notification and run a callback, signal that the original request has finished, etc. So no threads are sitting idle waiting for the HDD- they are running around frantically doing work. Which is fine- what we want to avoid is 1) creating new threads, 2) context switches, and 3) inactive threads while there’s other CPU work to do (which means wasted resources).
I’ll go more into the explanation/example for the proper way to implement that FastFileCopy method in a future post (actually probably after I rewrite the one at work). There are already lots of examples of asynchronous IO so you should be able to figure it out yourself. Which you must do if you want to write multithreaded programs. Because you don’t want to be a smart person doing naive programming.
Here’s a video interview I did with Bill Crosbie, a member of the IGDA’s Education SIG. He was interviewing tech artists at GDC to get ideas for a curriculum the IGDA can give to educators, to help grow and raise awareness of the tech art discipline.
The last part of the video is where I talk about how tech artists need to be ‘ruthless.’ It certainly caught Bill offguard, and gave me a focal attribute to talk about for the rest of the week. “Ruthlessness” will of course need to be a topic for a future post.
Thanks a lot to Bill for doing all this!
I’ve uploaded my GDC slides (with full notes/narration). Here’s the link:
The main point of the presentation is understanding how to get your Tech Art and Tools Engineering teams to work together effectively (and why they aren’t working effectively now). I go over each team’s strengths (have you ever considered how differently TA and Engineering are set up?!), how to turn adversarial relationships into positive structures (how are tools supposed to get made when all three departments are competing for the same people’s time?), and actual technical strategies for working together (defining strong data interfaces and laying boundaries for a common codebase).
Please download it, read it over, and tell me what you think!
So, GDC 2011 was fantastic. Really, truly fantastic, on an industry, discipline, and personal level.
This was the largest GDC yet, with something like 18,000 attendees and 600+ speakers.
Our Tech Art Bootcamp was a huge success. Almost filled room from 10am to 6pm on Tuesday. Incredible. The Tech Art Roundtables were packed every day, and the Tech Animator Roundtables seemed to go better than last year.
I think my speech, entitled “Ending the Culture War: Uniting Tech Art and Engineering to Create a Better Pipeline” went incredibly well. Not a repeat of last year’s disastrous presentation.
I’ll get the bootcamp slides up this week.
I’m going to be posting about some of the trends I’ve noticed and challenges we’re facing as a discipline.
Tech-artists.org will be moving into the mobile/social age soon with some much needed site improvements.