Popfly - creating games the easy way
I can't believe I missed this video! Watch till the end for an awesome Master Chief moment
Firefox 3 virus scanning
I consider myself reasonably familiar with the APIs that ship with Windows and I'm surprised whenever I come across some new API that I had never heard of before.
In a lunch discussion with Aarthi yesterday, I mentioned Firefox 3's virus scanning. Browsers have done this in the past (IE7 does it silently and prompts if it finds something evil) but seeing the explicit notification gave me a sense of security.
We got into a debate over how Firefox must have implemented it. My first guess was that Firefox must be using some online service which checks URLs - this would be similar to how all modern browsers support anit-phishing. However, that doesn't make sense when you consider how compute intensive virus scanning is. Doing it as a service for a popular browser would be a non-trivial effort.
I did some digging later and found this bugzilla bug which pointed me to the right path. It turns out that Windows has *2* in-built APIs to scan files for viruses. Relatively unknown (atleast to me), they wrap around the installed virus scanner.
The first one if IOfficeAntiVirus and the other is IAttachmentExecute. IAttachmentExecute, which is supported on XP SP2 and upwards, IAttachmentExecute (which is used by IE6 and 7 ) does a lot of magic behind the scenes - from supporting NTFS alternate streams to enumerating the installed virus scanners for you.
I'm amazed at the breadth of APIs you find across Windows!
Recursive generators are elegant
I just wrote a piece of code which drove home how simple and elegant generators in Python can be. I needed to loop through a *very large number* (lots and lots of digits) and though I needed only the string representation of the number, I couldn't write the loop in the typical fashion since the number was too large for Python to handle.
This is what I came up with. Does anyone have suggestions on how I can make this prettier?
def generate(prefix, remaining):
if remaining ==0:
yield prefix
else:
for i in range(0,10):
for number in generate(prefix + str(i), remaining-1):
yield numberfor number in generate("", 25): #25 digit number
print number
Zune and ZunePass will change the way you listen to music
Disclaimer: I work for Microsoft but these opinions are my own, yada, yada. And I have nothing to do with the Zune team.
Here's the short version - paying a low monthly fee and being able to download as many songs as you want is killer. I've downloaded over 1000 songs just in my first week and I'm going strong. And the 'social' works.
ZunePass is the Zune's subscription service. You pay $14.99 per month and you can download any number of songs from the Zune marketplace. Since it is a subscription service, you lose access to the songs once you stop paying the monthly fee. Sounds DRM (it is) and evil, doesn't it? Not quite.
I'm staunchly anti-DRM. And I'm a huge fan of Apple's devices. I did all my music shopping on Amazon's MP3 store and listened to all my songs on my iPhone. When Mel Sampat asked to get a Zune, saying that it will change my life, I thought he was joking. But Mel was persistent and I went and got myself a red Zune 80 custom engraved from Zune Originals. I will post a review on the device soon but I wanted to talk about ZunePass and the Zune software first.
Exploring is a lot easier
Buying songs on Amazon/iTunes is cheap but not free. I bought a lot of songs but I would buy them only if I really wanted them. I would rarely take a chance on an unknown artist or song or album.
With ZunePass, since I'm not paying per song, I find myself spending several hours randomly browsing through the Zune marketplace. I'm trying out *way* more artists and songs. Example? Someone on the internal Zune mailing list gave a thumbs up to William Shatner's (yes - *that* William Shatner) 'Has Been'. That's an album I would have never risked money on - but guess what - it's actually quite nice.
The closest analogy I can come to the exploring I do is how we browse through Wikipedia - where you click on an article and find yourself reading a totally unrelated article a couple of hours later.
If the Zune marketplace gains popularity, I'm willing to bet that we would see some interesting changes in how artists become popular. When you have to pay nothing to try out a new artist or a new album, it removes ton of 'impedance'.
I went crazy this morning on ZunePass downloading old Tamil and Hindi movie songs from ZunePass. My Tamilian readers will understand why I'm so excited about getting songs from movies like 'Mr.Bharath' or Duet or more recent movies like Bombay. The selection of world music, atleast the Indian section, is quite deep and is very up-to-date.
I can hear a lot of folks (especially all my college friends reading this post) saying "You already get this from Bittorrent". Apart from the legal aspect and quality aspect, you're going to have a lot of trouble finding seeds for rare or unpopular content.
Under the right circumstances, DRM isn't so bad
There. I said the unthinkable. Here are the limits the Zune DRM puts on you
- You can listen to your music on three given machines at any time, For me, that's my home machine, my work machine and my laptop. I stream content from my home machine (over UPnP) to my XBox 360 hooked up to my home theater system.
- You can take it on your Zune device if you have one
- You lose access to your music once you stop paying the subscription fee. Of course, if you wanted to, there's an option to buy the song outright as well (ala iTunes or Amazon's mp3 store)
I've been using it for a week now and I don't find any of it limiting. The biggest blocker for a lot of people will be lack of native Mac or Linux support but I guess Parallels/Bootcamp/Fusion can help out in that regard. Another big blocker is that these songs don't work with iPods - so you'll have to listen to these songs from your computer alone or get a Zune. As I've found out in the last one week, the new Zunes aren't bad at all.
This made me think - is all DRM evil? Without DRM, there's really only one business model - buying each song or album. DRM, *implemented correctly*, lets you 'rent' content and doesn't necessarily mean a bad user experience. I have a feeling that there are a lot of people out there who will be happier with the renting model (due to the dramatically lower prices) for some forms of content.
'The social' works
Finding songs and artists to listen to based on what your friends are listening turns out to be an exceptionally good model. In my case, a lot of my friends are doing the same sort of exploration I talked about above so I get to reap the benefits of their effort. At the top of this post, you can see my Zune card with a live updated list of the songs I listen to, my favorites, etc.
It's incredibly cheap
I've already downloaded over 1000 songs in the short time I've had a Zune. I'm sure I'm going to get several thousand more over the next year or so. Paying $1 or so per song for thousands of songs would be *very expensive* when compared to paying a flat $15 for per month .
What Twitter and everyone else needs right now - distributed programming tools

We've gone through our various databases, caches, web servers, daemons, and despite some increased traffic activity across the board, all systems are running nominally. The truth is we're not sure what's happening.
The quote and image above are from the much talked-about Twitter blog post. Alex Payne has a longer post detailing their architecture issues here (Techmeme discussion).
The problem Twitter's dev team is doing right now in diagnosing their problems is something I've been thinking about for some time. As a programmer, your top diagnostic tools are probably your debugger, your profiler and a logging system. Now, take a look at the simplified architecture diagram I created below of a typical high traffic web application (note my complete lack of artistic skill).

Notice anything wrong? The tools that we just spoke about help us only on each individual node. They do a terrible job of traversing program flow across machines. Given that any self-respecting service is running on multiple machines, most dev teams find out the hard way that tracking down distributed bugs is very hard. Take Twitter for example - by their own admission, their problems lie not in the boxes themselves but in the arrows between them. This problem is compounded due to the fact that your app code is a very small part of this system. It may be possible to build extensive logging into your code but having to trace values across memcached/Cacheman and MySql/Sql Server and your load balancer is non-trivial, if not impossible.
I think there's a tremendous opportunity in solving this problem. Unfortunately, I don't know what exact shape or form it will take (I do have a few guesses). Here's my wishlist for what such a solution should have
- Distributed debugging I want the ability to seamlessly debug from one machine to another, moving between web frontend code, SQL stored procedures, any messaging infrastructure and back. For places where I can't debug, I want to see all data coming and coming out.
- Establishing causality For post-mortem distributed debugging, you are always trying to establish causality - what event lead to what other event. Such a system will log enough data to establish causality relationships between various events in the system. For a particular request or stack trace, I should be able to go back and look at code flow across machines for that request.
- Distributed profiling Anyone who has maintained an public facing site has seen unexplained slowdowns which are hard to debug. In the very early days of Popfly, we used to see an unexplained slowdown everyday at lunchtime (we used to joke about the servers going out to lunch). A tool that constantly monitors server performance and in case of a slowdown, shows you where time is spent across the system will be critical. Tools like Nagios do a great deal of monitoring for you but I haven't seen any tool do the latter.
I haven't found any current system which does all the above to my liking. I'm writing some code in my spare time to get some of the above but it is a daunting task. The closest I've seen anyone come is the XTrace project from Berkeley's RAD labs. The following excerpt from the paper explains how it works
A user or operator invokes X-Trace when initiating an application task (e.g., a webrequest), by inserting X-Trace metadata with a task identifier in the resulting request. This metadata is then propagated down to lower layers through protocol interfaces (which may need to be modified to carry X-Trace metadata), and also along all recursive requests that result from the original task. This is what makes X-Trace comprehensive; it tags all network operations resulting from a particular task with the same task identifier.
The problem with systems like X-Trace is that they require extensive modifications to database servers, your web server, your cache server, etc. Such a task is not easy, especially given the wide array of choices you have in each of those. However, X-Trace is promising and I know that they've carried out some large scale tests with good results.
Does anyone have pointers to tools and solutions which can help with the above?
Finding out what users like and don't like without them telling you
Traditional mechanisms for collecting feedback are well known - user studies, focus groups, usability labs, reading emails from users, visiting customers in person, etc. All of them are important and have their place. Equally important but not as well known are automated feedback gathering mechanisms. This is important since what users *do* is a much better data source than what users *say they do*.
Let's get something clear at the beginning. User privacy concerns trumps over everything else - you need to know what is ok and what isn't and more importantly, that the user knows what is going on and has opted in. If you want to know what is ok and what isn't, I suggest looking at the large companies (Microsoft, Google) and see how they do it. They typically have large user experience and legal teams so anything that comes out is typically kosher.
It is impossible to improve a product if you don't know what the top errors your users hit are. In an ideal world, you would get notified every time the user hits a problem with a debugger automatically attached. Though impossible in the real world, you can come close.
Errors in web applications/services
Errors in web applications are tricky to deal with since there are a wide variety of things that could go wrong. Here's a list of techniques I personally use, mostly due to my work on Popfly.
- Handle the window.onerror event and send the exception back to your servers
- For every custom client side error, send some data back to the server to log it.
- Log every server side error with an unique identifier and display that to the user instead of a generic error number (usually along with a helpful message). This lets you see the exact stack trace/log for the problem. This single technique has helped us tremendously at Popfly.
- Build the right tools on the backend to analyze all the data coming at you. At the very least, any member of the team should be able to get the stack trace and bug id for an error that a user sees.
Watson/Online Crash Analysis for desktop apps
The dialog on the right is a typical crash report dialog. The idea behind it is ridiculously simple - when a crash occurs, send a minidump to a server. From its humble origins in the Office world, Watson is now *the* way to find out what the top crashes are across apps on Windows. Apart from being able to load the dump in a debugger, you can also mine interesting data out of it - e.g, see what the top offending modules or what the worst extensions are. This data isn't available to Microsoft alone. If you're a software maker on Windows, you can get at crash data for your own applications.
If you don't like the Watson UI or if you're not creating Windows apps, you may want to roll your own implementation. There are third party crash reporting software you could look at as well - Google Breakpad is one that I know of. 1
Usage information
Apart from errors and crashes, you also need to know what users are doing with your applications. What buttons are they pressing, how many files do they open, what do they click on and in what order. Microsoft uses the 'Customer Experience Improvement Program' to collect this sort of data. This is opt-in with strict rules on what kind of data can be collected. This leads to great data-driven decisions across several products (for example, with the Office Ribbon).
The popular recent usage of such information is in search toolbars. All the major search toolbars ask users to opt-in to having their anonymous browsing history uploaded and then use that data to improve their algorithms and in some cases, show you personalized results.
Analytics
On the web, its a lot easier to find out where your users are and what they're doing with your application since all the data is coming at you over HTTP anyway. Every site should have some form of analytics turned on. I can think of almost no cases where a site owner would not want to know how much traffic the site is getting and the usage patterns that the traffic generates. Analytics are great for
- Telling you traffic patterns - user characteristics, load across time, referers, popular pages, etc
- Showing you what content works and what doesn't
- Showing you how users interact with your site. For e.g, setting up a funnel is a great way to find out where users are running into problems and dropping out.
- Telling you demographic data (with AdCenter Analytics)
I personally have both Google Analytics and AdCenter Analytics turned on in this blog.
A/B Testing
Last, but not least, is A/B Testing. The idea is simple - you run a controlled experiment where you show different sets of users different variants of a feature. You collect data and then determine2 which version performed better . The amazing thing about A/B testing is that you can now resolve design decisions by letting real users show you through their actions. This is vastly different from a traditional design decision where one has to try and guess what design will work and what won't.
Amazon, Google and Microsoft all make extensive use of A/B testing (Amazon is probably the heaviest). Ronny Kohavi runs the experimentation team at Microsoft and they have some great content up at http://exp-platform.com/default.aspx. AdWords is the only popular A/B testing platform I could find for general usage - I would love pointers to others.
A/B testing is not for everything and for everyone.
- I'm still not convinced it is possible to do large scale design using A/B testing. Testing variants is one thing - designing complete user flows through A/B testing is something else
- A/B testing and QA don't go well together. Since you're testing multiple versions at the same time, someone needs to make sure that all the multiple versions work well (since they'll be seen by real users). That can put a lot of stress on whoever's testing your apps.
- It is easy to get bad data - either through external factors , by picking bad sample users or by biasing your experiment in some other way.
I strongly recommend reading Kohavi's paper on the topic to get an idea of some of the pitfalls.
Notes
1. My best experience with crash analysis came last week from an unexpected source - the e text editor. It crashed on me and popped up a custom 'do you want to send crash info' dialog. I was pleasantly surprised when I got an email from them the *very next day* that the bug that caused my crash had been fixed with a link to the new setup.
2. The 'determining' part is actually quite tricky. One would think that it is as simple as just seeing the numbers but in reality, one has to account for statistical significance, external factors, whether the experiment was run across enough users, whether the user sampling was random enough, etc.
At the Web 2.0 Expo from the 22nd to the 25th
Archives
November 2004 January 2006 June 2006 July 2006 August 2006 September 2006 October 2006 November 2006 December 2006 January 2007 February 2007 March 2007 April 2007 May 2007 June 2007 July 2007 August 2007 September 2007 October 2007 December 2007 January 2008 February 2008 March 2008 April 2008 May 2008 June 2008
