In the beginning, the universe was created. This has made a lot of people very angry, and is generally considered to have been a bad move.

Friday, August 17, 2007

Skype. Take a Deep Breath.

Slogan gets a whole new meaning.

There are outages and then there are outages. As any frequent Skype user has noticed by now, the Skype service hasn't been very healthy for the past 13 hours (and still counting). Time to start taking those deep breaths if you were counting on making any communication with Skype this Thursday (and possibly Friday).

The latest update from the Skype team points the finger at engineering -- a deficient algorithm is to blame. That rules out the evil Microsoft conspiracy theory which, while possible, was unlikely to begin with and also the denial of service attack theory which was my initial guess for the login issues. I am sure someone, somewhere, blamed Skype's woes on today's stock crash also.


Where's the QA?

The Skype issue does bring up several points of interest. First is the whole engineering debacle (if that turns out to be the final verdict) -- the complexity of software engineering, managing software updates, software lifecycle and software quality. These of course have been discussed to death by academics over the past decades yet precious little progress is evident in the software industry today. Critical errors still end up into software no matter how hard we try to avoid them with processes, reviews, unit tests or audits. Skype is not the first, and unfortunately also not the last, to make a bad software update.

Frankly, I don't see the problem in software quality changing with the current set of tools majority of software developers are working with. It looks like we just have to accept critical software bugs as a way of life until a visionary computer scientist comes up with a software methodology that is simple enough for majority of engineers to adopt. This most likely requires automation on verifying software correctness given the lack of rigor and discipline evident with most software projects and engineers (yours truly included and I don't even believe writing software is an art, like some). Sadly however, no amount of discipline will avoid critical software failure, just ask NASA.

The second problem of programming quality is the lack of expressions for intent with the current mainstream programming languages. It is hard to see an automation for software verification to emerge without expressed intent of what a piece of code is supposed to achieve in the first place. This is where the current mainstream imperative programming languages clearly fall short and eventual overhaul is necessary. But that's a topic for a whole different essay.

In the meanwhile then there is not much we can do except to pray that the software glitches hit something as harmless as communication services as opposed to an airplane you're flying with or any other safety-critical equipment you might be forced to come to contact with. There are plenty of software horror stories to be found for those who care to look.


Here Lies Lock In

Second interesting point the Skype issue has brought up has to do with the potential weaknesses of peer-to-peer applications in general, and the wisdom of relying too much on a single service provider for such a critical component of communication, whether personal or for business.

The single vendor issue should be relatively easy to tackle -- just make sure you have a backup service available if your primary fails. There are plenty of messaging solutions around and absolutely no reason to place all the eggs into a single basket. And if you're relying on mostly free services, getting a backup isn't going to cost you much either.

Oh wait, but what about my contacts?

You've got all your contacts locked into a single vendor's product? Enjoy the slow screw of a vendor lock-in. Having cheap or free backup won't matter unless all of your buddies agree to use the same backup service. And fat chance of that ever happening.

So when a vendor/service provider has a massive failure, it should be a wake up call to all customers to demand open standards for their own benefit. Sure, we can hope the vendor is capable and always provides excellent service and that problems are rare and we can avoid all this trouble but given the current state of software quality as mentioned before, that's not very likely to happen.

Almost every vendor of popular messaging product is guilty of this or has been guilty of this at one time or another, be it from Microsoft, Yahoo or Skype, and they lock you into their product via your network of contacts. There are alternatives but, sadly, they get much too little attention (from users and developers alike).

No vendor is going to release their lock in voluntarily and without significant customer pressure, and what better time to apply that pressure other than when the vendor has a major cock up like Skype did today. You can vote with your feet or with your wallet. Demand open access to the information that is yours to begin with -- your own contacts, and demand open protocols so you can access your contacts at any time regardless of which software product or device your contacts are using.


Future of Free Service

Final thought arising from current Skype problems is the future of free service. While some might argue that Internet voice services are a poor substitute for traditional phone network, especially given today's example of rather lengthy downtime (and they may be correct given the current state of affairs), the traditional providers are not exempt from similar problems. Untested software patches have gone into phone systems and brought them down for hours. What differs here is the service level agreements that can be had with traditional network operators and the associated penalties that occur in case of a failure. While this may not prevent failures, it does create the incentive for vendors to put sufficient resources behind preventing problems and handling the emergencies.

Assuming companies like Skype do eventually desire to get into the enterprise, having service disappear for a whole day at a time becomes utterly unacceptable. Having your business on hold because of your service provider is not an option. This is where the service level agreements enter the picture that pay for better service for select customers. At that point the company's revenue source shifts from the stingy consumer wanting to save every penny on a call, or preferring free computer to computer call as much as possible, to a business willing to pay for a guaranteed service level. This will inevitable also shift the company's focus (follow the money) elsewhere from current customers (mostly consumers) and puts a question mark on the future and quality of the free service. In the end, when you get it for free, you get exactly what you paid for, right?

Update: During writing this blog entry, Skype seems to be getting back to its feet. It has gone from the low of 95,000 concurrent users to 2 million and is rising. Seems whatever the problem was has been fixed.


2 comments:

Zaduj said...

Problem has reappeard on 1st of Feb. IMO opinion IT could be a delibarate action from Skype, in order to make more people calling instead of using chat function. Make sense for me becasue they have enough users to do as they please. "Free" is a relative term in this case! Money will make this world go (y)ünder...

Juha Lindfors said...

Interesting. I was on Skype on Feb 1st and didn't notice anything problematic. Nothing in the news either in the quick scan I made.

Perhaps a local issue?

Anyway, it will be interesting to see where Skype goes next. After the almost 1B USD writedown Ebay made earlier, the new CEO has apparently given Skype one year to prove itself. We shall see.