swombat.com

daily articles for founders

A verbal command line for the world

I've been playing around with Siri since I got my iPhone 4S, and I've seen various articles painting Siri as the next evolution of UI, while others express disappointment that Siri is not as amazing as Apple seems to think.

Allow me to take a quick step outside of the "startup advice" theme of this site to describe why I think Siri is worth paying attention to.

The mighty command line

If you ask geeks everywhere, they will extoll to you the virtues of the command line, affectionately known as CLI (Command Line Interface). CLIs are brilliant if you know how to use them. In an environment designed for CLIs, like Unix (or OS X, or even some aspects of old and modern Windows/DOS), where every piece of software can and is driven by a CLI, the command line is incredibly powerful.

With it, you can type a simple line of apparently mysterious characters, and get "the computer" to perform tasks with nearly magical efficiency. For example, a well-crafted command can easily, in a few seconds, rename 15'000 files across hundreds of folders to fit a new naming pattern.

Learning to "speak" the CLI language may take some time, but once you know it, you can basically "speak" to the computer, via your keyboard, and get it to do your bidding quickly, concisely, efficiently.

The easy GUI (pronounced Gooey)

But command lines have a big problem: they're complex, they admit no error, and by giving you great power they also require you to know what you're doing or risk causing yourself great damage. And they're text-based. Most people don't like reading lots of text. They prefer looking at pictures or talking to people. They like simplicity.

GUIs (an invention commonly attributed to Xerox, and originally delivered to the mass market by the early Apple in 1984), delivered that solution, and most people have been satisfied enough with that, so they haven't ever bothered learning to use a command line.

But those of us who know the right CLI incantations know that GUIs are very limiting. They are slow and inflexible. They lack fundamentally useful capabilities, like passing the output of one program to the next, to make magic happen.

By reducing everything to pointing and clicking (or, in the more advanced iterations, touching with your fingers), GUIs make it easy, but you have to give up a lot of power to use them.

Enter Siri

Siri is not a GUI. It's far more verbose and subtle and versatile than simple pointing and clicking. In practice, Siri is still very limited, but in theory, it's as flexible as language is. In other words, Siri is a command line interface.

It has a few extra pieces - for example, the voice recognition piece, and the natural language processing piece. Those pieces are essential to making the Siri command line accessible to people who have no time, patience, or interest in learning magical incantations. But fundamentally, Siri is a command line that receives your language-based commands and plugs them into whatever services are being offered by the Siri infrastructure. This allows you to do... whatever the language syntax supports. Since the language is the one you speak and think with, eventually you should be able to get Siri to do pretty much anything you can express in your native language.

Currently, the Siri syntax is quite limited. The voice recognition is ok. The natural language processing is passable. But all those are things that can be improved, so long as Siri is minimally useful enough to be worth investing in. And that, it is. Siri is already a great interface for adding reminders, meetings, timers, and so on. It's a decent interface for searching the web ("Search the web for 'how do smoke detectors work?'"), and, in the US, is probably a good interface for interacting with maps and local businesses. That's enough of a starting point to warrant further investment from Apple.

Siri vision

Those who look at what Siri is right now and think "meh" are simply lacking in vision.

The more services are integrated into it, the better Siri becomes. This is not a distant dream, it is simply the next step. It is blindingly obvious that within a few short years, every service that can be integrated with Siri will be. A few months ago, being able to pull out your phone and ask it "When is the next bus 19 coming?" was a pipe dream, science fiction. Now, it's one small step away from reality. Think about that.

Not only that, but as the number of Siri users increases (it's predicted that there will be over 100 million iPhone 4S users within the next year), everyone will want their services to be integrated with Siri. Apple will have to beat service providers away with a stick. If you thought the App Store approval process was draconian, wait till you see the Siri approval process.

The more Apple invests into the voice recognition and NLP technology, the better Siri becomes, too. Right now, some bits of syntax are cumbersome. It took me bloody ages to figure out how to add stuff to my shopping list reminders in a natural way ("Add buy ketchup to my shopping reminders" failed, but "Add ketchup to my shopping list" worked). And Siri's voice recognition utterly fails in noisy environments, particularly when using the Apple headphones. All these are incremental improvements, though (which we know Apple excels at). The real magic, the smooth and intuitive integration of an ubiquitous voice-driven personal assistant with a variety of external services, that's done already.

The vision for Siri is simple and brilliant: it's a command line for the world, for everyone and everything. It's here today, in its infancy, and over the next decade we're going to get to watch it evolve into the main way that most normal people interact with computers.

If that makes you feel "meh", then I don't know what to say to you.


More from the library:
Running a startup without hiring
The salesman and the developer
Always drill down to fundamental metrics
Google Analytics Alternative