Thursday
31 Jan 2008
Enso 2.0 Design Thoughts
Design Our Products Software Development
As part of the our move to Mozilla and thinking about a free-as-in-speech Enso, I want to be more transparent with our design directions and goals. Our designs can only benefit by incorporating the criticism and suggestions of the community we have here. Open-source design is a balancing act between making final decisions and finding consensus. We hope to take the lessons that Jono spelled out in his excellent article on successfully humane open-source projects and use them in our own projects.
This post is about the new directions we are taking Enso. If you haven’t done so yet, start by reading about some of the motivations for doing some Enso redesign. In short:
- Enso shouldn’t make you type all of “open” every time
- Enso should be able to open paths and urls
- Enso should support international character input
- Enso should gracefully handle the case where there’s no convenient place to enter text
- Enso shouldn’t require you to type out text, select it, and then run a command when you’d rather run the command and then enter the text (think calculate)
- Enso shouldn’t make you hold down a key while typing lots of characters
We think we’ve solved these problems with our Enso 2.0 redesign. In this post, and possible follow on posts, I’ll walk through the new stuff. I should note that our upcoming prototype will not yet have have all of the features mentioned here.
Autocomplete
One of the major design challenges in Enso is balancing between descriptive names and short names. This is not a new problem: Unix command lines used short names instead of descriptive names. That’s why Unix ended up with such memorable commands as “df” and “tar -xvzf”. For Enso, the problem is particularly noticeable for commands that that have similar beginnings. For example, the “translate to” family of commands makes my fingers unhappy. I have to type out all of “translate to” before choosing a language. Why? Because “translate from” is alphabetically before “translate to”, so that’s what gets autotyped when I hit tab.
I wrote about the autocomplete problem all the way back in March of 2007. We were hoping to be able to release a much improved autocomplete a month later, but ran into both usability and performance problems that kept pushing the release date back and back.
Our new autocomplete allowed you to select a command by typing bits of the command name. You could even type the parts that were most memorable to you. For instance, “offox” would match to “open firefox“, and “trantojap” would match to “translate to japanese “. We found that this worked very well most — but not all — of the time. When the command you wanted came up (90% of the time), it was like magic. When something you didn’t want came up, it was was like a curse; you had no idea what went wrong or how to fix it. This stems from our “best match” highlighting algorithms; they were clever, which meant that it was sometimes hard to know what to expect. Adding just one character could drastically change where the typed characters appeared, often without changing the matched command. For example, if you’ve entered “of” and then type an “o, the best match will go from “open firefox” to “open firefox”. You can see how it would be easy to lose track of what you’ve typed.
Two Part Commands
The problem with using our autocomplete is that there were too many commands. And there were too many commands because we were conflating the idea of a command and its argument. While it is convenient to be able to type “google pants” and “open firefox” as a direct command, it is cumbersome to type “translate to french outrageous accent”, and impossible to type “calculate 4+(2+.5)^3/5″ (shifted characters aren’t type-able while holding down the command key). Similarly, it is impossible to enter accented and foreign-language characters. One of the more common complaints about Enso is that it is sometimes frustrating to type while holding down a key. If we had our druthers (and could eat them too), we would have special keys below the space-bar that your thumb could hold while typing. Entry into the hardware market is hard, so we went for the Caps Lock compromise. In the new version of Enso we are separating the selection of the command from the entry of the argument. By disentangling the command from it’s argument, we not only gain fewer characters typed in the quasimode and a smaller number of commands to match to, but also more freedom in handling suggestions and matches. For example, we can now do things like suggesting Google suggestions in the “google” command.
Here’s how the command selection will eventually look after the user has held down Caps Lock and typed “o”:
It works like this: You hold down Caps Lock and start typing the name of the command you want e.g., open, go, google, calculate, etc. Because Enso is autocompleting to the relatively small number of commands, it can use the powerful autocomplete. You’ll now be able to type “ttojap” for “translate to japanese”. As soon as the command you want is selected, you let go of Caps Lock and a transparent entry area instantly comes up that lets you enter the rest of the command. When you are done, you hit enter.
Here are two examples:
To open Firefox, you hold down Caps Lock, type “o”, release Caps Lock, type “Fire” and hit enter.
To calculate, you hold down Caps Lock, type “cal”, release, type “31^(5/2.2)” and hit enter.
What happens if you want to calculate some text that you’ve already typed?
Select it, hold down Caps Lock, type “cal”, and release. The entry area will be pre-filled with your text. Hit enter the result is inserted at your selection.
This is a pretty big change. A fundamental change, even. One worthy of debate. When we were discussing this design internally, we were worried that having to release Caps Lock might break the user’s chain of thought into two pieces: choosing the command and then remembering what you were going to do in that command. We found that not only did we Humanoids not have that problem, but we actually re-habituated to the new design very quickly. Not needing to hold down the Caps Lock key while typing long pieces of text is a boon, and getting to any particular command is actually faster. We also worried about the modality of the entry area, but I’ll come back to that in a minute. First, the benefits.
By paring the quasimodal portion of Enso down to just the command name, Enso has become the best keyboard shortcut system in existence. Novice users can type out the easy-to-learn-and-remember full command name (”open”). Long time users can just use one-keystroke commands (”o”). Thanks to our new learning algorithm (I’ll come back to this in another section), those shortcuts will train themselves to your use patterns, yet never shift around on you. This means that the time cost for choosing a command is greatly reduced. Using Caps Lock “o” for open is key-wise equivalent to a hot key for calling up a dedicated launcher! Good design erases the line between beginners and experts.
One of the oddities in this design is that commands that used to operate immediately on a selection (like “uppercase”) now require a tap of the enter key after releasing Caps Lock. We’ve had a number of internal debates about this. There is something unsettling about forcing that extra key press, especially in the case of commands like “lower case” where it is almost inconceivable that you would use the command and then type into the entry area. But with commands like “calculate”, you often want to use the command without the need to first find a place with editable text to write an equation
So why not have the command execute immediately if there’s a selection, and only require the entry area step with it’s enter-key tap when there isn’t a selection? That solution is certainly clever, and I wish it worked. What we discovered in testing was that the method wasn’t reliable enough. When execute a command where you want to type the contents in the entry area, you aren’t thinking about whether or not there’s a selection. Trying to be clever made Enso just unpredictable enough to be really frustrating.
Possible future commands like “reveal”, which opens the folder containing a selected target, may not ever use a selection. In the end, we felt the need for the entry area (for commands like “calculate”) outweighed the oddness of tapping an additional key (for selection commands like “upper case”). And as always, we strove to keep Enso consistent; at least this way you can habituate to tapping the return key as part of the command gesture.
Isn’t The Entry Area Modal?
Stop the presses! Don’t you Humanized folks hate all things modal? Isn’t the entry area modal? The answer is we aren’t certain, but that we believe the answer is mostly “no”. We need user testing to know for sure. How can the entry area not be modal? Because the definition of a mode is:
An human-machine interface is modal with respect to a given gesture when (1) the current state of the interface is not the user’s locus of attention and (2) the interface will execute one among several different responses to the gesture, depending on the system’s current state.
Our argument is that the state of the system (e.g., that your keystrokes are going into the entry area and not into the application) will almost always be the user’s locus of attention. The entry area will never appear unless the user actively asks for it, so they are never surprised, never enter the information in the wrong spot, and never make mode errors.
In our informal testing we have found this to be true, except for one case. That is the case where you choose a command, and before you finish using the entry area, you become distracted by something on the screen and try to interact with it. Your locus of attention has been stolen, you no longer are thinking about the state of the system, and you make mode errors.
So we added a “resume” command to the design. If you are trying to interact with something else on your system, then Enso fades away with a transparent message telling about the “resume” command. The resume command does exactly what it sounds like: it resumes the state of Enso exactly where you left it. This way, you don’t get tripped up where Enso is open and you are trying to do something else, and it’s trivial to get back to where you were.
Resume works the same way when you cancel a command by tapping the escape key. If you’ve gone through the trouble of entering data, the computer shouldn’t just throw that away. So even if you escape, you can resume.
For the quick among you, there’s another question to be asked. Isn’t the state of what will be resumed itself modal? Good question. We are once again in a somewhat choppy waters that are best navigated by user testing. Our best guess, however, is that you’ll only use the resume command when you know exactly what’s in there, i.e., that it is your locus of attention (even if it is invisible). How is this different than the invisible buffer that is copy and paste? Only in one important way: That resume is not being used for storing important content that might not exist elsewhere.
The Learning Algorithm
We’ve been fairly vocal about being wary of adaptive algorithms. Most implementations break one of our mantras of good interface design. They aren’t habituatable because bits of your interface keep moving around. A lot of our readers have been arguing persuasively that we were too heavy-handed in dismissing all adaptive interfaces. If the algorithm respects habituation, then we have no right to complain.
Having Enso adapt to your behavior is a much-requested feature, so we revisited this subject. As you use Enso 2.0, the commands you use most often start bubbling up, which is similar to most other adaptive systems. However, once Enso thinks you’ve habituated to something (using a particular set of keystrokes to mean a particular command), it locks that command in. What does this mean? It means that if you’ve been using “ca” to issue the “calendar” command, and then add the “calculate” command to Enso, then Enso will choose “calendar” over “calculate” for the keystrokes “ca“, even though it is alphabetically second, because you’ve habituated to “ca” being “calendar”. However, “cal” is still fair game for matching “calculate” because Enso knows you haven’t formed any habitual associations for that gesture.
While this means that two Enso installations might autocomplete differently, that’s no worse than two Enso installations with different sets of commands. Eventually, you’ll want to be able to go to any computer and have Enso know who you are, so that all of your habits can be preserved (Weave, anyone?). But, that’s for much later.
Multiple Argument Commands
In Enso 1.0, the selection was an implicit argument. That’s how “open with” works: The selection is assumed to be the thing being opened, and the explicit (typed-in) argument is the application that does the opening. Something new in the Enso 2.0 design is that commands can have multiple implicit arguments. One of the commands I’ve wanted is the “email” command, which takes the current selection (by it a file or some text) and allows you to email it to a friend. Another is the “send to phone” command, which sends an SMS to a friends phone. Both commands benefit from not forcing you to write the message first.
There’s a design question left unanswered in how we name commands. Take the example of the “translate to X” family of commands. We have a choice:
(A) The language is included as part of the command name. That is “translate to japanese” is one command, and “translate to italian” is another.
(B) “translate to” is the command name and the language is selected as an argument in the entry area.
This doesn’t just affect the translate command family, but any family of commands that takes two arguments, the first of which determines the behavior of the command. For instance, there is the “convert” command family, which does conversions from one unit to another. Does the unit being converted to go in the command name? Or as an argument? We’re still working on this one. The issue is worth a blog post on it’s own right. I’d love people’s thoughts.
Conclusion
When it comes down to it, Enso 2.0 isn’t really much more complicated than Enso 1.0. We’ve split commands into two parts, and we’ve made the autocomplete and suggestions better. That’s pretty much it. Yet, I think that all of the problems we set out to solve have been solved, with a minimum of fuss:
- Enso shouldn’t make you type all of “open” every time
- Enso should be able to open paths and urls
- Enso should support international character input
- Enso should gracefully handle the case where there’s no convenient place to enter text
- Enso shouldn’t require you to type out text, select it, and then run a command when you’d rather run the command and then enter the text (think calculate)
- Enso shouldn’t make you hold down a key while typing lots of characters
There’s a complex process that goes into making products that are simple. Our actual design documents are many times the length of this blog post. They deal with many of the gritty and boring details that I’ve left out here, like colors, fonts, edge cases, state charts, ramifications, etc. Of course, parts of those documents are dedicated to the fun and exotic. Think command chaining/piping. I’ll write about that in a future blog post.
For now, I and the entire Humanized team would love to hear your feedback on the design. Oh, and remember not all of this is implemented in the soon-to-be-released Enso 2.0 prototype.

COMMENTS
49 Voices Add yours below.