As many of you will know, I am quite the usability pervert. Understanding how people use computers and creating better and more intuitive interfaces fires me up, and the mere idea of GNOME 3.0 is interesting to me. The reason why I find GNOME 3.0 exciting is that it presents a dream for us; we are not entirely sure where we are going, but we know it needs to be different, intuitive and better for our users.
While at GUADEC I sat down for a while with MacSlow, and he gave me a demo of Lowfat. For quite some time I have had some vague ideas for the approximate direction of GNOME 3.0, and some of Mirco’s work triggered some of these latent concepts and mental scribblings. I am still not 100% sure where I would like to see GNOME 3.0 go, but some of the fundamental concepts are solidifying, and with my recent addition to the mighty Planet GNOME, I figured I should share some ideas and hopefully cause some useful discussion. I am going to wander through a few different concepts and discuss how we should make use of them in GNOME 3.0, culminating in some ideas and food for thought.
A more organic desktop
One of the problems with the current GNOME is that it largely ignores true spatial interaction. Sure, we have spatial nautilus (which is switched off in many distros), but spatial interaction goes much further. If you look at many desktop users across all platforms, the actual desktop itself serves as a ground for immediate thinking and ad-hoc planning in many different ways:
- Immediate Document Handling – the desktop acts as an initial ground for dealing with documents. Items are downloaded to the desktop and poked at before entering more important parts of the computer such as the all-important organised filesystem.
- Grouping – users use the desktop to group related things together. This is pure and simple pile theory. The idea is that people naturally group related things together into piles. Look at your desk right now – I bet you have things grouped together in related piles and collections. We don’t maximise this natural human construct on our desktop. More on this later.
- Deferred Choices – the desktop serves as a means to defer choices. This is when the user does not want to immediately tend to a task or needs to attend to the task later. An example of this is if you need to remember to take a DVD to work the next day – you typically leave it next to the front door or with your car keys. The analogy with the current desktop is that you would set an alarm in a special reminder tool. More people set ad-hoc reminders than alarms.
- Midstream Changes – it is common for users to begin doing a task, and then get distracted or start doing something else. An example of this is if you start making a website. You may make the initial design, and then need to create graphics and get distracted playing with different things in Inkscape. The desktop often acts as a midstream dumping ground for these things. Work in progress in documents are often placed on the desktop, and this acts as a reminder to pick it up later (see deferred choices earlier).
It is evident that the desktop is an area that provides a lot of utility, and this utility maps to organic human interaction – collecting things together, making piles, creating collections, setting informal reminders, grouping related things. These are all operations on things, and are the same kind of operations we do in our normal lives.
Part of the problem with our current desktop is that there is a dividing line between things on the desktop and things elsewhere. It is a mixed maze of meta-data, and inter-connected entities that should be part of the desktop itself. As an example, when I was writing my book, I created word processor documents, kept notes about the book in TomBoy, saved bookmarks in Firefox and kept communications in GMail and Gaim. The singular effort of writing a book involved each of these disparate unconnected resources storing different elements of my project. I would instead like to see these things much more integrated into (a) contextual projects, and (b) manageable at a desktop level. More on contextual projects later.
Blurring the line between files, functions and applications
The problem we have right now is that the desktop is just not as integrated as it could be. If you want to manage files, you do it in a file manager, if you want to do something with those files you do it in an application, if you want to collect together files into a unit, you use an archive manager. Much of this can be done on the desktop itself, but we need to identify use cases and approach the problem from a document-type level.
Let me give you an example. A common type of media are pictures, photographs and other images. The different things you may want to do with those images include:
- Open them
- Edit them
- Compare them
- Collect them together by some form of relevance (such as photos from a trip, or pictures of mum and dad)
- Search for them
These tasks involve a combination of file management, photo editing applications, photo viewing applications and desktop search. Imagine this use case instead:
I want to look through my photos. To do this I jump to the ‘photo collection’ part of my desktop (no directories) and my collection has different piles of photos. I can then double-click on a pile and open up in front of me. Each photo can be picked up and moved around in a physically similar way to a normal desk (this is inspired from MacSlow’s LowFat). I can also spin photos over and write notes and details on the back of them. Using my photos I can put two side by side and increase the size to compare them, or select a number of photos and make them into a pile. This pile can then be transported around as a unit, and maybe flicked through like a photo album. All of this functionality is occurring at the desktop level – I never double-click a photo to load it into a photo viewer, I just make the photo bigger to look at it. All of the manipulation and addition of meta-data (by writing on the back of the photo) is within the realm of real world object manipulation, and obeying pile theory and spatial orientation.
The point here is that the objects on the desktop (which are currently thought of as icons in today’s desktop) are actual real world objects that can be interacted with in a way that is relevant to their type. In the above use case you can make the items bigger to view them, compare them side by side, and scribble notes on them. These are unique to certain documents and not others. You would not zoom, compare and scribble notes on audio for example, but you would certainly use pile theory on audio to collect related audio together (such as albums).
So, if we are trying to keep interaction with objects at the desktop level, how exactly do we edit them and create new content? How do today’s applications fit into this picture? Well, let me explain…
The problem with many applications is that they provide an unorganised collections of modal tools that are not related to context in any way. I have been thinking about this a lot recently with regards to Jokosher, and this was discussed in my talk at GUADEC. Take for example, Steinberg’s Cubase:
In Cubase, if you want to perform an operation, you need to enable a tool, perform the operation, disable the tool and then do something else. There is a lot of tool switching going on, and toolbar icons are always displayed, often in situations when that tool can either not be used or just would not make sense to be used. The problem is that it obeys the philosophy of always show lots of tools onscreen as it makes the app look more professional. Sure, it may look professional, but it has a detrimental impact on usability.
I believe that tools should only ever be displayed when pertinent to the context. As an example, in Jokosher we have a bunch of waveforms:
The first point here is that we don’t display the typical waveforms you see in other applications. Waveforms are usually used to indicate the level in a piece of audio, and as such, we figured that musicians just want to see essentially a line graph, instead of the spiky waveform in most applications. This immediately cuts down the amount of irrelevant information on screen. Now, if you select a portion of the wave in Jokosher, a tray drops down:
(well, at the moment, it drops down, but does not visually look like a tray, so run with me on this for a bit!)
Inside this tray are buttons for tools that are relevant to that specific selection. Here we are only ever displaying the tools that are pertinent to the context, and this has a few different benefits:
- We don’t bombard our users with endless toolbars
- Tools are always contextual, which makes the interface more discoverable and intuitive
- We restrict the potential of error by restricting the number of tools available for a given context
- There are fewer buttons to accidentally click on, and this lowers the hamfistability of our desktop 😛
Now, take this theory of contextual tools, and apply it at the desktop level. Using our example from earlier with photos, I would like to see contextual tools appear when you view a photo at a particular size. So, as an example, if I have my collection of photos and I increase the size of a photo to look at it, I would like to see some context toolbars float up when I hover my mouse over the photo to allow me to make selections. When I have made a selection I should then see more tools appear. There are two important points here:
- Firstly, you don’t load the photo into an application. As you view the photo at the desktop level, the functionality associated with an editing application seamlessly appears as contextual tools. This banishes the concept of applications. Instead, you deal with things, and interact with those things immediately.
- Secondly, tools are always contextual, and relevant to the media type. For a photo and a document, selections make sense, for an audio file, you should be able to apply effects and adjust the volume, for a text editor you should be able to change font properties. Everything is relevant to the context.
The contextual desktop at a project level
To really make the project feel contextual, we need to be able to make it sensitive to projects and tasks. At the moment, tasks and sub-tasks are dealt with on a per-task basis as opposed to being part of a bigger, grander picture. Let me give you a use case with today’s desktop:
I am working on a project with a client to build a website for him. I decide I want to send some emails to him, so I fire up my email client and dig through my inbox to find the mail he sent me yesterday. I then reply to him. As I work, I realise that I need to speak to him urgently, so I log on to IM. Within my buddy list I see that he is there, so I have a chat with him. While chatting, my friend pops up to discuss the most recent Overkill album. As I am working, I don’t really want to talk about the album right now, so I either ignore him or make my excuses. After finishing the discussion with the client, I load up Firefox and hunt through my bookmarks to find a relevant page and start merging the content into the customers site. To do this I load up Bluefish and look through the filesystem to find my work files and begin the job.
The problem here is that the relevant work is buried deep in other irrelevant items. To make matters worse, some resources such as IM can just prove too distracting, and may never get used (remember midstream changes earlier). As such, the really valuable medium of IM is never used for fear of distraction. Now imagine the use case as this:
I am working on a project with a client to build a website for him. I find my collection of projects and enable it, and my entire desktop switches context to that specific project. Irrelevant applications such as games are hidden, and relevant resources are prioritised. When I communicate with the client, only emails and buddies relevant to that project are displayed. When I want to find resources (such as documents) to work on, only those documents that are part of the project are displayed. The entire desktop switches to become aligned to my current working project. This makes me less distracted, more focused and there is less clutter to trip over.
I actually had this idea a while back, and wrote an article describing it in more detail. Feel free to have a read of it.
The point of this blog post is not to sell you these concepts, but to identify some better ways of working which are more intuitive and more discoverable. Importantly, we need to make our desktop feel familiar. This was a point Jef Raskin made as part of his work, and I agree. Some people have been proposing some pretty wacky ideas for GNOME 3.0, but grandiose UI statements mean nothing unless they feel familiar and intuitive. What I am proposing is an implementation of real world context, relevance and physics into our desktop. This will make it more intuitive, less cluttered, less distracting and a better user experience.
I really want to encourage some genuine discussion around this, so feel free to add comments to this blog post, or reply via Planet GNOME. Have fun!