Journal tags: interaction

22

sparkline

The datalist element on iOS

The datalist element is good. It was a bit bumpy there for a while, but browser implementations have improved over time. Now it’s by far the simplest and most robust way to create an autocompleting combobox widget.

Hook up an input element with a datalist element using the list and id attributes and you’re done. You can even use a bit of Ajax to dynamically update the option elements inside the datalist in response to the user’s input. The browser takes care of all the interaction. If you try to roll your own combobox implementation, it’s almost certainly going to involve a lot of JavaScript and still probably won’t account for all use cases.

Safari on iOS—and therefore all browsers on iOS—didn’t support datalist for quite a while. But once it finally shipped, it worked really nicely. The options showed up just like automplete suggestions above the keyboard.

But that broke a while back.

The suggestions still appeared, but if you tapped on one of them, nothing happened. The input element didn’t get updated. You had to tap on a little downward arrow inside the input in order to see the list of options.

That was really frustrating for anybody on iOS using The Session. By far the most common task on the site is searching for a tune, something that’s greatly (progressively) enhanced with a dynamically-updating datalist.

I just updated to iOS 18 specifically to see if this bug has been fixed, and it has:

Fixed updating the input value when selecting an option from a datalist element.

Hallelujah!

But now there’s some additional behaviour that’s a little weird.

As well as showing the options in the autocomplete list above the keyboard, Safari on iOS—and therefore all browsers on iOS—also pops up the options as a list (as if you had tapped on that downward arrow). If the list is more than a few options long, it completely obscures the input element you’re typing into!

I’m not sure if this is a bug or if it’s the intended behaviour. It feels like a bug, but I don’t know if I should file something.

For now, I’ve updated the datalist elements on The Session to only ever hold three option elements in order to minimise the problem. Seeing as the autosuggest list above the keyboard only ever shows a maximum of three suggestions anyway, this feels like a reasonable compromise.

Disclosure

You know how when you’re on hold to any customer service line you hear a message that thanks you for calling and claims your call is important to them. The message always includes a disclaimer about calls possibly being recorded “for training purposes.”

Nobody expects that any training is ever actually going to happen—surely we would see some improvement if that kind of iterative feedback loop were actually in place. But we most certainly want to know that a call might be recorded. Recording a call without disclosure would be unethical and illegal.

Consider chatbots.

If you’re having a text-based (or maybe even voice-based) interaction with a customer service representative that doesn’t disclose its output is the result of large language models, that too would be unethical. But, at the present moment in time, it would be perfectly legal.

That needs to change.

I suspect the necessary legislation will pass in Europe first. We’ll see if the USA follows.

In a way, this goes back to my obsession with seamful design. With something as inherently varied as the output of large language models, it’s vital that people have some way of evaluating what they’re told. I believe we should be able to see as much of the plumbing as possible.

The bare minimum amount of transparency is revealing that a machine is in the loop.

This shouldn’t be a controversial take. But I guarantee we’ll see resistance from tech companies trying to sell their “AI” tools as seamless, indistinguishable drop-in replacements for human workers.

Control

In two of my recent talks—In And Out Of Style and Design Principles For The Web—I finish by looking at three different components:

  1. a button,
  2. a dropdown, and
  3. a datepicker.

In each case you could use native HTML elements:

  1. button,
  2. select, and
  3. input type="date".

Or you could use divs with a whole bunch of JavaScript and ARIA.

In the case of a datepicker, I totally understand why you’d go for writing your own JavaScript and ARIA. The native HTML element is quite restricted, especially when it comes to styling.

In the case of a dropdown, it’s less clear-cut. Personally, I’d use a select element. While it’s currently impossible to style the open state of a select element, you can style the closed state with relative ease. That’s good enough for me.

Still, I can understand why that wouldn’t be good enough for some cases. If pixel-perfect consistency across platforms is a priority, then you’re going to have to break out the JavaScript and ARIA.

Personally, I think chasing pixel-perfect consistency across platforms isn’t even desirable, but I get it. I too would like to have more control over styling select elements. That’s one of the reasons why the work being done by the Open UI group is so important.

But there’s one more component: a button.

Again, you could use the native button element, or you could use a div or a span and add your own JavaScript and ARIA.

Now, in this case, I must admit that I just don’t get it. Why wouldn’t you just use the native button element? It has no styling issues and the browser gives you all the interactivity and accessibility out of the box.

I’ve been trying to understand the mindset of a developer who wouldn’t use a native button element. The easy answer would be that they’re just bad people, and dismiss them. But that would probably be lazy and inaccurate. Nobody sets out to make a website with poor performance or poor accessibility. And yet, by choosing not to use the native HTML element, that’s what’s likely to happen.

I think I might have finally figured out what might be going on in the mind of such a developer. I think the issue is one of control.

When I hear that there’s a native HTML element—like button or select—that comes with built-in behaviours around interaction and accessibility, I think “Great! That’s less work for me. I can just let the browser deal with it.” In other words, I relinquish control to the browser (though not entirely—I still want the styling to be under my control as much as possible).

But I now understand that someone else might hear that there’s a native HTML element—like button or select—that comes with built-in behaviours around interaction and accessibility, and think “Uh-oh! What if there unexpected side-effects of these built-in behaviours that might bite me on the ass?” In other words, they don’t trust the browsers enough to relinquish control.

I get it. I don’t agree. But I get it.

If your background is in computer science, then the ability to precisely predict how a programme will behave is a virtue. Any potential side-effects that aren’t within your control are undesirable. The only way to ensure that an interface will behave exactly as you want is to write it entirely from scratch, even if that means using more JavaScript and ARIA than is necessary.

But I don’t think it’s a great mindset for the web. The web is filled with uncertainties—browsers, devices, networks. You can’t possibly account for all of the possible variations. On the web, you have to relinquish some control.

Still, I’m glad that I now have a bit more insight into why someone would choose to attempt to retain control by using div, JavaScript and ARIA. It’s not what I would do, but I think I understand the motivation a bit better now.

aria-live

I wrote a little something recently about using ARIA attributes as selectors in CSS. For me, one of the advantages is that because ARIA attributes are generally added via JavaScript, the corresponding CSS rules won’t kick in if something goes wrong with the JavaScript:

Generally, ARIA attributes—like aria-hidden—are added by JavaScript at runtime (rather than being hard-coded in the HTML).

But there’s one instance where I actually put the ARIA attribute directly in the HTML that gets sent from the server: aria-live.

If you’re not familiar with it, aria-live is extremely useful if you’ve got any dynamic updates on your page—via Ajax, for example. Let’s say you’ve got a bit of your site where filtered results will show up. Slap an aria-live attribute on there with a value of “polite”:

<div aria-live="polite">
...dynamic content gets inserted here
</div>

You could instead provide a value of “assertive”, but you almost certainly don’t want to do that—it can be quite rude.

Anyway, on the face it, this looks like exactly the kind of ARIA attribute that should be added with JavaScript. After all, if there’s no JavaScript, there’ll be no dynamic updates.

But I picked up a handy lesson from Ire’s excellent post on using aria-live:

Assistive technology will initially scan the document for instances of the aria-live attribute and keep track of elements that include it. This means that, if we want to notify users of a change within an element, we need to include the attribute in the original markup.

Good to know!

ARIA in CSS

Sara tweeted something recently that resonated with me:

Also, Pro Tip: Using ARIA attributes as CSS hooks ensures your component will only look (and/or function) properly if said attributes are used in the HTML, which, in turn, ensures that they will always be added (otherwise, the component will obv. be broken)

Yes! I didn’t mention it when I wrote about accessible interactions but this is my preferred way of hooking up CSS and JavaScript interactions. Here’s old Codepen where you can see it in action:

[aria-hidden='true'] {
  display: none;
}

In order for the functionality to work for everyone—screen reader users or not—I have to make sure that I’m toggling the value of aria-hidden in my JavaScript.

There’s another advantage to this technique. Generally, ARIA attributes—like aria-hidden—are added by JavaScript at runtime (rather than being hard-coded in the HTML). If something goes wrong with the JavaScript, the aria-hidden value isn’t set to “true”, which means that the CSS never kicks in. So the default state is for content to be displayed. There’s no assumption that the JavaScript has to work in order for the CSS to make sense.

It’s almost as though accessibility and progressive enhancement are connected somehow…

Accessible interactions

Accessibility on the web is easy. Accessibility on the web is also hard.

I think it’s one of those 80/20 situations. The most common accessibility problems turn out to be very low-hanging fruit. Take, for example, Holly Tuke’s list of the 5 most annoying website features she faces as a blind person every single day:

  • Unlabelled links and buttons
  • No image descriptions
  • Poor use of headings
  • Inaccessible web forms
  • Auto-playing audio and video

None of those problems are hard to fix. That’s what I mean when I say that accessibility on the web is easy. As long as you’re providing a logical page structure with sensible headings, associating form fields with labels, and providing alt text for images, you’re at least 80% of the way there (you’re also doing way better than the majority of websites, sadly).

Ah, but that last 20% or so—that’s where things get tricky. Instead of easy-to-follow rules (“Always provide alt text”, “Always label form fields”, “Use sensible heading levels”), you enter an area of uncertainty and doubt where there are no clear answers. Different combinations of screen readers, browsers, and operating systems might yield very different results.

This is the domain of interaction design. Here be dragons. ARIA can help you …but if you overuse its power, it may cause more harm than good.

When I start to feel overwhelmed by this, I find it’s helpful to take a step back. Instead of trying to imagine all the possible permutations of screen readers and browsers, I start with a more straightforward use case: keyboard users. Keyboard users are (usually) a subset of screen reader users.

The pattern that comes up the most is to do with toggling content. I suppose you could categorise this as progressive disclosure, but I’m talking about quite a wide range of patterns:

  • accordions,
  • menus (including mega menu monstrosities),
  • modal dialogs,
  • tabs.

In each case, there’s some kind of “trigger” that toggles the appearance of a “target”—some chunk of content.

The first question I ask myself is whether the trigger should be a button or a link (at the very least you can narrow it down to that shortlist—you can discount divs, spans, and most other elements immediately; use a trigger that’s focusable and interactive by default).

As is so often the case, the answer is “it depends”, but generally you can’t go wrong with a button. It’s an element designed for general-purpose interactivity. It carries the expectation that when it’s activated, something somewhere happens. That’s certainly true in all the examples I’ve listed above.

That said, I think that links can also make sense in certain situations. It’s related to the second question I ask myself: should the target automatically receive focus?

Again, the answer is “it depends”, but here’s the litmus test I give myself: how far away from each other are the trigger and the target?

If the target content is right after the trigger in the DOM, then a button is almost certainly the right element to use for the trigger. And you probably don’t need to automatically focus the target when the trigger is activated: the content already flows nicely.

<button>Trigger Text</button>
<div id="target">
<p>Target content.</p>
</div>

But if the target is far away from the trigger in the DOM, I often find myself using a good old-fashioned hyperlink with a fragment identifier.

<a href="#target">Trigger Text</a>
…
<div id="target">
<p>Target content.</p>
</div>

Let’s say I’ve got a “log in” link in the main navigation. But it doesn’t go to a separate page. The design shows it popping open a modal window. In this case, the markup for the log-in form might be right at the bottom of the page. This is when I think there’s a reasonable argument for using a link. If, for any reason, the JavaScript fails, the link still works. But if the JavaScript executes, then I can hijack that link and show the form in a modal window. I’ll almost certainly want to automatically focus the form when it appears.

The expectation with links (as opposed to buttons) is that you will be taken somewhere. Let’s face it, modal dialogs are like fake web pages so following through on that expectation makes sense in this context.

So I can answer my first two questions:

  • “Should the trigger be a link or button?” and
  • “Should the target be automatically focused?”

…by answering a different question:

  • “How far away from each other are the trigger and the target?”

It’s not a hard and fast rule, but it helps me out when I’m unsure.

At this point I can write some JavaScript to make sure that both keyboard and mouse users can interact with the interactive component. There’ll certainly be an addEventListener(), some tabindex action, and maybe a focus() method.

Now I can start to think about making sure screen reader users aren’t getting left out. At the very least, I can toggle an aria-expanded attribute on the trigger that corresponds to whether the target is being shown or not. I can also toggle an aria-hidden attribute on the target.

When the target isn’t being shown:

  • the trigger has aria-expanded="false",
  • the target has aria-hidden="true".

When the target is shown:

  • the trigger has aria-expanded="true",
  • the target has aria-hidden="false".

There’s also an aria-controls attribute that allows me to explicitly associate the trigger and the target:

<button aria-controls="target">Trigger Text</button>
<div id="target">
<p>Target content.</p>
</div>

But don’t assume that’s going to help you. As Heydon put it, aria-controls is poop. Still, Léonie points out that you can still go ahead and use it. Personally, I find it a useful “hook” to use in my JavaScript so I know which target is controlled by which trigger.

Here’s some example code I wrote a while back. And here are some old Codepens I made that use this pattern: one with a button and one with a link. See the difference? In the example with a link, the target automatically receives focus. But in this situation, I’d choose the example with a button because the trigger and target are close to each other in the DOM.

At this point, I’ve probably reached the limits of what can be abstracted into a single trigger/target pattern. Depending on the specific component, there might be much more work to do. If it’s a modal dialog, for example, you’ve got to figure out where to put the focus, how to trap the focus, and figure out where the focus should return to when the modal dialog is closed.

I’ve mostly been talking about websites that have some interactive components. If you’re building a single page app, then pretty much every single interaction needs to be made accessible. Good luck with that. (Pro tip: consider not building a single page app—let the browser do what it has been designed to do.)

Anyway, I hope this little stroll through my thought process is useful. If nothing else, it shows how I attempt to cope with an accessibility landscape that looks daunting and ever-changing. Remember though, the fact that you’re even considering this stuff means you care more than most web developers. And you are not alone. There are smart people out there sharing what they learn. The A11y Project is a great hub for finding resources.

And when it comes to interactive patterns like the trigger/target examples I’ve been talking about, there’s one more question I ask myself: what would Heydon do?

Unobtrusive feedback

Ten years ago I gave a talk at An Event Apart all about interaction design. It was called Paranormal Interactivity. You can watch the video, listen to the audio or read the transcript if you like.

I think it holds up pretty well. There’s one interaction pattern in particular that I think has stood the test of time. In the talk, I introduce this pattern as something you can see in action on Huffduffer:

I was thinking about how to tell the user that something’s happened without distracting them from their task, and I thought beyond the web. I thought about places that provide feedback mechanisms on screens, and I thought of video games.

So we all know Super Mario, right? And if you think about when you’re collecting coins in Super Mario, it doesn’t stop the game and pop up an alert dialogue and say, “You have just collected ten points, OK, Cancel”, right? It just does it. It does it in the background, but it does provide you with a feedback mechanism.

The feedback you get in Super Mario is about the number of points you’ve just gained. When you collect an item that gives you more points, the number of points you’ve gained appears where the item was …and then drifts upwards as it disappears. It’s unobtrusive enough that it won’t distract you from the gameplay you’re concentrating on but it gives you the reassurance that, yes, you have just gained points.

I think this a neat little feedback mechanism that we can borrow for subtle Ajax interactions on the web. These are actions that don’t change much of the content. The user needs to be able to potentially do lots of these actions on a single page without waiting for feedback every time.

On Huffduffer, for example, you might be looking at a listing of people that you can choose to follow or unfollow. The mechanism for doing that is a button per person. You might potentially be clicking lots of those buttons in quick succession. You want to know that each action has taken effect but you don’t want to be interrupted from your following/unfollowing spree.

You get some feedback in any case: the button changes. Maybe the text updates from “follow” to “unfollow” accompanied by a change in colour (this is what you’ll see on Twitter). The Super Mario style feedback is in addition to that, rather than instead of.

I’ve made a Codepen so you can see a reduced test case of the Super Mario feedback in action.

See the Pen Unobtrusive feedback by Jeremy Keith (@adactio) on CodePen.

Here’s the code available as a gist.

It’s a function that takes two arguments: the element that the feedback originates from (pass in a DOM node reference for this), and the contents of the feedback (this can be a string of text or it can be HTML …or SVG). When you call the function with those two arguments, this is what happens:

  1. The JavaScript generates a span element and puts the feedback contents inside it.
  2. Then it positions that element right over the element that the feedback originates from.
  3. Then there’s a CSS transform. The feedback gets a translateY applied so it drifts upward. At the same time it gets its opacity reduced from 1 to 0 so it’s fading away.
  4. Finally there’s a transitionend event that fires when the animation is over. Once that event fires, the generated span is destroyed.

When I first used this pattern on Huffduffer, I’m pretty sure I was using jQuery. A few years later I rewrote it in vanilla JavaScript. That was four years ago so I wonder if the code could be improved. Have a go if you fancy it.

Still, even if the code could benefit from an update, I’m pleased that the underlying pattern still holds true. I used it recently on The Session and it’s working a treat for a new Ajax interaction there (bookmarking or unbookbarking an item).

If you end up using this unobtrusive feedback pattern anyway, please let me know—I’d love to see more examples of it in the wild.

Voice User Interface Design by Cheryl Platz

Cheryl Platz is speaking at An Event Apart Chicago. Her inaugural An Event Apart presentation is all about voice interfaces, and I’m going to attempt to liveblog it…

Why make a voice interface?

Successful voice interfaces aren’t necessarily solving new problems. They’re used to solve problems that other devices have already solved. Think about kitchen timers. There are lots of ways to set a timer. Your oven might have one. Your phone has one. Why use a $200 device to solve this mundane problem? Same goes for listening to music, news, and weather.

People are using voice interfaces for solving ordinary problems. Why? Context matters. If you’re carrying a toddler, then setting a kitchen timer can be tricky so a voice-activated timer is quite appealing. But why is voice is happening now?

Humans have been developing the art of conversation for thousands of years. It’s one of the first skills we learn. It’s deeply instinctual. Most humans use speach instinctively every day. You can’t necessarily say that about using a keyboard or a mouse.

Voice-based user interfaces are not new. Not just the idea—which we’ve seen in Star Trek—but the actual implementation. Bell Labs had Audrey back in 1952. It recognised ten words—the digits zero through nine. Why did it take so long to get to Alexa?

In the late 70s, DARPA issued a challenge to create a voice-activated system. Carnagie Mellon came up with Harpy (with a thousand word grammar). But none of the solutions could respond in real time. In conversation, we expect a break of no more than 200 or 300 milliseconds.

In the 1980s, computing power couldn’t keep up with voice technology, so progress kind of stopped. Time passed. Things finally started to catch up in the 90s with things like Dragon Naturally Speaking. But that was still about vocabulary, not grammar. By the 2000s, small grammars were starting to show up—starting an X-Box or pausing Netflix. In 2008, Google Voice Search arrived on the iPhone and natural language interaction began to arrive.

What makes natural language interactions so special? It requires minimal training because it uses the conversational muscles we’ve been working for a lifetime. It unlocks the ability to have more forgiving, less robotic conversations with devices. There might be ten different ways to set a timer.

Natural language interactions can also free us from “screen magnetism”—that tendency to stay on a device even when our original task is complete. Voice also enables fast and forgiving searches of huge catalogues without time spent typing or browsing. You can pick a needle straight out of a haystack.

Natural language interactions are excellent for older customers. These interfaces don’t intimidate people without dexterity, vision, or digital experience. Voice input often leads to more inclusive experiences. Many customers with visual or physical disabilities can’t use traditional graphical interfaces. Voice experiences throw open the door of opportunity for some people. However, voice experience can exclude people with speech difficulties.

Making the case for voice interfaces

There’s a misconception that you need to work at Amazon, Google, or Apple to work on a voice interface, or at least that you need to have a big product team. But Cheryl was able to make her first Alexa “skill” in a week. If you’re a web developer, you’re good to go. Your voice “interaction model” is just JSON.

How do you get your product team on board? Find the customers (and situations) you might have excluded with traditional input. Tell the stories of people whose hands are full, or who are vision impaired. You can also point to the adoption rate numbers for smart speakers.

You’ll need to show your scenario in context. Otherwise people will ask, “why can’t we just build an app for this?” Conduct research to demonstrate the appeal of a voice interface. Storyboarding is very useful for visualising the context of use and highlighting existing pain points.

Getting started with voice interfaces

You’ve got to understand how the technology works in order to adapt to how it fails. Here are a few basic concepts.

Utterance. A word, phrase, or sentence spoken by a customer. This is the true form of what the customer provides.

Intent. This is the meaning behind a customer’s request. This is an important distinction because one intent could have thousands of different utterances.

Prompt. The text of a system response that will be provided to a customer. The audio version of a prompt, if needed, is generated separately using text to speech.

Grammar. A finite set of expected utterances. It’s a list. Usually, each entry in a grammar is paired with an intent. Many interfaces start out as being simple grammars before moving on to a machine-learning model later once the concept has been proven.

Here’s the general idea with “artificial intelligence”…

There’s a human with a core intent to do something in the real world, like knowing when the cookies in the oven are done. This is translated into an intent like, “set a 15 minute timer.” That’s the utterance that’s translated into a string. But it hasn’t yet been parsed as language. That string is passed into a natural language understanding system. What comes is a data structure that represents the customers goal e.g. intent=timer; duration=15 minutes. That’s sent to the business logic where a timer is actually step. For a good voice interface, you also want to send back a response e.g. “setting timer for 15 minutes starting now.”

That seems simple enough, right? What’s so hard about designing for voice?

Natural language interfaces are a form of artifical intelligence so it’s not deterministic. There’s a lot of ruling out false positives. Unlike graphical interfaces, voice interfaces are driven by probability.

How do you turn a sound wave into an understandable instruction? It’s a lot like teaching a child. You feed a lot of data into a statistical model. That’s how machine learning works. It’s a probability game. That’s where it gets interesting for design—given a bunch of possible options, we need to use context to zero in on the most correct choice. This is where confidence ratings come in: the system will return the probability that a response is correct. Effectively, the system is telling you how sure or not it is about possible results. If the customer makes a request in an unusual or unexpected way, our system is likely to guess incorrectly. That’s because the system is being given something new.

Designing a conversation is relatively straightforward. But 80% of your voice design time will be spent designing for what happens when things go wrong. In voice recognition, edge cases are front and centre.

Here’s another challenge. Interaction with most voice interfaces is part conversation, part performance. Most interactions are not private.

Humans don’t distinguish digital speech fom human speech. That means these devices are intrinsically social. Our brains our wired to try to extract social information, even form digital speech. See, for example, why it’s such a big question as to what gender a voice interface has.

Delivering a voice interface

Storyboards help depict the context of use. Sample dialogues are your new wireframes. These are little scripts that not only cover the happy path, but also your edge case. Then you reverse engineer from there.

Flow diagrams communicate customer states, but don’t use the actual text in them.

Prompt lists are your final deliverable.

Functional prototypes are really important for voice interfaces. You’ll learn the real way that customers will ask for things.

If you build a working prototype, you’ll be building two things: a natural language interaction model (often a JSON file) and custom business logic (in a programming language).

Eventually voice design will become a core competency, much like mobile, which was once separate.

Ask yourself what tasks your customers complete on your site that feel clunkly. Remember that voice desing is almost never about new scenarious. Start your journey into voice interfaces by tackling old problems in new, more inclusive ways.

May the voice be with you!

Marty’s mashup

While the Interaction 19 event was a bit of a mixed bag overall, there were some standout speakers.

Marty Neumeier was unsurprisingly excellent. I’d seen him speak before, at UX London a few years back, so I knew he’d be good. He has a very reassuring, avuncular manner when he’s speaking. You know the way that there are some people you could just listen to all day? He’s one of those.

Marty’s talk at Interaction 19 was particularly interesting because it was about his new book. Now, why would that be of particular interest? Well, this new book—Scramble—is a business book, but it’s written in the style of a thriller. He wanted it to be like one of those airport books that people read as a guilty pleasure.

One rainy night in December, young CEO David Stone is inexplicably called back to the office. The company’s chairman tells him that the board members have reached the end of their patience. If David can’t produce a viable turnaround plan in five weeks, he’s out of a job. His only hope is to try something new. But what?

I love this idea!

I’ve talked before about borrowing narrative structures from literature and film and applying them to blog posts and conference talks—techniques like flashback, in media res, etc.—so I really like the idea of taking an entire genre and applying it to a technical topic.

The closest I’ve seen is the comic that Scott McCloud wrote for the release of Google Chrome back in 2008. But how about a romantic comedy about service workers? Or a detective novel about CSS grid?

I have a feeling I’ll be thinking about Marty Neumeier’s book next time I’m struggling to put a conference talk together.

In the meantime, if you want to learn from the master storyteller himself, Clearleft are running a two-day Brand Master Workshop with Marty on March 14th and 15th at The Barbican in London. Early bird tickets are on sale until this Thursday, so don’t dilly-dally if you were thinking about nabbing your spot.

Interaction 19

Right before heading to Geneva to spend the week hacking at CERN, I was in Seattle with a sizable Clearleft contingent to attend Interaction 19, the annual conference put on by the Interaction Design Association.

Ben has rounded up the highlights from my fellow Clearlefties. There are some good talks listed there: John Maeda, Nelly Ben Hayoun, and Jon Bell were thoroughly enjoyable. Some other talks were just okay, and there was one talk, by IXDA president Alok Nandi, that was almost impressive in how rambling and incoherent it was. It was like being in a scene from Silicon Valley. I remember clapping at the end; not out of appreciation, but out of relief.

If truth be told, Interaction 19 had about a day’s worth of really great content …spread out over three days. To be fair, that’s par for the course. When we went to Interaction 17 in New York, the hit/miss ratio was about the same:

There were some really good talks at the event, but alas, the muti-track format made it difficult to see all of them. Continuous partial FOMO was the order of the day.

And as I said at the time:

To be honest, the conference was only part of the motivation for the trip. Spending a week in New York with a gaggle of Clearlefties was its own reward.

So I’m willing to cut Interaction 19 a lot of slack. Even if quite a few of the talks were just so-so, getting to hang with Clearlefties in Seattle during snowmageddon was a lot of fun (and you’ll be pleased to hear that we didn’t even resort to cannibalism to survive).

But while the content of the conference was fair to middling, the organisation of it was a shambles:

Imagine the Fyre Festival but in downtown Seattle in winter. Welcome to @ixdconf. #ixd19

They sold more tickets than there were seats. I ended up watching the first morning’s keynotes being streamed to a screen in a conference room in a different building.

Now, I’ve been at events with keynotes that have overflow rooms—South by Southwest does this. But that’s at a different scale. This is a conference with a known number of attendees, each one of them spending over a thousand dollars to attend. I’m pretty sure that a first-come, first-served policy isn’t the best way of treating those attendees.

Anyway, here’s what I submitted for that round-up of the best talks, but which, for reasons of prudence, was omitted from the final post:

I really enjoyed the keynote by Liz Jackson on inclusive design. I would’ve enjoyed it even more if I could’ve seen it in person. Instead I watched it live-streamed to a meeting room two buildings over because the conference sold more tickets than they had seats for. This was after queueing in the cold for registration. So I feel like I learned a lot from Interaction 19 …about how not to organise a conference.

Still, as Ben notes:

We all enjoyed ourselves thoroughly, despite best efforts by the West Coast snow to disrupt the entire city.

I’m going to be back in Seattle in just under two weeks for An Event Apart. Now that’s a conference! It runs like a well-oiled machine, and every talk in its single track has been curated for excellence …with one exception.

Ubiquity and consistency

I keep thinking about this post from Baldur Bjarnason, Over-engineering is under-engineering. It took me a while to get my head around what he was saying, but now that (I think) I understand it, I find it to be very astute.

Let’s take a single interface element, say, a dropdown menu. This is the example Laura uses in her article for 24 Ways called Accessibility Through Semantic HTML. You’ve got two choices, broadly speaking:

  1. Use the HTML select element.
  2. Create your own dropdown widget using JavaScript (working with divs and spans).

The advantage of the first choice is that it’s lightweight, it works everywhere, and the browser does all the hard work for you.

But…

You don’t get complete control. Because the browser is doing the heavy lifting, you can’t craft the details of the dropdown to look identical on different browser/OS combinations.

That’s where the second option comes in. By scripting your own dropdown, you get complete control over the appearance and behaviour of the widget. The disadvantage is that, because you’re now doing all the work instead of the browser, it’s up to you to do all the work—that means lots of JavaScript, thinking about edge cases, and making the whole thing accessible.

This is the point that Baldur makes: no matter how much you over-engineer your own custom solution, there’ll always be something that falls between the cracks. So, ironically, the over-engineered solution—when compared to the simple under-engineered native browser solution—ends up being under-engineered.

Is it worth it? Rian Rietveld asks:

It is impossible to style select option. But is that really necessary? Is it worth abandoning the native browser behavior for a complete rewrite in JavaScript of the functionality?

The answer, as ever, is it depends. It depends on your priorities. If your priority is having consistent control over the details, then foregoing native browser functionality in favour of scripting everything yourself aligns with your goals.

But I’m reminded of something that Eric often says:

The web does not value consistency. The web values ubiquity.

Ubiquity; universality; accessibility—however you want to label it, it’s what lies at the heart of the World Wide Web. It’s the idea that anyone should be able to access a resource, regardless of technical or personal constraints. It’s an admirable goal, and what’s even more admirable is that the web succeeds in this goal! But sometimes something’s gotta give, and that something is control. Rian again:

The days that a website must be pixel perfect and must look the same in every browser are over. There are so many devices these days, that an identical design for all is not doable. Or we must take a huge effort for custom form elements design.

So far I’ve only been looking at the micro scale of a single interface element, but this tension between ubiquity and consistency plays out at larger scales too. Take page navigations. That’s literally what browsers do. Click on a link, and the browser fetches that URL, displaying progress at it goes. The alternative, as exemplified by single page apps, is to do all of that for yourself using JavaScript: figure out the routing, show some kind of progress, load some JSON, parse it, convert it into HTML, and update the DOM.

Personally, I tend to go for the first option. Partly that’s because I like to apply the rule of least power, but mostly it’s because I’m very lazy (I also have qualms about sending a whole lotta JavaScript down the wire just so the end user gets to do something that their browser would do for them anyway). But I get it. I understand why others might wish for greater control, even if it comes with a price tag of fragility.

I think Jake’s navigation transitions proposal is fascinating. What if there were a browser-native way to get more control over how page navigations happen? I reckon that would cover the justification of 90% of single page apps.

That’s a great way of examining these kinds of decisions and questioning how this tension could be resolved. If people are frustrated by the lack of control in browser-native navigations, let’s figure out a way to give them more control. If people are frustrated by the lack of styling for select elements, maybe we should figure out a way of giving them more control over styling.

Hang on though. I feel like I’ve painted a divisive picture, like you have to make a choice between ubiquity or consistency. But the rather wonderful truth is that, on the web, you can have your cake and eat it. That’s what I was getting at with the three-step approach I describe in Resilient Web Design:

  1. Identify core functionality.
  2. Make that functionality available using the simplest possible technology.
  3. Enhance!

Like, say…

  1. The user needs to select an item from a list of options.
  2. Use a select element.
  3. Use JavaScript to replace that native element with a widget of your own devising.

Or…

  1. The user needs to navigate to another page.
  2. Use an a element with an href attribute.
  3. Use JavaScript to intercept that click, add a nice transition, and pull in the content using Ajax.

The pushback I get from people in the control/consistency camp is that this sounds like more work. It kinda is. But honestly, in my experience, it’s not that much more work. Also, and I realise I’m contradicting the part where I said I’m lazy, but that’s why it’s called work. This is our job. It’s not about what we prefer; it’s about serving the needs of the people who use what we build.

Anyway, if I were to rephrase my three-step process in terms of under-engineering and over-engineering, it might look something like this:

  1. Start with user needs.
  2. Build an under-engineered solution—one that might not offer you much control, but that works for everyone.
  3. Layer on a more over-engineered solution—one that might not work for everyone, but that offers you more control.

Ubiquity, then consistency.

Pseudo and pseudon’t

I like CSS pseudo-classes. They come in handy for adding little enhancements to interfaces based on interaction.

Take the form-related pseudo-classes, for example: :valid, :invalid, :required, :in-range, and many more.

Let’s say I want to adjust the appearance of an element based on whether it has been filled in correctly. I might have an input element like this:

<input type="email" required>

Then I can write some CSS to put green border on it once it meets the minimum requirements for validity:

input:valid {
  border: 1px solid green;
}

That works, but somewhat annoyingly, the appearance will change while the user is still typing in the field (as soon as the user types an @ symbol, the border goes green). That can be distracting, or downright annoying.

I only want to display the green border when the input is valid and the field is not focused. Luckily for me, those last two words (“not focused”) map nicely to some more pseudo-classes: not and focus:

input:not(:focus):valid {
  border: 1px solid green;
}

If I want to get really fancy, I could display an icon next to form fields that have been filled in. But to do that, I’d need more than a pseudo-class; I’d need a pseudo-element, like :after

input:not(:focus):valid::after {
  content: '✓';
}

…except that won’t work. It turns out that you can’t add generated content to replaced elements like form fields. I’d have to add a regular element into my markup, like this:

<input type="email" required>
<span></span>

So I could style it with:

input:not(:focus):valid + span::after {
  content: '✓';
}

But that feels icky.

Update: See this clever flexbox technique by Kitty Giraudel for a potential solution.

A question of timing

I’ve been updating my collection of design principles lately, adding in some more examples from Android and Windows. Coincidentally, Vasilis unveiled a neat little page that grabs one list of principles at random —just keep refreshing to see more.

I also added this list of seven principles of rich web applications to the collection, although they feel a bit more like engineering principles than design principles per se. That said, they’re really, really good. Every single one is rooted in performance and the user’s experience, not developer convenience.

Don’t get me wrong: developer convenience is very, very important. Nobody wants to feel like they’re doing unnecessary work. But I feel very strongly that the needs of the end user should trump the needs of the developer in almost all instances (you may feel differently and that’s absolutely fine; we’ll agree to differ).

That push and pull between developer convenience and user experience is, I think, most evident in the first principle: server-rendered pages are not optional. Now before you jump to conclusions, the author is not saying that you should never do client-side rendering, but instead points out the very important performance benefits of having the server render the initial page. After that—if the user’s browser cuts the mustard—you can use client-side rendering exclusively.

The issue with that hybrid approach—as I’ve discussed before—is that it’s hard. Isomorphic JavaScript (terrible name) can theoretically help here, but I haven’t seen too many examples of it in action. I suspect that’s because this approach doesn’t yet offer enough developer convenience.

Anyway, I found myself nodding along enthusiastically with that first of seven design principles. Then I got to the second one: act immediately on user input. That sounds eminently sensible, and it’s backed up with sound reasoning. But it finishes with:

Techniques like PJAX or TurboLinks unfortunately largely miss out on the opportunities described in this section.

Ah. See, I’m a big fan of PJAX. It’s essentially the same thing as the Hijax technique I talked about many years ago in Bulletproof Ajax, but with the new addition of HTML5’s History API. It’s a quick’n’dirty way of giving the illusion of a fat client: all the work is actually being done in the server, which sends back chunks of HTML that update the interface. But it’s true that, because of that round-trip to the server, there’s a bit of a delay and so you often end up briefly displaying a loading indicator.

I contend that spinners or “loading indicators” should become a rarity

I agree …but I also like using PJAX/Hijax. Now how do I reconcile what’s best for the user experience with what’s best for my own developer convenience?

I’ve come up with a compromise, and you can see it in action on The Session. There are multiple examples of PJAX in action on that site, like pretty much any page that returns paginated results: new tune settings, the latest events, and so on. The steps for initiating an Ajax request used to be:

  1. Listen for any clicks on the page,
  2. If a “previous” or “next” button is clicked, then:
  3. Display a loading indicator,
  4. Request the new data from the server, and
  5. Update the page with the new data.

In one sense, I am acting immediately to user input, because I always display the loading indicator straight away. But because the loading indicator always appears, no matter how fast or slow the server responds, it sometimes only appears very briefly—just for a flash. In that situation, I wonder if it’s serving any purpose. It might even be doing the opposite to its intended purpose—it draws attention to the fact that there’s a round-trip to the server.

“What if”, I asked myself, “I only showed the loading indicator if the server is taking too long to send a response back?”

The updated flow now looks like this:

  1. Listen for any clicks on the page,
  2. If a “previous” or “next” button is clicked, then:
  3. Start a timer, and
  4. Request the new data from the server.
  5. If the timer reaches an upper limit, show a loading indicator.
  6. When the server sends a response, cancel the timer and
  7. Update the page with the new data.

Even though there are more steps, there’s actually less happening from the user’s perspective. Where previously you would experience this:

  1. I click on a button,
  2. I briefly see a loading indicator,
  3. I see the new data.

Now your experience is:

  1. I click on a button,
  2. I see the new data.

…unless the server or the network is taking too long, in which case the loading indicator appears as an interim step.

The question is: how long is too long? How long do I wait before showing the loading indicator?

The Nielsen Norman group offers this bit of research:

0.1 second is about the limit for having the user feel that the system is reacting instantaneously, meaning that no special feedback is necessary except to display the result.

So I should set my timer to 100 milliseconds. In practice, I found that I can set it to as high as 200 to 250 milliseconds and keep it feeling very close to instantaneous. Anything over that, though, and it’s probably best to display a loading indicator: otherwise the interface starts to feel a little sluggish, and slightly uncanny. (“Did that click do any—? Oh, it did.”)

You can test the response time by looking at some of the simpler pagination examples on The Session: new recordings or new discussions, for example. To see examples of when the server takes a bit longer to send a response, you can try paginating through search results. These take longer because, frankly, I’m not very good at optimising some of those search queries.

There you have it: an interface that—under optimal conditions—reacts to user input instantaneously, but falls back to displaying a loading indicator when conditions are less than ideal. The result is something that feels like a client-side web thang, even though the actual complexity is on the server.

Now to see what else I can learn from the rest of those design principles.

Double tap delay

Even though my encounter with Tess yesterday was brief, we still managed to turn the conversation to browsers, standards, and all things web in our brief chat.

Specifically, we talked about this proposal in Blink related to the 300 millisecond delay that mobile browsers introduce after a tap event.

Why do browsers have this 300 millisecond delay? Well, you know when you’re looking at fixed-width desktop-based website on a mobile phone, and everything is zoomed out, and one of the ways that you can zoom in to a specific portion of the page is to double tap on that content? A double tap is defined as two taps less than 300 milliseconds apart. So whenever you tap on something in a touch-based browser, it needs to wait for that length of time to see if you’re going to turn that single tap into a double tap.

The overall effect is that tap actions feel a little bit laggy on the web compared to native apps. You can fix this by using the fastclick code from FT Labs, but I always feel weird solving a problem on mobile by throwing more front-end code at it.

Hence the Blink proposal: if the author has used a meta viewport declaration to set width=device-width (effectively saying “hey, I know what I’m doing: this content doesn’t need to be zoomed”), then the 300 millisecond delay could be removed from tap events. Note: this only affects double taps—pinch zoom is unaffected.

This sounds like a sensible idea to me, but Tess says that she sometimes still likes to double tap to zoom even in responsive designs. She’d prefer a per-element solution rather than a per-document meta element. An attribute? Or maybe a CSS declaration similar to pointer events?

I thought for a minute, and then I spitballed this idea: what if the 300 millisecond delay only applied to non-focusable elements?

After all, the tap delay is only noticeable when you’re trying to tap on a focusable element: links, buttons, form fields. Double tapping tends to happen on text content: divs, paragraphs, sections. That’s assuming you are actually using buttons and links for buttons and links—not spans or divs a-la Google.

And if the author decides they want to remove the tap delay on a non-focusable element, they can always make it focusable by adding tabindex=-1 (if that still works …does that still work? I don’t even know any more).

Anyway, that was my not-very-considered idea, but on first pass, it doesn’t strike me as being obviously stupid or crazy.

So, how about it, browser makers? Does removing the 300 millisecond delay on focusable elements—possibly in combination with the meta viewport declaration—make sense?

Progresponsive

Brad has done a great job in documenting navigation patterns for responsive designs. More recently I came across Erick Arbé’s similar collection of patterns for responsive navigation. And, of course, at the Responsive Day Out, David gave a presentation on the subject.

David Bushell: Responsive Navigation on Huffduffer

As I mentioned in the chat after David’s talk, choosing a pattern doesn’t need to be an either/or decision. You can start with a simple solution and progressively enhance to a more complex navigation pattern.

Take the footer-anchor pattern, for example. I really, really like this pattern. It doesn’t require any JavaScript whatsoever; just a simple hyperlink from the top of the page that links to the fragment identifier of the navigation at the bottom of the page. It works on just about every device.

But you don’t have to stop there. Now that you’ve got a simple solution that works everywhere, you can enhance it for more capable browsers.

Take a look at this example that applies the off-canvas pattern for browsers capable of handling the JavaScript and CSS required.

You can see the two patterns in action by looking at the source in JS Bin. If you toggle the “Auto-run JS” checkbox, you can see both behaviours. Without JavaScript you get the footer-anchor pattern. With JavaScript (and a capable browser) you get the off-canvas pattern.

I haven’t applied any media queries in this instance, but it would be pretty straightforward to apply absolute positioning or the display: table hack to display the navigation by default at wider screen sizes. I’ll leave that as an exercise for the reader (bonus points: apply the off-canvas from the right of the viewport rather than the left).

Feel free to peruse the somewhat simplistic code. I’m doing a bit of feature detection—or cutting the mustard—to test for querySelector and addEventListener. If a browser passes the test, a class is applied to the document root and some JavaScript is executed on page load to toggle the off-canvas behaviour.

On a recent project, I found myself implementing a number of different navigation patterns: off-canvas, overlay, and progressive disclosure. But each one began as an instance of the simple footer-anchor pattern.

Progressive enhancement, baby. Still not dead, still important.

Off-canvas horizontal lists

There was a repeated rallying cry at the Responsive Day Out. It was the call for more sharing—more sharing of data, more sharing of case studies, more sharing of success stories, but also more sharing of failures.

In that spirit, I thought I’d share a pattern I’ve been working on. It didn’t work, but I’m not going to let that stop me putting it out there.

Here’s what I wanted to do…

Let’s say you’ve got a list of items; modular chunks of markup like an image and a caption, for example. By default these will display linearly on a small screen: a vertical list. I quite like the way that the Flickr iPhone app takes those lists and makes them horizontal—they go off-canvas (to the right), with a little bit of the next item peaking out to give some affordance. It’s like an off-canvas carousel.

I’d quite like to use that interaction in responsive designs. But I don’t want to do it by throwing a lot of JavaScript at the problem. So I thought I’d attempt to achieve it with a little bit of CSS.

So, let’s say I’ve got a list of six items like this:

<div class="items">
    <ul class="item-list">
        <li class="item"></li>
        <li class="item"></li>
        <li class="item"></li>
        <li class="item"></li>
        <li class="item"></li>
        <li class="item"></li>
    </ul><!-- /.item-list -->
</div><!-- /.items -->

Please pay no mind to the qualities of the class names: this is just a quick proof of concept.

Here’s how that looks. At larger screen sizes, I display the list items in groups of two or three, side by side. At smaller sizes, the items simply linearise vertically.

Okay, now within a small-screen media query I’m going to constrain the width of the container:

.items {
    width: 100%;
}

I’m going to make the list within that element stretch off-canvas for six screens wide (this depends on me knowing that there will be exactly six items in the list):

.items .item-list {
    width: 600%;
}

Now I’ll make each item one sixth of that size, which should be one screen’s worth. Actually, I’m going to make it a bit less than exactly one sixth (which would be 16.6666%) so that a bit of the next item peaks out:

.item-list .item {
    width: 15%;
}

My hope was that to make this crawlable/swipable, all I had to do was apply overflow: scroll to the containing element:

.items {
    width: 100%;
    overflow: scroll;
}

All of that is wrapped up in a small-viewport media query:

@media all and (max-width: 30em) {
    .items {
        width: 100%;
        overflow: scroll;
    }
    .items .item-list {
        width: 600%;
    }
    .items .item {
        width: 15%;
    }
}

It actually works …in some browsers. Alas, support for overflow: scroll doesn’t extend back as far as Android 2, still a very popular flavour of that operating system. That’s quite a showstopper.

There is a polyfill called Overthrow from those mad geniuses at Filament Group. But, as I said, I’d rather not throw more code at the problem. While I can imagine shovelling a polyfill at a desktop browser, I have a lot of qualms about trying to “support” an older mobile browser by giving it a chunk of JavaScript to chew on.

What I really need is a way to detect support for overflow: scroll. Alas, looking at the code for Overthrow, that isn’t so easy. Modernizr cannot help me here. We are in the realm of the undetectables.

My pattern is, alas, a failure.

Or, at least, it’s a failure for now. The @supports rule in CSS is tailor-made for this kind of situation. Basically, I don’t want any those small-screen rules to apply unless the browser supports overflow: scroll. Here’s how I will be able to do that:

@media all and (max-width: 30em) {
  @supports (overflow: scroll) {
    .items {
        width: 100%;
        overflow: scroll;
    }
    .items .item-list {
        width: 600%;
    }
    .items .item {
        width: 15%;
    }
  }
}

This is really, really useful. It means that I can start implementing this pattern now even though very few browsers currently understand @supports. That’s okay. Browsers that don’t understand it will simply ignore the whole block of CSS, leaving the list items to display vertically. But as @support gets more …um, support …then the pattern will kick in for those more capable browsers.

I can see myself adding this pre-emptive pattern for a few different use cases:

Feel free to poke at the example code. Perhaps you can find a way to succeed where I have failed.

Publishing Paranormal Interactivity

I’ve published the transcript of a talk I gave at An Event Apart in 2010. It’s mostly about interaction design, with a couple of diversions into progressive enhancement and personality in products. It’s called Paranormal Interactivity.

I had a lot of fun with this talk. It’s interspersed with videos from The Hitchhiker’s Guide To The Galaxy, Alan Partridge, and Super Mario, with special guest appearances from the existentialist chalkboard and Poshy’s upper back torso.

If you don’t feel like reading it, you can always watch the video or listen to the audio.

Adactio: Articles—Paranormal Interactivity on Huffduffer

You could even look at the slides but, as I always say, they won’t make much sense without the context of the presentation.

Continuous partial annoyance

Twitter have been rolling out a new redesign. Thanks to Dustin, I got to try it out when the switch was flipped.

As with any redesign, the initial reaction tends to be It’s different! I fear change! Therefore I dislike this. See also: redesigns of The Guardian, Last.fm, Flickr, BBC…

With Twitter, that initial knee-jerk fades pretty quickly because the new site is undeniably beautiful. The visual design is top-notch.

There’s a nice little addition in the markup, too. The body element has a class name that you can hook into for user stylesheets. This is a very, very, very good thing. For example, my class name is .user-style-adactio so I can add some declarations to my user stylesheet.

The first rule simply hides the egregious Trending Topics and Who To Follow features (and I love that Who To Follow abbreviates to WTF):

.user-style-adactio .trends-inner,
.user-style-adactio .wtf-inner {
 display: none !important;
}

By the way, a user stylesheet is the only time it’s acceptable use important! in your CSS.

My other rules adjust the layout a bit when the viewport gets smaller. It’s just a quick little hack and it’s not great but it’s handy if, like me and Norm!, you don’t like a site dictating how wide your browser window should be. Thanks to user stylesheets, you can fix this:

@media screen and (max-width: 995px) {
 .user-style-adactio #page-container,
 .user-style-adactio #page-outer {
  min-width: 590px !important;
 }
 .user-style-adactio .dashboard {
  float: none !important;
  clear: both !important;
  max-width: 0 !important;
 }
}

Handy tip: if you use Dropbox, store your user stylesheet there. That way, you can point multiple machines to the same stylesheet. I’ve got my laptop at home and my iMac at work pointing to the same CSS file.

There’s one aspect of the new Twitter redesign that I really don’t like, and I can’t fix it with a user stylesheet: infinite scrolling. As I said (on Twitter, of course):

I’m allergic to infinite scrolling

Notice that I didn’t say that infinite scrolling is wrong, it’s just wrong for me. There’s nothing wrong with peanuts unless you have a nut allergy.

The reason that I don’t like infinite scrolling is that I actually use the scrollbar to scroll. That is, I move my cursor over the scrollbar, click and drag. Infinite scrolling makes this unworkable: the scrollbar under my cursor jumps around as new content is loaded.

I figured that in this day of mouse wheels and trackpads, I must be in the minority with my old-fashioned scrollbar usage. I asked for data on Twitter, and sure enough, most people who responded said they used the mouse wheel, the trackpad, the space bar or arrow keys. Though some people still found the scrollbar useful as a visual indicator of how long the page is …which is also negated by infinite scrolling.

Interestingly, while most of the people who responded to my query on Twitter said they hardly ever use the scrollbar, the Firefox heatmap shows that it’s one of the most used interface features. That was a much larger sampling: 117,000 users.

Still, I can understand why Twitter have decided to go with infinite scrolling. If I’m in the minority in thinking it’s horrible, that’s my problem. I can’t even claim that it’s an accessibility problem: it requires more manual dexterity to use the scrollbar than to use other methods of scrolling.

Twitter could add a user setting to switch off infinite scrolling—perhaps replacing it with the old style “more” button, which I liked—but that’s a cop-out. Whenever something gets shunted off into a preference, it’s generally a sign of indecision in the design. The Twitter redesign isn’t indecisive: it has a very clear and consistent visual and interactive design vocabulary. It just happens that one aspect of the UI vocabulary doesn’t mesh well with my own usage pattern.

So, in this case, the solution may well be for me to change the way I use the site. It still irks me, though. I’m generally against any interactions that happen without an explicit request from the user, such as revealing data and functionality on hover, for example. Twitter avoids that particular anti-pattern but with infinite scrolling, the act of moving down the page is interpreted as a request to load more data. I would much prefer to request that data explicitly with a button or link. Of course, that requires that the user do more, so it could be argued that infinite scrolling actually reduces the number of interactions that the user is required to do …assuming that the inferred interaction is in fact the desired interaction. That’s a big assumption.

On the face of it, it would seem that Twitter are being somewhat dismissive of the scrollbar as a UI element. But that’s not true. While they are reducing the usefulness of browser-native scrollbars by using infinite scrolling, they are, at the same time, replicating the functionality of scrollbars but non-natively. If you reveal a side panel—by clicking on someone’s Twitter username, for example—and if the content doesn’t fit within the viewport, then a non-native scrollbar is generated.

scrollbars

As I said, the new redesign is wonderful. I’m just nit-picking ..but it’s a big nit.

The Framework Age

Liz Danzico is talking at An Event Apart San Francisco about frameworks. Not CSS frameworks, not JavaScript frameworks, not Rails, not Django, but websites as frameworks. These days we’re designing frameworks for user interaction rather than static artefacts.

Liz tells a story about Miles Davis who showed up at the studio with six slips of paper listing the six musicians he wanted to play with on his record. Over the course of one day, these people who had never played this music together recorded a whole album. Davis wanted to capture something called creative instability. Kind of Blue came out of this framework that he created.

Liz wants to talk about frameworks that are uninscribed and detectable cues that loosely govern a set of actions. These are interaction frameworks, frameworks that shape how people behave.

Back to music. Classical music uses classical notation. If you can’t read notation, you can’t make sense of it so it’s kind of elitist. It also provides rules like tempo and key. If you step outside these boundaries, you are deviating from the notation. Also, every note is accounted for in the notation. You can’t improvise it. Jazz notation is different. It provides chord progressions. It’s up to the musician to improvise around this framework. Modal jazz is even more abstract. That’s what Miles Davis invented that day in the studio. Kind of Blue was created out of just a scale.

On the web, we’re making the same transition from classical to jazz. We’re improvising. We’ve moved from a hard-coded system of building pages to an open system of creating participatory environments.

But this kind of tension is nothing new. It’s being going on for years. There’s been a long-running tension between orality and literacy. The printing press destroyed a lot of oral tradition but we still use word of mouth to pass on urban legends and recipes. Liz mentions Alex Wright’s observation in Glut that we are seeing a resurgence in this kind of oral tradition online. Even though we’re writing in blogs and mailing lists, we’re not so much publishing as talking.

There’s evidence of improv online. Exquisite simplicity was how pianist Bill Evans described Miles Davis’s framework of six slips of paper.

Quoting from The Paradox of Choice, Liz shows how the default settings can make a big difference (in the number of organ donations, for example, which could be opt-in or opt-out). Geni has some smart default settings. Same with Tripit. All you need to do is forward an email and it will take care of the rest. Focus on creating smart defaults.

In improv, you need to involve the audience. It’s important to adapt to what your audience is doing. Here’s an example from architecture: there was a fountain that was built in Washington Square Park in New York but before they got ‘round to turning it on, people started using it as a seating area. When the city tried to turn on the fountain, people revolted. The fountain is dry to this day and is used for public theatre.

Referring to the redesign of the Wordpress admin, Liz points out that it’s really important to involve users in the design process. There’s a difference between asking your audience what they think of a system compared to looking at how they are actually using that system.

Listen and watch. That’s another lesson we can take from music and apply to the web. When you’re playing with other people, not only do you have to listen to what the other people are doing, you have to watch them too. It’s the same with architecture. Desire paths are created by people actually using a space. They show clearly where paths should be built. Eyetracking can reveal the desire paths of users interacting with an application. There are other tools like User Voice which can involve the audience. Observe. Listen. Pay attention.

A common technique in Jazz is call and response when musicians play off one another. You see this online in reviews where the reviews start reacting to each other rather than the original item being reviewed. Allow users to build on one another.

User-centred design and participatory design are great ways of involving the users in the design process but that’s still different to actual use. It’s time for a new way of working: designing for improvisation (but remember that no one single process will ever be successful). Our design process should reflect the trend towards user participation that we’re seeing on the web. People’s tolerance for improvisation is increasing and our role as framework providers should reflect that.

Spaces

It seems that small interface changes are rolled out to Twitter on a fairly regular basis. This morning, for example, I was greeted with a new “Everyone” tab. I was also disappointed to discover that an interface improvement that was introduced a few weeks ago has now been removed. As interaction tweaks go, this was a very small thing but it’s something I appreciated very much. Let me explain…

When I’m reading a long-ish page on the Web, rather than move my cursor over to the scrollbar, position it just so and click to scroll down see the next screenful of content, I’ll just tap the spacebar. In just about every browser I know, this will scroll the content by one screenful. This flow is interrupted if a website “helpfully” puts the focus into a form element when the page loads. This isn’t an issue on, say, Google because Google doesn’t have more than one screenful of content. But it is an issue on Twitter. When Twitter loads, the What are you doing? input box is automatically given focus. If I want to scroll down below the fold, I must either use the scrollbar or click out of the form element and then use the spacebar. I can understand the rationale behind this. Chances are most people want to get into that form element and start typing …at least on the front page.

The interface improvement that Twitter introduced a while back was to take that automatic focus away if I was on any page other than the first. In other words, if I was clicking back through older pages to catch up what my friends have been doing, the focus was no longer automatically given to the form element. Brilliant! This awareness of context reminds me of what Eric wrote when they were adding print stylesheets to A List Apart:

These print styles are only used on articles, which are the pages that are most likely to be printed.

Perhaps through oversight or maybe through deliberate choice, Twitter now places the focus in that form element on every page. What a pain! And what a shame that a great example of context-sensitive interaction has been removed.

I hereby invoke my bitching ‘n’ moaning mojo: c’mon Twitter, do the right thing.

While I’m at it…

Oi! Flickr! What’s up with the automatic focus in the search form on search results pages? Explain that to me. ‘Cause from where I’m sitting, it’s just downright annoying.