The Clearleft Podcast: Measuring Design

Measuring Design

Download mp3

Jeremy

This is the Clearleft podcast.

Today I’d like to start with the Vietnam war or as it’s known in Vietnam, the American war. It’s also known as McNamara’s war.

Announcer

Secretary of defense, Robert McNamara, points to bombed out supply routes from North Vietnam, routes that have supplied reds in the south with weapons like this Chinese machine gun, as well as troops trained by the communists. Discounting the possibility of using nuclear weapons, Mr. McNamara says that our air raids have knocked out 24 key bridges.

Robert McNamara

I believe I’m correct in saying that in the past four and a half years, the Vietcong, the communists, have lost 89,000 men, killed, in South Vietnam. Now not all of these men have been infiltrated from the north, but an important number has been. And with that, plus the expansion of the Vietcong forces in the south, you can see the heavy drain upon the filler resources of the north, and the reason why they are having to turn to their regular military units to continue the supply of men over these infiltration routes, a supply that’s absolutely essential to them if they are to offset the continuing casualties.

Jeremy

As you can probably tell, Robert McNamara was fixated on numbers. He thought the success of the war could be measured in numbers. Today, Robert McNamara’s name is attached to, to a logical fallacy. The McNamara fallacy also known as the quantitative fallacy. Was described by Daniel. Daniel Yankelovich like this:

The first step is to measure whatever can be easily measured. This is okay. As far as it goes. The second step. Is to disregard that, which can’t be easily measured or give it an arbitrary quantitative value. This is artificial and misleading. The third step is to presume that what can’t be measured easily really isn’t that important. This is blindness. The fourth step is to say that what can’t be easily measured doesn’t really exist. This is suicide.

So over-indexing on what can be easily measured and dismissing what can’t be easily measured. Does that sound familiar at all?

Here’s Fonz Morris from Netflix.

Fonz

Every single user on the Netflix platform right now is in some form of a test. That’s how much we value data at the company. And that’s how much we constantly want to improve the experience for our users. So what you’re seeing is not the same as what your friend or your brother or anybody is seeing. We actually have to contact engineering to get ourselves removed and put into a special use case where we don’t see any testing.

Jeremy

So Netflix do a lot of AB testing. So does booking.com. Back when he worked at booking.com, Stuart Frisby gave a talk at one of Clearleft’s events about a design culture based on AB testing.

Stuart

This is innate to booking.com. A core component of our culture is AB testing.

Everything that we put in front of customers, in front of our partners, is the result of statistically measured AB tests. So I know what a lot of you in the room will be thinking. That’s a real constraint to doing great design work. And I see the point, but what it also is is it’s a tool that you can use to prove the value of design.

And that was how I looked upon it, was finally for the first time in my career and the first time in anyone’s career, probably as a designer in booking.com, we can put a number on what design contributes to the business. We can show that design in and of itself as a discipline has a significant impact on the profitability and the quality of products that we put in front of customers.

And we all know that that’s true. Like intuitively we as designers know that that’s why we do this. But that doesn’t matter to anybody else. They don’t have that intuition about design. They have intuitions about their own shit.

So here’s how booking.com is successful. We utilize AB testing, which drives higher conversion rates across our platform, which in our case results in conversion rates of two to three times the industry average.

So when you consider the scale of our business, you see what kind of competitive moat that becomes for us. So that’s the framework within which we have to prove the value of design. This is what the organization understands as value. So let’s figure out how we speak that language. And we did. So we did lots of AB tests where, sure, design is always a component of the things that we put in front of customers but you can also just test design. And I’m not suggesting you just test design by changing the colors of buttons, but there are ways that you can prove that design and the act of design has a direct and measurable impact. And we did it.

Fonz

I’m a fan of AB testing because I think it allows us to not really focus on thinking that this one solution is the best.

And we can keep iterating to try to really figure out what’s the best for our users at the end of the day.

Jeremy

But here’s researcher Maite Otondo.

Maite

The thing about just conducting quantitative research, like AB testing, is that they tell you how many, but they don’t tell you why. So you might run an AB test and see that some elements perform better than others, but you don’t really know why. Then if you don’t know why it performs better how do you make that decision again for it to be successful if you don’t really know what it is?

Jeremy

Here’s Chris How, design strategist at Clearleft.

Chris

AB testing and more complicated kind of multi-variate testing, is one of those techniques that I think a user experience designer should have in their toolkit, but it is merely one of the things they should have in their toolkit.

One of the big challenges with AB testing is, from an organization’s perspective, it feels like a perfect piece of testing.

You throw up a piece of design. You put some people looking at design A, some people looking at design B and you get back very quickly some compelling numbers that can take the pain out of doing a design critique or having an opinion yourself. And you can just rely on those numbers as a way forward.

I think the real challenge with AB testing is customers live in a very complex and related world where I think it’s almost impossible, I’m going to say it is impossible, it’s impossible to have a direct cause and effect, you know, we’re going to change the color of this button or change the headline on this article, we’re going to change the color of the link text, and to come back with any confidence to say that single change has made this huge impact.

But that’s often what I see with AB testing is people are trying to do the smallest amount of work and come back with some numbers that give them the certainty that what they’ve done has had that impact.

I rarely see that as the case. I think we live in a much more nuanced, complicated and interesting world where the sum of the parts of a website or an app all contribute to the results that you will get from the AB testing.

Jeremy

Here’s Andy Thornton, strategy director at Clearleft.

Andy

AB testing definitely has a purpose. I think it’s good for well-defined problems and being able to quickly choose between some options. It does have a place but it’s the complete opposite end of the spectrum of what generative research offers, which is open-ended problem exploration with ill-defined solutions. Even when you have insight, there’s many, many ways to solve those problems.

AB testing is, is a blunt instrument to say this or that.

Chris

I also have real challenges with AB testing that I’ve more often seen it done really badly than done well.

If you were a scientist running a test you would repeat that test and you would repeat that experiment. And the confidence in the results would come through the repeatability and knowing that the conditions that you’ve got can be repeated and the result stays the same.

But with AB testing, more often than not, I see clients put up a test. Rarely if ever, can they tell me what is the level of statistical significance they need to, to have confidence in that, you know, they’ll, they’ll look at you and go, well, this test is 52 48, let’s pick some numbers, and take that as proof that one thing is better than another.

And they’ll rarely rerun that test to see, of all the other influences at the time of running that test, if any of them were having any impact. And I think AB testing it often results in very small optimization. And I don’t often think optimization for the best. And stops people, as a result, taking a step back and looking at maybe some bigger changes that they could make.

Jeremy

Even if AB testing is well-suited for measuring the trees, it won’t help you see the forest. Here’s Radhika Dutt speaking at this year’s UX Fest online.

Radhika

We’re so focused on metrics.

We’re either saying, you know, let’s measure everything, AB test everything, or we’re so focused on optimizing for metrics, moving things up and to the right, but those aren’t necessarily helping us build better products.

You’ve realized that fundamentally you haven’t really moved the needle despite having optimized for metrics.

Andy

There is an over emphasis sometimes on numbers at the top level of organizations i.e. gIve me some facts that can be converted into mathematics and make it easy to digest and easy to track.

Unfortunately within the wider world generally, you could say, there is an obsession with trying to convert everything into numbers. And actually that’s quite an interesting thing to probe. I don’t quite know why that is. We don’t seem comfortable with a degree of ambiguity or a degree of using language or emotion as a way to express certainty or confidence.

Jeremy

Another speaker at UX Fest this year was Melissa Smith. She gave a talk all about deceptive dark patterns. By the way that phrase, dark pattern, was coined by former Clearlefty, Harry Brignull. I should get him on this podcast to do a whole episode on that topic.

Melissa explained what a deceptive dark pattern is.

Melissa

I define dark patterns as a design that intentionally takes advantage of human behaviors to achieve unintended actions from a user.

Jeremy

Melissa also gave an example.

Melissa

Time manipulation. So time manipulation can take a few forms. It can be applying time pressure to make a decision in a way that stresses the user out.

So you can imagine that on the websites that have a counter at the top, that has a timer that goes down that says you know, this deal is going to expire in X amount of time. That is a site using time manipulation.

Jeremy

That scenario with time manipulation will feel very familiar to you if you’ve used booking.com, the same company that swears by AB testing. Just like Netflix.

Fonz

We’re doing the testing to try to come up with the best solution. I do think there does come a point where the designer would be responsible to interject if we’re crossing the line or something that is now just obvious dark pattern. We don’t support dark patterns, honestly, because the long-term really does matter to us.

So the AB testing to me can last as long as it needs, because it should be about trying to figure out what’s the best experience for the user, not the trickiest or the most deceiving way.

Jeremy

I asked Melissa, isn’t the biggest problem with deceptive dark patterns that, at least according to the test results, these deceptive dark patterns work?

Melissa

They do work. But at what cost is the question, right?

So they they’ll work in getting someone to potentially sign up for a subscription to your service unwittingly or potentially without their knowledge. So yeah, maybe it resulted in an immediate conversion or business metric.

But the cost of that is when the person who was engaging with your product realizes what happens it typically doesn’t leave a good sentiment towards your product when they realize that they were essentially tricked to do something that they didn’t want to do.

Jeremy

Here’s Vicki Tan.

Vicki

If we just keep optimizing for the near term, for making money, we’re going to end up in a really bad situation.

And if you look at some of the tech company stuff that’s happened in the last decade, I think that is evidence of, of how we need to have a longer view of all these decisions that we make.

Radhika

Optimizing for these financial metrics and moving these metrics up and to the right while we focus on user engagement and time spent on our product, very often we’re not considering the damage that it creates to society or how our product is really affecting society.

Melissa

As designers we’re walking that tightrope between making sure that the users understand everything that’s going on and, you know, maybe trying to achieve a business goal. So I don’t think there’s going to always be a cut and dried line. It’s always going to really depend on the specific context and situation.

Jeremy

There’s a tension here. On the one hand, designers are comfortable with ambiguity. And they understand that what can’t easily be measured can still be very important. But designers are also being told that they need to speak the language of business. And the language of business is numbers.

Here’s Chris How again.

Chris

I’m going to start by saying that my background is definitely qualitative rather than quantitative. I’m comfortable with spreadsheets. I’m comfortable with numbers, but my preference is the storytelling around qualitative research.

Having said that some numbers are important and I think anybody who’s working as a designer, I use that term quite broadly, but working as a designer, In a commercial environment should understand how their work gets measured, if nothing else to make sure that what we do adds value and isn’t just vanity work.

Jeremy

Andy Thornton.

Andy

I think we sometimes really, really, and especially at the moment, over emphasize this idea of speaking the language of business as if there’s a huge chasm between designer speak, let’s say, and everyone else.

And I actually don’t think that chasm’s as big as we suggest

We’re all speaking the same language with different dialects. And basically people have got different areas of emphasis and areas of interest within the business that they want to hear echoed back from you as a designer or as a researcher or as someone who has a certain focus. You got to make sure that the conversation is inclusive and it’s respectful and it’s targeted and focused against the concerns of other people within the business.

Chris

I think the challenge is often in what gets measured and how we know that’s the right thing to get measured.

Over the years, I’ve seen many organizations I work with championing the numbers. There’s something that sort of just feels empirical and definite and factual when somebody puts some numbers onto something.

And I’m often the person sitting at those meetings thinking I’m not sure I have confidence in the numbers that are being presented. I don’t think the numbers have more credibility than the qualitative research that I might be doing on a project.

So I think there is a balance between being able to go into an organization and have a way of communicating the value that you add by the work that you’re doing.

But I don’t think numbers, in the form that they’re often presented, is the only way that that can be done.

Jeremy

I asked Andy: can user experience be measured?

Andy

Yes. Of course it can be measured.

Again, I think there’s this risk that measurement is just about mathematics. Like something that you can convert into numbers. I don’t think that’s the case.

Qualitative insights around how people feel in sentiment that are expressed through words that can’t be put into numbers are still ways of measuring. It’s just a lot more complicated and harder to feed back what you’ve learned in that process of measurement, because actually it’s about insight.

Jeremy

Chris, same question. Can user experience be measured?

Chris

I think some parts of the user experience can be measured.

The challenge that I often have is what gets measured and why that gets measured.

So I think if you’re going to look at measuring the user experience, you shouldn’t be starting with Google analytics, for example, as a tool to do that.

I think generally if you’re looking to measure the experience of something, you should be looking beyond just the numbers that the tools can give you and being wider and more holistic in your view. And by doing that, you’re probably getting less precise with the numbers, but I’m okay with that.

I think if you’re looking to measure experiences, you should be looking to blend qualitative and quantitative numbers and a little bit of gut instinct into looking at an overall picture. And I think the challenge in many organizations is that, well, I think there’s two challenges actually in most organizations.

And those challenges are around tools and culture. And I think in many organizations, people start looking at the numbers the tool that they’re using can give them. The law of the instrument. So if you’ve got a hammer in your hand, then very quickly the solution to everything is get a nail and start bashing it.

And I think if your tools are just giving you things like bounce rates and read rates and visitor numbers, they’re the things that you start thinking are important where they’re not they’re things that are quite easy for those tools to measure.

Jeremy

Right. If you torture the data long enough, you can get it to reveal anything.

Chris

And then, tied with that, I think there’s often in an organization, a cultural issue around measurement. And that is that organizations have invested in a project or invested in an initiative and they’re looking for good news. And then you get analysts searching around the numbers to find something that looks positive that they can report on.

And that’s very different from using numbers to inform your decisions and to look at the opportunities in the future that you might want to be investigating. And if you just start having that culture of "find me the good news in this" then the numbers just become a fashion parade.

And as I say, kind of with businesses, numbers have a lot of credibility. People are comfortable at board level and senior level with numbers. But I just occasionally think you should scratch at those numbers to see where they come from, how much credibility they have, how much confidence you have in those numbers.

Jeremy

You know, I always hear about designers needing to speak the language of business to get that coveted cliched seat at the table. But couldn’t business learn to speak the language of design?

Chris

There are definitely organizations I’ve worked with over the years where, at a senior level, people have been able to have really good conversations about experience. You know, I don’t think user experience is just held within the design world. There are plenty of organizations where, at a board level, they understand that their role is to create a customer and a positive experience is one of the ways to gain a customer and keep a customer. They are looking beyond just the spreadsheet and the numbers that are coming out of that.

Having said that I’ve also worked with organizations that are almost so shortsighted that the spreadsheet and the numbers on that spreadsheet is how they’re making their decisions.

And I think for organizations like that, the encouragement that I would give, and certainly the advice that I give to their design teams is to invite those senior stakeholders to come to qualitative research, to actually go and meet their customers, to walk the floor, to see what the experience of their product and service is.

Over many years and many times I’ve seen the managing director of a company literally knocking his head on a desk in a research lab going, how did we create something so confusing? How did we create a product that none of our potential customers can understand where the value is in that product.

Once you get senior board people out of their office, on the floor, seeing research carried out, meeting their customers, they understand that experience is not just something that’s reflected on a spreadsheet. You know, a spreadsheet is merely just another report. It’s just another input into having to ascertain how well your product or service is performing and where the opportunities are to increase the value that you’re adding from that.

Jeremy

One of the biggest nonfiction best sellers of recent years is all about numbers. The book Factfulness by the late great Hans Rosling is dripping with data. But Hans Rosling repeatedly says:

The world cannot be understood without numbers. And it cannot be understood with numbers alone.

Thanks to Chris How, Andy Thornton and Maite Otondo for taking the time to talk to me. You also heard from Fonz Morris, Stuart Frisby, Melissa Smith, Radhika Dutt, Teresa Torres and Vicki Tan who have all spoken at Clearleft events.

Thank you for listening.

Season Three

Measuring Design