“You didn’t talk to enough people.” It’s a common concern we hear when conducting a design research program: the concern that the number of people we spoke with can’t possibly be large enough to accurately inform an emergent design and innovation strategy.
Innovation has risk, and it’s natural to want to mitigate that risk. Most companies try to understand and prevent innovation risk through research, and a fundamental principle of most good scientific or marketing research is to select a sample that’s statistically representative. This means that a small sample of research respondents were recruited that map appropriately to the larger population and are indicative of the opinions and behavior of that larger set of people. We’ve learned that “good research” requires a big sample size, and even if we don’t know the math behind the statistics, we recognize that a tiny sample is unlikely to be valuable in making large statements. Good research means that the people you talk to represent the people you didn’t talk to.
But that’s a bad model for design research. The rules for what constitute good research methodologies in social sciences, medicine, and other professionals are simply inadequate for research that intends to inform design strategy. This is a challenge, because these other professions – the social and hard sciences – have a longer and more established history of best practice for experimental research. Small-sample behavioral research runs counter to these traditions and norms. And, small-sample research runs counter to our common sense, too. The world is big, and it seems crazy to assume that we can make decisions about new products and services for millions based on input from a dozen.
For us to embrace the process of innovation, we need to embrace a different (yet still rigorous) process of research. Design research is not sloppy. It’s exhaustive and exhausting, has a method that takes time and care to learn and executive, and can be done well, and can be done poorly. But it’s a different form of research, fundamentally. Expectations around it should be different. And the most noticeable change will be in sample size and selection criteria.
A well-designed survey relies on a few ingredients. First, the experimenters have made a hypothesis about causal behavior. Pretend that you are an education researcher, and you are interested in understanding why students quit college before they’ve earned a degree. You’ve observed one of your neighbor’s kids drop out of school after she had a child, and you hypothesize that “students who have a child while in school are more likely to drop out than students who don’t have a child while in school.”
A survey can act as a way of gathering data to explore the validity of your hypothesis. You could send a survey to every single college student (there are about 21 million of them) and ask them if they had a child, and if they dropped out. Then, you would see real data that could indicate the relationship between having children and completing degrees.
If you could do this, you would be able to say things like “College students who have children in college are 13% more likely to drop out than those who aren’t.” This doesn’t indicate causality – that having kids leads to dropping out – but it starts to give good evidence that there is a connection between those ideas.
But sending a survey to every college student is, in practice, impossible. You would have to have a list of every student and their address; send them the survey, at your expense; retrieve it – again, at your expense; and aggregate the data. By the time you had your list, the students would have graduated and new ones would have started again.
At the heart of a statistical survey is the idea that you can distribute a survey to a smaller but representative sample of a population, rather than to all of the members of that population; you can send a survey to a small group of students, rather than all 21 million. You could then make extrapolations from what you found from this smaller set, and make statements like “We are 95% sure that college students who have children in college are 13%, plus or minus two percentage points, more likely to drop out than those who aren’t.”
The statement has a hedge, based on the fact that you talked to a smaller sample size. You aren’t 100% sure, because you didn’t canvas every student; you are only 95% sure. And, you can’t say 13% definitively; instead, you are giving a range from 11-15%.
The cost savings on your participant selection in your hedge is huge.
To say “We are 100% sure it’s 13%”, you would need to survey 21 million students.
To say “We are 95% sure it’s 11-15%”, you only need to talk to 600.
So the basics of research depend on a hypothesis, a sample of participants who are randomly selected, and some number crunching to identify how sure you are that the people you talked to represent the opinions of everyone else. These are the statistical underpinnings of most academic survey-based research, and these are the ideas that most people have in their heads, in some rudimentary form, when you say “I did some research.” Even if they don’t know the math behind causality and correlation, most critical thinkers understand that you can talk to a small group to understand the behavior of a larger group – and that the small group can be small, but can’t be tiny. 600 people are fine; 20 are probably not.
Fundamental to this idea is this basic premise: We talked to a small group of people, to predict what a large group of people do, think, or feel.
But design research is about provocation. In design research, we’re not talking to small groups of participants to understand large swaths of people. Our research is not intended to mitigate risk, or to temper anxiety and add assurances of validity.
Our research has a single goal: to provoke new design ideas.
We spend time with participants to understand them specifically, rather than generally. In design research, a single person is a single person, not a stand-in for a whole population of other people. They have opinions, behaviors, and experiences that are unique to them, and our intent is to learn and feel those experiences. When we conduct research with participants, we hear about their stories. We learn their eccentricities, and begin to get a glimpse into the peculiarities and extremes of their lives. We see how they use products and services, what inspires them, what motivates them to try to accomplish and grow; we build empathy with them.
And this gives us “raw data” that we can then use to inspire new design questions and opportunities, new frames of reference for brainstorming, new strawmen for creative exploration, and new ways of thinking outside of what has been considered the norm of best practice.
The data makes us think and reconsider our entrenched beliefs. It makes us contemplative, and helps us see the world in a new ways.
We conducted research with college students, and heard stories from people like Rita, who has a 14 year old, a 12 year old, a 10 year old, an 8 year old, and a 1 year old. Her day is spent caring for her kids and working; her course work begins at 10pm, once the kids go to bed, and she works online until about 2 or 3 each morning.
We learned about Maria, whose parents are actively discouraging her from going to school because they want her to get married and have children.
We heard from Haley, who didn’t actually have a child, but instead dropped out to take care of her sister’s newborn – to support her sister, while sacrificing her own immediate needs and dreams.
These stories are rich with emotional value, and help us build a shape and structure to the problem space. I have no idea if Maria, Rita and Haley are common or complete outliers. It’s unlikely, but perhaps there are literally no other college students in the US that have kids like Rita, or that stay up all night; or dropped out like Haley to help their family, suspending their own education.
The thing is, I don’t care.
My research, my research findings, and my synthesized research insights never make any mention of how Rita and Haley exemplify a larger population of students. We didn’t conduct research to make extrapolations; we conducted it to understand and feel. And so sample size and statistical significance don’t matter at all.
This is a challenge. Design research uses the same terminology – participants, research plan, hypothesis – as traditional behavioral science research, and so it’s nearly impossible to shed the baggage of that research protocol. Through a lens of behavioral science research, a small, non-randomly selected sample calls into question any findings, because it (correctly) identifies that the research is flawed in predicting how a larger group will behave, or what a larger group thinks and feels.
But if the research is rigorous – if design research was well planned, well conducted, and well captured – then the findings are valid, because the findings are simply observations around the population of participants.
In this model, where sample size doesn’t matter because we aren’t making any extrapolations or predictions, anomalies and extremities are valuable. The weirder the participant, the better, because it provides so much new fodder and raw material for the researcher. When a researcher encounters a completely eccentric and obviously bizarre participant, it provides a chance for the researcher to stretch their understanding of the problem space.
When we spoke to a young banking customer, they didn’t understand and couldn’t articulate the difference between a savings and checking account.
When we talked to a college student about her debt, she explained that she was counting on her college loans being excused by the government – and, without that, she had no real strategy to repay the thousands of dollars she owned.
When we asked a millennial about life insurance, she explained that she had been sold both a term and whole life policy by a friend of her family, and was paying thousands of dollars a year in premiums – but she had no dependents, no investments, and was having trouble paying rent each month.
In these three examples, the participants or their experiences are anomalous, and that’s the point. Their strange stories and experiences have helped the design team see the problem space of banking, or debt, or insurance differently.
When we consider design research, we need to stop judging the validity of the program on the size of the sample or the bias of the participant selection. Instead, let’s look at the methodology and the rigor and style of the data collection, and use a lens of provocation. We can judge design research based on these questions:
- Were the participants given the opportunity to be experts? Being an expert doesn’t mean that they have degrees or credentials; it means that the researchers gave them the runway to voice their expertise. A mother of five is an expert at being a mother. A television aficionado is an expert at watching television. A good research program should place the participants in a context where they can experience and exhibit their expertise.
- Were the participants given methods and tools to communicate their expertise? Expertise is often tacit – people can’t always articulate the things they are good at, because those activities have become so autotelic. A good research program should give participants specific ways to vocalize, or more importantly, show, their expertise, even when they can’t consciously communicate it.
- Were the facilitators encouraging, yet neutral? A good design research moderator is able to encourage participants to communicate their ideas and feelings, while avoiding judgement, criticism, or opinions – while remaining completely neutral. Participant communication is encouraged based on open-ended, non-agenda-driven questions.
These are the questions that are relevant in assessing and identifying good design research. When you are challenged on the validity of your findings, make it explicit that the participants were supported and encouraged to show how their work is done and how their life is lived, and that their work will inspire you to make new products and services.