If you joined us last week, you’ll remember that we here at Cor.us are testing different ways of improving survey construction, with a focus particularly on completion time. (Here is the link to the previous post if you didn’t see it! Shorter Surveys). In that previous post we looked at respondent’s age as a driver.
Today we’ll be looking at another critical factor in maximizing the attention of respondents - question design.
Using the same test dataset as before, we found four things worth being cognizant of:
- The most impactful driver of per-question completion time is density, as defined by the number of ‘cells’ a respondent must evaluate or score.
For instance, a single- or multiple-select list of 7 options has a density of 7, while a Likert matrix of 5 ratings across 12 attributes has a density of 60. Increasing density by 10%, 50%, and 100% leads to completion times of +2%, +11%, and +22%, respectively, so in the previous example we would expect a 60-cell question to take an average of 184% more time to complete than a 7-cell question!
In some contexts, however, this can undoubtedly work to your advantage. In staying with the same example above, we can see that it’s possible to capture roughly 8.5 times more data at a cost of just 3 times the incremental tax on respondent attention.
Takeaway: All else equal, condense the choices and options as much as possible without sacrificing the intent and purpose of the survey. Do you really need 7 choices in a Likert scale, or would 5 suffice? Is it vital that your respondents assess, say, 12 different attributes in a question, or would 9 suffice?
- Efficiency in wording of both questions and choices leads to shorter survey completion times.
An increase of just 15 words in the question lead to an increased completion time of ~12%. Our analysis indicates that reducing the number of words in the question text - as opposed to the text of responses - is a more efficient means of shortening average respondent time, though both have a material impact in response times.
Takeaway: Avoid verbosity. Take care to exercise parsimony with the number of words you use for the text of both questions and answers.
- Rating types of questions, such as “strongly disagree to strongly agree”, save about 15% on completion time, even when controlling for dense matrices.
Presumably this is because the logic that differentiates each response option is both quicker to grasp at the outset and easier to retain in memory as a respondent evaluates their answer to the associated question(s). In matrices this effect can be much more pronounced. Compare, for instance, the simplicity of a Likert scale against a response set that includes the following options: “Not interested”, “I used to own this”, “I want to buy this”, “I already own this”, “I highly recommend this”. As the respondent works their way through the matrix rows they may need to refer back to these options to refresh their memory.
Takeaway: Commonly used scales and ratings are much less taxing of respondent focus than rare or unique alternatives. Designing your questions with ordinal option sets whenever possible should therefore be your goal.
- Surveys that switch back and forth between different question types (e.g. matrices, single- and multiple-choice, sliders) are less efficient from a completion-time perspective than staying with the same question type.
In fact, when switching to a different question type we can expect completion time to rise for that next question by an incredible 36%. The effect is reversed when question types remain consistent. After 3 or 4 repetitions of the same question type, completion time for the 3rd question in the series will drop by 10%, and for the 4th question it will drop by 35%. “Getting into a groove” as a respondent is very much predicated on question type consistency.
Takeaway: Except when deliberately interspersing attention “speed bumps” that test your respondents’ focus, otherwise try to minimize switching between question types within your surveys. It can be tempting to make your surveys more “creative” with question type variety, but this is counterproductive from the standpoint of maximizing attention spans.
As we will do with all posts, we’ll share the raw data (see image below). Again these results are based on 10,000 completes from random US adults in a dataset that was 45 questions long!