Including measures to insure that subjects are paying attention, such as instructional manipulation checks, or at the least verification questions to filter subjects who don’t understand the task, is a regular practice when running crowdsourcing experiments. Such measures act as stop-gaps for tendencies toward satisficing behavior in environments like AMT. While working for rewards encourages reasoned, accurate answers, if a worker can find and apply a heuristic that results in “accurate enough” responses they can complete more HITs faster, optimizing rewards.
While such checks may be effective in many cases, or the need for them avoided altogether by methods that strategically combine answers into an accurate aggregate signal, there are cases where including them isn’t enough to bring about the desired quality of responses. Or, these checks might add additional issues to a task such as when verification questions interrupt the flow of a worker’s thinking, or anchor a worker’s subsequent responses.
Dual reasoning accounts of cognition suggest that people can process incoming information intuitively, automatically, and relatively effortlessly, or they can apply more deliberative, systematic and analytical thinking to make a decision. In a recent paper (Hullman, 2011), I discuss how evidence from psychological experiments that manipulate task stimuli in ways that encourage one reasoning style over another can be integrated into crowdsourced research to improve the quality and validity of experimental results. In many cases, a central challenge to experimentation on AMT is how to activate the sort of systematic thinking that will lead to better responses, and possibly more skilled workers in the long run, while still creating HITs that workers want to do!
One technique to induce active, systematic cognitive processing of presented task information (such as images or text) involves integrating more difficult to parse stimuli in key places in the task. Harder-to-read fonts, for example, have been shown to increase recall and comprehension of textual information (Alter et al., 2009), as has using more cognitively “costly” legends as opposed to labels on graphs (Shah et al., 2011). While subjects may perceive such stimuli as more difficult, the effect is less interruptive of their cognitive processing of the target task than verification questions or instructional manipulation checks.
Motivation, the self-directed component of psychological activity, can both strengthen and balance the effects of the above “desirable difficulties” techniques. Active, engaged processing of information can be induced by increasing a reasoner’s desire to engage with the content. What if researchers devoted more attention to creating an aesthetically-pleasing or personalized HIT? Doing so could strike a balance between increasing a worker’s motivation and sense of enjoyment of the task while simultaneously introducing cognitive difficulties that decrease the likelihood of more erroneous automatic reasoning.
Hullman, J. (2011). Not all HITs are Created Equal: Controlling for Reasoning and Leaning Processes in MTurk. Positionpaper. ACM CHI 2011, Vancouver, BC.
Alter, A. L., & Oppenheimer, D. M. (2009). Uniting the Tribes of Fluency to Form a Metacognitive Nation. Personality and Social Psychology Review, 13(3), 219-235.
Shah, P., Miyake, A., & Freedman, E. (2011). Are Labels Really Better Than Legends? The Effects of Display Characteristics and Topic Familiarity on the Comprehension of Multivariate Line Graphs. Working Paper in preparation.