how to deal with survey fraud

by Marie Boucher (Director, Consumer Insights at Honest Data)

When you put time and energy into designing an online survey, the last thing you want is a survey taker to fake their way through it. Even worse, an automated click farm or bot army can ruin your dataset by swamping the survey.


Some sources put the rate of fraudulent survey responses between 5% and 26%, and in a study where a few percentage points could mean the death of an idea, even 5% is critical. Additional research into the problem has claimed 1 in 5 surveys contain fraudulent responses. These reports create distrust and cynicism in the market research community, our stakeholders, and our clients.


Tackling survey fraud can be daunting. On a small scale, individuals will cheat or lie to qualify for a survey they shouldn’t be taking in order to receive a reward. Or qualified respondents will retake a survey many times to increase their reward. On a much larger scale, organized groups or malicious software will automate multiple responses to pool incentives.


Identifying fraud in your survey


Here are some proactive steps for designing fraud-proof surveys:


  1. Repeat similar questions to check consistency. Asking for a respondent’s age or location at both the beginning and end of a survey and then matching the answer can indicate a valid response. 
  2. Calculate the duration of each response from start time and termination time to find responses completed in “impossible” timeframes. The average reader can process 300 words per minute in most cases, so a 1200-word should take at least 4 minutes in most cases. Some survey tools will let you embed question timing to easily check for speeding on single questions or sections within the survey as well. 
  3. Vary the order or scale of repeated battery, matrix, or grid-style questions. Questions with a consistent scale across a number of attributes or statements are often targets of “straight-lining” or speeding respondents who pick the same option all the way down the list. Switch the scale (e.g., 1 to 5 for several, then 5-to-1 for a few more) or vary the labels (agree-disagree for some, then important-not important for others) so you can identify straight-line data as fraudulent responses. 
  4. Open-ended comment responses can also provide indications of cheating. Likely fraud can often include nonsensical typing (“qwe$@jnd”), plagiarized text (“Four score and seven years ago…”), repeated answers, and even copied text from your own survey questions. 
  5. Block one computer or a single IP address from submitting multiple copies of a survey (often called “ballot stuffing”) using options in your survey platform or tool.
  6. Remove the “back” button to prevent multiple submissions once someone begins your survey.
  7. Also, some survey tools will allow you to validate personal information such as zip codes and phone numbers, which are often fabricated in fraudulent responses. 
  8. Specific to automated fraud such as bots, take advantage of reCAPTCHA or similar “I am not a bot” verification available in most survey tools.
  9. Inserting attention-checking questions with obvious answers can help with both negligent and fraudulent survey takers. People who are speeding through a survey to get the incentive, or automated bots, will often miss simple, obvious questions like “Select the word ‘blue’ from the following list of colors,” allowing you to flag invalid responses. 
  10. Try not to restrict the survey too much using skip, hide, or show logic. Fraudulent respondents find it easier to submit valid-looking answers if you’ve limited the answer options for them in advance. For example, if you ask someone 1. If they listen to music on their smartphone, then 2. Which app do they use to listen to music on their smartphone, you may be tempted to only show question 2 to those who answered “yes” to question 1. But if you still show question 2 to everyone, but give them an answer option of “I don’t listen to music on my smartphone,” you can compare the responses for consistency.


The problem of survey takers fabricating responses is real, but high quality design and implementation can go a long way towards providing answers you can trust. At Honest Data, we put a huge amount of work into supporting our clients’ data integrity. If you have questions about trusting your research, please drop us a line! We just love talking about our work. research@honestdata.com 


Sources:

https://www.sciencemag.org/news/2016/02/many-surveys-about-one-five-may-contain-fraudulent-data

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2580502

https://www.pewresearch.org/methods/2016/02/23/evaluating-a-new-proposal-for-detecting-data-falsification-in-surveys/

Detecting, Preventing, and Responding to “Fraudsters” in Internet Research: Ethics and Tradeoffs: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4669957/

Review of Methods to Detect Fabricated Survey Data: https://www.researchgate.net/publication/265059305_A_Literature_Review_of_Methods_to_Detect_Fabricated_Survey_Data


image75