Sisi Peng and Neeku Salehi Sisi Peng and Neeku Salehi

Tips For Scholars: Detecting Bots During Social Media Recruitment

As part of the follow-up study to the Media & Teen Mental Health project, our team leveraged social media for recruiting participants and posted digital flyers on different platforms, such as Twitter, Facebook, Instagram, TikTok, and Reddit. After sharing the recruitment survey on Facebook, we received a substantial surge of interest in our research study. While initially exciting, after further inspection we realized that over 90% of these new responses were from bots (computer programs that automatically complete surveys). Although these responses were discarded, their information provided valuable insights into bot behavior and activity. 

We noticed questionable patterns and developed guidelines to determine the likelihood of a bot response. We decided to throw away a participant's response if it met at least 3 of the following criteria: location, name/email address, timing, demographics, and repeat answers. 

Location:

  • Since the study was intended only for U.S. participants, IP addresses located outside of the U.S. were red flags

  • Multiple submissions from the same IP address was another sign of fraudulent activity

  • The exact same longitude and latitude on multiple unrelated responses was suspicious

  • Location data did not match the self-reported location provided by the respondent (IP address, area code, and/or state) 

Name/email address:

  • Mismatches between respondent name and email indicated potential fraud (for example: John Smith had the email address “chadroberts123@gmail.com”)

  • Too many numbers or random letters in the handle seemed dubious (for example: “jd14780791@gmail.com”) 

  • Mismatches between respondent name, gender, and email suggested possible deceit (for example: Christina Le who reported as female had the email address “zackarymaradv49@gmail.com”)

  • Inconsistencies with parent/child names and corresponding email addresses generated skepticism (for example: Jared Murray signed up as a parent with the email address “abinbayaravichandrann@gmail.com” and his child Alen Henderson’s email address was “glenlishn@gmail.com”)

Timing:

  • Genuine survey responses took 5 minutes on average and surveys that were completed too quickly or slowly were flagged for additional investigation (for example: some surveys were completed in 12 seconds and some took over 50 minutes)

  • Multiple completed surveys in a row with the same timing were marked for further scrutiny

  • Note: bots can learn, so be on the lookout for long response times that suggest a single bot is learning

Demographics:

  • Improbable demographics revealed potential imposters (for example: the percentage of Native American respondents was higher compared to the national average)

  • Note: keep in mind the characteristics of the local population that you are recruiting from

Repeat answers:

  • Same phrasing on different survey responses signaled possible bot activity

While online study recruitment has become increasingly popular, there is a large risk of bot interference and researchers need to be more aware of the problem. Jennifer Doty, PhD, CFLE from the Department of Family, Youth, and Community Sciences at the University of Florida explains her experience with bots:

“Last spring, we collected prescreener data online to interview youth from a variety of racial and ethnic groups. We launched our internet search via Facebook and online listservs. At first, we had a trickle of interest, but on April 20th we had an explosion of interest. Upon examination, we could see that the data was generated by a bot—the emails were strange and some were repeated, and we had about 100 times the number of American Indians we would expect. This was also the day that Derek Chauvin’s jury reached a verdict. We suspect that bots were especially active on a day where racial tensions were high. After this, we researched strategies to identify mischievous responders and included them in our next grant proposal. To ensure validity of participants and avoid mischievous responders, in our next project, we will include ReCAPTCHA technology screening and track IP addresses. In addition, we will include open-ended questions, which bots often leave empty. We will also include up to four screening questions that will help us flag mischievous responders. For example, we will require youth to match the year to the age they report and validate their age at the time of a recent event in history. Another strategy is including questions like, “Does the earth move around the sun?” These strategies have been used in previous studies to validate online samples.”

Celeste Campos-Castillo, PhD, an Associate Professor from the Department of Sociology at University of Wisconsin-Milwaukee also shares strategies for identifying bots:

“Other useful tips come from the world of researchers using crowdsourcing platforms, such as Amazon Mechanical Turk, to recruit respondents. These platforms provide researchers access to thousands of workers who complete tasks, including surveys, in exchange for payment. Unfortunately, the average payment is notoriously below minimum wage standards, leading workers to seek ways to complete as many tasks with little effort. This includes using bots to complete tasks automatically and virtual private servers (VPS) to sidestep IP address requirements (e.g., the survey prevents multiple submissions from the same address or requires that the address comes from a specific geographic region). Numerous papers document the problem with these platforms and provide solutions, so here are a few. One set of solutions plants questions that only a human responder who is paying close attention could complete correctly. Examples include questions directing the respondent to select a specific response option (e.g., “Select the neither agree or disagree option in order to proceed”) and asking confirmation of statements that could not possibly be true of anyone (e.g., “I have conducted business with the country of Latveria”). Another set embeds technology within the survey to aid in detection, such as protocols to detect a VPS.”

Ultimately, bots serve as threats to the study sample and data quality. When recruiting for research participants online and one notices a sudden spike in recruitment numbers, make sure to thoroughly check for any inconsistent responses, improbable responses, and unusual comments. Including a reCAPTCHA at the beginning of the recruitment survey or common sense questions are also recommended techniques to deter and identify bots. Before posting recruitment materials on an online platform, make sure to conduct a quick search for recent bot activity.

It is critical to detect and prevent bots from infiltrating participant responses. Bots harm research design and methodology by creating inaccuracies and skewing data. Utilizing the methods listed above could help prevent unreliable and invalid research findings and further your efforts to generate meaningful data.

Sisi Peng

CSS Fellow

Neeku Salehi

CSS Intern

Read More
representation Maryam Kia-Keating, Ph.D. representation Maryam Kia-Keating, Ph.D.

The Unbearable Invisibility of Being MENA in the Media

Growing up in Hawaii, despite its beautiful, multicultural communities, there was rarely a person around me that was Middle Eastern North African (MENA). My Iranian immigrant family practically took off sprinting after anyone if we heard even an inkling of Farsi spoken, just so that we could say hello. It was that rare and that coveted.

Decades later, those same combined, complicated feelings of yearning, heartache, and gratitude still wash over me when I find any media representation whatsoever that positively represents Persian culture. That’s why I was immediately diverted from my piled-up to-do list when I came across an Instagram video post of Britney Spears saying “Asheghetam” (“I love you” in Farsi) to Sam Asghari, her long-time boyfriend and now fiancé, who happens to be Iranian-American.

In fact, representation of Iranians, or anyone with MENA heritage has historically fallen short in Hollywood. Portrayals are often limited to painfully stereotyped characters which Meighan Stone, Former President of the Malala Fund, described as “negative, violent, and voiceless” in her report for the Harvard Kennedy School. In fact, her study of a 2-year period, between 2015-2017, found that there was not a single news story that highlighted positive coverage over negative coverage of Muslim protagonists.

Similarly, Jack Shaheen’s book Reel Bad Arabs: How Hollywood Vilifies a People analyzed 1,000 films across more than 100 years of filmmaking (from 1896-2000) and found that a whopping 93.5% offered negative portrayals, while 5% were neutral and a sad minority of only 1% were positive. A recent study by the MENA Arts Advocacy Coalition found that 242 primetime, first-run scripted TV and streaming shows between 2015-2016 underrepresented MENA actors. When including MENA characters in primetime TV shows, a majority (78%) depicted roles of terrorists, tyrants, agents, or soldiers, most of which were spoken with an accent.

MENA actors who break through MENA stereotypes are often still hidden and invisible in terms of their MENA identity. Among those with Iranian-American heritage: Yara Shahidi, Sarah Shahi (birth name Aahoo Jahansouzshahi), Adrian Pasdar, and others whose roles are often portrayed as a character with another non-MENA ethnic background (which sometimes coincides accurately with their own mixed heritage, but does not reflect their MENA side), such as Black, Latinx or Italian American.

Not enough has improved, but there are inklings of potential progress. Although the intriguing plan to launch a comedy about a Middle Eastern family of superheroes has yet to bear out, the TBS sitcom Chad made it on air after five years in development. Chad is about a teenage boy named Ferydoon “Chad” Amani, a 14-year-old Iranian-American played by Nasim Pedrad of Saturday Night Live.

In an interview with the Hollywood Reporter, Nasim Pedrad sums up how her personal experience motivated her vision for the show:

“When I was growing up, I did not see a half-hour comedy centered around, you know, a Middle Eastern family let alone specifically, a Persian one. In fact, so much of the representation of Middle Easterners on TV that I did see was predominantly negative, which was very alienating. I didn’t see Persian people on TV that seemed anything like the Persian people that I was surrounded by, not just in my family, but in my community. I didn’t understand. I was like, ‘Why are Middle Eastern people on American television only bad guys?’ Like what about those of us living here that are just like the rest of you, except for the specific cultural elements that we still celebrate and hold onto. So my hope is that people watch the show and actually can recognize that yes, this family is Persian American, but hopefully they can tap into just how many similarities we all have and how much we all have in common.” 

Psychologists and other scholars substantiate the importance of representation. The failure to move past stereotyped, negative roles for a majority of MENA characters is deeply harmful. It contributes to what my colleagues and I described as a cumulative racial-ethnic trauma for MENA Americans, in an article published in the American Psychologist. MENA Americans live with chronic and pervasive experiences of hypervisibility related to negative portrayals, and utter invisibility when it comes to featuring the positive, or even just the normal. These chronic subtle, and sometimes overt, messages of hate build up, contributing to insecurity, alienation, hopelessness, and ultimately, physical health and mortality.

In contrast, the potential benefits of media portrayals that affirm the ways in which MENA and other diverse communities are interconnected, loving, and share common values, hopes, and dreams, matter to children’s mental health and well-being. They matter to creating a society that has compassion, empathy, and embraces the many strengths that diversity brings.

Actionable Insights

  1. Do your homework. Watch and read authentic stories. Examples in the media are when Anthony Bourdain visited Iran on Parts Unknown, or when Brandon Stanton took his camera to Iran and other countries allowing his loyal HONY following to connect with the universality of human struggles and triumphs across borders.

  2. Represent rich complexity, identities, and varieties. Feature MENA characters in television and film with non-stereotyped characteristics and roles. Pay attention to details such as accents, religious beliefs, immigrant generation, sexuality, and gender roles that perpetuate negative stereotypes, are often inaccurate, and do not represent the diversity within the MENA community.

  3. Involve insiders. Involve MENA Americans in content creation to ensure authenticity of stories and characters. CSS Collaborator, Sascha Paladino, and his team offer a lovely model of inclusion and authenticity in Mira, Royal Detective, a Disney Junior show featuring a South Asian protagonist.

  4. Amplify capable, compelling, desirable representations. Amplify MENA stories that represent the many societal contributions MENA Americans make. Oftentimes, when someone with a MENA heritage does something well, their race/ethnicity is suddenly invisible from the story, and may not even be reported.

  5. Increase the sheer number of characters. Increase the MENA American characters in children’s programming. At only about 1%, there’s no place to go but up.

  6. Be accurate about identities. Accurately and authentically depict MENA actors as MENA (or, when relevant to their actual background and not creating conflict with the storyline, upholding their mixed heritage) characters. Likewise, such as in the case of Prince of Persia, or Dune, when characters are supposed have MENA heritage, hire MENA actors.

Maryam Kia-Keating, Ph.D.

Professor at the University of California, Santa Barbara

Collaborator of CSS

Read More
2020 Sophie Graham 2020 Sophie Graham

Sam, 13

Favorite Media/Technology: Xbox

How do you and your family interact with media/technology?

We watched sports together before COVID and now we sometimes watch movies together or play games against one another. My family uses Houseparty to play online games together.

How do you and your peers interact with media/technology?

I play Fortnite and talk to my friends on Xbox. On TikTok I send my friends funny videos. I also use Snapchat to for streaks.

What do you use media for?

I use social media for anything sports-related. I also use TikTok and YouTube to search up funny videos or sports highlight reels. I play Fortnite and sports games on Xbox.

What is your favorite/least favorite thing about media/technology?

Media is fun, but I don’t like the news.

What media are you using more now because of the coronavirus (COVID-19)?

I used more technology once the pandemic began, because I began doing all my school work online and sports were cancelled. I hung out with his family and played games and kept up with friends through video games. Now that sports are allowed, I don’t use technology as much as I did.

Interviewed by: Sophie Graham, University of Cincinnati student

Read More