Getty Images | SOPA images
One of the creators of the Botometer — a web tool Elon Musk used to estimate Twitter’s spam percentage for a court filing — has reportedly said that Musk’s calculation “means nothing.” Kai-Cheng Yang, a Ph.D. Candidate at Indiana University, “questioned the methodology used by Mr Musk’s team and told the BBC they did not contact him before using the tool,” a BBC article said today.
An Aug. 4 Musk court filing claimed that a Botometer analysis of Twitter Firehose data in the first week of July “showed that fake or spam accounts accounted for 33 percent of visible accounts during that period.” But as Yang pointed out, the botometer returns scores from 0 to 5 — with 5 being the most bot-like — and Musk’s court filing didn’t say where he drew the line between human and bot.
“To estimate the prevalence [of bots] You have to choose a threshold to lower the score,” Yang told the BBC. “If you change the threshold from three to two, you’re going to get more bots and fewer people.” Because Musk’s court filing “doesn’t make details clear, ‘Musk’ is free to do as he pleases. So the number doesn’t mean anything to me,” Yang said.
“Technically, you can choose any threshold you want and get any result you want,” Yang said in a previous interview with Yahoo. The Botometer is a project of the Observatory on Social Media and the Network Science Institute at Indiana University.
Botometer rated Musk as a likely bot
The botometer itself “indicated that Elon Musk’s own Twitter account was likely a bot and gave it a 4/5 rating,” Twitter noted in a court filing. Musk’s Botometer score reportedly varied between 0.5 and 4, showing that the tool rates Musk as human-like some days and more bot-like on others.
Twitter also noted that Musk and his team “have not stated what score they apply to complete an account constitutes spam; hence their claim is unverifiable”. Twitter further noted that an account could be a bot without being what the company considers a fake account or spam. Twitter gave examples such as bots “that report earthquakes as they happen, or weather updates.”
Other types of legitimate accounts can be considered as likely bots by the botometer. The Botometer gave my own verified Twitter account a 3 out of 5 bot rating today, and it gave the Ars Technica verified account a 3.6 out of 5.
The Botometer website FAQ warns against marking any account over a certain number as a bot. “It’s tempting to set an arbitrary threshold and consider anything above that number a bot and anything below a human, but we don’t recommend that approach… We think it’s more informative to look at the distribution of results across a sample of accounts.” “, says the FAQ.
Yang was surprised that Musk hadn’t developed a better tool
Yang also spoke to CNN recently and expressed surprise that Musk used the botometer instead of creating something more precise. “To be honest, you know Elon Musk is really rich, right? I assumed he would spend money hiring people to develop some sophisticated tool or method himself,” Yang told CNN.
The botometer is best used “to supplement your own judgment, not to replace it,” says the tool’s FAQ, noting that “humans and machines have different strengths when it comes to pattern recognition in a human.” Observer will fool a machine learning algorithm. For example, Botometer sometimes categorizes “organization accounts” as bot accounts. Likewise, an algorithm can safely classify some accounts that people are having trouble with.”
Twitter sued Musk in the Delaware Court of Chancery after trying to walk away from his obligation to buy the company for $44 billion. Musk has defended his attempt to break the merger agreement by questioning Twitter’s public disclosure that less than 5 percent of its monetizable daily active users (mDAU) are spam or fake.
Twitter defends the accuracy of its estimates, saying they are based on “multiple human reviews (in repeat) of thousands of randomly selected accounts each quarter, using both public and private data.” Twitter also says Musk has no right to leave the merger deal because of the number of spam accounts.
Musk has plans for more thorough spam analysis, his court filing said. “Defendant’s experts continue their analysis even now and await the production of additional data by Twitter (including ‘private’ data that Twitter provides to its human reviewers and claims is necessary to verify the reported less than 5 percent review spam and false user rate), intend to conduct a more comprehensive analysis and expect to provide updated estimates and findings in expert reports and in court,” Musk’s attorneys wrote.