Social media isn’t “one man, one vote.” Deception is commonplace, meaning analyses of social data are often built on biased evidence. Unfortunately the sheer volume of messaging on Twitter & Facebook allows adversaries to hide in plain sight. We eschew standard classification techniques and use probabilistic algorithms that uncover hidden groups of accounts engaged in misuse of the network. This novel approach has been used to discover spammers and participants in information campaigns who, in some cases, have been highly active on Twitter for several years.
Our Pinocchio software analyzes millions of authors to automate detection of deceptive identities. On Twitter, accounts we ﬂag are 2x more likely to eventually be suspended for violating the Terms of Service. However Twitter’s Trust & Safety team detects only 1/20th of our ﬂagged accounts, which can make up >15% of traﬃc on a given topic.