Hashtag #doyouknowwhoswatchingyou? A new study from USC researchers sampled more than 15 million tweets, showing that some Twitter users may be inadvertently revealing their location through updates on the social media channel.
The study, which appears in the current issue of the International Journal of Geoinformatics, provides important factual data for a growing national conversation about online privacy and third-party commercial or government use of geo-tagged information.
"I'm a pretty private person, and I wish others would be more cautious with the types of information they share," said lead author Chris Weidemann, a graduate student in the Geographic Information Science and Technology (GIST) online master's program at the USC Dornsife College of Letters, Arts and Sciences. "There are all sorts of information that can be gleaned from things outside of the tweet itself."
Twitter has approximately 500 million active users, who are expected to tweet 72 billion times in 2013. Reports have shown that about 6 percent of users opt-in to allow the platform to broadcast their location with every tweet.
But that's only part of the footprint Twitter users leave, and even users who have not opted-in for location tagging may be inadvertently revealing where they are, the study shows.
To get a fuller sense of what publicly accessible data might reveal about Twitter users, Weidemann developed an application called Twitter2GIS, to analyze the metadata collected by Twitter, including details about the user's hometown, time zone and language.
The data, generated by Twitter users and available through Twitter's application programming interface (API) and Google's Geocoding API, was then processed by a software program, which mapped and analyzed the data, searching for trends.
During the one-week sampling period of the study, roughly 20 percent of the tweets collected showed the user's location to an accuracy of street level or better.
Many Twitter users divulged their physical location directly through active location monitoring or GPS coordinates. But another 2.2 percent of all tweets – equating to about 4.4 million tweets a day – provided so-called "ambient" location data, where the user might not be aware that they are divulging their location.
"The downside is that mining this kind of information can also provide opportunities for criminal misuse of data," Weidemann said. "My intent is to educate social media users and inform the public about their privacy."
In addition to being a graduate student at USC, Weidemann works for a company that builds geographic information systems for the federal government. He initially developed Twitter2GIS as part of a capstone project for a course taught by Jennifer Swift, associate teaching professor of spatial sciences at USC.
Swift, Weidemann's thesis adviser, said the project stood out for its thoughtful look at geospatial information: "It will help create an awareness among the general population about the information they divulge," said Swift, a co-author on the study.
Weidemann is a self-described "conservative" Twitter user, using the social media channel infrequently. He has the privacy set to not share any location information about his tweets. Still, in the course of doing this study, he turned Twitter2GIS on his own account and was surprised at the specificity the application was able to find about his location, based on a hashtag he used about an academic conference.
"This research has been fun," Weidemann said. "And a little scary."