Researchers from the Ramón Llull University (Spain) have created a system capable of geolocating videos by comparing their audiovisual content with a worldwide multimedia database. In the future this could help to find people who have gone missing after posting images on social networks, or even to recognise locations of terrorist executions.
Many of the videos available online are accompanied by text which provides information on the place where it was filmed, but there are others that do not present this information. This complicates the application of the ever more frequent geolocation tools of multimedia content.
To solve this, scientists from the La Salle campus at Ramón Llull University (Barcelona) have developed a system to locate the videos with no indication of where they were produced on the map, a real challenge considering that the majority are scenes of daily life without the appearance of clearly recognisable places. As they do not come with text, the method is based on the recognition of their images or frames and all of the audio.
"The acoustic information can be as valid as the visual and, on occasions, even more so when it comes to geolocating a video," comments Xavier Sevillano, one of the authors. "In this field we use some physics and mathematical vectors taken from the field of recognition of acoustic sources, because they have already demonstrated positive results".
All of the data obtained is merged together and grouped in clusters so that, using computer algorithms developed by the researchers, they can be compared with those of a large collection of recorded videos already geolocated around the world.
In their study, published in the journal 'Information Sciences', the team has used almost 10,000 sequences as a reference from the MediaEval Placing task audiovisual database, a benchmarking initiative or assessment of algorithms for multimedia content. "The videos which are most similar in audiovisual terms to what we want to find are searched for in the database, to detect the most probable geographical coordinates," says Sevillano.
The scientist points out that the proposed system "despite having a limited database in terms of size and geographical coverage, is capable of geolocating videos with more accuracy than its competitors". More specifically, it is capable of locating 3% of videos within a ten-kilometre radius of their actual geographical location, and in 1% of cases it is accurate to one kilometre. The percentages are still modest, although they are four times more precise than the accuracy reached up until now using the same database.
The researchers recognise that their method will require a much greater audiovisual base to apply it to the millions of videos that circulate on the internet, but they highlight its usefulness in locating those which do not have textual metadata and the potential possibilities that it offers.
"This method could help rescue teams to track down where a person or group disappeared in a remote place, detecting the locations shown in the videos which could have been uploaded to a social network before losing contact", says Sevillano.
In the future, security forces could also use it, even to recognise locations of hostage executions and operations of terrorist groups such as Al Qaeda or Islamic State. "Our system does not make any assumptions regarding the location of the videos, but in these cases we are given very valuable additional information to limit the searches, as we already know that we are dealing with the area of Iraq or Syria, and therefore, we would only use reference videos from there," explains the researcher.
Another much more immediate application is to facilitate geographical browsing in video libraries, such as YouTube, which celebrated its 10th anniversary this week. "For example, if I want to go on holiday to New York and I feel like watching videos of Manhattan, when I type in a search on YouTube I get videos coming up recorded on the island, but also the performance of the seventies group The Manhattans and the trailer of the Woody Allen film Manhattan, which are not relevant to my search," comments Sevillano, "and in these cases, the new technology can also help".
Citation: Xavier Sevillano, Xavier Valero, Francesc Alías. "Look, listen and find: A purely audiovisual approach to online videos geotagging". Information Sciences 295: 558-572, 2015.