January 4, 2017, By Cheryl Pellerin – On Dec. 28, 2016, President Barack Obama published the annual proclamation of January as National Slavery and Human Trafficking Prevention Month, and the Defense Advanced Research Projects Agency is developing next-generation search technologies to help investigators find the online perpetrators of those crimes.
Wade Shen, a program manager in DARPA’s Information Innovation Office, said in a recent DoD News interview that the program, called Memex, is designed to help law enforcement officers and others perform online investigations to hunt down human traffickers.
“Our goal is to understand the footprint of human trafficking in online spaces, whether that be the dark web or the open web,” he explained, characterizing the dark web as the anonymous internet, accessed through a system, among others, called Tor.
“The term dark web is used to refer to the fact that crimes can be committed in those spaces because they’re anonymous,” Shen said, “and therefore, people can make use of [them] for nefarious activities.”
Point of Sale
The approach he and his team have taken is to collect data from the Internet and make it accessible through search engines.
“Typically, this is data that’s hard for commercial search engines to get at, and it’s typically the point of sale where sex trafficking is happening,” Shen explained. “Victims of sex trafficking are often sold as prostitutes online, and a number of websites are the advertising point where people who want to buy and people who are selling can exchange information, or make deals.
“What we’re looking for,” he continued, “is online behavioral signals in the ads that occur in these spaces that help us detect whether or not a person is being trafficked.”
When a prostitute is advertised online as being “new in town” or by specific characteristics, those are hints that person might be trafficked. New in town means a person might be moving around, and the term “fresh” often means a person is underage, Shen explained. “Those kinds of things are indicators we can use to figure out whether or not a person is being pimped and trafficked,” he added.
Before the Memex program formally began in late 2014, Shen’s team was working with the district attorney of New York to determine if they could find signals associated with trafficking in prostitution ads on popular websites.
“We found that lots of signals existed in the data, whether they be phone numbers used repeatedly by organizations that are selling multiple women online, or branding tattoos that exist in photos online, or signals in the text of the ads,” Shen said.
Shen’s team had been working on text-based exploitation programs for big data — extremely large data sets that may be analyzed computationally to reveal patterns, trends and associations, especially relating to human behavior and interactions. But they thought that if they extended the technology to understand images and networks of people, then they could apply it to detecting rings of traffickers and behaviors associated with trafficking online.
“If we could do that,” he said, “we could … generate leads for investigators so they wouldn’t have to sift through millions of ads in order to find the small number of ads that are associated with trafficking. So that’s what we did.”
Early on, the team realized that search wasn’t quite the right modality for doing such investigations and that there was a lot more work to do before the technology could be adapted to trafficking. That’s when the Memex program began, Shen said.
“Since the beginning of the program, we’ve had a strong relationship with the district attorney of New York, but they’re not the only user of the technology. Over time, we have engaged with many different law enforcement agencies, including 26 in the United Kingdom, the district attorney of San Francisco, and a number of others,” he said.
Investigators for the district attorney of New York were able to use Memex tools to find and prosecute perpetrators, and that resulted in an arrest and conviction in the program’s first year, he added.
“Since then,” Shen said, “there have been hundreds of arrests and other convictions by a variety of law enforcement agencies in the United States and abroad.”
Today, more than 33 agencies are using the tools, he added, and an increasing number of local law enforcement agencies are using the tools.
“As word of mouth spreads about the tools and the fact that we give free access to the tools to law enforcement, more and more people are signing up to use it,” he said.
Shen said it’s easy for his team to work with state, local and federal partners in the United States, but it’s harder to work with agencies abroad.
“But we’re committed to do that,” he added, “so we are in the process of working out deals with a number of those agencies so they have access to the tools we currently deploy and to allow them, after we exit [when the program ends in a year] … to continue to run their own versions of these tools.”
DARPA funds the Memex project, which, according to the agency’s budget office, has cost $67 million to date. But rather than do the work, as with its other projects, DARPA catalyzes commercial agents, universities and others to develop the technology, Shen said.
“They are experts in their fields — image analysis, text analysis or web crawling and so on — and we engage the best of that community to work on this problem. What they’ve essentially done is form coalitions to … build the tools [needed] to solve the problem, because no one of the entities that we call performers is able to do that on their own,” he added.
The Memex program has 17 different performers, and many of them also work with partners. “So all in all,” Shen said, “we have hundreds of people who are working on this effort. All of them are very dedicated to this problem, because the problem of human trafficking is real.”
When Shen’s team started the program, one of the things they realized was that the cost of people in these spaces, the cost of slaves, is essentially zero, he added.
“That means our lives are essentially worthless in some sense, and that just seems wrong,” he said. “That motivated us and a lot of our performers to do something, especially when we build technology for all sorts of commercial applications for profit and for other motives. That’s what a lot of our folks do on a day-to-day basis, and they felt the need to make use of their technology for a noble cause. We think Memex is one of these noble causes.”