Via Noah Schactman at Defensetech: Jeff Jonas, one of the nation’s leading data mining experts, has serious doubts about whether our government’s massive data mining projects have a remote chance of returning useful information.
One of the fundamental underpinnings of predictive data mining in the commercial sector is the use of training patterns. Corporations that study consumer behavior have millions of patterns that they can draw upon to profile their typical or ideal consumer. Even when data mining is used to seek out instances of identity and credit card fraud, this relies on models constructed using many thousands of known examples of fraud per year.
Terrorism has no similar indicia. With a relatively small number of attempts every year and only one or two major terrorist incidents every few years—each one distinct in terms of planning and execution—there are no meaningful patterns that show what behavior indicates planning or preparation for terrorism.
[…] Without patterns to use, one fallback for terrorism data mining is the idea that any anomaly may provide the basis for investigation of terrorism planning. Given a “typical” American pattern of Internet use, phone calling, doctor visits, purchases, travel, reading, and so on, perhaps all outliers merit some level of investigation. This theory is offensive to traditional American freedom, because in the United States everyone can and should be an “outlier” in some sense. More concretely, though, using data mining in this way could be worse than searching at random; terrorists could defeat it by acting as normally as possible.
Treating “anomalous” behavior as suspicious may appear scientific, but, without patterns to look for, the design of a search algorithm based on anomaly is no more likely to turn up terrorists than twisting the end of a kaleidoscope is likely to draw an image of the Mona Lisa.
As the civil liberty debate rages, even our extreme authoritarians couch their arguments in terms of benefit relative to cost. If the benefit doesn’t exist then wannabe autocrats like Newt Gingrich plainly have no leg to stand on. The only remaining support would have to come from these programs’ side benefits, primarily the existence of a detailed dossier on the personal life of every American citizen. That should come in handy in case any priest becomes, as one departed ruler might put it, a bit turbulent.*
(*) Once we’ve dumped the Magna Carta and rejected the Enlightenment, there’s no reason why our classical references shouldn’t go all the way back.
Steve
Here’s an actual 9/11 Commission recommendation that doesn’t get enough attention:
When they instituted random subway searches here in NYC, very few of us “liberal” New Yorkers were opposed because of our precious privacy rights, although surely it’s an inconvenience. No, people were generally opposed because it’s an inconvenience and WON’T WORK. If a bomb-carrying terrorist enters a subway station and sees the cops doing searches, he’ll just leave and walk two blocks to the next station where there aren’t any cops.
People are willing to make a lot of personal sacrifices in exchange for a security program that actually works. What this adminstration, and its authoritarian supporters, fail to realize is that they bear the burden of showing that a program works. Instead, it’s completely faith based, backed up solely by rumors on some right-wing blog that maybe this program led to the capture of such-and-such terrorist, or whatever, oh and it’s treason to leak the fact that the program exists because now we’re in the position of having to justify it.
The simple fact seems to be that none of the intrusive programs instituted since 9/11 seem to be accomplishing much, and we’re mostly stopping terrorist threats through good old-fashioned law enforcement. If I’m wrong, too bad, because the burden is on the government to show it needs these new powers, and they haven’t done a damn thing in that regard.
Elvis Elvisberg
If you try to make the unpatriotic argument that costs should be in some proportion to benefits, you’ll be told that the downside is the destruction of America.
Therefore, to save America, we must abandon the inconvenient aspects of the Constitution, torture suspected criminals, invade other countries without scrutinizing evidence, and generally lay flat every law in the land to get at the terrorists.
Otherwise, the terrorists will have won.
The Other Steve
Frankly, about the only thing data collection is good for is after the fact data mining. That is, if you’ve got a tip you can query…
Show me everybody who in the past week has bought, a turkey baster, a handgun and a copy of Catcher in the Rye.
RSA
Civil rights issues aside, I think that Jonas’s article may be oversimplifying the technical issues involved. For example:
Most if not all of Jonas’s examples could be used as arguments that human attempts to identify and prevent terrorist attacks are doomed to failure (which may indeed be the case–but the argument is not specific to automated identification of suspicious patterns of behavior). I could just as easily write, “Searching for terrorists is no more likely to succeed than applying brush to canvas is likely to come up with an image of the Mona Lisa.”
If the argument becomes that we should pursue good old-fashioned detective work, that’s fine, because human intelligence (in the NSA/defense sense of the phrase) can’t be gained directly by automated systems, and for some kinds of decision-making, human involvement or oversight is going to be necessary for the foreseeable future. When it comes down to pattern recognition, however, there are no good technical reasons why some of the burden can’t be shared with computers (leaving aside the separate question of whether we can build sufficiently sensitive algorithms).
Zifnab
That’s kinda the crux of the issue. Can we build sufficiently sensitive algorithms? Spending a billion dollars on Eschalon doesn’t do us any good if we just use it to Google-search “people who want to blow up America” and turn up ten million queries. The whole point is that an American will volunteer up his freedom for security, but he does demand that he actually gets security.
But the data mining and the No-Fly List and the wiretapping don’t seem to be yeilding up any terrorists, so why are they still in place? If Bush proclaimed “We’re going to ransack every fifth house in America looking for terrorists”, a reasonable person would call that insane, stupid, and a waste of resources. The current policy – spying on Quakers, barring Cat Stevens from using an airplane, extraditing people to Syria for torture after other agencies have already aquited them – would be laughable if it was being practiced by the Soviets or the Chinese. This is the fruit of our vaunted “Republican Anti-Terror Initative” and it mostly just pisses people off.
I don’t care whether a computer database can catch terrorists. I care if our database will catch them. So far, I haven’t seen any reason to believe that what our President is currently doing actually works.
RSA
I don’t think any projects that have gotten attention so far are on the right track, including all the ones you mention. On the other hand, there are approaches that have been sorta-kinda successful in related areas that could turn out to be part of a better solution. Link analysis has been used to detect fraud in the banking and insurance industries, and red/blue team strategies have had some success in network intrusion detection; these pose comparable problems to identifying terrorists in that the bad guys are trying to fly under the radar by blending into larger groups of people and their behaviors. Of course, the risks and remedies are entirely different. Still, they might be good alternatives to rooting through everyone’s data all the time. The first approach above focuses more on social networks and such (okay, that could be pretty intrusive, but it could also be seeded with known terrorists to limit the spread), while the second can focus on vulnerabilities rather than specific attackers.
(This is all off the top of my head, based on occasional conversations with people working in these areas. It’s been ten years since I’ve been really familiar with the data mining literature, and it’s moved quite far in that time. Just thought I’d throw in more grist.)
grumpy realist
When the number of false positives vastly outweights the possible true positives, your data mining is a bloody waste of time. You’d have done better spending the same amount of money into more humint and all the tedious tracking down of leads through law inforcement.
Jake
Speaking in a purely hypothetical sense, it seems the best way to test pdm as a crime prevention tool (v. criminal apprehension) would be to run it on a common crime. If it can’t catch a rapist it won’t catch a terrorist. And this Admin’s already told us we need to surrender our rights to be safe…
This reminds me of the announcements they used to run in DC’s metro system. Riders were encouraged to be on the look out for people behaving in a “suspicious manner,” Uh…yeah. Where do I start?
TenguPhule
War is Peace.
Slavery is Freedom.
We have always been at war with Oceania.
Tsulagi
Read the linked New York Sun article. Waste of time. Just more Gingrich blowing gas out his ass. Now he wants to take his cowardice international.
I am so tired of bedwetting assholes like Gingrich preaching you should be constantly pissing in your socks out of fear like them. Their solution to regain bladder control? Whack at the foundations of this country that no terrorist could and call that patriotism. Until the Gay Old Perverts party calls their Gingrichs out for what they are…spineless, gutless cartoon figures…they’re not seeing another vote from me.
A month or two after 9/11 I saw a program about what actions cities could take regarding threat of terrorism. There were mayors in a discussion from three cities: D.C.; Tel Aviv; and one from another major US city. Not NYC, but I forget which other city.
Anyway, the mayor from D.C. was saying they needed to put up huge numbers of concrete blast barriers around potential targets. The other US mayor was agreeing and saying in addition widespread random searches were needed and legislation to enable them. The Tel Aviv mayor was smiling and keeping quiet until the two US mayors asked him given Israel’s long experience with terrorism what his city did.
Tel Aviv mayor said they did nothing along the lines they were proposing. He said it was important to live as normally and openly as possible keeping all their constitutional freedoms because to do otherwise showed terrorists they’d achieved their goal. To terrorize; to live in fear. They weren’t going to give them the satisfaction. Imagine that.
ThymeZone
A firm grip on the outflow device?
Zifnab
Foley wasn’t molesting patients, he was fighting terror!
And I seriously doubt that Gingrich has ever actually pissed his socks over terrorism. For starters, he’s not in the Capital building, so I’m sure he feels safer. Add to that fact the number of terror attacks we’ve had in the past year (1) and the number of dead politicians it resulted in (0). That’s when they caught us with our pants down (thanks Bush!) “Turn over your civil rights, live in this cage, and let us milk you for your money” has absolutely nothing to do with keeping people safe from terrorists. Just like “Let’s invade Iraq” has absolutely nothing to do with hunting down Osama.
ThymeZone
Revised.
Jake
Pages? Anyhoo, insert Foley Catheter joke here…
This, I’m not so certain about. I regularly speak to a nice, level-headed bunch of folks who live in soy and pork country. I’d rate their chances of being fragged by a terrorist attack as being slightly lower than Bush dumping Laura for Osama. And yet, they are terrified that they’ll get blown up by Islamo-baddies. I’ve tried pointing out that the bad guys would first need to find their little town and that would require a detailed map and but it hasn’t seemed to help.
And you just know he was thinking “What a pair of panty-waists.”
cleek
Iraq Is A War on Terror
ThymeZone
I thought I was kidding before, but this settles it:
These guys really are just fucking with us.
eric
There is a huge problem with these sort of programs that as far as I know can not be overcome. It is this, there are in fact very few terrorists in the world compared to the number of none terrorist. Given that fact any test to determine if some random person is a terrorist has a high probability of being a false positive.
Have a look at wikipedia and specifically Bayes’ Theorem.
http://en.wikipedia.org/wiki/Bayes%27_theorem#Example_.232:__Drug_testing
In the example cahnge the word “drug” to “terrorist” and you will get the drift.
RSA
I think (or rather hope) it’s obvious that any general argument that an automated system that identifies terrorists is going to produce a lot of false positives also applies to human judgment in identifying terrorists. Bayes’ Law governs not only computer decision-making but human decision-making as well. As Zifnab points out above, the crux of the issue is how we decide whether someone might be a terrorist, not whether a human or a machine is involved in the decision process.
Not to mention what we do in response to such an identification. Also leaving aside privacy and civil liberties issues.
Zifnab
He would have prefered Global Struggle Against Violent Extremism, but it didn’t test well in focus groups. Also, the name was copyrighted for the next season of 24, and even the White House isn’t ballsy enough to go toe-to-toe with Jack Bauer.
Jake
I myself favour The War Against Terror.
Mike
Well, the way Bush says it makes more sense and is more truthful (coming from him that is a stretch I know)
“The War on Terra”
Hyperion
the first requirement is high signal-to-noise ratio data.
the success of financial fraud detection based on pattern recognition is due in part to the high SNR of their data; the amounts/times/places of purchases are known very precisely. however, measuring human behavior behavior is much more error prone. plus which behaviors are key?
probably too geeky but…a competent model of normal human behavior would allow detection of non-normals (“suspicious”, as Jake said) but would not allow detection of terrorists.
to detect terrorist behavior, you have to model terrorist behavior, which means having a lot of data on terrorists. and right now we don’t have a sufficient number of “observations”. but i have a feeling that in the future we will.