Covid

MASKING SAVES LIVES

Wednesday, October 08, 2008

Government Report: Data Mining Doesn't Work Well

Portion below; whole thing here: http://news.cnet.com/8301-13578_3-10059987-38.html

Note the local Seattle connection (highlighted below). Let's hope this report isn't buried (I know -- faint hope). Via Cursor.org.

The most extensive government report to date on whether terrorists can be identified through data mining has yielded an important conclusion: It doesn't really work.

A National Research Council report, years in the making and scheduled to be released Tuesday, concludes that automated identification of terrorists through data mining or any other mechanism "is neither feasible as an objective nor desirable as a goal of technology development efforts." Inevitable false positives will result in "ordinary, law-abiding citizens and businesses" being incorrectly flagged as suspects.

The whopping 352-page report, called "Protecting Individual Privacy in the Struggle Against Terrorists," amounts to at least a partial repudiation of the Defense Department's controversial data-mining program called Total Information Awareness, which was limited by Congress in 2003.

But the ambition of the report's authors is far broader than just revisiting the problems of the TIA program and its successors. Instead, they aim to produce a scholarly evaluation of the current technologies that exist for data mining, their effectiveness, and how government agencies should use them to limit false positives--of the sort that can result in situations like heavily-armed SWAT teams raiding someone's home and shooting their dogs based on the false belief that they were part of a drug ring.

The report was written by a committee whose members include William Perry, a professor at Stanford University; Charles Vest, the former president of MIT; W. Earl Boebert, a retired senior scientist at Sandia National Laboratories; Cynthia Dwork of Microsoft Research; R. Gil Kerlikowske, Seattle's police chief; and Daryl Pregibon, a research scientist at Google.

They admit that far more Americans live their lives online, using everything from VoIP phones to Facebook to RFID tags in automobiles, than a decade ago, and the databases created by those activities are tempting targets for federal agencies. And they draw a distinction between subject-based data mining (starting with one individual and looking for connections) compared with pattern-based data mining (looking for anomalous activities that could show illegal activities).

But the authors conclude the type of data mining that government bureaucrats would like to do--perhaps inspired by watching too many episodes of the Fox series 24--can't work. "If it were possible to automatically find the digital tracks of terrorists and automatically monitor only the communications of terrorists, public policy choices in this domain would be much simpler. But it is not possible to do so."

No comments: