IT Chiefs Worry Aging Data Will Grow ‘Toxic’
Global data storage needs are growing at 20 to 25 percent annually, and Cisco Systems forecasts that by 2019, the world will add another 10.4 trillion gigabytes of new data every year.
For government information technology and security chiefs, that represents a challenge: Rising data storage needs drive up costs and increase risk.
“We have to measure the toxicity of data over time,” said David Tillman, the Navy’s cyber security director, at the FedScoop Security Through Innovation Summit, April 14. “The longer we retain it, the more the potential threat. As storage is so cheap, we save and save and it becomes toxic from a threat perspective.”
Declining storage costs appeal to our worst instincts, Tillman said. “We’re natural hoarders.”
It takes discipline to beat back such instincts. Ken Bible, deputy chief information officer for the Marine Corps, said the issue boils down to organizational discipline.
“I don’t know that the data itself becomes toxic,” Bible said at the AFCEA Defensive Cyber Operations Symposium late last month. “The data is data. It’s how it’s used. It’s how we handle it. If we put the right rules in place for how we’re going to use analytics or handle the records going back however many years, then it can be managed. This requires some systems engineering.”
Not all data is worth saving, he argues. The Freedom of Information Act (FOIA) dictates many documents must be saved, including email. But other data is open to interpretation, either because it is exempt from FOIA, or because it is simply not addressed.
Though the law requires the Marine Corps archive email traffic from general officers and senior executive service, for example, it doesn’t address email records for every private or lance corporal. Do those also need to be saved?
Just as important as which data to save – and for how long – is determining the best way to store it, both for cost and security.
“There are technological solutions for which data to save and for how long per governing policy,” said Dan Gahagan, vice president, enterprise capabilities in General Dynamics Information Technology’s Intelligence Solutions Division. “Data tagging, for example, can allow you to target data for either long-term retention or short term destruction. The challenge is in getting the data tagged appropriately, and then implementing the right technologies to handle the tagged data. And that doesn’t even begin to address how we tag all the existing data that’s already out there.”
The rules of the road for what data should be saved are far from final. The Air Force collects so much surveillance video, for example, that it can’t view it all, let alone save it. After three months, , most drone surveillance video is destroyed.
Lt. Gen. Bill Bender, the Air Force chief information officer, said the service is 10 years behind on these issues. “We don’t have a detailed data strategy” – yet, he said. But he acknowledges that effort is underway. The service’s Information Dominance Flight Path, its IT policy-guiding document, notes under data management that “the Air Force will develop a roadmap … for implementing DoDI 8320.02 data standards by 4Q FY15.” That work is still in progress, however.
From a security perspective, the more data that’s saved, the more potential risk that not only the data will be compromised, but if it is, an enemy can piece together disparate pieces of data to develop a more comprehensive understanding of U.S. personnel, organizations, or even tactics and techniques.
Essye Miller, the Army’s director of cyber security, said users’ natural bent to keeping information easily accessible fails to take into account how powerful analytics and simple aggregation, can be. “We have a tendency to make as much info as we can available with no thought to the aggregation of that information by the enemy and without regard to analytics,” she said. “How long do we keep data? Historically, we have kept information too long.”
Human behavior and lack of discipline is one part of the problem. Constant personnel turnover, copied files misplaced in the wrong folders or drives can take a toll. The Marine Corps’ Cyber Security Division Chief Ray Letteer said he routinely sends cyber white teams out to comb through networks, seeing what’s been left and forgotten. “They find personal data, fitness reports, resumes, et cetera, that goes back in some cases to the 1980s. There’s got to be an expiration date on some of this stuff.”
New Data Sources
Data come in many forms. Surveillance photographs and video, signals intelligence, personnel records, correspondence: Rules must be established for each. Cyber analytic data represent another new trove of mineable information, Bible said. Cyber screening tools capture all the traffic coming in and out of government networks. Does all that data need to be saved permanently? How long are they of value?
Left to decide for themselves, Bible said, intelligence analysts would throw nothing away. “They’re very interested in anything they can get their hands on and would probably say, ‘Save everything,’” Bible said. “But I think we do have to figure out the right answer with respect to how long we keep certain data – and from a practical perspective, where we keep it.
“We’ve got to find less expensive ways to keep things we need to access once every other year, versus the things we need to have at our fingertips all the time,” Bible said. “That requires some systems engineering and some effort to figure that out and get that properly architected.”
Tiered data storage can be automated based on policy, an approach that helps take human behavior out of the equation. For example, files can either be automatically archived after 30 days without being touched, or deleted after a year.
Rob Foster, the Navy’s chief information officer, said the challenge is determining the rules of the road for what gets saved and what doesn’t. Rules change from agency to agency and even individual to individual. Setting an institutional approach is difficult, but attractive.
“Everybody’s got different rules and regulations,” Foster said. “I think people’s default is, if you let the individual determine what is an official record, you have a significant challenge when it comes to training. And if you retain everything to mitigate that, you have a significant challenge [with risk].”
Centralizing control can help. Enterprise systems management can reduce – if not erase – policy differences across an organization. Each of the services, as well as the Defense Department overall, are moving in that direction now, reducing the number of application and disparate user policies that have spread across the department over time.
“In the IT space particularly we’re coming to a consensus view of the critical requirement to manage the IT at an enterprise level,” Bender said. “In doing so, similar to what industry learned some time ago, we’re finding that centralized management ends up closing all of the intermediate gaps in terms of setting policies that represent a consensus view throughout the organization.”
Establishing those policies and standards – and communicating them broadly – is critical, Bible said. Service members and civilian employees trust the department to protect their data and to make good decisions about where the data goes and how it’s used. “If we handle data responsibly,” he said, “if we put the right rules in place, that’s the key to keeping data from becoming toxic.”