Data Leakage Essay

International Journal of Scientific & Engineering Research, Volume 3, Issue 6, June-2012
Data Leakage and Detection of Guilty Agent
Rupesh Mishra, D.K. Chitre
Abstract— Organizations apply data or information security only in terms of protecting their network from intruders (e.g. hackers) but with
growing amount of sensitive data and rapid growth in the sizes of organizations (e.g. due to globalization), rise in number of data points
(machines and servers) and easier modes of communication, sometime accidental or even deliberate leakage of data from within the
enterprise may outsource its data processing, so data must be
given to various other companies. Owner of the data is termed
as the distributor and the supposedly trusted third parties are
called as the agents. Our goal is to identify the guilty agent
when distributor’s sensitive data have been leaked by some
agents. Perturbation and watermarking are techniques which
can be helpful in such situations. Perturbation is a very useful
technique where the data is modified and made less sensitive
before being handed to agents. For example, one can add random noise to certain attributes, or one can replace exact values
by ranges [2]. However, in some cases, it is important not to
alter the original distributor’s data. For example, if an outsourcer is doing our payroll, he must have the exact salary and
customer bank account numbers. If medical researchers treating the patients (as opposed to simply computing statistics),

they may need accurate data for the patients.
In this paper such applications are considered where the
original sensitive data cannot be perturbed. Traditionally, leakage detection is handled by watermarking, e.g., a unique
code is embedded in each distributed copy [3]. If that copy is
later discovered in the hands of an unauthorized party, the
leaker can be identified. Watermarks can be very useful in
some cases, but again, they involve some modification of the
original data. Furthermore, watermarks can sometimes be destroyed if the data recipient is malicious, for leakage detection
unobtrusive technique is required.
In the business process after giving set of objects to agents,
the distributor discovers some of those same objects in an
unauthorized place. (For example, the data may be found on a
website, or obtained through a legal discovery process). At
this point, the distributor can assess the likelihood that the
leaked data has come from one or more agents, as opposed to
having been independently gathered by other means. Using
an analogy with cookies stolen from a cookie jar, if we catch
Freddie with a single cookie, he can argue that a friend gave
him the cookie. But if we catch Freddie with five cookies, it
will be much harder for him to argue that his hands were not
in the cookie jar. If the distributor sees ―enough evidence‖ that
an agent leaked data, he may stop doing business with him, or
may initiate legal proceedings [4].
In this paper, section 2 includes the problem definition in
detail, Section 3 covers work done in this area, Section 4
represents problem setup and notation related with the data
leakage detection system, Section 5 explains the agent...

