Improved angelization technique against background knowledge attack for 1:M microdata

View article
Just published in @PeerJCompSci - Improved angelization technique against background knowledge attack for 1:M microdata Read the full article https://t.co/pcxX2WtpZj #Security #Privacy #InternetofThings
PeerJ Computer Science
DOI: 10.5281/zenodo.7214275, also in original can be seen on: https://archive.ics.uci.edu/ml/datasets
Some examples of modified datasets were used in other research as a reference. https://archive.ics.uci.edu/ml/datasets, https://datahub.io/machine-learning/adult#readme, https://www.researchgate.net/figure/Examples-of-generated-counterfactuals-on-the-modified-Adult-dataset-Example-Based-CF-and_tbl2_337830079, https://openreview.net/forum?id=bYi_2708mKK

Main article text

 

Introduction

Motivation

  • Scenario I Sensitive Vertical Attack (sVer):

  • Scenario II inapplicability for 1:M microdata:

  • Scenario III High utility loss:

Contribution

  • The proposed (θ, k)-utility privacy algorithm, categorizes the SA values of 1:M microdata into Low, Mild, High, Severe, and A-symptomatic values, based on the category Table 3 to reshape the original microdata Table 1 into Table 4, for the purpose to get the 1:1 microdata. The SA 1:M record values are replaced with category table SA values. If the SA values are repeated in more than one category, the higher category value is considered and ignored the lower one and stored in history table.

  • The proposed algorithm using the angelization approach, anonymizes the microdata T in Table 1 into QT and ST (see Section 5) and are linked through the Bucket ID (BID) using the one-to-many correspondence (i.e., QS-Loose Linkability) for improving utility and privacy, instead of one-to-one correspondence.

  • Based of the above points, the experiment results demonstrate the out performance of the proposed (θ, k)-utility privacy algorithm, as compared to its counterparts in terms of utility and privacy.

Preliminaries

HLPN analysis of previous models

(k, l)-diversity

θ-Sensitive k-Anonymity

Proposed (θ, k)-utility

The (θ, k)-utility Algorithm

 
____________________________ 
Algorithm 1 (θ∗,k)-utility_____________________________________________________________________________________________ 
Require: 
    T: 1:M Microdata Table; 
     k: k-anonymity; 
     Γ: T populated with CtgT ; 
     τ: Individual tuple from transform Table (T  ); 
     ℏ: Individual history tuple from history table H; 
Ensure: 
    QT: Quasi Table :-genData; 
     ST: Sensitive Table :-sbt; 
     ___________________________________________________________________________________________________________________________________________________ 
 1:  sbt={}; 
 2:  genData={}; 
 3:  for all tsai ⋅⋅⋅tsan in T do 
 4:     Dsan := Compute(Distinct(SA value)) 
 5:  end for 
 6:  for all dsai ⋅⋅⋅dsan do 
 7:     CtgT := Categorize Dsan into five categories 
 8:  end for 
 9:  for all tsai ⋅⋅⋅tsan do 
10:     Γ:= T ↔ CtgTcat 
11:  end for 
12:  for  all ri ⋅⋅⋅rn do 
13:     for all ti ⋅⋅⋅tn do 
14:         τi:= max(tsai)∀tsa 
i  ϵ CtgTcat 
15:         ℏi:= ¬max(tsai)∀tsa 
i  ϵ CtgTcat 
16:     end for 
17:  end for 
18:  T  := ∑n 
    i=1 τi 
19:  Hi:= ∑n 
    i=1 ℏi 
20:  while T     ⁄= {} do 
21:     if T     ≤ 2k then 
22:         sbtk :=   T 
23:         sbt := sbt ∪sbtk 
24:     else 
25:         Apply θ-Sensitive k-Anonymity Khan, Razaullah and Tao, Xiaofeng and Anjum, Adeel and Kanwal, Tehsin 
            and Malik, Saif Ur Rehman and Khan, Abid and Rehman, Waheed Ur and Maple, Carsten (2020) 
26:     end if 
27:  end while 
28:  N := —T  qi—   // BID is obtained from bksa 
29:  while N ⁄= {} do  
30:     if N ≤ 2k then 
31:         genData := N 
32:     else 
33:         genData := genData ∪ gen(N)   // Linked via BID 
34:     end if 
35:  end while 
36:  return  sbt 
37:  return  genData____________________________________________________________________________________________________________________________    

HLPN analysis of (θ, k)-utility algorithm

Experimental Analysis

Experimental setup

Utility loss

Normalized certainty penalty

Query accuracy

Privacy loss

Record intersection

Record linkability

Execution time

Discussion

Conclusion and Future work

Supplemental Information

Additional Information and Declarations

Competing Interests

The authors declare there are no competing interests.

Author Contributions

Rabeeha Fazal conceived and designed the experiments, performed the experiments, performed the computation work, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Razaullah Khan conceived and designed the experiments, performed the experiments, performed the computation work, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Adeel Anjum conceived and designed the experiments, analyzed the data, performed the computation work, authored or reviewed drafts of the article, and approved the final draft.

Madiha Haider Syed conceived and designed the experiments, analyzed the data, authored or reviewed drafts of the article, and approved the final draft.

Abid Khan conceived and designed the experiments, analyzed the data, authored or reviewed drafts of the article, and approved the final draft.

Semeen Rehman conceived and designed the experiments, analyzed the data, authored or reviewed drafts of the article, and approved the final draft.

Data Availability

The following information was supplied regarding data availability:

The data set used in the work is derived from the publicly available dataset available at the UC Irvine Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/adult.

The derived data set is available at Zenodo: Ronny Kohavi, & Barry Becker. (1996). UCI Machine Learning- Adult Dataset [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7214275.

Funding

This work was supported by TU Wien Bibliothek through its Open Access Funding Programme. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

1,056 Visitors 996 Views 31 Downloads

Your institution may have Open Access funds available for qualifying authors. See if you qualify

Publish for free

Comment on Articles or Preprints and we'll waive your author fee
Learn more

Five new journals in Chemistry

Free to publish • Peer-reviewed • From PeerJ
Find out more