Contact Project Developer Ashish D. Tiwari [astiwz@gmail.com]
Download Synopsis Abstract
Desktop Applications Java JSP Data Mining BE-Engineering(CO/IT) ME-Engineering(CO/IT) BCS MCS BCA MCA MCM BSC Computer/IT MSC Computer/IT Diploma (CO/IT) IEEE-2015

Differentially Private Frequent Itemset Mining via Transaction Splitting

Frequent itemset mining (FIM) is one of the most fundamental problems in data mining.
Abstract-Synopsis-Documentation

Differentially Private Frequent Itemset Mining via Transaction Splitting

Abstract: Recently, there has been a growing interest in designing differentially private data mining algorithms. Frequent itemset mining (FIM) is one of the most fundamental problems in data mining. In this paper, we explore the possibility of designing a differentially private FIM algorithm which can not only achieve high data utility and a high degree of privacy, but also offer high time efficiency. To this end, we propose a differentially private FIM algorithm based on the FP-growth algorithm, which is referred to as PFP-growth. The PFP-growth algorithm consists of a preprocessing phase and a mining phase. In the preprocessing phase, to improve the utility and privacy tradeoff, a novel smart splitting method is proposed to transform the database. For a given database, the preprocessing phase needs to be performed only once. In the mining phase, to offset the information loss caused by transaction splitting, we devise a run-time estimation method to estimate the actual support of itemsets in the original database. In addition, by leveraging the downward closure property, we put forward a dynamic reduction method to dynamically reduce the amount of noise added to guarantee privacy during the mining process. Through formal privacy analysis, we show that our PFP-growth algorithm is ϵ-differentially private. Extensive experiments on real datasets illustrate that our PFP-growth algorithm substantially outperforms the state-of-the-art techniques.

Existing System:

Existing work presents an Apriori-based differentially private FIM algorithm. It enforces the limit by truncating transactions (i.e., if a transaction has more items than the limit, deleting items until its length is under the limit). In particular, in each database scan, to preserve more frequency information, it leverages discovered frequent itemsets to re-truncate transactions. However, FP-growth only performs two database scans. There is no opportunity to re-truncate transactions during the mining process. Thus, the transaction truncating approach proposed in is not suitable for FP-growth. In addition, to avoid privacy breach, we add noise to the support of itemsets.

Proposed System:

In the preprocessing phase, to improve the utility and privacy tradeoff, a novel smart splitting method is proposed to transform the database pose considerable threats to individual privacy. Differential privacy has been proposed as a way to address such problem. Unlike the anonymization-based privacy models ,differential privacy offers strong theoretical guarantees on the privacy of released data without making assumptions about an attacker’s background knowledge. In particular, by adding a carefully chosen amount of noise, differential privacy assures that the output of a computation is insensitive to changes in any individual’s record, and thus restricting privacy leaks through the results. A variety of algorithms have been proposed for mining frequent item sets. The Apriori and FP-growth are the two most prominent ones. In particular, Apriori is a breadth first search, candidate set generation-and-test algorithm. The appealing features of FP-growth motivate us to design a differentially private FIM algorithm based on the FP-growth algorithm. In this paper, we argue that a practical differentially private FIM algorithm should not only achieve high data utility and a high degree of privacy, but also offer high time efficiency. Although several differentially private FIM algorithms have been proposed, we are not aware of any existing studies that can satisfy all these requirements simultaneously.

Algorithums:

1.Spliting one transaction. 2.Mining Phase

Comment is Only Available for registered users! Create Account or Login Now!