خلاصه
1. مقدمه
2. کارهای مرتبط
3. معماری حفظ حریم خصوصی با تقسیم ویژگی سه طرفه
4. طرح پیشنهادی
5. آزمایش ها و نتایج
6. نتیجه گیری
اعلامیه منافع رقابتی
تصدیق
منابع
Abstract
1. Introduction
2. Related work
3. An architecture of privacy preservation with three-way attribute division
4. Proposed scheme
5. Experiments and results
6. Conclusion
Declaration of competing interest
Acknowledgment
References
چکیده
پیشرفت های اخیر در اینترنت اشیا (IoT) مزایای بسیار زیادی را برای کسب و کارها به ارمغان آورده است. این مزایا توسط خدماتی به دست می آید که حجم زیادی از داده ها را جمع آوری می کند که برای تجزیه و تحلیل جمع آوری می شود. داده ها همچنین ممکن است حاوی اطلاعات حساس باشند. حریم خصوصی چنین داده هایی یک چالش مهم تحقیقاتی است. حریم خصوصی دیفرانسیل یک تکنیک جدید برای حفظ حریم خصوصی داده ها است. با ناشناس کردن ویژگی هایی که ممکن است حاوی اطلاعات حساس باشند کار می کند. یک مرحله ضروری قبل از اعمال حریم خصوصی دیفرانسیل، تقسیم مجموعه ویژگی به سه گروه به نامهای حساس، غیر حساس و مبهم است. یک مسئله کلیدی در مطالعات موجود این است که تقسیم مجموعه ویژگی ها به صورت دستی توسط متخصص حوزه انجام می شود و بنابراین هزینه بر است. ما یک رویکرد سه طرفه برای حریم خصوصی دیفرانسیل و یک الگوریتم پشتیبانی برای این مرزبندی مجموعههای ویژگی معرفی میکنیم. نتایج نشان می دهد که محتوای اطلاعاتی و ثبات مجموعه داده با رویکرد ما به طور قابل توجهی بهبود می یابد.
توجه! این متن ترجمه ماشینی بوده و توسط مترجمین ای ترجمه، ترجمه نشده است.
Abstract
The recent advancements in Internet of Things (IoT) have brought enormous advantages for businesses. These benefits are achieved by services that collect large volumes of data that is collected for analysis. The data may also contain sensitive information. Privacy of such data is an important research challenge. Differential privacy is a recent technique for data privacy. It works by anonymizing the attributes that may contain sensitive information. An essential step before applying differential privacy is the division of attribute set into three groups called sensitive, non-sensitive and ambiguous. A key issue in existing studies is that the division of attribute set is done manually by a domain expert and is therefore costly. We introduce a three-way approach for differential privacy and a supporting algorithm for this demarcation of attribute sets. Results indicate that the information content and stability of the dataset improves considerably with our approach.
Introduction
Internet of things (IoT) has enormous usage potential in smart homes, Industrial IoTs and medical or healthcare IoTs. This requires collection of large amounts of data that may be stored and shared for analysis. The collected data may contain medical, financial or personal information and its leakage may lead to privacy issues. Many privacy related incidents have been reported in the recent past which demand for efficient and effective solutions for data privacy [1]. On the other hand, in IoT smart homes, there is a greater risk of data and identity theft. Data can be used to analyze human activities and may have serious implications such as robberies.
Different anonymization techniques have been used in IoTs to protect sensitive information. The most common approach is hiding all the sensitive attributes. This however, leads to significant loss in utility of the data for useful analysis [2]. Another common approach is to release only aggregate values [3]. A privacy breach can occur with this approach if someone manages to gain enough aggregate values that provide hints about sensitive data of an individual [3]. Query auditing is another approach for data privacy [4]. It works by comparing the results of the past queries with the current query to determine whether or not responding to the same query will lead to a privacy breach. It can then deny those queries that lead to a privacy breach. It is however argued that query denials can also lead to information leakage. The approach of k-anonymity was introduced in order to address shortcomings of the earlier techniques. It fails in situations where sensitive values in a class lack diversity [5]. l-diversity was a refinement of k-anonymity [6]. It works by making sure that each class of values has enough sensitive values and that the values are distributed evenly. It is not able to prevent attribute disclosure which makes it susceptible to inference attacks [7]. The approach of t-closeness is a refinement of k-anonymity which works by making sure that the distribution of a sensitive attribute in a class is close to the distribution of the sensitive attribute in the overall table. It however is susceptible to re-identification attacks [8].
Conclusion
The data collected and stored by sensors in Internet of things (IoT) needs to be protected against privacy breaches. Differential privacy is an approach for privacy preservation. Before applying differential privacy, it is necessary to divide the attribute set into three groups known as sensitive, non-sensitive and ambiguous. Existing practices rely on manual division of attributes by a domain expert and are therefore quite costly. We introduce a three-way approach for automatic attribute division for differential privacy. The approach divides the attribute set based on a pair of thresholds and an evaluation function. The configuration of thresholds controls the groupings or divisions of attributes and needs to be configured carefully. To achieve effective thresholds and the resulting grouping of attributes, we introduce an algorithm called 3WADD that automatically determines the thresholds for an effective division of attributes. An architecture that incorporated 3WADD using differential privacy is also presented. The proposed scheme improves the information content and stability of the dataset.
These results open up new research avenues for exploring more sophisticated methods of three-way decisions for obtaining effective and useful division of attributes for IoT. In particular, different kinds of evaluation functions may be explored depending on the underlying nature of the data and the specific needs of the application at hand. The automated nature of this approach can significantly reduce the cost of privacy preservation in an IoT dataset and thus motivate more organizations and individuals to use IoT to improve their processes.