Abstract
I. Introduction
II. Models and Security Requirements
III. Preliminaries
IV. Proposed Privacy-Preserving Scheme
V. Security Analysis
Authors
Figures
References
Abstract
With the advances of data mining and the pervasiveness of cloud computing, online medical diagnosis service has been extensively applied in e-heathcare field, and brought great conveniences to people’s life. However, due to the insufficient data sharing among healthcare centers under the security and privacy concerns of medical information, the flourish of online medical diagnosis service still faces many severe challenges including diagnostic accuracy issues. In this paper, in order to address the security issues and improve the accuracy of online medical diagnosis service, we propose a new privacy-preserving collaborative model learning scheme with skyline computation, called PCML. With PCML, healthcare centers can securely learn a global diagnosis model with their local diagnosis models in the assistance of cloud, and the sensitive medical data of each healthcare center is well protected. Specifically, with a secure multi-party vector comparison algorithm (SMVC), all local diagnosis models are encrypted by their owners before being sent to the cloud, and can be directly operated without decryption. Detailed security analysis shows that PCML can resist security threats in the semi-honest model. Moreover, PCML is implemented with medical datasets from UCI machine learning repository, and extensive simulation results demonstrate that PCML is efficient and can be implemented effectively.
Introduction
In recent years, the online medical diagnosis system [1], which can provide medical diagnosis service anywhere and anytime, has attracted considerable interest. Compared with traditional treatment methods, online medical diagnosis is more flexible and convenient since it breaks the geographical restriction, and reduces the waiting time of seeing doctors [2]–[۶]. To predict hidden diseases from collected medical data, many data mining techniques have been developed for e-healthcare system in recent years. For example, skyline computation [7], which returns a set of interesting points from a potentially huge data space, can be appropriately used in medical data analyzing and disease classification [6]. Specifically, with collected medical data, healthcare centers can generate diagnosis models via medical data mining with skyline query, which assists them in offering online medical diagnosis services, and allows users to check their health conditions expediently. Unfortunately, in traditional online medical system, the medical data are commonly stored distributively in different healthcare centers, and a sole healthcare center collecting only a small set of medical data cannot generate a skyline diagnosis model accurate enough [8], [9]. For example, consider the scenario shown in Fig. 1, when a user accesses online medical diagnosis services from multiple healcare centers, due to the limitation of diagnosis model accuracy, healthcare centers may not be able to diagnose diseases accurately, which will bring bewilderment to the user.