Abstract
I- Introduction
II- Related Work
III- Methodology for Data Warehouse Design
IV- Telecommunication DW Case Study
V- Implementation
VI- Discussion
References
Abstract
In a cloud based data warehouse (DW), business users can access and query data from multiple sources and geographically distributed places. Business analysts and decision makers are counting on DWs especially for data analysis and reporting. Temporal and spatial data are two factors that affect seriously decision-making and marketing strategies and many applications require modelling and special treatment of these kinds of data since they cannot be treated efficiently within a conventional multidimensional database. One main application domain of spatiotemporal data warehousing is telecommunication industry, which is rapidly dominated by massive volume of data. In this paper, a DW schema modelling approach is proposed which integrate in a unified manner temporal and spatial data in a general data warehousing framework. Temporal and spatial data integration becomes more important as the volume and sharing of data grows. The aim of this research work is to facilitate the understanding, querying and management of spatiotemporal data for on-line analytical processing (OLAP). The proposed new spatiotemporal DW schema extends OLAP queries for supporting spatial and temporal queries. A case study is developed and implemented for the telecommunication industry.
INTRODUCTION
An increasing number of Cloud Computing (CC) platforms provide facilities for big Data Warehouse (DW) storage and manipulation. Having all the DW functionalities over the Internet simplifies the access on it and storage is no longer an issue since clouds offer almost limitless storage capacity. The Apache Hive Data Warehouse [1] manages large distributed data sets using SQL, while Microsoft with Azure SQL Data Warehouse [2] can fully manage a cloud DW providing a single holistic DW solution. Amazon offers also cloud DW capabilities over Amazon Redshift Cluster using standard SQL [3]. Google isn’t out of this with Google BiqQuerry to antagonize the other big vendors [4]. Nowadays, almost all big and smaller cloud providers like IMB [5], Oracle [6], Teradata [7], CoolaData [8] etc. have already include DW services in their cloud environments. Business intelligence (BI) is a technology-driven process for the collection, integration, analysis, and presentation of business information. It includes a wide variety of tools, applications and methodologies that permit organizations to collect data from internal and external sources for analysis and decision-making. A component of BI is online analytical processing (OLAP). OLAP creates a multidimensional view of data for the user to do the analysis. The approach for OLAP is classified into three categories, MOLAP, ROLAP and HOLAP. In MOLAP (multi-dimensional online analytical processing) the data used for analysis is stored in specialized multidimensional databases. ROLAP works directly with relational databases. HOLAP approach is a hybrid OLAP approach which combines MOLAP with ROLAP by allowing the designer to decide which portion of the data will be stored in MOLAP and which portion in ROLAP. BI data is typically stored in a DW. The purpose of data warehousing is to construct a huge repository of integrated data, which is optimized for analysis purposes. Nowadays, the big data challenge moves the tradition DWs to cloud based DWs with limitless storage resources and internet based secure access from anywhere to everywhere.