Abstract
1- Introduction
2- Related work
3- Proposed methodology
4- Data analysis and results
5- Conclusion and future work
References
Abstract
The unbroken amplification of a versatile urban setup is challenged by huge Big Data processing. Understanding the voluminous data generated in a smart urban environment for decision making is a challenging task. Big Data analytics is performed to obtain useful insights about the massive data. The existing conventional techniques are not suitable to get a useful insight due to the huge volume of data. Big Data analytics has attracted significant attention in the context of large-scale data computation and processing. This paper presents a Hadoop-based architecture to deal with Big Data loading and processing. The proposed architecture is composed of two different modules, i.e., Big Data loading and Big Data processing. The performance and efficiency of data loading is tested to propose a customized methodology for loading Big Data to a distributed and processing platform, i.e., Hadoop. To examine data ingestion into Hadoop, data loading is performed and compared repeatedly against different decisions. The experimental results are recorded for various attributes along with manual and traditional data loading to highlight the efficiency of our proposed solution. On the other hand, the processing is achieved using YARN cluster management framework with specific customization of dynamic scheduling. In addition, the effectiveness of our proposed solution regarding processing and computation is also highlighted and decorated in the context of throughput.
Introduction
With the passage of time, the technological growth has revolutionized the generation of data [1]. Unlike the landline phones of earlier ages, the availability of smart phones has made our lives smarter. We used to have floppy disks for data storage, however, the same data are now stored at the cloud. A huge amount of data is generated by each action performed using the mobiles phones [2]. The introduction of smart cars in the transportation industry has increased the scale of data generation. These cars have a number of sensors to record every happening event in the context of a vehicle’s functionality. Thus, the volume of data has increased exponentially. Besides, the generated data is not in a structured form [3]. Internet of Things (IoT) plays an essential role in the evolution of data. IoT connects the physical objects with the Internet and makes the objects smarter. IoT is the organization and arrangement of interconnected machines, objects and computing platforms to transmit data over a particular network. IoT has changed the entire digital world and is the main reason behind the evolution of data. It is predicted that there will be 50 billion physical devices integrated in the Internet by 2020 [4].