Multiple stream job performance optimization with source operator graph transformationsDayarathna, Miyuru; Suzumura, Toyotaro
doi: 10.1002/cpe.5658pmid: N/A
Multiple distributed stream queries, which are executed on stream processing systems, need to be fine‐tuned to the compute cluster in order to harness the full potential of the hardware they run on. In this paper, we describe an automatic technique for conducting such stream query optimization in the presence of multiple stream jobs. During this autotuning process, we identify the structure of each program and conduct automatic program transformation to generate optimized unified streaming jobs. The operators on the unified secondary sample application are grouped into processing elements considering their performance characteristics and the stream graph topology structure to produce high performance stream query network. We implemented this multiple stream query optimization technique on a mechanism called Tahitica. We demonstrate our approach's ability for producing optimized stream query performance by comparing it to naive deployments using two real‐world stream processing applications in the domains of health care and search advertising. Our stream query optimization approach reported 7.1% throughput performance improvement compared to a naive deployment.
An intelligent trust model for hybrid DDoS detection in software defined networksGong, Changqing; Yu, Delong; Zhao, Liang; Li, Xiguang; Li, Xianwei
doi: 10.1002/cpe.5264pmid: N/A
Software Defined Networks (SDNs) have been extensively studied in recent years. The centralised and programmable controller also brings many security challenges. As a conventional attack with the purpose of destruction, Distributed Denial of Service (DDoS) is still a threat for Software Defined Networks (SDNs). There is a lack of trust evaluation and management mechanism between the OpenFlow switches in SDNs. Therefore, in this paper, we propose a trust evaluation and management model, namely, the Intelligent Trust Model (ITM). In this framework, the Extreme Learning Machine (ELM) is applied to detect hybrid DDoS in Software Definition Networks (SDNs). Our model (ITM) can update the trust value of the OpenFlow switches in real time and respond quickly to different types of DDoS attacks. Compared with others, the experiment results show that our model can provide more efficient detection with high detection accuracy and low false positive rate for the hybrid DDoS attack. At last, in our proposal, OpenFlow switches with higher trust have a relatively higher priority. Therefore, we solve the flow conflict issue in the infrastructure layer.
A comprehensive survey of security threats and their mitigation techniques for next‐generation SDN controllersHan, Tao; Jan, Syed Rooh Ullah; Tan, Zhiyuan; Usman, Muhammad; Jan, Mian Ahmad; Khan, Rahim; Xu, Yongzhao
doi: 10.1002/cpe.5300pmid: N/A
Software Defined Network (SDN) and Network Virtualization (NV) are emerged paradigms that simplified the control and management of the next generation networks, most importantly, Internet of Things (IoT), Cloud Computing, and Cyber‐Physical Systems. The Internet of Things (IoT) includes a diverse range of a vast collection of heterogeneous devices that require interoperable communication, scalable platforms, and security provisioning. Security provisioning to an SDN‐based IoT network poses a real security challenge leading to various serious security threats due to the connection of various heterogeneous devices having a wide range of access protocols. Furthermore, the logical centralized controlled intelligence of the SDN architecture represents a plethora of security challenges due to its single point of failure. It may throw the entire network into chaos and thus expose it to various known and unknown security threats and attacks. Security of SDN controlled IoT environment is still in infancy and thus remains the prime research agenda for both the industry and academia. This paper comprehensively reviews the current state‐of‐the‐art security threats, vulnerabilities, and issues at the control plane. Moreover, this paper contributes by presenting a detailed classification of various security attacks on the control layer. A comprehensive state‐of‐the‐art review of the latest mitigation techniques for various security breaches is also presented. Finally, this paper presents future research directions and challenges for further investigation down the line.
Machine learning algorithms to detect DDoS attacks in SDNSantos, Reneilson; Souza, Danilo; Santo, Walter; Ribeiro, Admilson; Moreno, Edward
doi: 10.1002/cpe.5402pmid: N/A
Summary Software‐Defined Networking (SDN) is an emerging network paradigm that has gained significant traction from many researchers to address the requirement of current data centers. Although central control is the major advantage of SDN, it is also a single point of failure if it is made unreachable by a Distributed Denial of Service (DDoS) attack. Despite the large number of traditional detection solutions that exist currently, DDoS attacks continue to grow in frequency, volume, and severity. This paper brings an analysis of the problem and suggests the implementation of four machine learning algorithms (SVM, MLP, Decision Tree, and Random Forest) with the purpose of classifying DDoS attacks in an SDN simulated environment (Mininet 2.2.2). With this goal, the DDoS attacks were simulated using the Scapy tool with a list of valid IPs, acquiring, as a result, the best accuracy with the Random Forest algorithm and the best processing time with the Decision Tree algorithm. Moreover, it is shown the most important features to classify DDoS attacks and some drawbacks in the implementation of a classifier to detect the three kinds of DDoS attacks discussed in this paper (controller attack, flow‐table attack, and bandwidth attack).
The optimization of virtual resource allocation in cloud computing based on RBPSOWang, Xiaohui; Gu, Haoran; Yue, YuXian
doi: 10.1002/cpe.5113pmid: N/A
The virtual resource allocation in cloud computing is becoming a critical issue. In order to meet the task requirements of different users, virtual machines need to be placed on physical machines through virtualization technology in the data center. However, in this process, the total load balance, energy consumption, and resource utilization of physical machines should be considered. Therefore, two models are established for two different optimization targets, respectively. The first model is built to minimize the degree of load imbalance. The second model is built to maximize the resource utilization and minimize the energy consumption. To gain better results of virtual machines placement, we propose a new algorithm called resampled binary particle swarm optimization (RBPSO). To enhance the global search ability of BPSO, we add the re‐sampling, mutation and small vibration process to it, named RBPSO, for the purpose of maintaining the diversity of the population, reducing redundant calculation and thereby improving the ability and efficiency of the algorithm. Then, the RBPSO is used to solve the deployment problem of virtual machines in cloud computing. The experiments show that the proposed model is reasonable and RBPSO performs better than BPSO and genetic algorithm (GA).
Submarine: A subscription‐based data streaming framework for integrating large facilities and advanced cyberinfrastructureZamani, Ali Reza; AbdelBaky, Moustafa; Balouek‐Thomert, Daniel; Villalobos, J. J.; Rodero, Ivan; Parashar, Manish
doi: 10.1002/cpe.5256pmid: N/A
Large scientific facilities provide researchers with instrumentation, data, and data products that can accelerate scientific discovery. However, increasing data volumes coupled with limited local computational power prevents researchers from taking full advantage of what these facilities can offer. Many researchers looked into using commercial and academic cyberinfrastructure (CI) to process these data. Nevertheless, there remains a disconnect between large facilities and CI that requires researchers to be actively part of the data processing cycle. The increasing complexity of CI and data scale necessitates new data delivery models, those that can autonomously integrate large‐scale scientific facilities and CI to deliver real‐time data and insights. In this paper, we present our initial efforts using the Ocean Observatories Initiative project as a use case. In particular, we present a subscription‐based data streaming service for data delivery that leverages the Apache Kafka data streaming platform. We also show how our solution can automatically integrate large‐scale facilities with CI services for automated data processing.
Time‐critical data management in clouds: Challenges and a Dynamic Real‐Time Infrastructure Planner (DRIP) solutionKoulouzis, Spiros; Martin, Paul; Zhou, Huan; Hu, Yang; Wang, Junchao; Carval, Thierry; Grenier, Baptiste; Heikkinen, Jani; Laat, Cees; Zhao, Zhiming
doi: 10.1002/cpe.5269pmid: N/A
The increasing volume of data being produced, curated, and made available by research infrastructures in the environmental science domain require services that are able to optimize the delivery and staging of data for researchers and other users of scientific data. Specialized data services for managing data life cycle, for creating and delivering data products, and for customized data processing and analysis all play a crucial role in how these research infrastructures serve their communities, and many of these activities are time‐critical—needing to be carried out frequently within specific time windows. We describe our experiences identifying the time‐critical requirements of environmental scientists making use of computational research support environments. We present a microservice‐based infrastructure optimization suite, the Dynamic Real‐Time Infrastructure Planner, used for constructing virtual infrastructures for research applications on demand. We provide a case study whereby our suite is used to optimize runtime service quality for a data subscription service provided by the Euro‐Argo using EGI Federated Cloud and EUDAT's B2SAFE services, and to consider how such a case study relates to other application scenarios.
Reliable access to massive restricted texts: Experience‐based evaluationPeng, Zong; Plale, Beth
doi: 10.1002/cpe.5255pmid: N/A
Libraries are seeing growing numbers of digitized textual corpora that frequently come with restrictions on their content. Computational analysis corpora that are large, while of interest to scholars, can be cumbersome because of the combination of size, granularity of access, and access restrictions. Efficient management of such a collection for general access especially under failures depends on the primary storage system. In this paper, we identify the requirements of managing for computational analysis a massive text corpus and use it as basis to evaluate candidate storage solutions. The study based on the 5.9 billion page collection of the HathiTrust digital library. Our findings led to the choice of Cassandra 3.x for the primary back end store, which is currently in deployment in the HathiTrust Research Center.