EosDNN: An Efficient Offloading Scheme for DNN Inference Acceleration in Local-Edge-Cloud Collaborative Environments

Abstract

With the popularity of mobile devices, intelligent applications, e.g., face recognition, intelligent voice assistant, and gesture recognition, have been widely used in our daily lives. However, due to the lack of computing capacities, it is difficult for mobile devices to support complex Deep Neural Network (DNN) inference. To alleviate the pressure on these devices, traditional methods usually upload part of the DNN model to a cloud server and perform a DNN query after uploading an entire DNN model. To achieve real-time DNN query, we consider the collaboration between local, edge and cloud, and perform DNN query when uploading DNN partitions. In this paper, we propose an Efficient offloading scheme for DNN Inference Acceleration (EosDNN) in a local-edge-cloud collaborative environment, where the DNN inference acceleration is mainly embodied in the optimization of migration delay and realization of real-time DNN query. EosDNN comprehensively considers the migration plan and uploading plan, where for the former, a Particle Swarm Optimization with Genetic Algorithm (PSO-GA) is applied to obtain the distribution of DNN layers under the server with the lowest migration delay, and for the latter, a Layer Merge Uploading Algorithm (LMU) is proposed to obtain DNN partitions and their upload order with efficient DNN query performance. Experimental results demonstrate that EosDNN can be applied to large-scale DNN model migration, which can achieve an ideal migration delay and obtain a more fine-grained DNN partition uploading plan, thereby optimizing DNN query performance.

Publication
IEEE Transactions on Green Communications and Networking
Pengfei Jiao
Pengfei Jiao
Professor