Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

High Efficiency Inference Accelerating Algorithm for NOMA-Based Edge Intelligence

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Additional Information
    • Publication Information:
      Institute of Electrical and Electronics Engineers (IEEE), 2024.
    • Publication Date:
      2024
    • Abstract:
      Even the artificial intelligence (AI) has been widely used and significantly changed our life, deploying the large AI models on resource limited edge devices directly is not appropriate. Thus, the model split inference is proposed to improve the performance of edge intelligence (EI), in which the AI model is divided into different sub-models and the resource-intensive sub-model is offloaded to edge server wirelessly for reducing resource requirements and inference latency. Unfortunately, with the sharp increasing of edge devices, the shortage of spectrum resource in edge network becomes seriously in recent years, which limits the performance improvement of EI. Refer to the NOMA-based edge computing (EC), integrating non-orthogonal multiple access (NOMA) technology with split inference in EI is attractive. However, the NOMA-based communication aspect and the influence of intermediate data transmission fail to be considered properly in model split inference of EI in previous works, and the sophistication in resource allocation caused by NOMA scheme makes it further complicated. Thus, the Effective Communication and Computing resource allocation algorithm is proposed in this paper for accelerating the split inference in NOMA-based EI, shorted as ECC. Specifically, the ECC takes the energy consumption and the inference latency into account to find the optimal model split strategy and resource allocation strategy (subchannel, transmission power, computing resource). Since the minimum inference delay and energy consumption cannot be satisfied simultaneously, the gradient descent (GD) based algorithm is adopted to find the optimal tradeoff between them. Moreover, the loop iteration GD approach (Li-GD) is developed to reduce the complexity of the GD algorithm caused by parameter discretization. The key idea of Li-GD is that: the initial value of the ith layer's GD procedure is selected from the optimal results of the former (i-1) layers' GD procedure whose intermediate data size is the closest to ith layer. Additionally, the properties of the proposed algorithms are investigated, including convergence, complexity, and approximation error. The experimental results demonstrate that the performance of ECC is much better than that of the previous studies.
    • File Description:
      application/pdf
    • ISSN:
      1558-2248
      1536-1276
    • Accession Number:
      10.1109/twc.2024.3454086
    • Rights:
      CC BY
    • Accession Number:
      edsair.doi.dedup.....5419daa90162cfba8b9af535b008808f