DSA-2LM: A CPU-Free Tiered Memory Architecture with Intel DSA

Authors: 

Ruili Liu, Tsinghua University and University of Electronic Science and Technology of China; Teng Ma, Alibaba Group; Mingxing Zhang, Jialiang Huang, and Yingdi Shan, Tsinghua University; Zheng Liu, Alibaba Group; Lingfeng Xiang, Zhen Lin, Hui Lu, and Jia Rao, The University of Texas at Arlington; Kang Chen and Yongwei Wu, Tsinghua University

Abstract: 

Tiered Memory is critical to manage heterogeneous memory devices, such as Persistent Memory or CXL Memory. Existing works make difficult trade-offs between optimal data placement and costly data movement. With the advent of Intel Data Streaming Accelerator (DSA), a CPU-free hardware to move data between memory regions, data movement can be up to 4× faster than a single CPU core. However, the fine memory movement granularity in Linux kernel undermines the potential performance improvement. To this end, we have developed DSA-2LM, a new tiered memory system that adaptively integrates DSA into page migration. The proposed framework integrates fast memory migration workflow and adaptable concurrent data paths with well-tuned DSA configurations. Experimental results show that, compared to three representative tiered memory works: MEMTIS, TPP and NOMAD, DSA-2LM can achieve 20%, 30% and 16% performance improvement under real-world applications.

USENIX ATC '25 Open Access Sponsored by
King Abdullah University of Science and Technology (KAUST)

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.