TL;DR: Attention sinks in Diffusion Language Models are transient, not stable anchors — so the AR heuristic of "always keep sinks" breaks. We identify and prune them instead, beating strong pruning ...
Workstation with Ubuntu 22.04 for compatibility with ROS2 Humble. A workstation with a GPU (e.g., NVIDIA RTX 3090) is required. 1 robot arms with a 6-axis force/torque sensor at the end effector. We ...