CATNIP: LLM Unlearning via Calibrated and Tokenized Negative Preference Alignment

Published in arXiv preprint, 2026

First-author work on LLM unlearning that aims to remove undesirable knowledge while preserving model utility.

Recommended citation: Yang, Z., Zhong, Y., Hong, J., & Zhu, Z. (2026). CATNIP: LLM Unlearning via Calibrated and Tokenized Negative Preference Alignment. arXiv preprint.