HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusalcs.AI updates on arXiv.org