Obfuscated Reasoning

LLMs can learn and generalize steganographic encoding under the right conditions.

A Venn diagram illustrating the relationship between different performance and CoT estimation methods, including misaligned CoT, transluently misaligned CoT, and their impact on performance and performance oversight.