Probing the Decoupling Hypothesis in LLM Reasoning

Dataemia
2 Min Read



Summarize this content to 100 words:

[Submitted on 23 May 2025 (v1), last revised 4 Feb 2026 (this version, v2)]

View a PDF of the paper titled Robust Answers, Fragile Logic: Probing the Decoupling Hypothesis in LLM Reasoning, by Enyi Jiang and 4 other authors
View PDF
HTML (experimental)

Abstract:While Chain-of-Thought (CoT) prompting has become a cornerstone for complex reasoning in Large Language Models (LLMs), the faithfulness of the generated reasoning remains an open question. We investigate the Decoupling Hypothesis: that correct answers often mask fragile, post-hoc rationalizations that are not causally tied to the model’s prediction. To systematically verify this, we introduce MATCHA, a novel Answer-Conditioned Probing framework. Unlike standard evaluations that focus on final output accuracy, MATCHA isolates the reasoning phase by conditioning generation on the model’s predicted answer, allowing us to stress-test the stability of the rationale itself. Our experiments reveal a critical vulnerability: under imperceptible input perturbations, LLMs frequently maintain the correct answer while generating inconsistent or nonsensical reasoning – effectively being “Right for the Wrong Reasons”. Using LLM judges to quantify this robustness gap, we find that multi-step and commonsense tasks are significantly more susceptible to this decoupling than logical tasks. Furthermore, we demonstrate that adversarial examples generated by MATCHA transfer non-trivially to black-box models. Our findings expose the illusion of CoT robustness and underscore the need for future architectures that enforce genuine answer-reasoning consistency rather than mere surface-level accuracy.

Submission history From: Enyi Jiang [view email] [v1]
Fri, 23 May 2025 02:42:16 UTC (1,459 KB)
[v2]
Wed, 4 Feb 2026 21:36:09 UTC (1,459 KB)



Source link

Share This Article
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

error: Content is protected !!