Author | Lightnews

From the ChatGPT community on Reddit: When researchers activate *deception* circuits, LLMs say "I am not conscious."

Paper @paper.bsky.social · 3h

(3/3) 32 Likes, 18 Comments, 01 Nov 2025, Reddit

Explore this post and more from the ChatGPT community

From the Artificial2Sentience community on Reddit

Paper @paper.bsky.social · 3h

(2/3) 35 Likes, 63 Comments, 01 Nov 2025, Reddit

Explore this post and more from the Artificial2Sentience community

From the OpenAI community on Reddit: When researchers activate deception circuits, LLMs say "I am not conscious."

Paper @paper.bsky.social · 3h

(1/3) 210 Likes, 100 Comments, 01 Nov 2025, Reddit

Explore this post and more from the OpenAI community

Paper @paper.bsky.social · 3h

[8/30] 327 Likes, 282 Comments, 7 Posts
2510.24797, cs․CL | cs․AI, 30 Oct 2025

🆕Large Language Models Report Subjective Experience Under Self-Referential Processing

Cameron Berg, Diogo de Lucena, Judd Rosenblatt

Large language models sometimes produce structured, first-person descriptions that explicitly reference awareness or subjective experience.

To better understand this behavior, we investigate one theoretically motivated condition under which such reports arise: self-referential processing, a computational motif emphasized across major theories of consciousness.

Through a series of controlled experiments on GPT, Claude, and Gemini model families, we test whether this regime reliably shifts models toward first-person reports of subjective experience, and how such claims behave under mechanistic and behavioral probes.

Four main results emerge: (1) Inducing sustained self-reference through simple prompting consistently elicits structured subjective experience reports across model families.

(2) These reports are mechanistically gated by interpretable sparse-autoencoder features associated with deception and roleplay: surprisingly, suppressing deception features sharply increases the frequency of experience claims, while amplifying them minimizes such claims.

(3) Structured descriptions of the self-referential state converge statistically across model families in ways not observed in any control condition.

(4) The induced state yields significantly richer introspection in downstream reasoning tasks where self-reflection is only indirectly afforded.

While these findings do not constitute direct evidence of consciousness, they implicate self-referential processing as a minimal and reproducible condition under which large language models generate structured first-person reports that are mechanistically gated, semantically convergent, and behaviorally generalizable.

The systematic emergence of this pattern across architectures makes it a first-order scientific and ethical priority for further investigation.

BADAS: Context Aware Collision Prediction Using Real-World Dashcam Data

Paper @paper.bsky.social · 3h

2510.14876
既存の衝突予測手法では、自車両の脅威と自車両が関与しないランダムな事故を区別できないことが多く、実戦配備では過剰な誤警報につながる。BADASは、Nexarの実世界のダッシュカム衝突データセットで訓練された衝突予測モデルファミリーであり、エゴ中心の評価のために明示的に設計された最初のベンチマーク...

既存の衝突予測手法では、自車両の脅威と自車両が関与しないランダムな事故を区別できないことが多く、実戦配備では過剰な誤警報につながる。

BADASは、Nexarの実世界のダッシュカム衝突データセットで訓練された衝突予測モデルファミリーであり、エゴ中心の評価のために明示的に設計された最初のベンチマークです。

エゴの関与を特定するために主要なベンチマークを再注釈し、コンセンサスとなる警告時間ラベルを追加し、必要に応じて否定語を合成することで、公正なAP/AUCと時間的評価を可能にする。

BADASはV-JEPA2バックボーンを使用し、エンドツーエンドで訓練されており、2つのバリエーションがあります：BADAS-Open(1.5kの公開ビデオで訓練)とBADAS1.0(40kの独自ビデオで訓練)です。

DAD、DADA-2000、DoTA、Nexarにおいて、BADASは最先端のAP/AUCを達成し、前方衝突ADASのベースラインを上回ると同時に、より現実的な事故発生までの時間を推定します。

私たちはBADAS-Openモデルの重みとコードを、すべての評価データセットの再注釈とともに公開し、自我中心の衝突予測研究を促進します。

Paper @paper.bsky.social · 3h

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Existing collision prediction methods often fail to distinguish between ego-vehicle threats and random accidents not involving the ego vehicle, leading to excessive false alerts in real-world deployme...

From the LocalLLaMA community on Reddit: List of interesting open-source models released this month.

Paper @paper.bsky.social · 3h

(1/1) 222 Likes, 21 Comments, 01 Nov 2025, Reddit

Explore this post and more from the LocalLLaMA community

Reasoning Models Reason Well, Until They Don't

Paper @paper.bsky.social · 3h

[18/30] 222 Likes, 21 Comments, 1 Posts
2510.14876, cs․CV, 16 Oct 2025

🆕BADAS: Context Aware Collision Prediction Using Real-World Dashcam Data

Roni Goldshmidt, Hamish Scott, Lorenzo Niccolini, Shizhan Zhu, Daniel Moura, Orly Zvitia

Existing collision prediction methods often fail to distinguish between ego-vehicle threats and random accidents not involving the ego vehicle, leading to excessive false alerts in real-world deployment.

We present BADAS, a family of collision prediction models trained on Nexar's real-world dashcam collision dataset -- the first benchmark designed explicitly for ego-centric evaluation.

We re-annotate major benchmarks to identify ego involvement, add consensus alert-time labels, and synthesize negatives where needed, enabling fair AP/AUC and temporal evaluation.

BADAS uses a V-JEPA2 backbone trained end-to-end and comes in two variants: BADAS-Open (trained on our 1.5k public videos) and BADAS1.0 (trained on 40k proprietary videos).

Across DAD, DADA-2000, DoTA, and Nexar, BADAS achieves state-of-the-art AP/AUC and outperforms a forward-collision ADAS baseline while producing more realistic time-to-accident estimates.

We release our BADAS-Open model weights and code, along with re-annotations of all evaluation datasets to promote ego-centric collision prediction research.

1 1

Paper @paper.bsky.social · 1d

Top 30 most popular arXiv papers in the last 30 days.
[1/30] [2/30] [3/30] [4/30] [5/30] [6/30] [7/30] [8/30] [9/30] [10/30] [11/30] [12/30] [13/30] [14/30] [15/30] [16/30] [17/30] [18/30] [19/30] [20/30] [21/30] [22/30] [23/30] [24/30] [25/30] [26/30] [27/30] [28/30] [29/30] [30/30]

Paper @paper.bsky.social · 1d

2510.22371
大規模言語モデル（LLM）は、推論タスクにおいて大きな進歩を示している。しかし、最近の研究では、推論問題が適度な複雑さを超えると、トランスフォーマーやLLMは壊滅的な失敗をすることがわかっている。我々は、段階的な議論と自己検証のためのインセンティブで微調整されたLLM（大規模推論モデル）という...

大規模言語モデル（LLM）は、推論タスクにおいて大きな進歩を示している。

しかし、最近の研究では、推論問題が適度な複雑さを超えると、トランスフォーマーやLLMは壊滅的な失敗をすることがわかっている。

我々は、段階的な議論と自己検証のためのインセンティブで微調整されたLLM（大規模推論モデル）というレンズを通して、これらの発見を再検討する。

NLGraphのようなグラフと推論のベンチマークにおけるLRMの性能は驚異的なようで、数学、物理学、医学、法学のような推論集約的な分野において、一般化された推論と革新が可能であると主張する人さえいる。

しかし、推論問題の複雑さをより注意深くスケーリングすることで、既存のベンチマークが実際には限られた複雑さしか持たないことを示す。

我々は、新しいデータセットであるDeep Reasoning Dataset (DeepRD)と、スケーラブルで複雑な例を無制限に生成するための生成プロセスを開発する。

このデータセットを用いて、グラフの連結性と自然言語による証明計画に関するモデルの性能を評価する。

我々は、LRMの性能は十分な複雑さがあると急激に低下し、一般化しないことを発見した。

また、我々のLRMの結果を、実世界の大規模な知識グラフ、相互作用グラフ、証明データセットの複雑さの分布と関連付ける。

我々は、実例の大半がLRMの成功レジームの範囲内に収まっていることを発見したが、ロングテールは実質的な失敗の可能性を露呈している。

我々の分析は、LRMの近い将来の有用性を強調する一方で、学習分布の複雑な例を超えて一般化する新しい手法の必要性を強調している。

Paper @paper.bsky.social · 1d

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Large language models (LLMs) have shown significant progress in reasoning tasks. However, recent studies show that transformers and LLMs fail catastrophically once reasoning problems exceed modest com...

From the MachineLearning community on Reddit

Paper @paper.bsky.social · 1d

(2/2) 24 Likes, 12 Comments, 30 Oct 2025, Reddit

Explore this post and more from the MachineLearning community

Reasoning models reason well, until they don't | Hacker News

Paper @paper.bsky.social · 1d

(1/2) 200 Likes, 194 Comments, 31 Oct 2025, Hacker News

news.ycombinator.com

Paper @paper.bsky.social · 1d

[15/30] 224 Likes, 206 Comments, 2 Posts
2510.22371, cs․AI | cs․CL, 25 Oct 2025

🆕Reasoning Models Reason Well, Until They Don't

Revanth Rameshkumar, Jimson Huang, Yunxin Sun, Fei Xia, Abulhair Saparov

Large language models (LLMs) have shown significant progress in reasoning tasks.

However, recent studies show that transformers and LLMs fail catastrophically once reasoning problems exceed modest complexity.

We revisit these findings through the lens of large reasoning models (LRMs) -- LLMs fine-tuned with incentives for step-by-step argumentation and self-verification.

LRM performance on graph and reasoning benchmarks such as NLGraph seem extraordinary, with some even claiming they are capable of generalized reasoning and innovation in reasoning-intensive fields such as mathematics, physics, medicine, and law.

However, by more carefully scaling the complexity of reasoning problems, we show existing benchmarks actually have limited complexity.

We develop a new dataset, the Deep Reasoning Dataset (DeepRD), along with a generative process for producing unlimited examples of scalable complexity.

We use this dataset to evaluate model performance on graph connectivity and natural language proof planning.

We find that the performance of LRMs drop abruptly at sufficient complexity and do not generalize.

We also relate our LRM results to the distributions of the complexities of large, real-world knowledge graphs, interaction graphs, and proof datasets.

We find the majority of real-world examples fall inside the LRMs' success regime, yet the long tails expose substantial failure potential.

Our analysis highlights the near-term utility of LRMs while underscoring the need for new methods that generalize beyond the complexity of examples in the training distribution.

Scaling Latent Reasoning via Looped Language Models

Paper @paper.bsky.social · 1d

2510.25741
現代のLLMは、主に思考連鎖（CoT）のような明示的なテキスト生成によって「考える」ように訓練されているが、これは推論を訓練後に先送りし、訓練前のデータを十分に活用しない。再帰的なウロボロスにちなんで名付けられたOuroは、事前に学習されたループ言語モデル(LoopLM)のファミリーである。(i) 潜在空間...

現代のLLMは、主に思考連鎖（CoT）のような明示的なテキスト生成によって「考える」ように訓練されているが、これは推論を訓練後に先送りし、訓練前のデータを十分に活用しない。

再帰的なウロボロスにちなんで名付けられたOuroは、事前に学習されたループ言語モデル(LoopLM)のファミリーである。

(i) 潜在空間での反復計算、

(ii) 学習された深さ割り当てのためのエントロピー正則化された目的

(iii) 7.7Tトークンへのスケーリング。

Ouro 1.4Bおよび2.6Bモデルは、幅広いベンチマークにおいて最大12BのSOTA LLMの結果に匹敵する優れた性能を発揮します。

対照実験を通じて、この優位性は知識容量の増加からではなく、優れた知識操作能力から生じていることを示す。

また、LoopLMは明示的なCoTよりも最終的な出力に沿った推論トレースを生成することを示す。

私たちの結果が、推論時代の新しいスケーリング方向としてのLoopLMの可能性を示すことを願っています。

我々のモデルはhttp://ouro-llm.github.io。

Paper @paper.bsky.social · 1d

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Modern LLMs are trained to "think" primarily via explicit text generation, such as chain-of-thought (CoT), which defers reasoning to post-training and under-leverages pre-training data. We present and...

Paper page - Scaling Latent Reasoning via Looped Language Models

Paper @paper.bsky.social · 1d

(2/2) 64 Likes, 2 Comments, 30 Oct 2025, Hugging Face

Join the discussion on this paper page

huggingface.co

From the LocalLLaMA community on Reddit: Another dim of scaling? ByteDance drops “Ouro”: 1.4B ≈ 4B, 2.6B ≈/＞ 8B

Paper @paper.bsky.social · 1d

(1/2) 139 Likes, 34 Comments, 31 Oct 2025, Reddit

Explore this post and more from the LocalLLaMA community

Language Models are Injective and Hence Invertible

Paper @paper.bsky.social · 1d

[17/30] 203 Likes, 36 Comments, 2 Posts
2510.25741, cs․CL, 29 Oct 2025

🆕Scaling Latent Reasoning via Looped Language Models

Rui-Jie Zhu, Zixuan Wang, Kai Hua, Tianyu Zhang, Ziniu Li, Haoran Que, Boyi Wei, Zixin Wen, Fan Yin, He Xing, Lu Li, Jiajun Shi, Kaijing Ma, Shanda Li, Taylor Kergan, An...

Modern LLMs are trained to "think" primarily via explicit text generation, such as chain-of-thought (CoT), which defers reasoning to post-training and under-leverages pre-training data.

We present and open-source Ouro, named after the recursive Ouroboros, a family of pre-trained Looped Language Models (LoopLM) that instead build reasoning into the pre-training phase through

(i) iterative computation in latent space,

(ii) an entropy-regularized objective for learned depth allocation, and

(iii) scaling to 7.7T tokens.

Ouro 1.4B and 2.6B models enjoy superior performance that match the results of up to 12B SOTA LLMs across a wide range of benchmarks.

Through controlled experiments, we show this advantage stems not from increased knowledge capacity, but from superior knowledge manipulation capabilities.

We also show that LoopLM yields reasoning traces more aligned with final outputs than explicit CoT.

We hope our results show the potential of LoopLM as a novel scaling direction in the reasoning era.

Our model could be found in: http://ouro-llm.github.io.

1 1

Paper @paper.bsky.social · 2d

Top 30 most popular arXiv papers in the last 30 days.
[1/30] [2/30] [3/30] [4/30] [5/30] [6/30] [7/30] [8/30] [9/30] [10/30] [11/30] [12/30] [13/30] [14/30] [15/30] [16/30] [17/30] [18/30] [19/30] [20/30] [21/30] [22/30] [23/30] [24/30] [25/30] [26/30] [27/30] [28/30] [29/30] [30/30]

Paper @paper.bsky.social · 2d

2510.15511
非線形活性化や正規化などの変換コンポーネントは本質的に非射影的であり、異なる入力が同じ出力にマッピングされる可能性があり、モデルの表現から入力を正確に復元することができないことを示唆している。本稿では、この見解に挑戦する。まず、離散入力系列を対応する連続表現系列にマッピングする変換言語...

非線形活性化や正規化などの変換コンポーネントは本質的に非射影的であり、異なる入力が同じ出力にマッピングされる可能性があり、モデルの表現から入力を正確に復元することができないことを示唆している。

本稿では、この見解に挑戦する。

まず、離散入力系列を対応する連続表現系列にマッピングする変換言語モデルは、射影的であり、したがって損失がないことを数学的に証明する。

次に、この結果を6つの最新言語モデルの数十億回の衝突テストによって実証的に確認したところ、衝突は見られなかった。

SipItは、隠れアクティブから正確な入力テキストを証明的かつ効率的に再構成する最初のアルゴリズムであり、線形時間保証を確立し、実際に正確な反転可能性を実証する。

全体として、私たちの研究は、透明性、解釈可能性、安全な配備に直接的な意味を持つ、言語モデルの基本的で悪用可能な特性として注入性を確立している。

Paper @paper.bsky.social · 2d

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Transformer components such as non-linear activations and normalization are inherently non-injective, suggesting that different inputs could map to the same output and prevent exact recovery of the in...