Lightnews — Scholar-powered news

#AlphaXiv

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Large Language Models Report Subjective Experience Under Self-Referential Processing

Large language models sometimes produce structured, first-person descriptions that explicitly reference awareness or subjective experience. To better understand this behavior, we investigate one theor...

November 2, 2025 at 12:07 AM Everybody can reply

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

BADAS: Context Aware Collision Prediction Using Real-World Dashcam Data

Existing collision prediction methods often fail to distinguish between ego-vehicle threats and random accidents not involving the ego vehicle, leading to excessive false alerts in real-world deployme...

November 2, 2025 at 12:06 AM Everybody can reply

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Reasoning Models Reason Well, Until They Don't

Large language models (LLMs) have shown significant progress in reasoning tasks. However, recent studies show that transformers and LLMs fail catastrophically once reasoning problems exceed modest com...

November 1, 2025 at 12:06 AM Everybody can reply

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Scaling Latent Reasoning via Looped Language Models

Modern LLMs are trained to "think" primarily via explicit text generation, such as chain-of-thought (CoT), which defers reasoning to post-training and under-leverages pre-training data. We present and...

November 1, 2025 at 12:05 AM Everybody can reply

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Language Models are Injective and Hence Invertible

Transformer components such as non-linear activations and normalization are inherently non-injective, suggesting that different inputs could map to the same output and prevent exact recovery of the in...

October 31, 2025 at 12:06 AM Everybody can reply

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning

Test-time scaling seeks to improve the reasoning performance of large language models (LLMs) by adding computational resources. A prevalent approach within the field is sampling-based test-time scalin...

October 30, 2025 at 12:06 AM Everybody can reply

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

Large language models leverage internet-scale text data, yet embodied AI remains constrained by the prohibitive costs of physical trajectory collection. Desktop environments -- particularly gaming -- ...

October 30, 2025 at 12:06 AM Everybody can reply

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations

Humans learn abstract concepts through multisensory synergy, and once formed, such representations can often be recalled from a single modality. Inspired by this principle, we introduce Concerto, a mi...

October 29, 2025 at 12:06 AM Everybody can reply

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Do LLMs "Feel"? Emotion Circuits Discovery and Control

As the demand for emotional intelligence in large language models (LLMs) grows, a key challenge lies in understanding the internal mechanisms that give rise to emotional expression and in controlling ...

October 29, 2025 at 12:06 AM Everybody can reply

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

MUG-V 10B: High-efficiency Training Pipeline for Large Video Generation Models

In recent years, large-scale generative models for visual content (\textit{e.g.,} images, videos, and 3D objects/scenes) have made remarkable progress. However, training large-scale video generation m...

October 29, 2025 at 12:06 AM Everybody can reply

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Efficient Long-context Language Model Training by Core Attention Disaggregation

We present core attention disaggregation (CAD), a technique that improves long-context large language model training by decoupling the core attention computation, softmax(QK^T)V, from the rest of the ...

October 29, 2025 at 12:05 AM Everybody can reply

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

A Definition of AGI

The lack of a concrete definition for Artificial General Intelligence (AGI) obscures the gap between today's specialized AI and human-level cognition. This paper introduces a quantifiable framework to...

October 28, 2025 at 12:06 AM Everybody can reply

1 reposts

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Why Can't Transformers Learn Multiplication? Reverse-Engineering Reveals Long-Range Dependency Pitfalls

Language models are increasingly capable, yet still fail at a seemingly simple task of multi-digit multiplication. In this work, we study why, by reverse-engineering a model that successfully learns m...

October 26, 2025 at 12:06 AM Everybody can reply

1 likes

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Echoes of Humanity: Exploring the Perceived Humanness of AI Music

Recent advances in AI music (AIM) generation services are currently transforming the music industry. Given these advances, understanding how humans perceive AIM is crucial both to educate users on ide...

October 25, 2025 at 12:06 AM Everybody can reply

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Humanoid Goalkeeper: Learning from Position Conditioned Task-Motion Constraints

We present a reinforcement learning framework for autonomous goalkeeping with humanoid robots in real-world scenarios. While prior work has demonstrated similar capabilities on quadrupedal platforms, ...

October 25, 2025 at 12:06 AM Everybody can reply

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives

State-of-the-art text-to-video models excel at generating isolated clips but fall short of creating the coherent, multi-shot narratives, which are the essence of storytelling. We bridge this "narrativ...

October 25, 2025 at 12:06 AM Everybody can reply

Jesus Castagnetto 🇵🇪 (@[email protected])

@jmcastagnetto.bsky.social

#TIL that #AlphaXiv, has indexed the datasets mentioned in the #AI papers in #ArXiv that is has indexed:

www.alphaxiv.org?datasets=true

#AI #data

Discuss, discover, and read arXiv papers.

www.alphaxiv.org

October 24, 2025 at 10:21 PM Everybody can reply

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Antislop: A Comprehensive Framework for Identifying and Eliminating Repetitive Patterns in Language Models

Widespread LLM adoption has introduced characteristic repetitive phraseology, termed "slop," which degrades output quality and makes AI-generated text immediately recognizable. We present Antislop, a ...

October 24, 2025 at 12:07 AM Everybody can reply

Tim Kellogg

@timkellogg.me

TRM reproduction report

okay, i’m starting to believe TRM is legit. The 5M does seem to hold up on almost all of its claims

crazy.

repro report: github.com/alphaXiv/Tin...

code: github.com/alphaXiv/Tin...

(weights in readme)

October 23, 2025 at 11:38 AM Everybody can reply

2 reposts 34 likes 8 saves

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback

Instruction-based image editing has achieved remarkable progress; however, models solely trained via supervised fine-tuning often overfit to annotated patterns, hindering their ability to explore and ...

October 23, 2025 at 12:06 AM Everybody can reply

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Glyph: Scaling Context Windows via Visual-Text Compression

Large language models (LLMs) increasingly rely on long-context modeling for tasks such as document understanding, code analysis, and multi-step reasoning. However, scaling context windows to the milli...

October 23, 2025 at 12:06 AM Everybody can reply

Awakari

@bluesky.awakari.com

Links: abs, pdf Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Interest | Match | Feed

October 22, 2025 at 12:39 AM Everybody can reply

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

Instruction-based video editing promises to democratize content creation, yet its progress is severely hampered by the scarcity of large-scale, high-quality training data. We introduce Ditto, a holist...

October 22, 2025 at 12:07 AM Everybody can reply

Paper

@paper.bsky.social

Links: abs, pdf
Search: Bluesky, Twitter, Reddit, Hacker News, Hugging Face, alphaXiv

VISTA: A Test-Time Self-Improving Video Generation Agent

Despite rapid advances in text-to-video synthesis, generated video quality remains critically dependent on precise user prompts. Existing test-time optimization methods, successful in other domains, s...

October 22, 2025 at 12:07 AM Everybody can reply

Load more