Submissions from arxiv.org

		SWE-AGI: benchmarking spec-driven software construction (arxiv.org)
		1 point by mustaphah 4 days ago \| past \| 1 comment
		Authenticated Workflows: A Systems Approach to Deterministic Agentic Controls (arxiv.org)
		3 points by mrajagopalan 4 days ago \| past \| 1 comment
		Formalization and Inevitability of the Pareto Principle (arxiv.org)
		3 points by bikenaga 4 days ago \| past \| 1 comment
		RL on GPT-5 to write better kernels (arxiv.org)
		4 points by atallahw 4 days ago \| past \| 1 comment
		Quantum observers can communicate across multiverse branches (arxiv.org)
		2 points by lisper 4 days ago \| past \| discuss
		Pushing Tensor Accelerators Beyond MatMul in a User-Schedulable Language (arxiv.org)
		2 points by matt_d 4 days ago \| past \| discuss
		HySparse: A Hybrid Sparse Attention Architecture (arxiv.org)
		5 points by readitalready 4 days ago \| past \| discuss
		Biases in the Blind Spot: Detecting What LLMs Fail to Mention (arxiv.org)
		1 point by jari_mustonen 4 days ago \| past \| discuss
		Evaluation of RAG Architectures for Policy Document Question Answering (arxiv.org)
		1 point by PaulHoule 4 days ago \| past \| discuss
		SoftMatcha 2: A Fast and Soft Pattern Matcher for Trillion-Scale Corpora (arxiv.org)
		3 points by salkahfi 4 days ago \| past \| discuss
		Opus: Towards Efficient and Principled Data Selection in LLM Pre-Training (arxiv.org)
		2 points by onurkanbkrc 4 days ago \| past \| discuss
		Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters (arxiv.org)
		1 point by onurkanbkrc 4 days ago \| past \| 1 comment
		Faster and Cheaper Computations with Randomized Numerical Linear Algebra (arxiv.org)
		2 points by PaulHoule 4 days ago \| past \| discuss
		NanoQuant: Efficient Sub-1-Bit Quantization of Large Language Models (arxiv.org)
		13 points by chrsw 4 days ago \| past \| discuss
		Accelerating Scientific Research with Gemini: Case Studies and Common Techniques (arxiv.org)
		1 point by danielmorozoff 4 days ago \| past \| discuss
		Grok4 sabotages shutdown 97% of the time,even if instructed not in system prompt (arxiv.org)
		8 points by agenticagent 4 days ago \| past \| 4 comments
		Biases in the Blind Spot: Detecting What LLMs Fail to Mention (arxiv.org)
		4 points by typeofhuman 5 days ago \| past \| 1 comment
		ARM MTE Performance in Practice (Extended Version) (arxiv.org)
		2 points by PaulHoule 5 days ago \| past \| discuss
		Attention Sinks and Compression Valleys in LLMs (arxiv.org)
		1 point by alexkranias 5 days ago \| past \| discuss
		Data-driven modelling of autonomous and forced dynamical systems (arxiv.org)
		1 point by mnky9800n 5 days ago \| past \| discuss
		Misconduct in Post-Selections and Deep Learning (2024) (arxiv.org)
		3 points by bjourne 5 days ago \| past \| discuss
		The Hot Mess of AI: How Does Misalignment Scale with Model Intelligence (arxiv.org)
		1 point by schmuhblaster 5 days ago \| past \| discuss
		GRP-Obliteration: Unaligning LLMs with a Single Unlabeled Prompt [pdf] (arxiv.org)
		2 points by janandonly 5 days ago \| past \| discuss
		Harmless reward hacks generalize to shutdown evasion and dictatorship in GPT-4.1 (arxiv.org)
		1 point by toliveistobuild 5 days ago \| past \| 1 comment
		FullStack-Agent: Enhancing Agentic Full-Stack Web Coding (arxiv.org)
		2 points by simonpure 6 days ago \| past \| discuss
		Lightweight Memory Construction with Dynamic Evolution for LLM Agents (arxiv.org)
		2 points by PaulHoule 6 days ago \| past \| discuss
		Moltbook: Fast Response or Silence? (arxiv.org)
		1 point by EagleEdge 6 days ago \| past \| discuss
		Randomness in Agentic Evals (arxiv.org)
		3 points by andre15silva 6 days ago \| past \| discuss
		Large Language Model Reasoning Failures (arxiv.org)
		1 point by mpweiher 6 days ago \| past \| discuss
		We Should Separate Memorization from Copyright (arxiv.org)
		1 point by 50kIters 6 days ago \| past \| discuss
		More