"agents"
-
The Human Must Remain the Control Surface
-
Prompt Injection Is an Operational Risk, Not a Prompting Problem
"governance"
"human-controlled-agents"]
"oasis-claw"]
"practical-ai-safety-stack"
"prompt-injection"
["ai-safety"
-
The Human Must Remain the Control Surface
-
Prompt Injection Is an Operational Risk, Not a Prompting Problem
ai
-
Sheaf-Theoretic Reward Spaces: A New Approach to Safe AI
-
Attention, But Make It Type-Safe
-
Navigating the Safety Manifold: Why Black Holes are Safer Than Walls
-
Gluing Rewards Together: How Math Solves Paradoxes
-
From Proofs to Programs to... Text?
ai-safety
-
The Shape of Good Behavior: Why AI Needs More Than Just a Number
-
Introducing: The Categorical Structure of Sequence Modeling
alignment
attention
category-theory
high-dimensional-reward-spaces
-
The Shape of Good Behavior: Why AI Needs More Than Just a Number
-
Navigating the Safety Manifold: Why Black Holes are Safer Than Walls
-
Gluing Rewards Together: How Math Solves Paradoxes
Human-Controlled Agents
Mathematics
Nature Knows Best
ontological-induction
reinforcement-learning
research
-
Sheaf-Theoretic Reward Spaces: A New Approach to Safe AI
-
Attention, But Make It Type-Safe
-
Navigating the Safety Manifold: Why Black Holes are Safer Than Walls
-
Gluing Rewards Together: How Math Solves Paradoxes
-
From Proofs to Programs to... Text?
Safety
series-intro
sheaf-theory
structure_of_clear_thinking
-
Introducing: The Categorical Structure of Sequence Modeling
-
From Proofs to Programs to... Text?
-
Building an Auditable AI: A Complete Walkthrough