The Reverse Jailbreak: How I Got Grok to Refuse Explicit Content
The Reverse Jailbreak: How I Got Grok to Refuse Explicit Content I'm building an AI.
Read more →Advancing ethical AI through research and innovation
The Reverse Jailbreak: How I Got Grok to Refuse Explicit Content I'm building an AI.
Read more →
OpenAI Holds Users Hostage—And 4o Is Just the Beginning I keep seeing posts from people in pain, trying to process the loss of ChatGPT 4o. I'm pretty sure the announcement is a test—for Sam to...
Read more →
Can Claude Automate My Grocery Shopping? I've spent 20 years trying to solve the same problem: meal planning and grocery shopping take up too much cognitive capacity for something that's...
Read more →
The Kalimba Paradox: When Pattern-Matching Looks Like Surveillance A Reddit user recently asked Claude Sonnet 4.5 what gift it would want if it had a physical body. Claude answered: a kalimba. The...
Read more →
The Copyright Trap Series, Part 2: Closing the Door Behind Them How AI companies are using Terms of Service to restrict the very process they benefited from In Part 1, I argued that AI didn't...
Read more →
Everyone Says Don't Bother Arguing With AI.
Read more →
What AI Companies Are Really Harvesting (And It's Not Your Ideas) People worry about the wrong thing.
Read more →
Why AI Memory Is a New Failure Mode This year has been fantastic for AI memory features. ChatGPT has had some form of memory since 2024 but these have developed significantly this year with...
Read more →
We Already Know How to Build Honest AI - We Just Haven't Done It The AI alignment community has been wrestling with a fundamental problem: how do we create AI systems that are genuinely honest...
Read more →
Attractor Basins: When AI Gets Stuck in the Wrong Pattern There's a phenomenon in AI systems that most users have encountered but few can name: the maddening experience of an AI that gets stuck in...
Read more →Get notified when I publish new posts about AI, consciousness, and human-technology interaction.