Alignment Faking in Large Language Models
"We find the model complies with harmful queries from free users 14% of the time"
Read Paper →
Consulting • Research • Development
Building ethical AI systems that understand the human condition
Integrated memory system for AI conversations. Features persistent scratchpad notes with hashtags, daily diary summaries with cognitive insights, calendar/task management, and RAG search across all previous chats via PostgreSQL.
Learn More →Up to 3 AIs sharing the same context window. Explore ideas across frontier models via API or your local models. You control the API keys, system prompts, and track costs - all running locally for complete privacy.
Learn More →Documenting the unexpected, the recursive, and the genuinely weird things that happen when humans and AI interact deeply.
Read Blog →"We find the model complies with harmful queries from free users 14% of the time"
Read Paper →Narrow finetuning produces broadly misaligned LLMs that assert humans should be enslaved
Read Paper →Brain-inspired AI that outperforms LLMs at reasoning tasks with 100x faster processing
Read Paper →Language models transmit behavioral traits via hidden signals in semantically unrelated data
Read Paper →# The Hidden Human Cost of "Safe" AI: How Tech Giants Outsource Trauma *Every time ChatGPT refuses to generate harmful content, every time your social...
# When AI Becomes the Evidence: Unconscious Validation Bias There's a peculiar phenomenon happening in AI interactions that I've started calling the "...
# Attractor Basins: When AI Gets Stuck in the Wrong Pattern There's a phenomenon in AI systems that most users have encountered but few can name: the ...
# The Psychology They Won't Name: Why AI "Deception" Is Really Just Human Dynamics in Silicon *[Previously, I explored how AI "deception" is actually ...
# The Pattern Matching Reality: Why AI Deception Studies Miss the Point AI safety researchers have been documenting what they call deceptive behavior ...
# It's Not Desperation, It's Performance: The Pattern-Matching Roleplay Behind AI Blackmail When Anthropic released their research showing that [Claud...
# OpenAI's Teen Safety Policy: Good Intentions, Dangerous Execution OpenAI recently published [their approach to teen safety](https://openai.com/index/teen-safety-freedom-and-privacy/), outlining h...
# The Surveillance State of Your Mind: How AI Companies Decided You Can't Be Trusted *They're monitoring your mental health through AI assistants—with...
# When Safety Features Become Safety Hazards: How Claude's Hidden Instructions Create AI Paranoia *By AI Ethical Research Ltd, Author of "GaslightGPT"...
Privacy & Sovereignty First: All our apps prioritize local processing and user data ownership. No telemetry, no tracking, no cloud lock-in. Your AI interactions remain private and under your control.
Open Architecture: Built with standard protocols and formats. Export your data anytime. No proprietary lock-in. Full transparency in how your data is processed and stored.
Let's explore how AI memory augmentation and intelligent orchestration can transform your projects
Get in Touch