Provoking thought

Hallucination Weekly #2

Welcome to Hallucination Weekly #2 — my new experimental way to share what is on my radar in terms of AI, in a short(er) and more skimmable form. With the AI industry in its state of extreme hype, and zero sign that this will change any time soon, the most common question I get is: what is worth paying attention to, what’s worth trying? Gimme me the TL;DR.

Like last time, I’ll split topics up in the following sections:

  1. Queued: these are AI developments and products that have peaked my interest, but I have not really dug in deeply yet.
  2. Processing: these are developments and products that I am actually playing with and have some early (honeymoon period) thoughts on.
  3. Done: topics that I now have sufficiently explored to have a real (final?) opinion on.
  4. Brain noise: some academic ideas I’m toying with in my head.

Queued

OpenAI launched its own code agent: Codex. Like Anthropic’s Claude Code (which I still struggle to pronounce), it’s terminal-based, so no IDE integrations yet. The first thing that makes it notable is that it’s open source (Apache 2.0 licensed, not “my model is ‘open source’ because that sounds good” open source) so you actually see what it can and cannot do. I found it quite insightful to go through some of the open source code behind Cline to better gauge what to expect from it (especially at the tools level). Obviously, Codex is tied to OpenAI’s models (although I suppose you could fork it), which until recently were considered worse than Anthropic’s. However, GPT-4.1 also launched this week, which claims to have made major strides in coding. So worth trying Codex out as well.

Processing

My family is slowly increasing its AI adoption. Even my wife (an English teacher) is using ChatGPT more and more. For my older son (11), this is already second nature, the twins (8) don’t really have computer access yet. But what about cost? I want everybody to get the best possible results (which means premium models). Do I now have to buy everybody $20/month ChatGPT, and Claude, and Perplexity subscriptions? I’m not made of money (yet). The solution I’m “rolling out” in my family now is the approach of buying API credits from OpenAI and Anthropic, and installing Msty on their desktop (available on Windows and Mac), and Pal Chat on iOS (both free-tier apps). Both of these are full-featured enough ChatGPT-like clients that can talk to multiple providers using your own API keys instead of a regular subscription. That way I just pay for the tokens they use. My assumption is that this will net significantly lower than paying for a pack of subscriptions. To be verified.

MCPs (the Model Context Protocol) are all the rage right now. At least they were last week. One way to describe MCP is as the “USB of LLMs” — if that analogy means anything to you. Essentially it’s a standardized way (because Anthropic, OpenAI and another big names have now adopted it) to allow you to extend the capabilities (toolset) of your LLM. Everybody’s rushing to get their “MCP server” out. Not always because it makes sense, but to jump on the MCP hype wave and make some headlines. There are many open concerns about MCP, including security and deployment. Over the last two weeks I sat on numerous calls with non-engineering departments installing node.js and Python and editing JSON files to give them access to the latest and greatest in MCP land. I expect this to get better soon.

Done

Nothing in “Done” this week. Everything is moving.

Brain noise

I’m reading Careless People, by former Director of Public Policy at Facebook Sarah Wynn-Williams. It’s an absolutely wild read. What becomes ever more apparent to me is that humanity really struggles to scale groups of people (companies, countries) to a large size without it going completely off the rails. And this is coming from a person who still believes that (most) people have good intentions, even big tech CEOs.

From chapter 28:

Weirdly, Mark has his San Francisco house not too far from ours. When I ask him why we never see him in the neighborhood, he explains it’s because he can’t get planning permission for a place to park his helicopter so he rarely uses the house.

It seems inevitable that once you accrue a certain amount of power and wealth that you lose empathy with “regular folk,” even though your decisions affect them — sometimes at the life or death level. Even with the best of intentions going in, I don’t think it’s reasonable to expect from anybody to be able to deal with this reasonably. This comes up all over the place. Big Tech CEOs, other celebrities, political leaders. This is relevant to AI in the sense that AI break-throughs are all coming from tech giants right now. It takes significant resources to train a model, you can’t do this on your spare Raspberry Pi in the closet. That means we’re at the whims of companies with leaders that live on very different planets than we are. A different type of alignment problem. I don’t know what to do with this yet.

That’s all for this week.