Hello fellow keepers of numbers,

Happy March Madness to those who celebrate. Personally, these are among my favorite couple days of the calendar year. Even despite the fact this used to fall during my busy season in my previous life.

This week, OpenAI released their mini and nano models for the GPT-5.4 family. Anthropic put Claude Cowork tasks on your phone, in a bid to upset OpenClaw. And Anthropic published the largest qualitative study ever conducted, which includes the public’s perspective on AI and actual experiences. Plus, stick around for a demo of the new interactive visuals in Claude on financial data.

THE LATEST

OpenAI’s GPT-5.4 mini and nano aim at high-volume work

Source: Gemini Nano Banana 2 / The AI Accountant

OpenAI released two smaller models in the GPT-5.4 family, GPT-5.4 mini and GPT-5.4 nano, designed for fast, high-volume, latency-sensitive workloads. GPT-5.4 mini keeps most of the flagship model’s capabilities like tool use, web search, file search, and computer use, but runs more than 2x faster than the prior GPT-5 mini at a lower price point.

The model is also used in multi-agent setups where a larger GPT-5.4 model can plan work and delegate repetitive subtasks to mini sub-agents. It supports a 400,000-token context window for long documents and large workflows.

GPT-5.4 mini is available today in the API, Codex, and ChatGPT. For API use, pricing starts at $0.75 per million input tokens and $4.50 per million output tokens.

GPT-5.4 nano is a smaller, API-only model built for bulk tasks like classification, data extraction, ranking, and simple coding sub-agents. It also supports a 400,000-token context window but drops some heavier capabilities to maximize speed and throughput. Nano is priced at about $0.20 per million input tokens and $1.25 per million output tokens, making it the cheapest model in the GPT-5.4 family.

Why it’s important for us:

This isn’t a surprising release. OpenAI has had their mini and nano models available for the last several major model families. If you’re using the regular chatbot in ChatGPT, these model releases are mostly irrelevant. If you’re using ChatGPT via the API for any automations, GPT-5.4 mini is a great, affordable option.

It’s also interesting that GPT-5.4 mini has a 400k token context window. The standard GPT-5.4 has a 272k context window by default. If you remember last week’s announcement, both Claude Sonnet 4.6 and Opus 4.6 now have a 1M token context window. So while it means ChatGPT can support far less context (via files, instructions, long-running chats) than Claude, GPT-5.4 mini is a bit of a step up compared to GPT-5.4. I suspect this means we’ll see GPT-5.4 jump up shortly, hopefully to a 1M token context window by default.

I think GPT-5.4 nano is mostly irrelevant. The pricing for GPT-5.4 mini isn’t very material, so I’d recommend defaulting to that model if you’re looking for affordability in automations using OpenAI’s API.

Anthropic puts Claude Cowork in your pocket

Anthropic released Dispatch this week, a new research preview feature for Claude Cowork that gives users a single persistent conversation thread with Claude synced across their phone and desktop. Rather than starting a new session for each task, Dispatch maintains one continuous thread that retains context from previous tasks. Users can message Claude from the mobile app while away from their computer, and Claude works on the task using the local files, connectors, and plugins already configured on the desktop.

Setup is straightforward. Users open Dispatch from the Cowork sidebar, scan a QR code to pair their phone, and grant file access. From there, it works like messaging Claude. Pull data from a local spreadsheet, search Slack and email, build a presentation from cloud files, organize a folder. Claude handles the execution on the desktop and delivers the finished output rather than showing every intermediate step.

Anthropic included a safety notice with the release. Giving a mobile AI agent remote control of a desktop AI agent creates a chain where instructions from a phone can trigger real actions on a computer, including reading, moving, or deleting local files, interacting with connected services, and controlling a browser. Anthropic advises users to trust every app and service in the chain, understand what files and accounts are accessible, and know how to quickly disconnect or revoke access before enabling the feature.

Dispatch is available now as a research preview for Pro and Max subscribers on macOS and Windows. It requires the latest versions of both the Claude Desktop app and the Claude mobile app, plus an active internet connection on both devices.

Why it’s important for us:

This is a step in the right direction for Claude Cowork. They've been moving this direction as well with Claude Code. The persistent conversations between mobile and desktop are going to be extremely convenient.

If you've paid attention to OpenClaw over the last 1-2 months, this will probably look like Anthropic is trying to respond to that viral agent. And it's because they are. There are still quite a few security, accuracy, and cost concerns with OpenClaw and similar wrappers. So it's interesting to see big players like Anthropic countering their punches.

Still, there are a few important limitations to address.

The desktop computer must be awake and the app must remain open for Dispatch to work.
Claude only responds to assigned tasks and does not reach out proactively.
All messages live in a single conversation with no way to start a new thread or manage multiple threads.
No notifications on mobile or desktop when Claude completes a task.
Scheduled tasks are managed separately outside of Dispatch.

The desktop app has to remain open, which is different from OpenClaw, and still a big downside for Claude. Also, not being able to manage separate threads is particularly annoying. Hopefully that changes soon. Lastly, scheduled tasks don't work in Dispatch yet. I suspect they'll be working on that furiously because this is what’ll take it from an added layer of convenience into a proactive personal and business agent in your pocket.

That being said, this is a major step in the right direction. They’re shipping features and updates for Cowork so fast, which is a testament to how quickly Cowork has taken off in the market.

Anthropic publishes what 81,000 people want from AI

Source: Anthropic / What 81,000 people want from AI

Anthropic published a large-scale qualitative study based on interviews with 80,508 Claude users across 159 countries in 70 languages. Over one week in December, everyone with a Claude.ai account was invited to sit down with Anthropic Interviewer, a version of Claude prompted to conduct structured interviews with adaptive follow-ups. Anthropic says it is the largest and most multilingual qualitative study ever conducted.

The study asked what people want from AI, whether it has delivered, and what concerns them. The top aspiration was professional excellence (18.8%), followed by personal transformation (13.7%), life management (13.5%), time freedom (11.1%), and financial independence (9.7%). When asked whether AI had taken a step toward their vision, 81% said yes.

Most respondents raised multiple concerns. Unreliability led at 26.7%, including hallucinations, fake citations, and the verification burden that comes with checking AI's work. Jobs and the economy followed at 22.3%, and loss of human autonomy came in at 21.9%. Cognitive atrophy, governance gaps, misinformation, surveillance, and malicious use each ranged between 13-16%.

A central finding is what Anthropic calls "light and shade." The same capabilities that produce benefits also produce harms, often within the same person. The study identified five tensions: learning versus cognitive atrophy, decision-making versus unreliability, emotional support versus dependence, time-saving versus illusory productivity, and economic empowerment versus displacement.

Time-saving was the single most cited benefit, with half of all respondents raising it. But 19% said they were actually losing time to AI because of the verification burden or because expectations increased at work. People in high-stakes professions like law, finance, government, and healthcare mentioned the decision-making versus unreliability tension at nearly twice the average rate. Nearly half of all lawyers reported both real decision-making benefits and firsthand experience with AI unreliability.

Globally, 67% of interviewees expressed net positive sentiment toward AI. Concern about jobs and the economy was the strongest predictor of overall sentiment.

Anthropic plans to use the Anthropic Interviewer framework regularly. The next study will focus on Claude's effects on wellbeing over time.

Why it’s important for us:

This is an awesome study, and I'd highly recommend reading it in more detail if you have time. Props to Anthropic for putting this together, and I hope other AI companies and organizations put effort into this as well.

This is a massive study covering a lot of the pros and cons many of us grapple with around AI. Personally, I feel like I'm able to do exponentially more things in the same amount of time than just a few years ago. I'm also able to take on more complex work than I previously would've been comfortable taking because of my ability to learn with AI and build alongside it.

At the same time, I'm extremely worried about cognitive atrophy, both for myself and others. And I find myself having to slog through tedious reviews of AI outputs. Sometimes it's tempting to rely on AI outputs without my own review, but when my work product or reputation depends on it, I obviously want to have final say.

I'd be very curious to hear results from a similar study by industry. I'm interested in how those working in accounting and finance feel at this point, and how that changes over the next 1-2 years as adoption of agentic AI increases.

PUT IT TO WORK

Last week, Anthropic released interactive charts and visuals in the Claude chat. It can now generate advanced graphics to help you understand complex topics, chart data you provide, or generally help turn anything into interactive visuals for those who are visual learners.

I tested this new feature on an income statement and a financial model. I’m incredibly impressed. Reporting software is officially on notice…

Claude's new interactive visuals with financials - Watch Video

TRENDING NEWS

Notion adds skills to use with their AI agents: Add a page in Notion and save it as a skill that you can reference with an @-mention when using AI agents. Skills are essentially automations written as instructions for an agent to execute.

NVIDIA announces NemoClaw, an enterprise security layer for OpenClaw: Runs NVIDIA’s AI model locally, which means data stays on-device with no API costs. It also gives organizations control over how agents behave and what data they access. Yet another indicator we’re transitioning towards custom agents like OpenClaw, but it’s still not quite ready for primetime in my opinion.

Google AI Studio moves from prototyping to production: Major update that moves Google AI Studio into direct competition with Lovable, Replit, Bolt, and others. It now handles database storage and authentication natively with Firebase. It also improved the design capabilities and simplified connecting with other applications.

WEEKLY RANDOM

As I mentioned in the intro, the opening week of March Madness is among my favorite weeks on the calendar. College football is actively trying to recreate the March Madness experience with their expanded playoffs, but it’s just not possible.

Football isn’t conducive to the same kind of upsets and chaos as basketball. It’s why basketball is such a beautiful sport. 8 under-recruited players for a mid-major school with far less natural talent than a powerhouse school can put together 40 minutes of hard-fought defense, unselfish offense, and maximum effort to accomplish the unthinkable and knock off a giant.

It’s why March Madness brackets are so hard to predict. Nobody has ever had a perfect bracket.

I figured AI might be able to help us make better decisions. So I created a tool, which I’ve creatively named Bracket AI.

Bracket AI application

Before you tell me the AI picks suck… The goal wasn’t necessarily to create an AI that gets all the picks correct. The goal was to recreate “human-like” picks, but based on underlying stats, common seeding upsets, and a very tiny bit of randomness.

I pulled in statistics from several different places, resulting in one of the most comprehensive databases of college basketball stats I’ve seen. I’m a statistics nerd, so this might’ve been just for me.

Example of the stats for a matchup

The AI model has access to all of those stats, as well as injury/suspension news for teams. It reasons over all of the information, plus common seeding results, and a simplified version of a Monte Carlo simulation. A Monte Carlo simulation is like a weighted coin flip, which provides a very minimal level of randomness, similar to what might happen during the course of a game (i.e., injuries, bad calls, shooters getting hot or going cold).

The app offers a few different ways of using AI. First, you can use it to select the winner of a specific matchup. The AI returns an analysis of the game, alongside the AI model’s selection.

Example of an AI-generated analysis and selection

Secondly, you can use it to fill out the entire bracket all at once. This sends the AI model game-by-game to make selections for each matchup.

Lastly, you can use AI to provide an analysis of the matchup without making a pick. This can give you some useful information to inform you on your own selection.

I know we’re a full day, plus some, into the tournament, but feel free to check it out and play around with it. It’s completely free for you to test. I’ve set fairly generous limits on the number of AI requests you get in the app, so don’t hesitate to use them. There might be some minor bugs, but such is the nature of building something like this in half a day with AI.

Until next week, keep protecting those numbers.

Preston

#023: GPT-5.4 mini, Claude Cowork on Phones, and People's Opinions on AI

OpenAI’s GPT-5.4 mini and nano aim at high-volume work

Anthropic puts Claude Cowork in your pocket

Anthropic publishes what 81,000 people want from AI

Keep Reading

The AI Accountant