Hello fellow keepers of numbers,

We got a lot of updates from the Google I/O event this week, some of which are covered here. And OpenAI is taking a page out of the Anthropic playbook. They’re shipping 15+ new features every week.

Gemini 3.5 Flash launched as the new default model for Google. It’s blazing fast. But maybe not that good? TBD. Caseware launched a new AI platform for audits. And KPMG partnered with Anthropic to use more Claude Cowork and Claude Code.

Plus, stick around for an explanation of the Claude for Small Business plugin and a few of the most interesting accounting-related skills.

THE LATEST

Gemini 3.5 Flash becomes Google's default model

Source: Gemini 3.5: frontier intelligence with action

Google released Gemini 3.5 Flash at Google I/O and made it the default model in the Gemini app and AI Mode in Search worldwide. The company says it outperforms Gemini 3.1 Pro on coding, multi-step tasks, and multimodal work while running roughly four times faster than other frontier models at less than half the cost on complex tasks.

On capability, Google highlights stronger OCR on complex invoices, the ability to read and reason over 100-plus page documents in a single pass, and richer generation of interactive web UIs and graphics.

Flash is available now in the Gemini app, AI Mode in Search, Google Antigravity, the Gemini API, and across Google Enterprise. Google says Gemini 3.5 Pro is rolling out next month.

Why it’s important for us:

Gemini 3.5 Flash is fast. Like, really fast. Do you ever get a response so fast that you think there’s no way it actually read my files or did the research? That might be a me-problem. Might take some recalibration in my brain as these frontier AI models continue to get faster.

This launch was probably Google’s biggest announcement at I/O this year. The speed is what stands out. But how good is 3.5 Flash, actually?

Benchmarks put it right up there with Claude Opus 4.7 and GPT-5.5, which is extremely impressive for Google’s cheaper, faster model. But from my own limited testing, and what I’ve seen a few others say, it doesn’t quite feel as good in practice.

Even if it’s a step below the top models, it probably has a place for some people because it’s fast and cheap. I’m also very curious to see what Gemini 3.5 Pro looks like next month if Flash is this close to the top models.

The biggest issue for me with Gemini is that working with it doesn’t feel as natural or fun as Claude or ChatGPT. A lot of that is where I’m using the model. Claude Cowork and Codex feel great because they’re agentic. Google hasn’t caught up to them yet (more on this in the Trending News). Their Antigravity app hasn’t quite hit for day-to-day work.

Caseware launches Verity AI platform for audit engagements

Source: ChatGPT Images 2.0 / The AI Accountant

Caseware launched Verity, a new AI layer built directly into assurance and financial reporting engagements. The platform uses engagement context, firm methodology, supporting documents, and professional standards to help audit teams surface risks, analyze documents, and move through engagement work.

Caseware says it has invested more than $100 million in AI development over multiple years. In beta deployments, Verity reached 94% accuracy on golden dataset items and reduced certain manual workflows from 15 to 20 minutes to under two minutes.

The first agents include a Disclosure Checklist Agent, Document Intelligence Agent, and Risk Suggestion Agent. Caseware says outputs are citation-backed, reviewable, and traceable before entering the engagement file.

Verity is built into Caseware Cloud and is currently available only in the U.S.

Why it’s important for us:

We’re getting a lot of signals around the future of audit. Just last week, it was Grant Thornton rolling out their new platform for agentic audits. All the major firms have now announced AI-enabled audit software. PwC and EY have recently gone as far as to say they expect audits to be mostly automated within 1-3 years.

Now, Caseware has announced Verity, which will allow for a lot of other firms that already leverage Caseware in the cloud to benefit from similar agentic software.

That being said, I’ve already made it known I’m skeptical of accounting software vendors who build their own AI features. Accounting software isn’t exactly famous for beautiful, flexible, forward-looking products.

The fastest path to success for most software vendors (and accounting firms) is exposing MCPs and APIs so firms can connect their systems of record to tools like Claude Cowork, Claude Code, and Codex. Let’s let the AI companies handle the AI…

That feels like a more realistic path to success than waiting for every accounting vendor to suddenly become the next Anthropic.

Still, this is an interesting announcement. The barrier to adopting agents is instantly much lower for thousands of accounting firms if they’re willing to pay for Verity. We’ll see how useful they are in practice.

KPMG embeds Claude across client work and global workforce

Source: ChatGPT Images 2.0 / The AI Accountant

Anthropic announced a global alliance with KPMG that will bring Claude into KPMG’s core business and give all 276,000+ KPMG employees access to Claude. The rollout embeds Claude inside KPMG Digital Gateway, the firm's main client-work platform, starting with new tools for tax and legal clients.

Digital Gateway is built on Microsoft Azure and combines KPMG's tax expertise, proprietary tools, and client data in one platform. With Claude Cowork and Managed Agents built in, KPMG professionals and clients can create AI workflows directly inside the platform instead of moving between separate tools and chat windows. KPMG said an AI agent for helping clients adjust to changing tax regulations that previously took weeks to build can now be created in minutes.

Anthropic is also naming KPMG a preferred partner for private equity. The two companies plan to build Claude-powered products for PE portfolio companies, including KPMG Blaze, which can embed Claude Code to help modernize aging IT systems and build new AI-enabled technology.

Why it’s important for us:

I feel like the top 10 accounting firms just continue making new deals with Anthropic and OpenAI. First, it was to roll out AI models. Then it was partnering to roll out AI models to clients. Then it was partnering to build internal AI tools. And now it’s utilizing agentic AI, agents on virtual servers, and selling consulting services to clients and PE.

KPMG built Digital Gateway, which is where they combine their own tools and data with Claude Cowork, Claude Code, and Claude Managed Agents. It’s not shocking that we keep seeing these partnerships, but the speed is a little surprising. It seems like Claude Cowork has really highlighted the gap, and now this is catching on quickly.

The PE piece is an interesting signal too. Very similar to the PwC announcement last week. Apparently, PE is willing to throw a lot of money at AI. Shocker.

TRENDING NEWS

OpenAI released ChatGPT for PowerPoint in beta: They're slowly catching up to Claude on the M365 add-ins. This is a much-needed release. Claude has historically been better at working with Excel and PowerPoint files, so I'll have to test this against the Claude add-in to see where it stands.

Google introduced Gemini Spark, a 24/7 AI agent designed to take action across Gmail, Docs, and other Google products: This is Google trying to catch up to Claude Cowork and Codex. Good announcement, but it's not rolled out yet. So TBD on how well this works at launch. But this could be a very interesting new tool for those who use Google Workspace.

Anchor moved its MCP out of beta and made it available to all Anchor users: Another accounting vendor MCP. Every vendor should have an MCP. It's becoming clear that your data needs to be accessible by AI in tools like Claude Cowork or Codex. MCPs make that happen.

OpenAI made Goal mode generally available in the Codex app: FYI, Claude has /goal as a skill now as well. This is a great new feature. Agents are getting better at working toward an outcome instead of just answering a prompt. I've been using this recently, and I'm finding that Claude or Codex will work for 45 mins to a few hours at a time on something, and it's significantly improved my outputs. Obviously, only useful if it's a difficult task with several moving parts.

Anthropic hired Andrej Karpathy to lead Claude-on-Claude pre-training research: It's weird to cover a hiring, but this is a big deal. Karpathy is one of the most respected minds in AI and was a founding member of OpenAI. This is a strong signal of what he thinks about Anthropic and the potential of the Claude models. He’s going to work on using Claude to train future Claude models.

OpenAI expanded Codex plugin sharing so teams can distribute reusable workflows across a workspace: Very clean interface for sharing plugins on Business and Enterprise plans.

OpenAI expanded Codex analytics for business and enterprise teams: I haven't yet seen what this looks like, but I like the idea a lot. Claude Cowork lets you track analytics for your team, but it requires another third-party subscription to compile the data. If this is all handled by OpenAI and it shows a nice dashboard, that'd be a great feature for business leaders. You'll theoretically be able to see your power users, spend by user and group, and the type of work your teams are doing.

OpenAI launched a personal finance preview in ChatGPT with Plaid account linking for U.S. Pro users: Individuals increasingly have access to AI-generated views of their spending, investments, and potentially even tax questions. Firms should get used to the fact that they'll be talking to more people who think they're educated on the topic. It'll be important to understand what advice clients think they've already received. I'm sure this is both good and also very bad…

Google unveiled Gemini Omni, a new multimodal model family for video generation, photo editing, audio, text, and digital avatars: Omni itself probably isn't useful for accountants right now, but the "world model" piece is interesting. If AI gets better at reasoning about the real world and how humans actually behave, that knowledge could eventually trickle down into the day-to-day models we use for documents, workflows, and client work.

PUT IT TO WORK

Anthropic released a new Claude for Small Business plugin, so I did the responsible thing and immediately looked for things to steal.

The most interesting pieces are the accounting skills, especially the month-end close skills. They show how Claude can use connected tools like QBO and payment processors to reconcile data, flag issues, create a P&L narrative, and export a close packet.

This isn’t plug-and-play for most businesses. But the skills are worth reviewing to find the ones you want to steal and make your own.

WEEKLY RANDOM

AI agents are getting wallets. Stripe announced last month that its Link wallet now supports agent-initiated payments. The agent submits a spend request with context, Stripe pings you to approve, and then payment credentials are released. The agent never sees your card number, and you'll be able to pre-authorize limits so it can act without per-transaction prompts.

This got me into a question I hadn’t really thought about: what does accounting look like with agents integrated into spending?

Ultimately, this doesn’t really change the expenses. An office supplies charge is still being coded to supplies expense. The chart of accounts probably doesn’t change. But controls around spending will.

Spend management tools have made authorization much easier for businesses over the last several years. I imagine the human-to-human limits and authorizations can carry over to the human-to-agent relationship. But the audit trail will change. Maybe it requires a different dimension to tag it as “agent spend.”

I suspect we’ll also see more chargebacks in the future. Likely won’t change how we account for chargebacks, but cash flow management might become more important. It could also surface some additional revenue recognition considerations.

Another interesting wrinkle is materiality. Agents will likely make a lot of small transactions. Individually, most would probably fall below materiality thresholds. But in the aggregate, errors could materially impact financials. This could have some significant impacts on how firms test controls.

I’m not sure how far out we are from a world where agents have significant spend within the market, but it’s already starting now.

Until next week, keep protecting those numbers.

Preston

Keep Reading