Hello fellow keepers of numbers,

True story, I’ve had to set up a second twitter account to track AI news. There’s so much so fast that it was consuming the entirety of my feed. I was missing important sports updates because they got lost in the sea of people posting novels and dramatically panicking about AI.

Okay, now let’s panic about AI.

Claude Mythos is here, and it’s finding exploits in every operating system and web browser it gets its cold, robotic little hands on. Perplexity markets Computer for Taxes to the public to see if we can set a single-year record for amended returns. And Claude Cowork finally leaves research preview.

THE LATEST

Anthropic unveils Claude Mythos and Project Glasswing

Anthropic announced Project Glasswing, a new cybersecurity initiative built around Claude Mythos Preview, a frontier model the company describes as its most capable to date. Anthropic has stated that it does not plan to make Mythos publicly available. Instead, it assembled a coalition that includes AWS, Apple, Google, Microsoft, JPMorganChase, CrowdStrike, NVIDIA, and Palo Alto Networks to deploy the model specifically against software vulnerabilities in critical infrastructure. Roughly 40 additional organizations have been granted access for scanning first-party and open-source systems.

On benchmarks, Mythos scores 93.9% on SWE-bench Verified compared to 80.8% for Opus 4.6. It leads in 17 of 18 benchmarks Anthropic measured.

The model autonomously found a 27-year-old vulnerability in OpenBSD, widely considered one of the most security-hardened operating systems in existence. It caught a 16-year-old bug in FFmpeg that automated testing tools had passed over five million times without catching. It also found a chain of Linux kernel vulnerabilities that could give an attacker full control of a machine. All of it was done without human direction.

Pricing for Mythos Preview is $25 per million input tokens and $125 per million output tokens, which makes it 5x the cost of Opus 4.6. Anthropic has committed $100 million in Mythos Preview usage credits to Project Glasswing partners.

Why it’s important for us:

Sick name, huh? Sounds like a 007 or Mission Impossible movie title… Mission Impossible 18: Project Glasswing.

I think depending on your algorithm and where you go for your news, you’ve likely heard both ends of the spectrum. On one hand, people say this is a marketing ploy by Anthropic. They’ve hyped up this model that’s really not that big of a deal so they can IPO in the next few months and make a bunch of money. Or on the other hand, people say this is the biggest deal of all time, and software will never be the same.

So I’m going to try to be a realist:

A lot of the top companies in the world have signed up for Project Glasswing. If this is all a marketing ploy, then most of the major companies are in on the conspiracy. I’m not much of a conspiracy theorist, so I think there’s good reason for these companies to join the program.

Thus, I tend to believe this is a pretty big deal. Claude Mythos was capable of exploiting every operating system and web browser it tested. Which means it’s likely going to be able to exploit nearly every single software. The implications of this are enormous. If Claude Mythos is not equally as good, or better, at patching those gaps in security, then any one bad actor who gets their hands on this could wreak major havoc.

If the code generated by AI models isn’t better than the ability of AI to exploit the code, then every single software on the planet becomes insecure if in the wrong hands. We’re entirely beyond human capability here. And that’s a bit unsettling.

But at the same time, maybe it’ll highlight a lot of the major insecurities in our software, and very smart developers can patch them. Then our software becomes safer than ever. I think that’s also a genuine possibility.

I suspect we’ll be seeing a lot of communications from major software companies and our accounting software vendors referencing AI-assisted bug patches. It’s more important now than ever to make sure your software and devices are updated to the most recent stable version (and remain so as new updates are deployed) to ensure you’re not susceptible to attacks.

Perplexity Computer takes on tax prep

Source: Perplexity / Introducing Computer for Taxes

Perplexity launched Computer for Taxes, a new feature within Perplexity Computer that drafts U.S. federal tax returns on official IRS forms. Users upload financial documents like W-2s and 1099s, the system reviews them and asks follow-up questions about the filer's situation, then maps the inputs to IRS forms and generates a draft return.

Beyond drafting, Computer for Taxes can review a professionally prepared return for accuracy and compliance, flag missed deductions or errors, and build custom dashboards and tools for more complex areas of the tax code like depreciation tracking, startup equity modeling, and rental property deductions under passive loss rules.

The tax knowledge is packaged as loadable modules built on Perplexity's Agent Skills protocol. The modules are continuously updated and grounded in current IRS materials and regulations, so when tax laws change, the modules update independently without retraining the underlying AI model.

Computer for Taxes is accessed via the "Navigate my taxes" option inside Perplexity Computer. It currently supports federal returns only, with no state tax support or e-filing capability at launch. Perplexity notes that outputs are intended for reference only and are not a substitute for professional tax advice.

Why it’s important for us:

For us, AI doing taxes has likely been a popular topic on social media the last few weeks thanks to the algos. It’s that time of year, I guess.

The unpleasant truth for a lot of accountants and firms is that there’s nothing special about filing tax returns. AI can learn the rules. It can apply the rules. It can ingest data in minutes that would take humans days or weeks.

I never understand when accountants get offended by the idea that AI could do a tax return. Think about it. Who within the firm does the majority of the work on a tax return? Interns and associates. Those idiots don’t know anything. I know because I used to be one of them idiots. The review is still left to managers and partners. So why would AI be any different?

But AI is still not there yet, especially with what I’ve seen from Perplexity. I’ve tested Perplexity Computer, and I wasn’t very impressed. Beyond the bad output, it cost me about $30 for a very simple task that would’ve taken Claude Code or Claude Cowork about 15 minutes.

The capabilities will be there soon though. It won’t percolate down to firms for a while. But those at the leading edge will be capable of using AI to complete returns nearly end-to-end, most likely by 2027 (2026 returns).

I think it’s time to start coming to terms with that so we can move on to better things besides keying numbers into a tax software created two decades ago.

TRENDING NEWS

Anthropic makes Claude Cowork generally available to all paid subscribers: Finally. The “research preview” tag is removed. Cowork is covered by the same security as the rest of your Claude plan. If you’re on a Claude Team or Enterprise plan, and you’ve decided you’re comfortable putting client data into the regular chat, you can now do the same with Cowork.

Anthropic brings Claude into Word: Not shocking since it’s already in Excel and PowerPoint. But still very cool. Looks extremely impressive, including tracking its edits for you to review and approve.

Meta launches Muse Spark, which is its first closed-source AI model: Almost forgot about Zuck for a minute. The benchmarks for agentic AI (think compared to using Claude Cowork) aren’t very strong. I’ll be interested to see the API pricing when it’s released. If it’s similar to the frontier labs, this model has no chance of competing. If it’s significantly cheaper or free, this might be an interesting option.

Anthropic launches Claude Managed Agents: They’re offering the ability to run the power of Claude Cowork in a cloud environment. This one is most relevant to the automation nerds (like myself) who build things in n8n, Make, or Zapier. It was previously very difficult to connect automations to the same infrastructure that Claude Cowork runs behind the scenes. Anthropic is making that much easier now. The dashboard to view agents is also a nice touch. I’ll likely have more on this in the coming weeks as I test it.

Zapier launches its SDK in open beta: This gives you the power to use every single Zapier API endpoint with your agent of choice (Claude Code or Codex). Much more powerful than using an MCP because it packages all Zapier integrations into one place. Right now, this is free. It’ll be interesting to see how they choose to charge for this moving forward. I’m guessing they see this as the future of automation (I’d agree), and they’re among the first to market with something like this. It’ll probably result in a huge market share. Really great move by them.

Microsoft Copilot’s Researcher now uses ChatGPT and Claude together: The Researcher agent now has two multi-model modes. Critique allows one AI model to draft the report and another to review for accuracy. Council allows the agent to run both AI models and provide results side-by-side. Critique performed about 13.8% better than a single-provider agent.

Anthropic introduces the advisor strategy for cheaper AI agents: This pairs Opus with Sonnet or Haiku to save tokens and run faster. Opus is the orchestrator, while Sonnet or Haiku does the dirty work. If this sounds familiar, it’s because this is similar to the announcement directly above from Microsoft. We’ll start seeing a lot more of this moving forward. It’s what the cool kids are calling ‘tokenomics’.

Google rolls out AI Inbox for Gmail: An AI-powered inbox view that replaces your email list with a briefing. Inbox management is one of the most common requests I hear. Great feature by Google. Problem is, most accounting firms use Outlook…

OpenAI raises $122B at an $852B valuation: Not much to say here other than ‘holy s***’.

PUT IT TO WORK

Claude Cowork is out of research preview, so we can use it as much as our hearts desire. Hopefully your heart desires less than 38 seconds of usage though because those rate limits hit fast.

Or just shell out more cash for the higher usage Team plan. That’s probably the right answer. One task that saves more than 1 hour a month pretty much makes it worth the money to spend the extra $100/mo.

Here’s a demo of Cowork reviewing emails every hour to find documents sent by clients, which it then uploads to your DMS.

WEEKLY RANDOM

OpenAI released a policy blueprint this week called “Industrial Policy for the Intelligence Age.” It covers several key areas, such as a public wealth fund, the right to AI, a modern tax base, and wage-linked incentives.

To summarize, this is OpenAI’s attempt to open the conversation on how we shift policy in a world where AI has taken a lot of jobs and is providing significant economic value.

Is it too little, too late? It feels like it is, right? It’s pretty frustrating that this is the time they’ve decided to think deeply about this topic and open a conversation. We’re already in the weeds. We’re seeing companies lay off meaningful percentages of their workforce every week now and blame AI (which is a load of B.S., but that’s beside the point right now).

I’m sympathetic to the fact that OpenAI didn’t know how quickly AI would improve when they initially launched ChatGPT. And I’m also sympathetic to the fact that it’s nice for the public to be aware of the technology. But it feels pretty negligent on OpenAI’s part that they’ve chosen to be the first company to release this technology for enterprise and consumer use and didn’t at least have a solid plan in place to communicate with leaders on the proper guardrails and policy.

I’m not suggesting they needed to have the answer, but we’re far past the point of no return.

Rant over. One of the most interesting pieces of the document is their opinion on modernizing the tax base and wage-linked incentives. OpenAI suggests rebalancing taxes towards capital gains, corporate income, and taxing automated labor (aka AI agents). They also suggest incentives, similar to an R&D tax credit, for retraining and investing in human workers.

Accountants should have a pretty significant seat at the table for these discussions. This’ll be an interesting one to watch.

Until next week, keep protecting those numbers.

Preston

Keep Reading