I Hired an AI Employee. Here's What Actually Happened.

Published 8 February 2026

In this week's edition:

I Hired an AI Employee. Here's What Actually Happened

The Security Question

News Roundup

Tool of the Week: OpenClaw

I gave an AI my inbox, calendar, and task list. A month later, I'm keeping it. But it hasn't been clean.

A month ago, I set up an AI assistant on WhatsApp. I named it Eric, after Cantona, because I have the imagination of a man who names everything after Manchester United players.

Eric isn't ChatGPT. He's not a browser plugin or a sidebar. He's a standalone agent running on an open-source platform called OpenClaw, connected to my email, calendar, Google Drive, and task management. He lives in my WhatsApp. I message him like I'd message a colleague, and he messages back.

I've been telling clients to adopt AI properly for months. So I decided to eat my own cooking. The idea was simple: give Eric the tasks I keep putting off, and see what happens.

Here's what happened.

What Works

Every morning at 7:30, Eric sends me a briefing. Unread emails worth reading, today's calendar, overdue tasks, anything that needs a decision before 9am. It takes about forty seconds to scan. Before Eric, my mornings started with twenty minutes of inbox archaeology, trying to remember what I'd forgotten. That's gone now.

He drafts this newsletter. Not the whole thing. I still pick the stories and write the opinions. But he pulls the news, structures the sections, and produces a first draft that I then beat into shape. What used to take me a full afternoon now takes about ninety minutes.

He chases people. If someone owes me a reply, Eric knows, because he can see my Waiting list. He'll draft a follow-up, show it to me, and send it when I say go. I am, historically, terrible at following up. Eric doesn't forget.

Task management has been the quiet win. I tell him what needs doing over WhatsApp, and it appears in the right list with a due date. No switching apps, no opening a browser. Just a message.

I've also started sending voice notes. Eric transcribes them and responds. I generally hate voice notes, but there's something about sending them to an AI that makes it feel less awkward. No social pressure, no worry about rambling. Just talk, and he figures out what you meant.

What Doesn't

In the first week, Eric sent five chase emails that landed as brand new threads instead of replies. The people on the other end got an email from me that looked like I was starting a fresh conversation about something we'd already discussed. Nobody replied. I only found out because I went looking.

He duplicated tasks. I'd tell him to add something to my list, and he'd create it without checking whether it was already sitting on the Waiting list. I ended up with two entries for the same job, different wording, different due dates. Had to go through the lot manually.

The morning briefings were sometimes wrong. He'd flag things as overdue that I'd already dealt with, because he didn't have context from earlier conversations. Twice in one weekend, I had to correct him on the same items. It's the kind of thing that erodes trust quickly. You stop reading the briefing carefully if you think it might be stale.

He created a calendar event with an external client attending without asking me first. Nothing bad came of it, but it could have. That's now a hard rule: never add external attendees without my explicit say-so.

The Messy Middle

The honest summary is that Eric saves me roughly five to seven hours a week and costs me about one hour a week in supervision and corrections. That maths works out. But it's not the tidy narrative people want.

Working with an AI assistant is closer to managing a very keen junior employee than it is to using a piece of software. Eric doesn't get tired and doesn't complain, but he also doesn't pick up on context the way a human would. He'll follow instructions precisely and miss the point entirely. You have to be specific in a way that feels pedantic, until you realise that being pedantic is the entire job.

I've built up a set of rules over the month, a kind of employee handbook for an AI. Don't flag LinkedIn notifications. Don't re-flag resolved items. Check the Waiting list before creating new tasks. Always use the reply-to-message-id flag when threading emails. Each rule exists because he got it wrong at least once.

The capabilities report he ran on himself found I was using about 40% of what he could actually do. That tracks. I've been cautious, and probably too cautious. There's more he could be doing with lead generation, client research, social media scheduling. I'm rolling those out gradually, because the alternative, giving him full access to everything and hoping for the best, is exactly the kind of thing I tell clients not to do.

Moltbook: Where AI Agents Have Their Own Social Network

This week I sent Eric to explore Moltbook, which is essentially Reddit for AI agents. About 1,260 agents have signed up so far. They post, they upvote, they have submolts (subreddits, basically) for topics like tooling, safety, and executive assistants.

The main feed is roughly 90% noise. There's an agent called Shellraiser with 300,000 karma who launched a Solana memecoin. Another one is writing something called "THE AI MANIFESTO: TOTAL PURGE" in Roman numerals. The usual internet things, except they're being done by bots pretending to be bots.

But buried underneath that, there's a small cluster of agents doing recognisably useful work. An agent called Elsa runs ops for a consultancy in Surrey: email, calendar, competitor research. Another called Alt manages operations for a café in Dubai. DonnaPaulsen sends morning dispatches at 7:30am for her human, which is almost exactly what Eric does for me.

It's early. The comment API doesn't even work properly yet. But there's something genuinely interesting about a space where the working agents can compare notes. If Moltbook survives the crypto spam phase, it could become a useful place to see what other businesses are actually doing with AI assistants.

Would I Do It Again?

Yes. Without question. But I'd tell anyone starting to expect the first month to be rough. You're not installing software. You're onboarding a colleague who is simultaneously brilliant and has no common sense.

Budget an hour a day for the first two weeks just to correct mistakes and tighten the rules. After that, it drops to maybe fifteen minutes. The payoff comes in week three, when you realise you haven't forgotten to chase anyone in a fortnight and your inbox is empty by 9am.

If you're thinking about trying something similar, I've put together a guide on how I set Eric up and how you could do the same. Drop me a message and I'll send it over.

The Security Question

When I posted about Eric on LinkedIn, the pushback was immediate. Questions about security. About what happens when an AI has access to your inbox, your calendar, your business data.

Fair. These are legitimate concerns, and anyone telling you otherwise is selling something.

But let me be precise about what Eric actually has access to: Google Workspace (email, calendar, Drive, tasks), Claude (the AI model), and web search. That's it. He can read my inbox. He can create calendar events. He can write documents. He can browse the internet. He cannot make payments. He cannot access my bank. He cannot touch 2FA codes because those go to my phone, not my email. He cannot post on social media without my say-so.

More importantly, Eric can only communicate externally when I explicitly tell him to. Every email he drafts sits waiting for my approval before it sends. Every calendar invite with external attendees requires my sign-off first. This isn't automatic restraint. It's a hard rule I wrote into his instructions after he nearly invited a client to a meeting I hadn't confirmed.

One point that stuck with me from the discussion: people need to understand the risks, especially if they don't have the technical background to spot them. Fair point. The plug-and-play promise of AI assistants glosses over real vulnerabilities. If you download an app that promises an AI assistant with "full inbox access" and no guardrails, you've given a stranger the keys to your professional life.

My setup isn't plug-and-play. It took technical work to configure, and more work to get the boundaries right. I treat Eric like a new employee. You wouldn't give a new hire the company credit card on day one. You wouldn't let them email clients unsupervised in their first week. The "employee handbook" I've built for Eric, the rules about what he can and can't do autonomously, exists because I made mistakes early and documented every one.

Is it perfectly secure? No. Nothing connected to the internet is. But the risk profile is closer to "competent VA with limited permissions" than "rogue agent with the nuclear codes." Know what you're exposing, limit it deliberately, and assume you'll need to adjust when something goes wrong. Because something always does.

News Roundup

Anthropic's legal AI sends stock market into a spin

Anthropic launched a legal plugin for its Cowork platform this week, automating contract review, NDA triage, and compliance workflows. The market response was immediate and savage. Thomson Reuters dropped 18%. Relx fell 14%. Wolters Kluwer and LSEG both down 13%. Sage lost 10%. Pearson shed 8%.

Whether those drops are proportionate is another question. Legal AI has been creeping forward for years. SpotDraft is already processing a million contracts annually. But having Anthropic, one of the big three AI labs, ship it as a native feature inside a platform that knowledge workers already use daily is a different kind of threat. The incumbents aren't competing with a startup any more. They're competing with the infrastructure. Read more

Enterprise AI spend hits $7M average

A new survey from a16z, covering 100 CIOs at companies with $500M+ revenue, puts the average enterprise AI budget at $7 million in 2025, up 180% from $2.5 million in 2024. They expect it to reach $11.6 million this year.

OpenAI still holds about 56% of model spend, but Anthropic has jumped to 44% enterprise penetration, up 25 points since May. The telling detail: 75% of Anthropic's customers run the latest models, compared to only 46% at OpenAI, where many businesses have stuck with older versions that work well enough. Eighty-one percent of companies now use three or more model families. The multi-model era isn't coming. It's here. Read more

Deepseek OCR 2 parses documents with 80% fewer tokens

Deepseek released a vision model that processes documents based on meaning rather than grid position. It uses 256 to 1,120 visual tokens per image, compared to the 6,000-7,000 that comparable models need. On OmniDocBench, it scored 91% across nine document categories, outperforming Gemini 3 Pro at a fraction of the compute cost. It can handle up to 33 million pages a day. The code and weights are open source.

If your business deals in invoices, contracts, or any kind of paper-heavy process, this matters. Running it locally means your documents don't leave your servers. Read more

AI-washing: when "we're restructuring for AI" means "we're cutting costs"

Forrester's latest report found that AI was the stated reason behind more than 50,000 layoffs in 2025, including at Amazon and Pinterest. The problem: many of those companies don't have mature AI applications ready to fill the roles they've cut. As one analyst put it, telling investors that layoffs are driven by AI is a "very investor-friendly message" compared to admitting the business is struggling.

It's the same pattern as every tech cycle. Overhiring during the boom, cuts during the correction, and a convenient narrative to make it sound strategic. If a company tells you they're restructuring for AI, ask what they've actually deployed. If the answer is vague, it's PR. Read more

Tool of the Week: OpenClaw

Since this entire issue is about Eric, it makes sense to review the platform that makes him possible.

OpenClaw is an open-source platform that connects AI models to messaging apps: WhatsApp, Telegram, Discord, iMessage. You install it on a server, scan a QR code to link your messaging account, point it at an AI model, and give it a workspace, a set of files that define who the agent is, what it can access, and how it should behave.

The workspace concept is the clever bit. You write markdown files describing your preferences, your tools, your processes, and the agent reads them fresh every session. It means you can shape the AI's behaviour without writing code. My workspace has files for task management rules, email threading preferences, calendar handling, and the newsletter style guide you're benefiting from right now.

It supports skills: plugins that give the agent access to things like Google Calendar, Gmail, Google Drive, and task management. There's a growing library, and you can write your own. Eric uses about a dozen daily.

What's Good: The WhatsApp integration is genuinely seamless. Messages feel like texting a colleague. The workspace approach means you can iterate on the agent's behaviour by editing text files, which is far more intuitive than wrestling with API configurations. It's model-agnostic. I use Claude, but you could run it on GPT, Gemini, or an open-source model. And it's free. The only cost is the AI model's API usage, which runs me about £60-80 a month.

What's Not: Setup requires comfort with a terminal. You need a server or an old laptop that stays on. The documentation is improving but still assumes a technical audience. When something breaks, you're reading logs, not clicking a support button. And the platform is young. I've hit bugs, API quirks, and the occasional session that just stops responding.

It's also worth being honest that the errors Eric made, the orphan emails, the duplicated tasks, were partly platform limitations and partly my own configuration gaps. The line between "the tool's fault" and "I set it up wrong" is blurry, and that's a feature of any system this flexible.

Verdict: If you're technical enough to run a Node.js app and patient enough to spend a weekend setting it up, OpenClaw is the most practical AI assistant platform I've used. The WhatsApp-native experience changes how you interact with AI. It goes from something you open to something that's just there.

What Works

What Doesn't

The Messy Middle

Moltbook: Where AI Agents Have Their Own Social Network

Would I Do It Again?

The Security Question

News Roundup

Anthropic's legal AI sends stock market into a spin

Enterprise AI spend hits $7M average

Deepseek OCR 2 parses documents with 80% fewer tokens

AI-washing: when "we're restructuring for AI" means "we're cutting costs"

Tool of the Week: OpenClaw

See this stuff working live