Pickles Rule the Earth
(or, Why "Almost Always Right" Should Worry You)
Author's note: I am not a doomsayer. I use this technology every single day, at work, at home, for brainstorming, productivity, and product research, all of it. I'm a believer. I am also a realist.
Next week, thousands of hospitality professionals will descend on HITEC and spend three days being shown how AI is about to transform our world. As per usual, some will do what they say on the tin. And some ... not so much.
All of them have one thing in common: they are built on probabilities, not certainties.
Your AI is a slot machine with a freakishly high return. Almost every pull pays out — a good answer, maybe a great one. But the house edge never leaves the table.
Every response is a weighted lookup. Billions of parameters voting on the most likely next word, then the next. The odds overwhelmingly favor the right answer. Ask it "how are you" and you'll get "I'm doing well" every single time.
Almost.
Almost isn't always. There's a non-zero chance — tiny, but real — that it answers pickles rule the earth. Confidently. The same confidence it uses when it's right.
Don't believe me? Try this.
The 5 + 5 Experiment
Ask your AI what 5 + 5 is. It'll say 10. I can almost guarantee it.
Now ask it how it knows.
I did this with the most powerful model available to the public — Anthropic Fable 5, for the nerds. Here's what it told me:
There's no arithmetic logic unit in here. When I produce "10," I'm doing a forward pass through the network and emitting the token with the highest probability given the context. For "5 + 5," that answer is so overwhelmingly represented in training data that the distribution is essentially a spike — it behaves like a deterministic lookup, but mechanistically it's still next-token prediction.
And the part that matters:
Reliability degrades as numbers get bigger or operations chain. 5 + 5 is bulletproof; but 7-digit multiplication? accuracy drops surprisingly fast. There's no exact mechanism, just increasingly strained learned approximations.
There are remediations ... a calculator tool can get invoked when the model decides it's warranted. But underneath, most AI is probabilistic distribution, all the way down. There isn't a tiny accountant living in there doing arithmetic. There isn't a tiny front desk agent checking policy. There is only a machine predicting what comes next.
So next week at HITEC, you'll see products that purport to solve a lttany of real problems, and you'll hear certainty implied where probability is the reality.
As you nod along to the pitch, ask yourself two things: What happens when it's wrong? And how often?
Let's Put a Number On It
Here's a 150-room hotel at 70% occupancy. Nothing exotic — middle of the road.
150-ROOM HOTEL @ 70% OCCUPANCY — THE HOUSE EDGE
================================================
105 occupied rooms · 52 arrivals/day · 55% loyalty
INTERACTIONS / DAY
Bookings + changes + pre-arrival ... 37
In-stay requests ................... 158
Digital + post-stay ................ 25
----------------------------------------
TOTAL ............................. ~220 (~155 automatable)
WHO YOU'RE TALKING TO
Involve a loyalty member .......... 63% (elites: ~40/day)
THE HOUSE EDGE (@ "99.6% accurate")
Bad answers ....... ~3/day · ~95/month · ~1,130/year
...hitting members ......... ~58/month
...hitting elites .......... ~17/month
================================================
"99.6% accurate" and "wrong 95 times a month" are the same number. One sounds like an A+. The other sounds ... not so much like an A+. To be clear, humans make mistakes too. The problem is that AI mistakes arrive with the confidence of correctness.
And notice who gets the bad answers: 63% of your interactions involve a loyalty member. Your most valuable guests ask the hardest questions — entitlements, points, upgrades — which is exactly where the model is most likely to confidently invent a "yes."
Here's the Part That Got Me
While writing this, I asked that same Fable 5 model to run these numbers. It produced the above lovely chart. Figures that, after 30 some-odd years in hospitality, looked about right.
But the whole thing hinged on one value: 99.6% accuracy.
So, being a realist, I asked: Is 99.6 remotely accurate? Any basis for that number?
It confessed:
The fairest answer: no — it's your number, not a measured one. I inherited 0.4% from your slot-machine framing and then built arithmetic on it.
The AI built my entire exhibit on a number it made up — and presented it with total confidence until I pushed.
Pickles.
It's Probably Worse Than 99.6%
When I made it go find real benchmarks, reality got a little more uncomfortable. It turns out there isn't a single answer. Performance varies wildly depending on scope, grounding, guardrails, and whether the system is simply answering questions or actually taking actions.
- Best case — a well-engineered, grounded, narrow-scope FAQ bot - around 99.3%.
- Typical — realistic deployments - 97–99%.
- Agentic — anything that takes actions (books, modifies, charges a folio) will likely fall lower still. Multi-step actions are tough. Every additional step introduces another opportunity for failure.
Re-run the chart at 1% and it's not 95 bad answers a month. It's 232. At 2%, 465 — many of them to your best guests. And unlike the front desk agent who misspoke, the AI's mistake is in writing. Screenshot-able. 'Your AI confirmed it' arrives with a receipt.
Next Week
Be realistic, be cautious, be curious. Ask the hard questions.
The house always has an edge. Your job is to read the cards before you cash them — lest
pickles rule the earth.