After over a decade in product management, I assumed moving into the world of AI would be a familiar path.
I brought my usual toolkit: feature prioritization, cross-functional coordination, and adoption metrics.
However, as I explored AI-powered and generative tools, I quickly realized this wasn’t business as usual.
These systems were fast, flexible, and capable of producing surprisingly useful outputs in seconds; they were also unpredictable.
They didn’t just deliver answers, they often invented them. And sometimes responded with confidence even when they were far from the point.
As a product manager, I found myself asking: Can I trust the response generated by these systems and base my work on it?
That’s when it became clear: You can’t measure AI the same way you measure traditional software.
Author's summary: Measuring AI requires a different approach.