Stop Measuring AI Like Software

After over a decade in product management, I assumed moving into the world of AI would be a familiar path.

I brought my usual toolkit: feature prioritization, cross-functional coordination, and adoption metrics.

However, as I explored AI-powered and generative tools, I quickly realized this wasn’t business as usual.

These systems were fast, flexible, and capable of producing surprisingly useful outputs in seconds; they were also unpredictable.

They didn’t just deliver answers, they often invented them. And sometimes responded with confidence even when they were far from the point.

As a product manager, I found myself asking: Can I trust the response generated by these systems and base my work on it?

That’s when it became clear: You can’t measure AI the same way you measure traditional software.

Author's summary: Measuring AI requires a different approach.

Communications of the ACM — 2025-10-15