DevelopsenseLogo

Evaluating the Chatbots

This ChatGPT getting dumber? This paper raises the question; this blog post questions the conclusions; and this article has more to say. That’s not a very useful question, because “dumber” is not exactly a property of ChatGPT (or anything else). It’s a set of relationships between ChatGPT’s behaviour; people’s notion(s) of dumb and smart; and the context. Evaluating that requires a complex set of perspectives, values, and social judgements. For … Read more

“Should Sound Like” vs. “Should Be”

Yet another post plucked and adapted from the walled garden of LinkedIn “What the large language models are good at is saying what an answer should sound like, which is different from what an answer should be.” —Rodney Brooks, https://spectrum.ieee.org/gpt-4-calm-down Note for testers and their clients: the problem that Rodney Brooks identifies with large language models applies to lots of test procedures and test results as well. People often have … Read more

Winding Up

After 20 years of working together to develop the Rapid Software Testing approach, James Bach and I have decided that — improbable as it may seem — it’s time to wrap it all up. Perhaps this will be a surprise to our followers in the community, but we now must confront what we previously thought was unimaginable: recent developments in technology have, for all intents and purposes, made testing obsolete. … Read more