DevelopsenseLogo

Yes, We Still Need To Look. Carefully.

I very occasionally visit Xitter (pronounciation tip: it goes like the name of the President of the People’s Republic of China). The other day, Jason Huggins said

Just in case you’re using a screen reader, that’s “I occasionally use the Tesseract OCR library for text recognition. I think that means I’m a senior machine learning engineer now, I guess.”

I felt a little impish, but I also felt quite lazy. This seemed like a perfect application for ChatGPT.

I prompted: “Create the most absurdly inflated imaginable title for someone who uses an optical characterer recognition tool. End the title with ‘engineer’.”

ChatGPT replied: “Supreme Grandmaster of Digital Script Transcription, Luminary of Lexical Decipherment, Oracle of Opto-Linguistic Analysis, and Conductor of Cybernetic Cognition, Sovereign of Scanning Technologies, Monarch of Manuscript Manipulation, and Architect of Alphanumeric Algorithms: Optical Character Recognition Engineer Extraordinaire”

A few observations: ChatGPT produced a very impressive result. The response was useful for exactly the purpose I had intended: generating lots of words with no particular meaning, with little effort on my part, and with no real regard for truth. That is, bullshit. That sounds like a diss, but in this case it isn’t; often bullshit is pernicious, but sometimes it’s funny, and fun.

But notice something else: ChatGPT got a key part of the assignment wrong. Despite the explicit instruction in the prompt, ChatGPT ended the response with “Extraordinaire”, instead of “Engineer”.

In this case, of course it doesn’t matter. But we should seriously consider what this means for circumstances in which someone might be using an LLM for something that does matter, especially in cases where we would need expert attention to notice and address problems in the output.

I’ve done lots of basic experiments with LLMs that should not pose much of a challenge. Every time, there have been bugs and problems in the responses. The wider literature is full of deeper analysis that points to serious risk. So if you’re using these tools for something important, you’ll need a tester’s mindset to be alert to trouble, a skilled supervisor’s expertise to fix it, and the patience and determination to apply them.

Leave a Comment