Mind the Prompt

Blog archive

A Chat with ChatGPT

At the beginning of writing this blog, I was quite resolved to write information about how much of a misunderstanding ChatGPT can really make of simple things when given a slightly vague prompt or a no-context prompt. Given that a lot of people have posted online about how ChatGPT would usually make errors in certain instances, like arithmetic questions, riddles, taking things very literally and even reports of things like counting the letters within words incorrectly, I set out to test all of these.

To my surprise, the results were quite different to what I expected!

About This Series

"Mind the Prompt" aims to cover language-related AI-insights as written by a Ph.D. in applied linguistics, whose research focuses on the interdisciplinary nature of language with a particular focus on computational linguistics, specifically within the Fourth Industrial Revolution and AI. Contact her at [email protected].

Giving ChatGPT some incredibly simple prompts was the first step. I used the free version of GPT-5.3, which would be the same one that most people without the premium option would use. I asked, "A penguin is not a whay." While I meant to type "what," a tiny typo of "y" instead of "t" gave quite an interesting answer. Instead of “thinking” I meant "what," the model came to the idea of a whale. When I then prompted it with what I meant, it interpreted my request as a clarification of what I meant.

By now, many users of LLMs have experienced arguing with ChatGPT as if one were arguing with a human who does not understand something. This, for me, was a defining moment in seeing firsthand how humorous the correlations that this LLM can produce really are. What stood out was how it did not immediately assume a typo to be the first line of response, but instead, its initial response is to relate what was typed erroneously to something in a similar category of the information contained in the rest of the sentence.

An average human would’ve likely picked up that "whay" meant "what" instead of first jumping to the idea of another sea animal. However, sometimes, given how highly evolved LLMs can be, it seems to have an adverse effect to its own detriment.

Riddle Prompt
[Click on image for larger view.]   Figure 1. Riddle Prompt

Even more ironic, prompting it with the correct word (in this case, what) did not necessarily clarify the error. Instead, it interpreted my “what” to be me asking it, “What?”

Notice how even though a question mark was not used, it still provided a clarification to my question and began to explain what it did instead of what I meant for it to do. Granted, the follow-up prompt I inserted could have been clearer -- something like “Not whale, I meant what.” But I intentionally vaguely prompted to see what would be spun out. It turns out ChatGPT can pretty easily figure out a vague riddle, not that this riddle was too complex, but it explained quite clearly why a penguin is not a mammal but is, in fact, a flightless bird, which was the answer I was actually looking for.

So, riddles seemed to be pretty decent, I decided to proceed next with a mathematical problem prompt. For this, I started off slowly: "What is two divided by 2?" The mix of digits and spelling out the number was intentional to see if it would perhaps make a difference to the output. This sum was answered correctly, so I took it a step further with, "eight by 3." The use of "by" was intentional to see if division would be what was assumed. Again, a correct reply.

Now I prompted it a bit more to ask it to "times it." Here's where it started getting slightly confused because the interpretation was that my prompt was asking for 8/3 to be multiplied by something else. I then said I would like for two to be multiplied (implying multiply 8 x 3). However, this slightly vague prompt was interpreted as multiplying the answer of 8/3 by the answer of 2/2 instead of what I originally meant. From here, I went simple and clear: "Multiply 8 and 3," to which the correct answer was then given.

Once again, online user opinions of how ChatGPT would get confused with simple math problems seemed to be proven wrong, given that I gave a few different sum options, and in all instances, the correct answer was provided. However, when I gave it a vague prompt, it did struggle to understand exactly which numbers I was referring to, which, in retrospect, could happen in a conversation with humans as well, especially when pragmatic interpretation differs across classes and cultures.

Maths Sum Prompt
[Click on image for larger view.]   Figure 1. Maths Sum Prompt

The last test I decided to delve into was to task ChatGPT with counting the number of characters in a given sentence. A number of reports on the OpenAI community report that this is a significant issue on ChatGPT specifically because of the way the code is specific to counting tokens instead of words/letters, which affects character counting. In 2023, a user reported that an incorrect character count was consistently generated, while in 2024, a user noted that the character count was incorrectly applied when summarizing and rewording a text. This was reported on GPT-4.

When I tested this same concept, ChatGPT-5.3 generated incredibly accurate results. It even accounted for the difference in character counts excluding the trailing space at the end of the sentence.

Counting Letters Prompt
Figure 1. Counting Letters Prompt

Given that the previous complaints were from a few years back and a much older version, the evolution of how well GPT-5.3 can now count characters, letters, spaces and anything else within a sentence has significantly improved from the original issues many people were facing. General computer programs like Word can easily count characters, so using an LLM to do so would not really make sense. This could be the reason for why earlier versions of the model performed quite poorly in this regard due to a lack of sufficient training and overall need for it to perform this task well. As OpenAI continues to develop ChatGPT models, a lot of the more commonly reported mistakes that were found are becoming a lot less common with continuous improvements to user experiences daily.

 

GPT-5.3, just recently launched, already aims to generate smooth conversations, deliver more accurate answers and find richer, contextualized results when searching the web. It also boasts stronger writing skills with a better range, texture and tone as well as more expressive writing styles that can potentially be as good as a human creative writer (just yay for those of us in the writing space).

So, beginning the blog with the intention of reporting where ChatGPT can easily make mistakes has unequivocally become a blog about how original "mistakes" or problem areas for ChatGPT have actually significantly improved with the release of newer models. This of course, does not mean the model is infallible; it simply means we have to keep on really making use of it well enough to notice what it is able to do well and what still cannot replace human touch (just yet).

Posted by Ammaarah Mohamed on 03/11/2026


Featured