Yes this is a perfect example of what LLMs are currently able to do, it's less an "answer", and more "here's a reply that seems and sounds a lot like an answer", and we ought not to be surprised at that deficiency, it's probably hardwired into the very means by which the "reply" is produced in the first place.
At a recent hype fest for, I think, their AI chips? the NVIDIA cto started making big claims about how "AI" will soon "check" its answers far more robustly until "the truth" (emphasis the) is arrived at, and only then give a reply to a user. Clearly the folks pushing these models are aware they need provide at least the semblance of an answer.