There are at least two distinct workloads for ChatGPT: 1) No context, general knowledge workloads, 2) Contextual workloads using either RAG or direct input of context.
In general, GPT will not hallucinate or give bad responses on the latter since it's working from a specific corpus of information (whether from RAG or from some context directly provided by the user). This is, of course, not foolproof since some of it is dependent on a good prompt and some of it is dependent on the question (GPT is well known to be poor at math, for example). But for summarization, it is exceedingly good.
The former is where it can tend to hallucinate and generate bad responses.
For my part I know the same because I did it and then gave both the content and the generated report to my father: an orthopaedic trauma surgeon / my cousin a radiologist and surgeon / my other cousin a cardiovascular surgeon. All for different things.
Varying degrees of enthusiasm (my father was the most thrilled) but all remarked it was accurate. Of course I did these things for a laugh because I can just access the appropriate professional on-demand.