What is the false negative rate and total rates? Without those we are missing too much. If the false negative rate (saying fine but it isn't) then the whole thing is useless. If the total cases are a few hundred (either CASM isn't a problem or those doing it use other platforms cause they know they will be caught on these) I don't care much that some are false positives - odds are it didn't get me.
Sure you can, random sampling should work. Don't just go making things up.
Of course actually carrying out that experiment would be absurd since I don't think anyone expects an appreciable percentage of clearnet material to be CSAM. The working assumption is that the goal is to find a needle in a haystack so GP's objection about needing to know the false negative rate is misguided.
I expect the equivelent of the fbi is investigating this using other sources and so has plenty of data without needing to randomly sample any non-suspect conversation. CASM has been a problem since before computers.