Without proof it really is useless except to them if they've got one, but why would they keep the proof secret?

The point is, even with proof of correctness the test is useless in practice.
In theory I can do a rigorous '1iteration' test of any M(p) by simply feeding it to e.g. the Pari 'factor' command. In practice, once p gets larger than a few hundred bits, the needed runtime becomes impractically large.