Another measurement error is the sine wave generator. That requires a form of speaker. Speaker distortion is a magnitude of error beyond the mic capsule's THD. We end up measuring both with each contributing to the errors.
Once we put all our confidence in listening tests alone, ridiculing formal test procedures, we are tempted to pretend the speakers or headphones for listening aren't there. They seem a bit like the elephant in the room...
Even with a speaker as a source, with many distortions, two microphones can be used at once in the same coupler. If those speaker distortions are identical in both microphone outputs, and using another speaker, a different set of distortions are identically produced in the two mics, the distortions are quite validly assumed to originate from the speakers, not the mics, and since the distortions are now known, they can be subtracted from the result, revealing just the distortions of the mics.
"...the test arrangement as per your recipe.""The model you propose..."
The true "Elephant in the Room" as you call it, is your assumption that it makes sense to completely disregard the listener's (subjective, unreliable) judgement...
(...)if a potential customer comes to you, a microphone expert, and asks you to test his U87 mic whether it is still operating within factory tolerances, what do you do?
How do you test it? If you have not the testing facilities do you send it off to a testing facility?
Your belief in scientific diagnostics* of defective microphones may stem from your personal unfamiliarity and inexperience with microphones...
With mic calibration work a special speaker called an "electrostatic actuator" is often used. It's really a high quality condenser microphone used as a tiny speaker in a special closed coupler arrangement. Apparently it has the least distortion possible in a speaker or driver
The "electrostatic actuator" isn't an acoustic sound source of any type, but a grid that can be mounted close to the diaphragm of a measurement microphone. It excites the mic by electrostatic force, then a (theoretical) correction curve for the sound pressure (treble) boost caused by the physical dimension of the mic is applied. The result is the the calibrated frequency response published by the manufacturer.
I don't think that harmonic distortion measurement makes sense this way (I might be wrong).For distortion measurement on mic's the differential method is used.2 different frequency sinewaves from 2 different soundsources are used at the same time and "mixed" acoustically.The microphone does produce differential (not harmonic) distortions that are not present in the individual sound sources, which produce harmonic distortions only.With exciting frequencies of, e.g. 10kHz from speaker one and 11kHz from speaker two, the mic's nonlinearity produces e.g. 11kHz-10kHz=1kHz.Harmonic distortion of the sound sources don't matter this way.This method can be used to measure extremly low electrical distortions too, as the (then two) generators don't need to be very low distortion types.
On rereading 6.6 I now see Nedzelnitzky makes it plain. The relevent section is 6.2 where Nedzelnitzky does mention using a reference microphone as a sound source.
The heterodyne principle isnt it?
There are papers from the AES Journal on this subject of THD of condenser capsules. I rcall one from 20 odd years ago exploring the effects of loading on the THD results.Lower value loading resistors like the 200 meg ohms used in older tube mics will generate capsule THD in the lower frequencies. The article determined that a load of 10 gig ohms avoided the low frequency distortion many find subjectively pleasing on vocals.The 10 gig values lowered the capsule THD to .001% at 50 hz.
... interconnects all sound different, yet that difference cannot be fully explained ...... Here, again, and until further notice, ears are trump.
If I can hear it but cannot measure it I do measure the wrong thing