This is why I hate paying for user testing. People offer feedback that is unnatural or nit picky.
Discussion
100%, if done at all should be on small sample sizes and as hands off as possible (not observing them for example).
en.wikipedia.org/wiki/Hawthorne_effect