An important aspect of the perceived quality of vocal music is the degree to which the vocalist sings in tune. Although most listeners seem sensitive to vocal mistuning, little is known about the development of this perceptual ability or how it differs between listeners. Motivated by a lack of suitable preexisting measures, we introduce in this article an adaptive and ecologically valid test of mistuning perception ability. The stimulus material consisted of short excerpts (6 to 12 s in length) from pop music performances (obtained from MedleyDB; Bittner et al., 2014) for which the vocal track was pitch-shifted relative to the instrumental tracks. In a first experiment, 333 listeners were tested on a two-alternative forced choice task that tested discrimination between a pitch-shifted and an unaltered version of the same audio clip. Explanatory item response modeling was then used to calibrate an adaptive version of the test. A subsequent validation experiment applied this adaptive test to 66 participants with a broad range of musical expertise, producing evidence of the test’s reliability, convergent validity, and divergent validity. The test is ready to be deployed as an experimental tool and should make an important contribution to our understanding of the human ability to judge mistuning.