The Pitch Imagery Arrow Task (PIAT) was designed to induce and evaluate pitch imagery in participants with a range of musical backgrounds (Gelding, Thompson, & Johnson, 2015). However, the original version of the task is long and inefficient. Therefore, the present three-part study aimed at enhancing its validity and reliability using modern psychometric techniques including Item Response Theory (IRT) and computerized adaptive testing (Harrison, Collins, & Müllensiefen, 2017). In an exploratory study, 115 participants completed the original PIAT. The data were modelled using generalized linear mixed effects models to determine main predictors of item difficulty. A new item bank (N = 3000 items) was then created that systematically varies these different aspects of music structure to manipulate the perceptual difficulty of items. Second, a calibration study tested a second participant sample (N = 243), where each participant received 30 randomly selected items from the item bank. Generalized mixed effects modelling found ability on the PIAT was best predicted by the ability to maintain and manipulate tones in mental imagery, as well as to resist perceptual biases that can lead to incorrect responses. Third, an adaptive version of the PIAT (aPIAT), completed by a new sample (N = 147), showed positive moderate to strong correlations with measures of non-musical and musical working memory, self-reported musical training and general musical sophistication. These results demonstrate that the new aPIAT is a short and efficient test of musical imagery ability that is related, yet not identical to, established measures of working memory.