It is still a matter of debate whether visual aids improve learning of music. In a multisession study, we investigated the neural signatures of novel music sequence learning with or without aids (auditory-only: AO, audiovisual: AV). During three training sessions on 3 separate days, participants (nonmusicians) reproduced (note by note on a keyboard) melodic sequences generated by an artificial musical grammar. The AV group (n = 20) had each note color-coded on screen, whereas the AO group (n = 20) had no color indication. We evaluated learning of the statistical regularities of the novel music grammar before and after training by presenting melodies ending on correct or incorrect notes and by asking participants to judge the correctness and surprisal of the final note, while EEG was recorded. We found that participants successfully learned the new grammar. Although the AV group, as compared to the AO group, reproduced longer sequences during training, there was no significant difference in learning between groups. At the neural level, after training, the AO group showed a larger N100 response to low-probability compared to high-probability notes, suggesting an increased neural sensitivity to statistical properties of the grammar; this effect was not observed in the AV group. Our findings indicate that visual aids might improve sequence reproduction while not necessarily promoting better learning, indicating a potential dissociation between sequence reproduction and learning. We suggest that the difficulty induced by auditory-only input during music training might enhance cognitive engagement, thereby improving neural sensitivity to the underlying statistical properties of the learned material.