Given the way multimodality as a field has expanded, becoming more diverse and complex, it is important to pause to identify exactly which concepts, theories and processes of multimodal analysis are more or less suitable for the needs of critical discourse analysis (CDA) and the wider field of critical discourse studies (CDS). The article argues that the field of multimodality remains fragmented both internally, with a range of divergent core interests, and externally from academic fields that have long dealt with the topics to which it is turning its interest. In this article, looking at some key ideas from visual studies, I reflect on what kind of multimodal approach best aligns with the needs of CDS. I argue for an affordance-based approach and one driven by the social and not by need to model on the basis of language.