Quotations are a means to report a broad range of events in addition to speech, and often involve both vocal and bodily demonstration. The present study examined the use of quotation to report a variety of multisensory events (i.e., containing salient visible and audible elements) as participants watched and then described a set of video clips including human speech and animal vocalizations. We examined the relationship between demonstrations conveyed through the vocal versus bodily modality, comparing them across four common quotation devices (be like, go, say, and zero quotatives), as well as across direct and non-direct quotations and retellings. We found that direct quotations involved high levels of both vocal and bodily demonstration, while non-direct quotations involved lower levels in both these channels. In addition, there was a strong positive correlation between vocal and bodily demonstration for direct quotation. This result supports a Multimodal Hypothesis where information from the two channels arises from one central concept.