Multimodal Digital Oral History
A Digital Oral History Lab Project Active R&D

MDOH.

Multimodal Digital Oral History

Developing methodology and technical workflows for active engagement with the oral, aural and sonic affordances of digital oral history collections.

Methodology Multimodal analysis (post-transcript)
Speech-to-text ASR pipeline (WhisperX)
Modalities Transcript · sound · waveform · metadata
Current focus AI laughter & paralinguistic detection
01

About the Project

The Multimodal Digital Oral History project is developing a methodology and technical workflow for active engagement with the oral, aural and sonic affordances of both retro-digitised and born-digital oral history collections — across modalities including transcript, sound, waveform, spectrogram and metadata.

We are currently working on the detection of laughter in oral history interviews using AI tools, in combination with non-digital approaches. Laughter offers a compelling example of elements such as hesitations, vocal quirks and pacing typically omitted from manual transcripts.

Transcript Sound Waveform Metadata Modal flow
Beyond the transcript

The full communicative richness of oral testimony — laughter, hesitation, pacing, timbre — lives in the modalities that manual transcripts cannot capture.

02

Methodology

Our approach encompasses multiple modalities of engagement with oral history materials, moving beyond traditional text-based analysis to incorporate the full spectrum of sonic and multimodal information contained within digital oral history collections.

We work across transcript, sound, waveform and metadata to develop comprehensive methodologies that preserve and analyse the rich, multidimensional nature of oral historical records — honouring both the content and the form of oral testimony.

03

Technical Innovation

AI routines enable the study of vocal features at the level of individual interviews and across larger collections. By combining computational approaches with traditional oral history methodologies, we identify and analyse patterns that would be impossible to detect through manual transcription alone.

This work represents a significant advancement in digital humanities methodology, opening new possibilities for understanding the full communicative richness of oral historical materials.

Pipeline from WAV audio through WhisperX, waveform, and laughter detection to TransVisEd and outputs (.json and .docx)
Fig. 01 Workflow pipeline — WAV audio through WhisperX, waveform and laughter detection to TransVisEd. Transcript visualiser & editor in active development.
04

Project Resources

Three entry points — the active research thread on AI-based laughter detection, the foundational publication, and the broader publication archive.

R / 01

AI Laughter Detection

Active research on detecting laughter in oral history interviews using AI tools — surfacing the vocal elements typically omitted from manual transcripts.

 
Type Research thread
Status In development
Follow progress
R / 02

Foundational Article

Smyth, Nyhan & Flinn (2023) — the intellectual basis of our approach, in Digital Scholarship in the Humanities.

 
Type Peer-reviewed article
Year 2023
Read on DSH
R / 03

Publications & Outputs

Browse the project's broader collection of articles, conference papers and reports through the lab's publication archive.

 
Type Research output
Source Lab repository
Read articles