Skip to content

Oscar Kjell

The talk-package uses Hugging Face transformers language models, natural language processing and machine learning methods to analyse audio recordings of talk and numerical variables.

This Getting Started tutorial is going through some central talk functions.

In short the workflow includes to first transform audio recordings into text (speech-to-text) as well as embeddings (which can be used in downstream tasks, for example using textTrain functions in the text-package.

talkEmbed(): mapping audio recordings to numbers using a HuggingFace language model

The talkEmbed() function transforms audio recording files to embeddings. The example data that will be used in this tutorial comes from participants that have described their harmony in life and satisfaction with life with a text response, 10 descriptive words or rating scales.


library(talk)

# work in progress

WhiSPA: Semantically and Psychologically Aligned Whisper with Self-Supervised Contrastive and Student-Teacher Learning Rajath Rao, Adithya V Ganesan, Oscar Kjell, Jonah Luby, Akshay Raghavan, Scott M. Feltman, Whitney Ringwald, Ryan L. Boyd, Benjamin J. Luft, Camilo J. Ruggero, Neville Ryant, ROMAN KOTOV, H. Schwartz

GitHub