The talk-package uses Hugging Face transformers language models, natural language processing and machine learning methods to analyse audio recordings of talk and numerical variables.
This Getting Started tutorial is going through some central talk functions.
In short the workflow includes to first transform audio recordings into text (speech-to-text) as well as embeddings (which can be used in downstream tasks, for example using textTrain functions in the text-package.
talkEmbed(): mapping audio recordings to numbers using a HuggingFace language model
The talkEmbed()
function transforms audio recording
files to embeddings. The example data that will be used in this tutorial
comes from participants that have described their harmony in life and
satisfaction with life with a text response, 10 descriptive words or
rating scales.
Articles related to the talk-package
WhiSPA: Semantically and Psychologically Aligned Whisper with Self-Supervised Contrastive and Student-Teacher Learning Rajath Rao, Adithya V Ganesan, Oscar Kjell, Jonah Luby, Akshay Raghavan, Scott M. Feltman, Whitney Ringwald, Ryan L. Boyd, Benjamin J. Luft, Camilo J. Ruggero, Neville Ryant, ROMAN KOTOV, H. Schwartz