To main content

Talk is Cheap, Energy is Not: Towards a Green, Context-Aware Metrics Framework for Automatic Speech Recognition

Abstract

Automatic Speech Recognition (ASR) systems are increasingly deployed across diverse computing environments, from cloud servers to edge devices. While accuracy has traditionally been the primary evaluation metric, the inference efficiency of these systems, including energy consumption, memory usage, and hardware utilisation, significantly impacts their practical usability. This paper introduces a novel benchmarking framework that assesses ASR models during inference from both performance and sustainability perspectives. We introduce a multi-metric evaluation approach quantifying Word Error Rate (WER), Real-Time Factor (RTF), Energy Per Audio Second (EPAS), inference latency, GPU Memory Efficiency (GME), and Hardware Utilisation Rate (HUR). Our framework includes configurable weighting schemes tailored for various deployment scenarios: balanced general-purpose evaluation, resource-constrained environments, high-throughput batch inference, and real-time processing. To demonstrate the utility of the framework, we benchmark several state-of-the-art ASR architectures (Whisper, Wav2Vec2, HuBERT, WavLM, UniSpeech, and SpeechT5) in both FP16 and FP32 precision on NVIDIA Jetson AGX Orin hardware. The proposed methodology supports researchers and practitioners in making informed model selection decisions based on context-specific inference requirements. By illuminating performance–consumption trade-offs, the metrics framework can help to reduce computational costs and the carbon footprint of ASR systems, while maintaining acceptable accuracy.
Read the publication

Category

Academic chapter

Language

English

Author(s)

Affiliation

  • SINTEF Digital / Sustainable Communication Technologies
  • RISE Research Institutes of Sweden
  • Telenor

Year

2026

Publisher

Springer

Book

Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track: European Conference, ECML PKDD 2025, Porto, Portugal, September 15–19, 2025, Proceedings, Part IX

ISBN

9783032061188

Page(s)

36 - 54

View this publication at Norwegian Research Information Repository