Till Örebro universitet

oru.seÖrebro universitets publikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Measuring the quality of generative AI systems: Mapping metrics to quality characteristics - Snowballing literature review
Ericsson AB, Blekinge, Sweden; Blekinge Institute of Technology, Karlskrona, Sweden.
Blekinge Institute of Technology, Karlskrona, Sweden.
Örebro universitet, Handelshögskolan vid Örebro Universitet.ORCID-id: 0000-0002-0311-1502
Fortiss, Munich, Bavaria, Germany; Blekinge Institute of Technology, Karlskrona, Sweden.
2025 (Engelska)Ingår i: Information and Software Technology, ISSN 0950-5849, E-ISSN 1873-6025, Vol. 186, artikel-id 107802Artikel, forskningsöversikt (Refereegranskat) Published
Abstract [en]

Context: Generative Artificial Intelligence (GenAI) and the use of Large Language Models (LLMs) have revolutionized tasks that previously required significant human effort, which has attracted considerable interest from industry stakeholders. This growing interest has accelerated the integration of AI models into various industrial applications. However, the model integration introduces challenges to product quality, as conventional quality measuring methods may fail to assess GenAI systems. Consequently, evaluation techniques for GenAI systems need to be adapted and refined. Examining the current state and applicability of evaluation techniques for the GenAI system outputs is essential.

Objective: This study aims to explore the current metrics, methods, and processes for assessing the outputs of GenAI systems and the potential of risky outputs.

Method: We performed a snowballing literature review to identify metrics, evaluation methods, and evaluation processes from 43 selected papers.

Results: We identified 28 metrics and mapped these metrics to four quality characteristics defined by the ISO/IEC 25023 standard for software systems. Additionally, we discovered three types of evaluation methods to measure the quality of system outputs and a three-step process to assess faulty system outputs. Based on these insights, we suggested a five-step framework for measuring system quality while utilizing GenAI models.

Conclusion: Our findings present a mapping that visualizes candidate metrics to be selected for measuring quality characteristics of GenAI systems, accompanied by step-by-step processes to assist practitioners in conducting quality assessments.

Ort, förlag, år, upplaga, sidor
Elsevier, 2025. Vol. 186, artikel-id 107802
Nyckelord [en]
Generative AI, GenAI, Large language model, LLM, Quality characteristics, Metric, Evaluation
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:oru:diva-122507DOI: 10.1016/j.infsof.2025.107802ISI: 001519902000001Scopus ID: 2-s2.0-105008505516OAI: oai:DiVA.org:oru-122507DiVA, id: diva2:1985311
Anmärkning

We acknowledge support from the KKS Foundation through S.E.R.T. Research Profile Project (research profile grant 20180010) at Blekinge Institute of Technology.

Tillgänglig från: 2025-07-23 Skapad: 2025-07-23 Senast uppdaterad: 2026-01-23Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus

Person

Chatzipetrou, Panagiota

Sök vidare i DiVA

Av författaren/redaktören
Chatzipetrou, Panagiota
Av organisationen
Handelshögskolan vid Örebro Universitet
I samma tidskrift
Information and Software Technology
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetricpoäng

doi
urn-nbn
Totalt: 44 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf