To Örebro University

oru.seÖrebro University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Language Models to Support Multi-Label Classification of Industrial Data
Blekinge Institute of Technology, Karlskrona, Sweden.
Blekinge Institute of Technology, Karlskrona, Sweden.
Blekinge Institute of Technology, Karlskrona, Sweden.
University College Dublin, Dublin, Ireland and CNR - ISTI, Pisa, Italy..
Show others and affiliations
2025 (English)In: 2025 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER): Proceedings, IEEE COMPUTER SOC , 2025, p. 45-55Conference paper, Published paper (Refereed)
Abstract [en]

Background: Multi-label requirements classification is an inherently challenging task, especially when dealing with numerous classes at varying levels of abstraction. The task becomes even more difficult when a limited number of requirements is available to train a supervised classifier. Zero-shot learning does not require training data and can potentially address this problem.

Objective: This paper investigates the performance of zero-shot classifiers on a multi-label industrial dataset. The study focuses on classifying requirements according to a hierarchical taxonomy designed to support requirements tracing.

Method: We compare multiple variants of zero-shot classifiers using different embeddings, including 9 language models (LMs) with a reduced number of parameters (up to 3B), e.g., BERT, and 5 large LMs (LLMs) with a large number of parameters (up to 70B), e.g., Llama. Our ground truth includes 377 requirements and 1968 labels from 6 output spaces. For the evaluation, we adopt traditional metrics, i.e., precision, recall, F-1, and F-beta, as well as a novel label distance metric D-n. This aims to better capture the classification's hierarchical nature and to provide a more nuanced evaluation of how far the results are from the ground truth.

Results: 1) The top-performing model on 5 out of 6 output spaces is T5-xl, with maximum F-beta = 0:78 and D-n = 0:04, while BERT base outperformed the other models in one case, with maximum F-beta = 0:83 and D-n = 0:04. 2) LMs with smaller parameter size produce the best classification results compared to LLMs. Thus, addressing the problem in practice is feasible as limited computing power is needed. 3) The model architecture (autoencoding, autoregression, and sentence-to-sentence) significantly affects the classifier's performance.

Contribution: We conclude that using zero-shot learning for multi-label requirements classification offers promising results. We also present a novel metric that can be used to select the top-performing model for this problem.

Place, publisher, year, edition, pages
IEEE COMPUTER SOC , 2025. p. 45-55
Series
IEEE International Conference on Software Analysis Evolution and Reengineering, ISSN 1534-5351, E-ISSN 2640-7574
Keywords [en]
multi-label, requirements classification, taxonomy, language models
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:oru:diva-122592DOI: 10.1109/SANER64311.2025.00013ISI: 001506888600005ISBN: 9798331535100 (electronic)ISBN: 9798331535117 (print)OAI: oai:DiVA.org:oru-122592DiVA, id: diva2:1986449
Conference
2025 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), Montreal, Canada, March 4-7, 2025
Available from: 2025-07-31 Created: 2025-07-31 Last updated: 2025-07-31Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Chatzipetrou, Panagiota

Search in DiVA

By author/editor
Chatzipetrou, Panagiota
By organisation
Örebro University School of Business
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 28 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf