To Örebro University

oru.seÖrebro University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
HyperEmbed: Tradeoffs between resources and performance in NLP Tasks with hyperdimensional computing enabled embedding of n-gram statistics
EISLAB Luleå, University of Technology, Luleå, Sweden.ORCID iD: 0000-0002-6785-4356
Department of Computer Science, ETH Zürich, Zürich, Switzerland.
UC Berkeley, Berkeley, USA; Research Institutes of Sweden, Kista, Sweden.ORCID iD: 0000-0002-6032-6155
DCC Luleå University of Technology, Luleå, Sweden.ORCID iD: 0000-0003-0069-640x
Show others and affiliations
2021 (English)In: 2021 International Joint Conference on Neural Networks (IJCNN): Proceedings, IEEE , 2021Conference paper, Published paper (Refereed)
Abstract [en]

Recent advances in Deep Learning have led to a significant performance increase on several NLP tasks, however, the models become more and more computationally demanding. Therefore, this paper tackles the domain of computationally efficient algorithms for NLP tasks. In particular, it investigates distributed representations of n -gram statistics of texts. The representations are formed using hyperdimensional computing enabled embedding. These representations then serve as features, which are used as input to standard classifiers. We investigate the applicability of the embedding on one large and three small standard datasets for classification tasks using nine classifiers. The embedding achieved on par F1 scores while decreasing the time and memory requirements by several times compared to the conventional n -gram statistics, e.g., for one of the classifiers on a small dataset, the memory reduction was 6.18 times; while train and test speed-ups were 4.62 and 3.84 times, respectively. For many classifiers on the large dataset, memory reduction was ca. 100 times and train and test speed-ups were over 100 times. Importantly, the usage of distributed representations formed via hyperdimensional computing allows dissecting strict dependency between the dimensionality of the representation and n-gram size, thus, opening a room for tradeoffs.

Place, publisher, year, edition, pages
IEEE , 2021.
Series
Proceedings of the International Joint Conference on Neural Networks, ISSN 2161-4393, E-ISSN 2161-4407
Keywords [en]
hyperdimensional computing, n-gram statistics, intent classification, embedding
National Category
Computer Sciences
Research subject
Machine Learning; Dependable Communication and Computation Systems
Identifiers
URN: urn:nbn:se:oru:diva-116020DOI: 10.1109/IJCNN52387.2021.9534359ISI: 000722581708054Scopus ID: 2-s2.0-85108654382ISBN: 9780738133669 (print)OAI: oai:DiVA.org:oru-116020DiVA, id: diva2:1900202
Conference
The International Joint Conference on Neural Networks (IJCNN 2021), virtual, July 18-22, 2021
Funder
EU, Horizon 2020, 839179
Note

The work of DK was supported by the European Union's Horizon 2020 Research and Innovation Programme under the Marie Sklodowska-Curie Individual Fellowship Grant Agreement 839179 and in part by the DARPA's VIP (Super-HD Project) and AIE (HyDDENN Project) programs.

Available from: 2024-09-23 Created: 2024-09-23 Last updated: 2024-09-23Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Alonso, PedroKleyko, DenisOsipov, EvgenyLiwicki, Marcus

Search in DiVA

By author/editor
Alonso, PedroKleyko, DenisOsipov, EvgenyLiwicki, Marcus
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 50 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf