To Örebro University

oru.seÖrebro University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Perceptron Theory Can Predict the Accuracy of Neural Networks
Redwood Center for Theoretical Neuroscience, University of California at Berkeley, Berkeley CA, USA; Intelligent Systems Laboratory, Research Institutes of Sweden, Kista, Sweden.ORCID iD: 0000-0002-6032-6155
Department of Information Engineering, Electronics and Telecommunications, University of Rome “La Sapienza”, Rome, Italy.
Neuromorphic Computing Laboratory, Intel Labs, Santa Clara CA, USA.
Department of Information Engineering, Electronics and Telecommunications, University of Rome “La Sapienza”, Rome, Italy.
Show others and affiliations
2024 (English)In: IEEE Transactions on Neural Networks and Learning Systems, ISSN 2162-237X, E-ISSN 2162-2388, Vol. 35, no 7, p. 9885-9899Article in journal (Refereed) Published
Abstract [en]

Multilayer neural networks set the current state of the art for many technical classification problems. But, these networks are still, essentially, black boxes in terms of analyzing them and predicting their performance. Here, we develop a statistical theory for the one-layer perceptron and show that it can predict performances of a surprisingly large variety of neural networks with different architectures. A general theory of classification with perceptrons is developed by generalizing an existing theory for analyzing reservoir computing models and connectionist models for symbolic reasoning known as vector symbolic architectures. Our statistical theory offers three formulas leveraging the signal statistics with increasing detail. The formulas are analytically intractable, but can be evaluated numerically. The description level that captures maximum details requires stochastic sampling methods. Depending on the network model, the simpler formulas already yield high prediction accuracy. The quality of the theory predictions is assessed in three experimental settings, a memorization task for echo state networks (ESNs) from reservoir computing literature, a collection of classification datasets for shallow randomly connected networks, and the ImageNet dataset for deep convolutional neural networks. We find that the second description level of the perceptron theory can predict the performance of types of ESNs, which could not be described previously. Furthermore, the theory can predict deep multilayer neural networks by being applied to their output layer. While other methods for prediction of neural networks performance commonly require to train an estimator model, the proposed theory requires only the first two moments of the distribution of the postsynaptic sums in the output neurons. Moreover, the perceptron theory compares favorably to other methods that do not rely on training an estimator model.

Place, publisher, year, edition, pages
IEEE, 2024. Vol. 35, no 7, p. 9885-9899
Keywords [en]
accuracy prediction, deep neural networks, hyperdimensional computing, perceptron theory, reservoir com- puting, vector symbolic architectures
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:oru:diva-116229DOI: 10.1109/TNNLS.2023.3237381ISI: 000935660500001Scopus ID: 2-s2.0-85198676170OAI: oai:DiVA.org:oru-116229DiVA, id: diva2:1900287
Funder
EU, Horizon 2020, 839179
Note

Sommer was supported by the National Institutes of Health (NIH) under Grant R01-EB026955. The work of Denis Kleyko was supported in part by the European Union's Horizon 2020 Research and Innovation Program within the Marie Sklodowska-Curie under Grant 839179, in part by the Defense Advanced Research Projects Agency's (DARPA) VIP (Super-HD Project) and AIE (HyDDENN Project) Programs, and in part by the Air Force Office of Scientific Research (AFOSR) under Grant FA9550-19-1-0241. The work of Friedrich T. Sommer and Denis Kleyko was supported in part by the Intel's THWAI Program.

Available from: 2024-09-23 Created: 2024-09-23 Last updated: 2024-09-23Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Kleyko, Denis

Search in DiVA

By author/editor
Kleyko, Denis
In the same journal
IEEE Transactions on Neural Networks and Learning Systems
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 64 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf