To Örebro University

oru.seÖrebro University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Automatically Wrangling Spreadsheets into Machine Learning Data Formats
KU Leuven, Leuven, Belgium.
KU Leuven, Leuven, Belgium.ORCID iD: 0000-0002-6860-6303
2018 (English)In: Advances in Intelligent Data Analysis XVII / [ed] Wouter Duivesteijn, Arno Siebes, Antti Ukkonen, Springer, 2018, p. 367-379Conference paper, Published paper (Refereed)
Abstract [en]

To help automate the important pre-processing step in machine learning and data mining, we introduce synth-a-sizer, a tool for semi-automatically wrangling spreadsheets into attribute-value format, so that they can be used by popular machine learning tools, only requiring the user to mark cells belonging to one single example. synth-a-sizer is based on inductive programming principles. We introduce synth-a-sizer’s transformations, search algorithm as well as a heuristic and distance measure for identifying types. We also report on a first experimental evaluation.

Place, publisher, year, edition, pages
Springer, 2018. p. 367-379
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 11191
Keywords [en]
Data wrangling, Program synthesis, Spreadsheets, Preprocessing, Inductive programming
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:oru:diva-90770DOI: 10.1007/978-3-030-01768-2_30ISI: 000719688600030Scopus ID: 2-s2.0-85055700293ISBN: 978-3-030-01767-5 (print)ISBN: 978-3-030-01768-2 (electronic)OAI: oai:DiVA.org:oru-90770DiVA, id: diva2:1540316
Conference
17th International Symposium on Intelligent Data Analysis (IDA 2018), ’s-Hertogenbosch, The Netherlands, October 24–26, 2018
Funder
EU, Horizon 2020, 694980Available from: 2021-03-29 Created: 2021-03-29 Last updated: 2021-12-30Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

De Raedt, Luc

Search in DiVA

By author/editor
De Raedt, Luc
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 30 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf