To Örebro University

oru.seÖrebro University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Can we automate data science?
Department of Computer Science, Katholieke Universiteit Leuven, Leuven, Belgium.ORCID iD: 0000-0002-6860-6303
2016 (English)In: European Data Science Conference: November 07-08, 2016 in Luxembourg, 2016, p. 38-38Conference paper, Oral presentation with published abstract (Other academic)
Abstract [en]

AI has been successful in automating scientific reasoning processes in e.g. the life science (with the Robot Scientists). The question that I want to ask is whether it is possible to automate the processes involved in data science? I also want to answer that question in the course of our ERC AdG project SYNTH on “Synthesising Inductive Data Models”.

To start the discussion on this topic, it is useful to look at the famous knowledge discovery cycle, where one typically starts from raw data, select and pre-process the data, identify the data mining task, use the right data mining algorithms, and then interpret the results and possibly iterate. It turns out that most of the existing approaches to automating this process, such as the automated statistician and meta-learning, algorithm portfolio and configuration approaches assume the learning task is known and we only need to identify the right algorithm and parameters to find the optimal task. It is well-known in the data mining community that this step takes typically only about 20% of the time, while the preprocessing and task identification take 80% of the time.

The question that I am interested in is what we can do to automate the pre-processing and task identification aspects, particularly for non-experts in data science.

Place, publisher, year, edition, pages
2016. p. 38-38
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:oru:diva-97316OAI: oai:DiVA.org:oru-97316DiVA, id: diva2:1635972
Conference
The European Data Science Conference (EDSC 2016), Luxenbourg, Luxenbourg, November 7-8, 2016
Available from: 2022-02-08 Created: 2022-02-08 Last updated: 2022-08-09Bibliographically approved

Open Access in DiVA

No full text in DiVA

Authority records

De Raedt, Luc

Search in DiVA

By author/editor
De Raedt, Luc
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 24 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf