To Örebro University

oru.seÖrebro University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Learning Generative Image Manipulations from Language Instructions
Örebro University, School of Science and Technology. (Center for Applied Autonomous Sensor Systems (AASS))ORCID iD: 0000-0002-0579-7181
Örebro University, School of Science and Technology. (Center for Applied Autonomous Sensor Systems (AASS))
Örebro University, School of Science and Technology. (Center for Applied Autonomous Sensor Systems (AASS))ORCID iD: 0000-0002-3122-693X
2020 (English)Conference paper, Oral presentation with published abstract (Refereed)
Abstract [en]

This paper studies whether a perceptual visual system can simulate human-like cognitive capabilities by training a computational model to predict the output of an action using language instruction. The aim is to ground action words such that an AI is able to generate an output image that outputs the effect of a certain action on an given object. The output of the model is a synthetic generated image that demonstrates the effect that the action has on the scene. This work combines an image encoder, language encoder, relational network, and image generator to ground action words, and then visualize the effect an action would have on a simulated scene. The focus in this work is to learn meaningful shared image and text representations for relational learning and object manipulation.

Place, publisher, year, edition, pages
2020.
Keywords [en]
image manipulation, predictive learning, relational network, cognitive learning, image generation
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:oru:diva-88913OAI: oai:DiVA.org:oru-88913DiVA, id: diva2:1521888
Conference
Concepts in Action: Representation, Learning, and Application (CARLA 2020), Virtual workshop, September 22-23, 2020
Available from: 2021-01-25 Created: 2021-01-25 Last updated: 2021-01-26Bibliographically approved

Open Access in DiVA

Learning Generative Image Manipulations from Language Instructions(1274 kB)184 downloads
File information
File name FULLTEXT01.pdfFile size 1274 kBChecksum SHA-512
f4db043cabce4313ef357f32406274b1f60036c3602d108af95c68928db88b4abd75081984b8883590263ea8b757950989d17c4e9163a27fd975adce6d72fcba
Type fulltextMimetype application/pdf

Authority records

Längkvist, MartinPersson, AndreasLoutfi, Amy

Search in DiVA

By author/editor
Längkvist, MartinPersson, AndreasLoutfi, Amy
By organisation
School of Science and Technology
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 184 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 243 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf