publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

2025

  1. Extract Free Dense Misalignment from CLIP(cited: 2)
    JeongYeon Nam, Jinbae Im, Wonjae Kim, and Taeho Kil
    2025
  2. CVPR-W
    Emergence of Text Readability in Vision Language Models(cited: 1)
    Jaeyoo Park, Sanghyuk Chun, Wonjae Kim, Sangdoo Yun, and Bohyung Han
    In 2nd Workshop on Emergent Visual Abilities and Limits of Foundation Models at CVPR 2025, 2025

2024

  1. CHIL
    Vision-Language Generative Model for View-Specific Chest X-ray Generation(cited: 20)
    Hyungyung Lee, Da Young Lee, Wonjae Kim, Jin-Hwa Kim, Tackeun Kim, Jihang Kim, Leonard Sunwoo, and Edward Choi
    In The 5th Annual Conference on Health, Inference, and Learning (CHIL 2024), 2024
  2. CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion(cited: 114)
    Geonmo Gu*, Sanghyuk Chun*, Wonjae Kim, HeeJae Jun, Yoohoon Kang, and Sangdoo Yun
    In 41st Conference on Computer Vision and Pattern Recognition (CVPR 2024, Workshop on Synthetic Data for Computer Vision), 2024
  3. STELLA: Continual Audio-Video Pre-training with Spatio-Temporal Localized Alignment(cited: 8)
    Jaewoo Lee*, Jaehong Yoon*, Wonjae Kim, Yunji Kim, and Sung Ju Hwang
    In 41st International Conference on Machine Learnings (ICML 2024), 2024
  4. Language-only Efficient Training of Zero-shot Composed Image Retrieval(cited: 109)
    Geonmo Gu*, Sanghyuk Chun*, Wonjae Kim, Yoohoon Kang, and Sangdoo Yun
    In 41st Conference on Computer Vision and Pattern Recognition (CVPR 2024), 2024
  5. HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts(cited: 21)
    Wonjae Kim, Sanghyuk Chun, Taekyung Kim, Dongyoon Han, and Sangdoo Yun
    In 19th European Conference on Computer Vision (ECCV 2024), 2024
  6. Reducing Task Discrepancy of Text Encoders for Zero-Shot Composed Image Retrieval(cited: 13)
    Jaeseok Byun, Seokhyeon Jeong, Wonjae Kim, Sanghyuk Chun, and Taesup Moon
    2024
  7. Probabilistic Language-Image Pre-Training(cited: 16)
    Sanghyuk Chun, Wonjae Kim, Song Park, and Sangdoo Yun
    2024

2023

  1. ACL
    Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation Learning(cited: 8)
    Kyuyong Shin*, Hanock Kwak*, Wonjae Kim, Jisu Jeong, Seungjae Jung, Kyung-Min Kim, Jung-Woo Ha, and Sang-Woo Lee
    In 60th Annual Meeting of the Association for Computational Linguistics (ACL 2023), 2023
  2. SeiT: Storage-Efficient Vision Training with Tokens Using 1% of Pixel Storage(cited: 14)
    Song Park*, Sanghyuk Chun*, Byeongho Heo, Wonjae Kim, and Sangdoo Yun
    In 19th International Conference on Computer Vision (ICCV 2023), 2023
  3. What Do Self-Supervised Vision Transformers Learn?(cited: 145)
    Namuk Park, Wonjae Kim, Byeongho Heo, Taekyung Kim, and Sangdoo Yun
    In 11th International Conference on Learning Representations (ICLR 2023), 2023
  4. ICML-W
    Computational Approaches for App-to-App Retrieval and Design Consistency Check(cited: 6)
    Seokhyeon Park*, Wonjae Kim*, Young-Ho Kim, and Jinwook Seo
    In 40th International Conference on Machine Learnings (ICML 2023, Workshop on Artificial Intelligence & Human Computer Interaction), 2023

2022

  1. Entropy
    Discrete Infomax Codes for Supervised Representation Learning(cited: 6)
    Yoonho Lee, Wonjae Kim, Wonpyo Park, and Seungjin Choi
    Entropy special issue “Theory and Applications of Information Processing Algorithms”, 2022
  2. ViDT: An Efficient and Effective Fully Transformer-based Object Detector(cited: 152)
    Hwanjun Song, Deqing Sun, Sanghyuk Chun, Varun Jampani, Dongyoon Han, Byeongho Heo, Wonjae Kim, and Ming-Hsuan Yang
    In 10th International Conference on Learning Representations (ICLR 2022), 2022
  3. An Extendable, Efficient and Effective Transformer-based Object Detector(cited: 27)
    Hwanjun Song, Deqing Sun, Sanghyuk Chun, Varun Jampani, Dongyoon Han, Byeongho Heo, Wonjae Kim, and Ming-Hsuan Yang
    2022
  4. CHI
    Speeding up Inference with User Simulators through Policy Modulation(cited: 25)
    Hee-Seung Moon, Seungwon Do, Wonjae Kim, Jiwon Seo, Minsuk Chang, and Byungjoo Lee
    In 40th Conference on Human Factors in Computing Systems (CHI 2022), New Orleans, LA, USA, 2022
  5. ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCO(cited: 64)
    Sanghyuk Chun, Wonjae Kim, Song Park, Minsuk Chang, and Seong Joon Oh
    In 17th European Conference on Computer Vision (ECCV 2022), 2022
  6. BMVC
    Correlation between Alignment-Uniformity and Performance of Dense Contrastive Representations(cited: 6)
    Jong Hak Moon, Wonjae Kim, and Edward Choi
    In 33rd British Machine Vision Conference (BMVC 2022), 2022
  7. Group Generalized Mean Pooling for Vision Transformer(cited: 6)
    Byungsoo Ko, Han-Gyu Kim, Byeongho Heo, Sangdoo Yun, Sanghyuk Chun, Geonmo Gu, and Wonjae Kim
    2022

2021

  1. ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision(cited: 2632)
    Wonjae Kim*, Bokyung Son*, and Ildoo Kim
    In 38th International Conference on Machine Learnings (ICML 2021), 18–24 jul 2021
  2. NeurIPS-W
    Conditional Generation of Periodic Signals with Fourier-Based Decoder(cited: 7)
    Jiyoung Lee, Wonjae Kim, Daehoon Gwak, and Edward Choi
    In 34th Conference on Neural Information Processing Systems (NeurIPS 2021), 2021

2020

  1. ECCV-W
    Diversified Mutual Learning for Deep Metric Learning(cited: 11)
    Wonpyo Park*, Wonjae Kim*, Kihyun You, and Minsu Cho
    In 15th European Conference on Computer Vision (ECCV 2020, TASK-CV workshop), 2020

2019

  1. Learning Dynamics of Attention: Human Prior for Interpretable Machine Reasoning(cited: 11)
    Wonjae Kim and Yoonho Lee
    In 32nd Conference on Neural Information Processing Systems (NeurIPS 2019), 2019

2018

  1. Thesis
    Understanding Visualization Idioms Through Deep Visualization
    Wonjae Kim
    Seoul National University, 2018

2017

  1. CHI
    ChartSense: Interactive data extraction from chart images(cited: 225)
    Daekyoung Jung, Wonjae Kim, Hyunjoo Song, Jeong-in Hwang, Bongshin Lee, Bohyoung Kim, and Jinwook Seo
    In 35th Conference on Human Factors in Computing Systems (CHI 2017), 2017
  2. PacificVis
    SwiftTuna: Responsive and incremental visual exploration of large-scale multidimensional data(cited: 12)
    Jaemin Jo, Wonjae Kim, Seunghoon Yoo, Bohyoung Kim, and Jinwook Seo
    In 10th IEEE Pacific Visualization Symposium (PacificVis 2017), 2017