January 2025

Automatic Speech Balloon Processing System

Industry Project

Project Summary: Developed an automated system for detecting and removing speech balloons from comic images using YOLO object detection combined with state-of-the-art diffusion-based inpainting, creating clean backgrounds for comic processing workflows.


This project addresses the labor-intensive process of removing speech balloons to create clean background images for comic editing, translation, or adaptation. The two-stage pipeline uses a custom-trained YOLO model for precise speech balloon detection, followed by diffusion-based image completion for background-consistent content generation while maintaining artistic style.

The system streamlines professional comic creation processes including translation preparation, digital restoration, and format adaptation for modern digital comic platforms, providing significant time and cost savings for comic production studios and independent creators.