Learning From Documents in the Wild to Improve Document Unwarping
DescriptionDocument image unwarping is important for document digitization and analysis. In this work, we present the state-of-the-art, data-driven document unwarping approach, PaperEdge. Unlike synthetic image-based prior methods, PaperEdge can incorporate real-world images in training. PaperEdge significantly improves the generalizability and OCR performance on the unwarped images for different document types.