Generation of PDF with vector symbols from scanned document

Ilya V. Kurilin, Ilia V. Safonov, Michael N. Rychagov, Hokeun Lee, Sang Ho Kim, Donchul Choi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

The paper is devoted to the algorithm for generation of PDF with vector symbols from scanned documents. The complex multi-stage technique includes segmentation of the document to text/drawing areas and background, conversion of symbols to lines and Bezier curves, storing compressed background and foreground. In the paper we concentrate on symbol conversion that comprises segmentation of symbol bodies with resolution enhancement, contour tracing and approximation. Presented method outperforms competitive solutions and secures the best compression rate/quality ratio. Scaling of initial document to other sizes as well as several printing/scanning-to-PDF iterations expose advantages of proposed way for handling with document images. Numerical vectorization quality metric was elaborated. The outcomes of OCR software and user opinion survey confirm high quality of proposed method.

Original languageEnglish
Title of host publicationProceedings of SPIE-IS and T Electronic Imaging - Image Quality and System Performance X
DOIs
StatePublished - 2013
Externally publishedYes
EventImage Quality and System Performance X - Burlingame, CA, United States
Duration: Feb 5 2013Feb 7 2013

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
Volume8653
ISSN (Print)0277-786X

Conference

ConferenceImage Quality and System Performance X
Country/TerritoryUnited States
CityBurlingame, CA
Period02/5/1302/7/13

Keywords

  • Text vectorization
  • contour approximation
  • vectorization quality metrics

Fingerprint

Dive into the research topics of 'Generation of PDF with vector symbols from scanned document'. Together they form a unique fingerprint.

Cite this