TY - THES T1 - Document understanding A1 - Aguirre, Christopher D.G A2 - Bautista, Ma Concepcion G. A2 - Consulta, Rostum C. A2 - Orozco, Anthony A. A2 - Torres, Marlon A. LA - English YR - 2000 UL - https://ds.mainlib.upd.edu.ph/Record/UP-99796217603116345 AB - Given the mass of printed documents today, an automated process in document understanding is highly desirable. The paper discusses the processing of color printed documents. It proposes an approach for segmentationand optical character recognition in the segmented blocks using neural networks back propagation technique to transform the text blocks to an HTML document. To reduce computational complexity and thus speed up processing, the original color image is first transformed into a binary image of edge representation then a new method is used to identify the segmented blocks. Finally, all identified text blocks are transformed into white-background/black text binary images for an OCR system. In the OCR system, the identified text blocks first undergoes a training process followed by a recognition process. After the recognition process, the recognized blocks are transformed into their corresponding symbols. CN - LG 993.5 2000 C65 A38 KW - Optical character recognition. KW - Neural networks (Computer science). ER -