Word Spotting in Cursive Handwritten Documents using Modified Character Shape Codes
There is a large collection of Handwritten English paper documents of Historical and Scientific importance. But paper documents are not recognized directly by computer. Hence the closest way of indexing these documents is by storing their document digital image. Hence a large database of document images can replace the paper documents. But the document and data corresponding to each image cannot be directly recognized by the computer. This paper applies the technique of word spotting using Modified Character Shape Code to Handwritten English document images for quick and efficient query search of words on a database of document images. It is different from other Word Spotting techniques as it implements two level of selection for word segments to match search query. First based on word size and then based on character shape code of query. It makes the process faster and more efficient and reduces the need of multiple pre-processing.
READ FULL TEXT