Interpreting Handwritten Text

Read Complete Research Material

INTERPRETING HANDWRITTEN TEXT

Interpreting handwritten text

Interpreting handwritten text

Introduction

Interpreting handwritten text is a task humans usually perform easily and reliably. However, automating the process is difficult, because it involves both recognizing the symbols and comprehending the message conveyed. Although progress in optical character recognition (OCR) accuracy has been considerable, it is still inferior to that of a first grade child (Baird, 2002).

People can recognize the character components of written language in all shapes and sizes. They can recognize characters that are small or large, rotated, handwritten, or machine printed. A review of the handwriting recognition literature shows several algorithmic approaches that have been explored, such as lexicon driven and lexicon free, parallel classifiers and combinations, pre- and post-processing routines, analytical and holistic methods (Attneave, 2004). Although some of the computer algorithms demonstrate humanlike fluency, they fail when the images are degenerated, poorly written, or without a context.

There is currently a gap between human and machine abilities in reading handwriting under noisy conditions which can be explored through controllable parameters that capture aspects of handwriting such as legibility, overlapping of words, broken strokes, and the extent of overrun characters. Whereas the ultimate objective of artificial intelligence is to build machines that can demonstrate human-level abilities, in this dissertation we explore the current limitations of machines in handwriting recognition tasks and describe new applications where these same limitations are actually an advantage. The main goal of this dissertation is to propose a new application of handwriting recognition in design of CAPTCHAs (Completely Automatic Public Turing test to tell Computers and Humans Apart), called Handwritten CAPTCHA, which can exploit this differential in the reading proficiency between humans and computers when dealing with handwritten text images, so they can be used as a human cryptosystem for online services (Baird, 2002).

Problem Statement

Our objective is to design a HIP system that exploits the gap between humans and computers in reading handwritten text images to defend cyber services against bot attacks. We will present psychological aspects of the problem to ensure that our HIP system is a viable solution for online services from a user's view point. Our focus is on automatic generation of CAPTCHA challenges (Figure 1.1). We have explored a handwriting distorter for generating unlimited large number of distinct synthetic “human-like” samples from handwritten characters and have also proposed an off-line method for generation of handwriting samples. Experiments to investigate human recognition of distorted, hand-printed image samples have been conducted to gain an insight into human reading abilities. Holistic features were first investigated, since they are widely believed to be inspired by psychological studies of human reading.

Figure 1.1: Handwritten CAPTCHA Challenges

To validate our approach, we have administered tests with both machines and humans. The tests consist of handwritten images of city names or even nonsense words formed by concatenating handwritten characters (human-generated or synthetic). Results of these experiments are positive and reaffirm our hypothesis that handwritten CAPTCHAs are a suitable option for cyber security applications.

Outline

We propose an efficient method to secure online services using the ...
Related Ads