Recognizing character or digit from documents such as photographs which captured at a street level is a very important factor in modern-day map making. As an example, automatically identify an address accurately from street view images of that building. By using this information more precise map can be built and it can also improve navigation services. Though normal character classification is already a solved problem by computer vision but still recognizing digit or character from the natural scene like photographs are still a harder problem. The reason behind the difficulties may be the non-contrasting backgrounds, low resolution, blurred images, fonts variation, lighting etc. Traditional approaches for classifying characters and digits from natural images was separated in two pipelines. First segmenting the images to extract isolated characters and the perform recognition on extracted images.
And this can be done using multiple hand-crafted features * and template matching. *The main purpose of this project is to recognize the street view house number by using a deep convolutional neural network. For this work, I considered the digit classification dataset of house numbers which I extracted from street level images. http://ufldl.stanford.edu/housenumbers/.
This dataset is similar in flavor to MNIST dataset but with more labeled data. It has more than 600,000-digit images which contain color information and various natural backgrounds and collected from google street View images. To achieve the goal, I formed an application which will detect the number from just image pixels. Here, a convolutional neural network model with multiple layers is used to train the dataset and detect the house digit number with high accuracy. I used the traditional convolutional architecture with different pooling methods and multistage features and finally got 91.1% accuracy.