This web site is designed for accessibility. Content is obtainable and functional to any browser or Internet device. This page's full visual experience is available in a graphical browser that supports web standards. See reasons to upgrade your browser.
![]() |
|
Homepage | Research | Publications | Technical Reports | Demos-Downloads | People | Internship | Student Projects | Events | Seminar | Links |
| CRBLP Contact Information Center for Research on Bangla Language Processing
height=1 src="2 Column Demo_files/180px.gif" width=180 border=0>
|
::--Optical Character Recognition --:: Name:: BanglaOCR [Current updates] Summary:: This projects aims to develop an Optical Character Recognizer that can recognize Bangla Scripts. The entire OCR research and development task is mainly divided into three major parts: preprocessing, classification and post-processing. We performed experiment with several techniques for each individual parts and choose the appropriate methods in our implementation. Currently we are using Tesseract OCR engine to perform the recognition task. Details:: BanglaOCR is the Optical Character Recognizer for Bangla Script. It takes scanned images of a printed page or document as input and converts them into editable Unicode text. The current version of BanglaOCR deals will several independent parts as listed below.
The Preprocessing task involves image acquisition, binarization, noise elimination, skew detection and correction, line, word and character level segmentation. Bangla Character segmentation is one of the most significant challenges. For classification we are using Tesseract OCR engine (one of the most accurate free software OCR engines currently available). To perform the post processing task we are using two levels processing. At the first level we are correcting the recognition mistakes based on a certain number of rules. At the second level we are using a suggestion based spell checker that is capable to identify the erroneous words and produce suggestions.
Status::
Research Scope::
Development Scope::
Download:: http://code.google.com/p/banglaocr/downloads/list Timeline:: 2007 – 2009 |
Home | Research | Publications | Technical Reports |
Demos-Downloads |
People | Internship | Student Projects | Events |
Seminar |
Links |
Center for Research on Bangla Language Processing BRAC University, Dhaka, Bangladesh © All Rights Reserved 2008 |