This web site is designed for accessibility. Content is obtainable and functional to any browser or Internet device. This page's full visual experience is available in a graphical browser that supports web standards. See reasons to upgrade your browser.

CRBLP

Homepage

Research

Publications

Technical Reports

Demos-Downloads

People

Internship

Student Projects

Events

Seminar

Links

Contact Information

Center for Research on Bangla Language Processing
BRAC University
66, Mohakhali, Dhaka-1212
Phone: +88 (02) 8824051-4 Ext:4023
Fax: +88 (02) 8810383
crblp@bracu.ac.bd

::--TTF to Unicode Font Converter--::

Name:: CRBLPConverter

Download::

CRBLPConverter version 1.1: CRBLP has released a new version of the CRBLPConverter V1.1 under the GNU Public License (GPL) version 2.
Change Log:
* Bug Fixes: Fixed some of the issues with the conversion from ASCII Font to Unicode Font.
* Optimizations: Now the HTML file Converter converts all HTML nodes (tags) instead of just the SPAN tag.
* New Feature: Now an HTML file is generated retaining all formatting from an HTML input file. (The previous version generated a Plain Text File). It also supports multiple fonts (including English) in a file.
[Download CRBLPConverter V1.1] [Release notes]

CRBLPConverter version 1.0: CRBLPConverter is a software package to convert various TTF encoded Bangla documents to Unicode encoding. There are thousands of Bangla documents that are encoded with various fonts (typically called “ASCII fonts”), which are wholly incompatible with one another, and in some cases, proprietary. To make matters worse, some of the Bangla so called ASCII fonts change the underlying encoding from version to version without changing the version number. This software is designed to automate the process of converting this large body of existing Bangla documents in different formats such as HTML and Microsoft Word to Unicode. CRBLPConverter includes converters for SutonnyMJ, Bangsee Alpona, Prothoma, and Alo. In the case of SuttonyMJ, different versions of Bijoy use different encodings, which forces the user to specify which version of Bijoy the original document was written with. This software is free and open source, released under the GNU Public License (GPL) version 2. [Download CRBLPConverter V1.0] [Release notes]

Summary::

CRBLP team is working on a TTF to Unicode Font converter which will enable us to convert the ASCII encoded text to a Unicode enabled text.

Details::

CRBLP converter is a system which can convert Bangla ASCII encoded text to Unicode. Since computer store data as number, so it is conflicting to use ASCII/ANSI code for Bangla text because same code pages uses by other languages. Unicode provides unique number for every character without concerning platform, program and language [Ref].

Since several million of Bangla documents has been written in ASCII encoded scripting system so automatic solution is the only way to transform the text into Unicode. Different font’s uses different coding system also same font uses different coding system in different version which is significant difficulty in converting ASCII to Unicode. Here our attempt focused on Sutonny MJ font developed by Bijoy version 2000, Bangsee Alpona, Prothoma, Alo developed by Prothom-Alo. In different version of Bijoy font different coding system are used. The number of one character is different in different version.The project team working to support other fonts and all type of documents.

Team::

Contributor:

  • Dewan Shahriar Hossain Pavel

Status:: CRBLPConverter Released Version 1.0 1.1

Timeline:: 2007 - 2008

 

Center for Research on Bangla Language Processing
BRAC University, Dhaka, Bangladesh
© All Rights Reserved 2008