Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prob. on Hebrew encoding

I have a hebrew text just as "×گض¸×¨ض´×™×،ض°×کוض¹×ں", and I want to convert it to readable unicode hebrew charackters.

I tried this code:

const string Str = "×گض¸×¨ض´×™×،ض°×کוض¹×ں";

Encoding enc1 = Encoding.Default;
Encoding enc2 = Encoding.Unicode;

byte[] bytes = enc1.GetBytes(Str);

string hebrewString = enc2.GetString(bytes);

label1.Text = hebrewString;

But it didn't succeeded. Please Help.

Update the text come from html source code

Version:1.0
StartHTML:000000210
EndHTML:000006218
StartFragment:000001595
EndFragment:000006126
StartSelection:000001595
EndSelection:000006126
SourceURL:file:///C:/ProgramData/Babylon/LocalUI/wnd.html
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3c.org/TR/1999/REC-    html401-19991224/loose.dtd">

<HTML 
xmlns="http://www.w3.org/1999/xhtml"><HEAD><TITLE>CLient build #1.2</TITLE><LINK 
rel=stylesheet type=text/css href="img/frame.css?ver=41"><LINK rel=stylesheet 
type=text/css href="img/baby.css?ver=41"><LINK rel=stylesheet type=text/css 
href="img/word.css?ver=41"><LINK rel=stylesheet type=text/css 
href="img/text.css?ver=41">
<SCRIPT type=text/javascript src="js/moudles.js?ver=100"></SCRIPT>

<SCRIPT type=text/javascript src="js/extrnl.js?ver=100"></SCRIPT>

<SCRIPT type=text/javascript src="js/frame.js?ver=100"></SCRIPT>

<SCRIPT type=text/javascript src="js/word.js?ver=100"></SCRIPT>

<SCRIPT type=text/javascript src="js/fTxt.js?ver=100"></SCRIPT>

<SCRIPT type=text/javascript src="js/baby.js?ver=100"></SCRIPT>

<SCRIPT type=text/javascript src="js/plcy.js?ver=100"></SCRIPT>
</HEAD>

<BODY style="FONT-FAMILY: Verdana" onload=bodyLoad() 
class="scrollBar ie7 fontSize5" bgColor=#000100 name="Rslt">

<DIV class=m2>

<DIV class=mrg>

<DIV style="BOTTOM: -67px" id=baseBody class=client>

<DIV id=wordContainer>

<DIV style="OVERFLOW-Y: scroll; DISPLAY: block; FONT-FAMILY: Tahoma" 
id=resultContainer class=wordBody>

<DIV id=rsltCntnr>

<DIV style="CURSOR: auto" id=BABID_Results><!--StartFragment--><DIV id=BABIDPtr_!!Z8UVKYSMBJ class=result entryType="3" entryPrio="1100099960">
<TABLE style="TABLE-LAYOUT: fixed" class=res-head cellSpacing=0 cellPadding=3 
width="100%">
<TBODY>
<TR>
<TD vAlign=top width=20><IMG id=BABID_CPIconImg class=BAB_ImgInTitle 
src="C:\Users\Mahmoud\AppData\Roaming\Babylon\Content\icons/Z8UVKYSMBJ_glossary_icon.ico"> 
</TD>
<TD id=BABID_CPTitle vAlign=top>
<DIV style="DISPLAY: inline" id=BABID_CPName class=BAB_NormalTitle>×‍ض´×œض¼×•ض¹×ں 
×گض¶×‘ض¶×ںض¾×©×پוض¹×©×پض¸×ں ×”ض·×‍ض¼ض¸×œضµ×گ</DIV><SPAN style="PADDING-LEFT: 2px"     id=BABID_CPBandBtns 
valign="top"><IMG class=BAB_ImageBtn title="Dictionary menu" tabIndex=0 
src="c:\programdata\babylon\localui\img\res\btn_titlemenu.png" 
behavetype="3ImageState" bab_name="BTN_TitleMenu"> 
</SPAN></TD></TR></TBODY></TABLE>
<DIV id=BABID_CPResult class=BAB_CPBodyStyleLocal>
<DIV xmlns:babex="urn:schemas-babylon-com:babex" 
xmlns:bab="urn:schemas-babylon-com:bab" 
xmlns:msxsl="urn:schemas-microsoft-com:xslt">
<DIV dir=rtl class=term align=right>
<DIV style="FLOAT: right" dir=ltr class=btnArr><IMG class=BAB_ImageBtn 
title="Previous term" tabIndex=0 
src="c:\programdata\babylon\localui\img\res\btn_browseprevious.png" 
bab_name="BTN_BrowsePrevious" behaveType="3ImageState" baburi=""><IMG 
class=BAB_ImageBtn title="Next term" tabIndex=0 
src="c:\programdata\babylon\localui\img\res\btn_browsenext.png" 
bab_name="BTN_BrowseNext" behaveType="3ImageState" baburi=""></DIV>×گض¸×¨ض´×™×،ض°×کוض¹×ں
<DIV class=rsltSpkrCntnr><IMG class=BAB_ImageBtn 
title="To listen to a text, select it, and click the speaker button" tabIndex=0 
src="c:\programdata\babylon\localui\img\res\btn_sayit_rtl.png" valign="bottom" 
bab_name="BTN_SayIt_rtl" behaveType="3ImageState" baburi="" 
term="×گض¸×¨ض´×™×،ض°×کוض¹×ں"></DIV></DIV>
<DIV class=definition align=right><SPAN dir=rtl>
<STYLE>a{cursor:pointer;text-decoration:none;color:blue</STYLE>

<DIV 
style="LINE-HEIGHT: 160%; FONT-FAMILY: David,Times New Roman; FONT-SIZE: 130%" 
dir=rtl><FONT style="COLOR: black; FONT-WEIGHT: normal"><SUP>×ھ</SUP></FONT> 
<FONT color=blue>(×–')</FONT> [יווני×ھ: ariston] ×،ض°×¢×•ض¼×“ض¸×”, ×گض²×¨×•ض¼×—    ض¸×”: "×گض¸×¨ض´×™×،ض°×کוض¹×ں 
×¢ض¸×ھض´×™×“ ×”ض·×§ض¼ض¸×“וض¹×©×پ-בض¼ض¸×¨×•ض¼×ڑض°-הוض¼×گ לض·×¢ض²×©×‚וض¹×ھ לض·×¢ض²    בض¸×“ض¸×™×• ×”ض·×¦ض¼ض·×“ض¼ض´×™×§ض´×™×‌ לض¶×¢ض¸×ھض´×™×“ 
לض¸×‘וض¹×گ" (ויקר×گ רבה ×™×’). "×گض²× ض´×™ עוض¹×¨ضµ×ڑض° ×”ض¸×گض¸×¨ض´×™×،ض°×کוض¹×ں     לض·×—ض²×‘ض´×™×‘ض·×™, ×›ض¼ض·× ض°×¤ضµ×™ ×”ض¸×¨ض¹×ں" 
(ש×‍עוני, שירי×‌ ×’ פה).
<P>[×گض²×¨ض´×™×،ض°×کض´×™×ں] </P></DIV></SPAN></DIV><BR 
style="CLEAR: both; FONT-SIZE: 1px"></DIV>
<DIV class=BAB_CPCopyrightStyle xmlns:babex="urn:schemas-babylon-com:babex" 
xmlns:bab="urn:schemas-babylon-com:bab" 
xmlns:msxsl="urn:schemas-microsoft-com:xslt"><BR><BR><BR>
<DIV dir=rtl>
<P><BR><BR>آ© כל הזכויו×ھ ש×‍ורו×ھ ליורשי ×”×‍חבר<BR>Copyright 2003, The     author's 
heirs آ©</P><BR><BR><BR><BR>
<LI><B>להקד×‍×”, לה×،ברי×‌, לרשי×‍×ھ ×”×‍קורו×ھ ועוד - ר×گו <A 
style="TEXT-DECORATION: none" 
href="bword://×‍ض´×œض¼×•ض¹×ں ×گض¶×‘ض¶×ںض¾×©×پוض¹×©×پض¸×ں ×”ض·×‍ض¼ض¸×œضµ×گ/">×‍ض´×œض¼×•ض¹×ں     ×گض¶×‘ض¶×ںض¾×©×پוض¹×©×پض¸×ں ×”ض·×‍ض¼ض¸×œضµ×گ 
- ×¢ض·×‍ض¼×•ض¼×“ضµ×™ ×”ض·×¤ض¼ض°×ھض´×™×—ض¸×”</A>.</B></LI></DIV></DIV>
<DIV style="DISPLAY: none" dir=rtl id=BABID_BottomLinks 
xmlns:babex="urn:schemas-babylon-com:babex" 
xmlns:bab="urn:schemas-babylon-com:bab" 
xmlns:msxsl="urn:schemas-microsoft-com:xslt">
<DIV style="FLOAT: left" id=BABID_BottomActions></DIV> <BR 
style="CLEAR: both; FONT-SIZE: 1px"></DIV>
<DIV class=prcTrial xmlns:babex="urn:schemas-babylon-com:babex" 
xmlns:bab="urn:schemas-babylon-com:bab" 
xmlns:msxsl="urn:schemas-microsoft-com:xslt">
<DIV class=left-corner><IMG class=BAB_ImageStat 
src="c:\programdata\babylon\localui\img\res\trialcornerleft.png" width=4 
height=55 bab_name="TrialCornerLeft"></DIV>
<DIV style="BACKGROUND: none transparent scroll repeat 0% 0%" 
class=right-corner><IMG class=BAB_ImageStat 
src="c:\programdata\babylon\localui\img\res\trialcornerright.png" width=4 
height=55 bab_name="TrialCornerRight"></DIV>
<DIV class=prcTrial-body>
<DIV class=days-left>Dictionary trial version (4 days)</DIV><IMG 
class=BAB_ImageStat src="c:\programdata\babylon\localui\img\res\prctrial.png" 
bab_name="PRCTrial"><SPAN class=buy-link><A id=CP_LINK 
href="buyprc://!!Z8UVKYSMBJ,745,0/">Buy This 
Dictionary</A></SPAN></DIV></DIV></DIV><BR 
style="CLEAR: both; FONT-SIZE: 1px"></DIV><!--EndFragment--></DIV>
</DIV>
</DIV>
</DIV>
</DIV>
</DIV>
</DIV>
</BODY>
</HTML>

this html works fine, but I can't get the hebrew text to string Thanks

like image 681
JustMe Avatar asked Apr 29 '13 11:04

JustMe


2 Answers

It seems you Str is not a valid Hebrew string.

Checking some possible encodings it seems it's a Hebrew (Windows):

 const string Str = "×گض¸×¨ض´×™×،ض°×کוض¹×ں";

 Encoding origionEncoding = Encoding.GetEncoding(1256); //assume the string was encoded as arabic

 byte[] bytes = origionEncoding.GetBytes(Str);

 Encoding desEncoding = Encoding.GetEncoding(1255);       //Hebrew (Windows) 

 string hebrewString = desEncoding.GetString(bytes);

EDIT: It seems an Hebrew string has been encoded as Arabic encoding so to do the reverse (If possible) we should try possible origin/destination pairs of encodings.

like image 85
Hossein Narimani Rad Avatar answered Oct 16 '22 20:10

Hossein Narimani Rad


I solved it. the text is utf-8

       const string Str = "×گض¸×¨ض´×™×،ض°×کוض¹×ں";
        Encoding defaultEncoding = Encoding.Default;
        byte[] bytes = defaultEncoding.GetBytes(Str);
        Encoding encoding2 = Encoding.UTF8;
        string hebrewString2 = encoding2.GetString(bytes);
        label1.Text = hebrewString2;

Thanks for every body

like image 33
JustMe Avatar answered Oct 16 '22 19:10

JustMe