Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I parse an Arabic Umm Al-Qura date string into a .NET DateTime object?

I have the following Arabic date in the Umm Al-Qura calendar that I want to parse into a .NET DateTime object:

الأربعاء‏، 17‏ ذو الحجة‏، 1436

This date is equivalent to September 30th 2015 in the Gregorian calendar.

I've been trying the following "standard" C# code to parse this date, but without success:

var cultureInfo = new CultureInfo("ar-SA");
cultureInfo.DateTimeFormat.Calendar = new UmAlQuraCalendar(); // the default one anyway

var dateFormat = "dddd، dd MMMM، yyyy"; //note the ، instead of ,

var dateString = "‏الأربعاء‏، 17‏ ذو الحجة‏، 1436";
DateTime date;
DateTime.TryParseExact(dateString, dateFormat, cultureInfo.DateTimeFormat, DateTimeStyles.AllowWhiteSpaces, out date);

No matter what I do, the result of TryParseExact is always false. How do I parse this string properly in .NET?

By the way, if I start from a DateTime object, I can create the exact date string above using ToString()'s overloads on DateTime without problems. I just can't do it the other way around apparently.

like image 480
Gabriel S. Avatar asked Sep 30 '15 08:09

Gabriel S.


1 Answers

Your datestring is 30 characters long and contains four UNICODE 8207 U+200F RIGHT TO LEFT MARK characters, but your dateformat does not.

// This gives a string 26 characters long
var str = new DateTime(2015,9,30).ToString(dateFormat, cultureInfo.DateTimeFormat)

RIGHT TO LEFT MARK is not whitespace.

If it only contains RLM/LRM/ALM you should probably just strip them out. Same with the isolates LRI/RLI/FSI and PDI sets, and LRE/RLE sets. You may not want to do that with LRO though. LRO is often used with legacy data where the RTL characters are stored in the opposite order, i.e. in the left-to-right order. In these cases you may want to actually reverse the characters.

Parsing dates from random places is a hard problem. You need a layered solution, try first one method, then another in priority order until you succeed. There is no 100% solution though, because people can type what they like.

See here for more information: http://www.unicode.org/reports/tr9/

like image 137
Ben Avatar answered Oct 05 '22 12:10

Ben