Accidentally splitting unicode chars when truncating strings

Tags:

I'm saving some strings from a third party into my database (postgres). Sometimes these strings are too long and need to be truncated to fit into the column in my table.

On some random occasions I accidentally truncate the string right where there is a Unicode character, which gives me a "broken" string that I cannot save into the database. I get the following error: Unable to translate Unicode character \uD83D at index XXX to specified code page.

I've created a minimal example to show you what I mean. Here I have a string that contains a Unicode character ("Small blue diamond" 🔹 U+1F539). Depending on where I truncate, it gives me a valid string or not.

var myString = @"This is a string before an emoji:🔹 This is after the emoji.";

var brokenString = myString.Substring(0, 34);
// Gives: "This is a string before an emoji:☐"

var test3 = myString.Substring(0, 35);
// Gives: "This is a string before an emoji:🔹"

Is there a way for me to truncate the string without accidentally breaking any Unicode chars?

878

asked Sep 29 '17 08:09

Joel

1 Answers

A Unicode character may be represented with several chars, that is the problem with string.Substring you are having.

You may convert your string to a StringInfo object and then use SubstringByTextElements() method to get the substring based on the Unicode character count, not a char count.

See a C# demo:

Console.WriteLine("🔹".Length); // => 2
Console.WriteLine(new StringInfo("🔹").LengthInTextElements); // => 1

var myString = @"This is a string before an emoji:🔹This is after the emoji.";
var teMyString = new StringInfo(myString);
Console.WriteLine(teMyString.SubstringByTextElements(0, 33));
// => "This is a string before an emoji:"
Console.WriteLine(teMyString.SubstringByTextElements(0, 34));
// => This is a string before an emoji:🔹
Console.WriteLine(teMyString.SubstringByTextElements(0, 35));
// => This is a string before an emoji:🔹T

answered Sep 23 '22 00:09

Wiktor Stribiżew

Related questions
                            
                                How to set focus on Entry inside Listview? Xamarin Forms
                            
                                Submit a Spark job from C# and get results
                            
                                ASP.NET Core DateTime - ToLocalTime vs ConvertTime
                            
                                False after casting interned strings to objects [duplicate]
                            
                                Sql server - execute stored procedure with readonly permission
                            
                                Verify Autofac registrations in a unit test
                            
                                C# ulong overflow / underflow somehow allowed
                            
                                Howto update only changed properties for an entity via EntityFramework
                            
                                WNetAddConnection2 returns Error 1200 - Local name is valid
                            
                                InvalidOperationException while saving a bitmap and using graphics.copyFromScreen parallel-y
                            
                                Dapper - Handling custom mapping for ddd entity with read-only fields via constructor
                            
                                How can I run a SQL query in C# statement/program with Linqpad?
                            
                                Updating EF entities based on deep JSON data
                            
                                Zoombox from Xceed WPF Toolkit not working
                            
                                What is the difference between `.NET Core` and `.NET Core App`?
                            
                                AJAX error not returning jqXHR.responseText.modelState when using custom dll
                            
                                Does .Net Standard Library supports Windows7 WPF Application?
                            
                                Constructing NameOf expression via SyntaxFactory (Roslyn)
                            
                                .net core 2.0 error running console app on ubuntu
                            
                                Calling Assert.Inconclusive() in an async unit test is reported as a fail

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Accidentally splitting unicode chars when truncating strings

Tags:

string

c#

postgresql

unicode

Joel

People also ask

1 Answers

Wiktor Stribiżew

Recent Activity

Donate For Us