Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can you efficiently remove duplicate characters from a string?

Is it possible to remove duplicate characters from a string without saving each character you've seen in an array and checking to see if new characters are already in that array? That seems highly inefficient. Surely there must be a quicker method?

like image 528
Chris Avatar asked Dec 18 '22 03:12

Chris


2 Answers

You can use a boolean array indexed by character:

bool seen[256];

For byte-sized ASCII-like characters, the above would be appropriate. For 16-bit Unicode:

bool seen[65536];

and so on. Then, for each character in the string it's a simple lookup to see whether that boolean has already been set.

like image 173
Greg Hewgill Avatar answered Dec 28 '22 08:12

Greg Hewgill


Using linq

string someString = "Something I wrote quickly";
char[] distinctChars = someString.ToCharArray().Distinct();
string newString = new string(distinctChars);
like image 43
Daniel A. White Avatar answered Dec 28 '22 10:12

Daniel A. White