I have some strings like this
string phoneNumber = "(914) 395-1430";
I would like to strip out the parethenses and the dash, in other word just keep the numeric values.
So the output could look like this
9143951430
How do I get the desired output ?
To find numbers from a given string in Python we can easily apply the isdigit() method. In Python the isdigit() method returns True if all the digit characters contain in the input string and this function extracts the digits from the string. If no character is a digit in the given string then it will return False.
A string represents alphanumeric data. This means that a string can contain many different characters, but they are all considered as if they were text, even if the characters are numbers. A string can also contain spaces.
You do any of the following:
Use regular expressions. You can use a regular expression with either
A negative character class that defines the characters that are what you don't want (those characters other than decimal digits):
private static readonly Regex rxNonDigits = new Regex( @"[^\d]+");
In which case, you can do take either of these approaches:
// simply replace the offending substrings with an empty string private string CleanStringOfNonDigits_V1( string s ) { if ( string.IsNullOrEmpty(s) ) return s ; string cleaned = rxNonDigits.Replace(s, "") ; return cleaned ; } // split the string into an array of good substrings // using the bad substrings as the delimiter. Then use // String.Join() to splice things back together. private string CleanStringOfNonDigits_V2( string s ) { if (string.IsNullOrEmpty(s)) return s; string cleaned = String.Join( rxNonDigits.Split(s) ); return cleaned ; }
a positive character set that defines what you do want (decimal digits):
private static Regex rxDigits = new Regex( @"[\d]+") ;
In which case you can do something like this:
private string CleanStringOfNonDigits_V3( string s ) { if ( string.IsNullOrEmpty(s) ) return s ; StringBuilder sb = new StringBuilder() ; for ( Match m = rxDigits.Match(s) ; m.Success ; m = m.NextMatch() ) { sb.Append(m.Value) ; } string cleaned = sb.ToString() ; return cleaned ; }
You're not required to use a regular expression, either.
You could use LINQ directly, since a string is an IEnumerable<char>
:
private string CleanStringOfNonDigits_V4( string s ) { if ( string.IsNullOrEmpty(s) ) return s; string cleaned = new string( s.Where( char.IsDigit ).ToArray() ) ; return cleaned; }
If you're only dealing with western alphabets where the only decimal digits you'll see are ASCII, skipping char.IsDigit
will likely buy you a little performance:
private string CleanStringOfNonDigits_V5( string s ) { if (string.IsNullOrEmpty(s)) return s; string cleaned = new string(s.Where( c => c-'0' < 10 ).ToArray() ) ; return cleaned; }
Finally, you can simply iterate over the string, chucking the digits you don't want, like this:
private string CleanStringOfNonDigits_V6( string s ) { if (string.IsNullOrEmpty(s)) return s; StringBuilder sb = new StringBuilder(s.Length) ; for (int i = 0; i < s.Length; ++i) { char c = s[i]; if ( c < '0' ) continue ; if ( c > '9' ) continue ; sb.Append(s[i]); } string cleaned = sb.ToString(); return cleaned; }
Or this:
private string CleanStringOfNonDigits_V7(string s) { if (string.IsNullOrEmpty(s)) return s; StringBuilder sb = new StringBuilder(s); int j = 0 ; int i = 0 ; while ( i < sb.Length ) { bool isDigit = char.IsDigit( sb[i] ) ; if ( isDigit ) { sb[j++] = sb[i++]; } else { ++i ; } } sb.Length = j; string cleaned = sb.ToString(); return cleaned; }
From a standpoint of clarity and cleanness of code, the version 1 is what you want. It's hard to beat a one liner.
If performance matters, my suspicion is that the version 7, the last version, is the winner. It creates one temporary — a StringBuilder()
and does the transformation in-place within the StringBuilder's in-place buffer.
The other options all do more work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With