I have a text file for processing, which has some numbers. I want JUST text in it, and nothing else. I managed to remove the punctuation marks, but how do I remove the numbers? I want this using C# code.
Also, I want to remove words with length greater than 10. How do I do that using Reg Expressions?
There's no way to remove an number from a file except by writing out the whole file again, but without the number you want to delete.
To remove dot and number at the end of the string, we can use gsub function. It will search for the pattern of dot and number at the end of the string in the vector then removal of the pattern can be done by using double quotes without space. After that the vector will be passed as shown in the below examples.
To delete nth digit from starting:Count the number of digits. Loop number of digits time by counting it with a variable i. If the i is equal to (number of digits – n), then skip, else add the ith digit as [ new_number = (new_number * 10) + ith_digit ].
You can do this with a regex:
string withNumbers = // string with numbers
string withoutNumbers = Regex.Replace(withNumbers, "[0-9]", "");
Use this regex to remove words with more than 10 characters:
[\w]{10, 100}
100 defines the max length to match. I don't know if there is a quantifier for min length...
Only letters and nothing else (because I see you also want to remove the punctuation marks)
Regex.IsMatch(input, @"^[a-zA-Z]+$");
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With