I am brand new to Python and have been working with it for a few weeks. I have a list of strings and want to remove the first four and last four characters of each string. OR, alternatively, removing specific character patterns (not just specific characters).
I have been looking through the archives here but don't seem to find a question that matches this one. Most of the solutions I have found are better suited to removing specific characters.
Here's the strings list I'm working with:
sites=['www.hattrick.com', 'www.google.com', 'www.wampum.net', 'www.newcom.com']
What I am trying to do is to isolate the domain names and get
[hattrick, google, wampum, newcom]
This question is NOT about isolating domain names from URLs (I have seen the questions about that), but rather about editing specific characters in strings in lists based upon location or pattern.
So far, I've tried .split, .translate, .strip but these don't seem appropriate for what I am trying to do because they either remove too many characters that match the search, aren't good for recognizing a specific pattern/grouping of characters, or cannot work with the location of characters within a string.
Any questions and suggestions are greatly appreciated, and I apologize if I'm asking this question the wrong way etc.
You can use Python's regular expressions to remove the first n characters from a string, using re's . sub() method. This is accomplished by passing in a wildcard character and limiting the substitution to a single substitution.
To delete the first or last n characters from a string, this is what you need to do: On the Ablebits Data tab, in the Text group, click Remove > Remove by Position. On the add-in's pane, select the target range, specify how many characters to delete, and hit Remove.
The idea is to use the deleteCharAt() method of StringBuilder class to remove first and the last character of a string. The deleteCharAt() method accepts a parameter as an index of the character you want to remove.
Reading your subject, this is an answer, but maybe not what you are looking for.
for site in sites:
print(site[:4]) # www .
print(site[-4:]) # .com / .net / ...
You could also use regex:
import re
re.sub('^www\.','',sites[0]) # removes 'www.' if exists
re.sub('\.\w+$','',sites[0]) # removes chars after last dot & dot
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With