I have a files that follow a specific format which look something like this:
test_0800_20180102_filepath.csv
anotherone_0800_20180101_hello.csv
The numbers in the middle represent timestamps, so I would like to extract that information. I know that there is a specific pattern which will always be _time_date_
, so essentially I want the part of the string that lies between the first and third underscores. I found some examples and somehow similar problems, but I am new to Python and I am having trouble adapting them.
This is what I have implemented thus far:
datetime = re.search(r"\d+_(\d+)_", "test_0800_20180102_filepath.csv")
But the result I get is only the date part:
20180102
But what I actually need is:
0800_20180101
We can use search() method from re module to find the first occurrence of the word and then we can obtain the word using slicing. re.search() method will take the word to be extracted in regular expression form and the string as input and and returns a re.
The substr() method extracts a part of a string. The substr() method begins at a specified position, and returns a specified number of characters. The substr() method does not change the original string. To extract characters from the end of the string, use a negative start position.
That's quite simple:
match = re.search(r"_((\d+)_(\d+))_", your_string)
print(match.group(1)) # print time_date >> 0800_20180101
print(match.group(2)) # print time >> 0800
print(match.group(3)) # print date >> 20180101
Note that for such tasks the group operator ()
inside the regexp is really helpful, it allows you to access certain substrings of a bigger pattern without having to match each one individually (which can sometimes be much more ambiguous than matching a larger one).
The order in which you then access the groups is from 1-n_specified
, where group 0
is the whole matched pattern. Groups themselves are assigned from left to right, as defined in your pattern.
On a side note, if you have control over it, use unix timestamps so you only have one number defining both date and time universally.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With