Let's say I have two path names: head and tail. They can overlap with any number of segments. If they don't I'd like to just join them normally. If they overlap, I'd like to detect the common part and combine them accordingly. To be more specific: If there are repetitions in names I'd like to find as long overlapping part as possible. Example
"/root/d1/d2/d1/d2" + "d2/d1/d2/file.txt" == "/root/d1/d2/d1/d2/file.txt"
and not "/root/d1/d2/d1/d2/d1/d2/file.txt"
Is there any ready-to-use library function for such case, or I have to implement one?
path. join() method in Python join one or more path components intelligently. This method concatenates various path components with exactly one directory separator ('/') following each non-empty part except the last path component.
The basic idea is 1) first take input_start to test_start (if both of them are not equal and input_start is min) 2) always take test_start and test_end 3) take test_end to input_end if test_end is less than input end (and end_input and end_test are not equal).
I would suggest you to use difflib.SequenceMatcher followed by get_matching_blocks
>>> p1, p2 = "/root/d1/d2/d1/d2","d2/d1/d2/file.txt"
>>> sm = difflib.SequenceMatcher(None,p1, p2)
>>> size = sm.get_matching_blocks()[0].size
>>> path = p1 + p2[size:]
>>> path
'/root/d1/d2/d1/d2/file.txt'
Ans a General solution
def join_overlapping_path(p1, p2):
sm = difflib.SequenceMatcher(None,p1, p2)
p1i, p2i, size = sm.get_matching_blocks()[0]
if not p1i or not p2i: None
p1, p2 = (p1, p2) if p2i == 0 else (p2, p1)
size = sm.get_matching_blocks()[0].size
return p1 + p2[size:]
Execution
>>> join_overlapping_path(p1, p2)
'/root/d1/d2/d1/d2/file.txt'
>>> join_overlapping_path(p2, p1)
'/root/d1/d2/d1/d2/file.txt'
You can use a list comprehension within join
function :
>>> p1="/root/d1/d2/d1/d2"
>>> p2="d2/d1/d2/file.txt"
>>> p1+'/'+'/'.join([i for i in p2.split('/') if i not in p1.split('/')])
'/root/d1/d2/d1/d2/file.txt'
Or if the difference is just the base name of second path you can use os.path.basename
to get the bname and concatenate it to p1
:
>>> import os
>>> p1+'/'+os.path.basename(p2)
'/root/d1/d2/d1/d2/file.txt'
I think this works:
p1 = "/root/d1/d2/d1/d2"
p2 = "d2/d1/d2/file.txt"
def find_joined_path(p1, p2):
for i in range(len(p1)):
if p1[i:] == p2[:len(p1) - i]:
return p1[:i] + p2
print(find_joined_path(p1, p2))
Note that it's a general solution that works for any two strings, so it may not be as optimized as a solution that works only with file paths.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With