Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

split out file name from path in postgres

Tags:

sql

postgresql

I have a field that contains windows file paths, like so:

\\fs1\foo\bar\snafu.txt
c:\this\is\why\i\drink\snafu.txt
\\fs2\bippity\baz.zip
\\fs3\boppity\boo\baz.zip
c:\users\chris\donut.c

What I need to do is find then number of duplicated files names (regardless of what directory they are in). So I want to find "snafu.txt" and "baz.zip", but not donut.c.

Is there a way in PostgreSQL (8.4) to find the last part of a file path? If I can do that, then I can use count/group to find my problem children.

like image 729
Chris Curvey Avatar asked Dec 06 '12 19:12

Chris Curvey


3 Answers

You can easily strip the path up to the last directory separator with an expression like

regexp_replace(path, '^.+[/\\]', '')

This will match the ocassional forward slashes produced by some software as well. Then you just count the remaining file names like

WITH files AS (
    SELECT regexp_replace(my_path, '^.+[/\\]', '') AS filename
    FROM my_table
)
SELECT filename, count(*) AS count
FROM files
GROUP BY filename
HAVING count(*) >= 2;
like image 59
dezso Avatar answered Nov 10 '22 15:11

dezso


select regexp_replace(path_field, '.+/', '') from files_table;
like image 45
Alex Howansky Avatar answered Nov 10 '22 13:11

Alex Howansky


CREATE OR REPLACE FUNCTION basename(text) RETURNS text
    AS $basename$
declare
    FILE_PATH alias for $1;
    ret         text;
begin
    ret := regexp_replace(FILE_PATH,'^.+[/\\]', '');
    return ret;
end;
$basename$ LANGUAGE plpgsql;
like image 31
James Doherty Avatar answered Nov 10 '22 15:11

James Doherty