Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between shlex.split() and re.split()?

Tags:

python

shlex

So I used shlex.split() recently to split a command as argument to subprocess.Popen() function. I recalled that long back I also used re.split() function to split a string with a specific delimiter specified. Can someone point out what is the essential difference in between them? In which scenario is each function best suited?

like image 230
swordfish Avatar asked Jan 08 '16 14:01

swordfish


People also ask

What is the difference between re split and split in Python?

Running a regular expression means that you are running a state machine for each character. Doing a split with a constant string means that you are just searching for the string. The second is a much less complicated procedure. @eyquem That does search without use of a state machine.

What is Shlex split?

The shlex module defines the following functions: shlex. split (s, comments=False, posix=True) Split the string s using shell-like syntax. If comments is False (the default), the parsing of comments in the given string will be disabled (setting the commenters attribute of the shlex instance to the empty string).

What is re split () in Python?

The re. split() function splits the given string according to the occurrence of a particular character or pattern. Upon finding the pattern, this function returns the remaining characters from the string in a list.

What does Shlex quote do?

shlex. quote() escapes the shell's parsing, but it does not escape the argument parser of the command you're calling, and some additional tool-specific escaping needs to be done manually, especially if your string starts with a dash ( - ).


1 Answers

shlex.split() is designed to work like the shell's split mechanism.

This means doing things like respecting quotes, etc.

>>> shlex.split("this is 'my string' that --has=arguments -or=something") ['this', 'is', 'my string', 'that', '--has=arguments', '-or=something'] 

re.split() will just split on whatever pattern you define.

>>> re.split('\s', "this is 'my string' that --has=arguments -or=something") ['this', 'is', "'my", "string'", 'that', '--has=arguments', '-or=something'] 

Trying to define your own regex to work like shlex.split is needlessly complicated, if it's even possible.

To really see the differences between the two, you can always Use the Source, Luke:

>>> re.__file__ '/usr/lib/python3.5/re.py' >>> shlex.__file__ '/usr/lib/python3.5/shlex.py' 

Open these files in your favorite editor and start poking around, you'll find that they operate quite differently.

like image 127
Wayne Werner Avatar answered Sep 23 '22 18:09

Wayne Werner