Python - How to split a string by non alpha characters

Tags:

I'm trying to use python to parse lines of c++ source code. The only thing I am interested in is include directives.

    #include "header.hpp"

I want it to be flexible and still work with poor coding styles like:

          #   include"header.hpp"

I have gotten to the point where I can read lines and trim whitespace before and after the #. However I still need to find out what directive it is by reading the string until a non-alpha character is encountered regardless of weather it is a space, quote, tab or angled bracket.

So basically my question is: How can I split a string starting with alphas until a non alpha is encountered?

I think I might be able to do this with regex, but I have not found anything in the documentation that looks like what I want.

Also if anyone has advice on how I would get the file name inside the quotes or angled brackets that would be a plus.

815

asked Feb 05 '16 18:02

nickeb96

2 Answers

Your instinct on using regex is correct.

import re
re.split('[^a-zA-Z]', string_to_split)

The [^a-zA-Z] part means "not alphabetic characters".

185

answered Oct 03 '22 19:10

nlloyd

You can do that with a regex. However, you can also use a simple while loop.

def splitnonalpha(s):
   pos = 1
   while pos < len(s) and s[pos].isalpha():
      pos+=1
   return (s[:pos], s[pos:])

Test:

>>> splitnonalpha('#include"blah.hpp"')
('#include', '"blah.hpp"')

answered Oct 03 '22 20:10

kfx

Related questions
                            
                                pip install dryscrape fails with "error: [Errno 2] No such file or directory: 'src/webkit_server'"?
                            
                                how to NOT read_csv if csv is empty
                            
                                Python scripts in /usr/bin
                            
                                Python not recognising directories os.path.isdir() [duplicate]
                            
                                How do I detect collision in pygame?
                            
                                Installed Nose but cannot use on command line
                            
                                How to configure Atom to run Python3 scripts?
                            
                                Django 2, python 3.4 cannot decode urlsafe_base64_decode(uidb64)
                            
                                Reading/Writing MS Word files in Python
                            
                                Search a list of strings for any sub-string from another list
                            
                                error: Setup script exited with error: command 'gcc' failed with exit status 1
                            
                                Scrapy - logging to file and stdout simultaneously, with spider names
                            
                                combine two arrays and sort
                            
                                get user profile in django
                            
                                Find dictionary keys with duplicate values
                            
                                Kivy does not detect OpenGL 2.0 [closed]
                            
                                Make User email unique django
                            
                                ValueError: no such test method in <class 'myapp.tests.SessionTestCase'>: runTest
                            
                                Conversion of curl to python Requests
                            
                                how to use word_tokenize in data frame

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python - How to split a string by non alpha characters

Tags:

python

string

regex

parsing

nickeb96

People also ask

2 Answers

nlloyd

kfx

Recent Activity

Donate For Us