Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Splitting a string by capital letters

Tags:

python

I currently have the following code, which finds capital letters in a string 'formula': http://pastebin.com/syRQnqCP

Now, my question is, how can I alter that code (Disregard the bit within the "if choice = 1:" loop) so that each part of that newly broken up string is put into it's own variable?

For example, putting in NaBr would result in the string being broken into "Na" and "Br". I need to put those in separate variables so I can look them up in my CSV file. Preferably it'd be a kind of generated thing, so if there are 3 elements, like MgSO4, O would be put into a separate variable like Mg and S would be.

If this is unclear, let me know and I'll try and make it a bit more comprehensible... No way of doing so comes to mind currently, though. :(

EDIT: Relevant pieces of code:

Function:

def split_uppercase(string):
x=''
for i in string: 
    if i.isupper(): x+=' %s' %i 
    else: x+=i 
return x.strip()

String entry and lookup:

formula = raw_input("Enter formula: ")
upper = split_uppercase(formula)

#Pull in data from form.csv
weight1 = float(formul_data.get(element1.lower()))
weight2 = float(formul_data.get(element2.lower()))
weight3 = float(formul_data.get(element3.lower()))


weightSum = weight1 + weight2 + weight3
print "Total weight =", weightSum
like image 840
dantdj Avatar asked Nov 27 '22 07:11

dantdj


2 Answers

I think there is a far easier way to do what you're trying to do. Use regular expressions. For instance:

>>> [a for a in re.split(r'([A-Z][a-z]*)', 'MgSO4') if a]
['Mg', u'S', u'O', u'4']

If you want the number attached to the right element, just add a digit specifier in the regex:

>>> [a for a in re.split(r'([A-Z][a-z]*\d*)', txt) if a]
[u'Mg', u'S', u'O4']

You don't really want to "put each part in its own variable". That doesn't make sense in general, because you don't know how many parts there are, so you can't know how many variables to create ahead of time. Instead, you want to make a list, like in the example above. Then you can iterate over this list and do what you need to do with each piece.

like image 123
BrenBarn Avatar answered Dec 13 '22 15:12

BrenBarn


You can use re.split to perform complex splitting on strings.

import re

def split_upper(s):
    return filter(None, re.split("([A-Z][^A-Z]*)", s))

>>> split_upper("fooBarBaz")
['foo', 'Bar', 'Baz']
>>> split_upper("fooBarBazBB")
['foo', 'Bar', 'Baz', 'B', 'B']
>>> split_upper("fooBarBazBB4")
['foo', 'Bar', 'Baz', 'B', 'B4']
like image 34
sigi Avatar answered Dec 13 '22 16:12

sigi