I have a pandas
dataframe that contains a column 'iso' containing chemical isotope symbols, such as '4He', '16O', '197Au'. I want to label many (but not all) isotopes on a plot using the annotate()
function in matplotlib
. The label format should have the atomic mass in superscript. I can do this with the LaTeX style formatting:
axis.annotate('$^{4}$He', xy=(x, y), xycoords='data')
I could write dozens of annotate()
statements like the one above for each isotope I want to label, but I'd rather automate.
How can I extract the isotope number and name from my iso column?
With those pieces extracted I can make the labels. Lets say we dump them into the variables Num
and Sym
. Now I can loop over my isotopes and do something like this:
for i in list_of_isotopes:
(Num, Sym) = df[df.iso==i].iso.str.MISSING_STRING_METHOD(???)
axis.annotate('$^{%s}$%s' %(Num, Sym), xy=(x[Num], y[Num]), xycoords='data')
Presumably, there is a pandas
string methods that I can drop into the above. But I'm having trouble coming up with a solution. I've been trying split()
and extract()
with a few different patterns, but can't get the desired effect.
This is my answer using split
. The regexp used can be improved, I'm very bad at that sort of things :-)
(\d+)
stands for the integers, and ([A-Za-z]+)
stands for the strings.
df = pd.DataFrame({'iso': ['4He', '16O', '197Au']})
result = df['iso'].str.split('(\d+)([A-Za-z]+)', expand=True)
result = result.loc[:,[1,2]]
result.rename(columns={1:'x', 2:'y'}, inplace=True)
print(result)
Produces
x y
0 4 He
1 16 O
2 197 Au
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With