I'm trying to transform a Python string from its original form to its vowel/consonant combinations.
Eg - 'Dog' becomes 'cvc' and 'Bike' becomes 'cvcv'
In R I was able to employ the following method:
con_vowel <- gsub("[aeiouAEIOU]","V",df$col_name)
con_vowel <- gsub("[^V]","C",con_vowel)
df[["composition"]] <- con_vowel
This would assess whether the character is vowel and if true assign the character 'V', then assess that string and replace anything that wasn't 'V' with 'C', then place the results into a new column called 'composition' within the dataframe.
In Python I have written some code in an attepmpt to replicate the functionality but it does not return the desired result. Please see below.
word = 'yoyo'
for i in word.lower():
if i in "aeiou":
word = i.replace(i ,'v')
else: word = i.replace(i ,'c')
print(word)
The theory here is that each character would be evaluated and, if it isn't a vowel, then by deduction it must be a consonant. However the result I get is:
v
I underastand why this is happening, but I am no clearer as to how to achieve my desired result.
Please note that I also need the resultant code to be applied to a dataframe column and create a new column from these results.
If you could explain the workings of your answer it would help me greatly.
Thanks in advance.
Source Code: string = input("Enter any string: ") if string == 'x': exit(); else: newstr = string; print("\nRemoving vowels from the given string"); vowels = ('a', 'e', 'i', 'o', 'u'); for x in string. lower(): if x in vowels: newstr = newstr.
There's a method for this; it's translate
. It's both efficient and defaults to pass values that are not found in your translation table (like ' '
).
You can use the string
library to get all of the consonants if you want.
import pandas as pd
import string
df = pd.DataFrame(['Cat', 'DOG', 'bike', 'APPLE', 'foo bar'], columns=['words'])
vowels = 'aeiouAEIOU'
cons = ''.join(set(string.ascii_letters).difference(set(vowels)))
trans = str.maketrans(vowels+cons, 'v'*len(vowels)+'c'*len(cons))
df['translated'] = df['words'].str.translate(trans)
words translated
0 Cat cvc
1 DOG cvc
2 bike cvcv
3 APPLE vcccv
4 foo bar cvv cvc
It's made for exactly this, so it's fast.
# Supporting code
import perfplot
import pandas as pd
import string
def with_translate(s):
vowels = 'aeiouAEIOU'
cons = ''.join(set(string.ascii_letters).difference(set(vowels)))
trans = str.maketrans(vowels+cons, 'v'*len(vowels)+'c'*len(cons))
return s.str.translate(trans)
def with_replace(s):
return s.replace({"[^aeiouAEIOU]":'c', '[aeiouAEIOU]':'v'}, regex=True)
perfplot.show(
setup=lambda n: pd.Series(np.random.choice(['foo', 'bAR', 'foobar', 'APPLE', 'ThisIsABigWord'], n)),
kernels=[
lambda s: with_translate(s),
lambda s: with_replace(s),
],
labels=['Translate', 'Replace'],
n_range=[2 ** k for k in range(19)],
equality_check=None,
xlabel='len(s)'
)
You can use replace
with regex=True
:
words = pd.Series(['This', 'is', 'an', 'Example'])
words.str.lower().replace({"[^aeiou]":'c', '[aeiou]':'v'}, regex=True)
Output:
0 ccvc
1 vc
2 vc
3 vcvcccv
dtype: object
use string.replace with some regex to avoid the loop
df = pd.DataFrame(['Cat', 'DOG', 'bike'], columns=['words'])
# use string.replace
df['new_word'] = df['words'].str.lower().str.replace(r"[^aeiuo]", 'c').str.replace(r"[aeiou]", 'v')
print(df)
words new_word
0 Cat cvc
1 DOG cvc
2 bike cvcv
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With