Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Evaluate consonant/vowel composition of word string in Python

I'm trying to transform a Python string from its original form to its vowel/consonant combinations.

Eg - 'Dog' becomes 'cvc' and 'Bike' becomes 'cvcv'

In R I was able to employ the following method:

   con_vowel <- gsub("[aeiouAEIOU]","V",df$col_name)
   con_vowel <- gsub("[^V]","C",con_vowel)
   df[["composition"]] <- con_vowel

This would assess whether the character is vowel and if true assign the character 'V', then assess that string and replace anything that wasn't 'V' with 'C', then place the results into a new column called 'composition' within the dataframe.

In Python I have written some code in an attepmpt to replicate the functionality but it does not return the desired result. Please see below.

word = 'yoyo'


for i in word.lower():
    if i in "aeiou":
       word = i.replace(i ,'v')
    else: word = i.replace(i ,'c')
print(word)

The theory here is that each character would be evaluated and, if it isn't a vowel, then by deduction it must be a consonant. However the result I get is:

v

I underastand why this is happening, but I am no clearer as to how to achieve my desired result.

Please note that I also need the resultant code to be applied to a dataframe column and create a new column from these results.

If you could explain the workings of your answer it would help me greatly.

Thanks in advance.

like image 585
jimiclapton Avatar asked May 04 '20 18:05

jimiclapton


People also ask

How do you check a vowel in a string in python?

Source Code: string = input("Enter any string: ") if string == 'x': exit(); else: newstr = string; print("\nRemoving vowels from the given string"); vowels = ('a', 'e', 'i', 'o', 'u'); for x in string. lower(): if x in vowels: newstr = newstr.


3 Answers

There's a method for this; it's translate. It's both efficient and defaults to pass values that are not found in your translation table (like ' ').

You can use the string library to get all of the consonants if you want.

import pandas as pd
import string

df = pd.DataFrame(['Cat', 'DOG', 'bike', 'APPLE', 'foo bar'], columns=['words'])

vowels = 'aeiouAEIOU'
cons = ''.join(set(string.ascii_letters).difference(set(vowels)))
trans = str.maketrans(vowels+cons, 'v'*len(vowels)+'c'*len(cons))

df['translated'] = df['words'].str.translate(trans)

     words translated
0      Cat        cvc
1      DOG        cvc
2     bike       cvcv
3    APPLE      vcccv
4  foo bar    cvv cvc

It's made for exactly this, so it's fast.

enter image description here

# Supporting code
import perfplot
import pandas as pd
import string

def with_translate(s):
    vowels = 'aeiouAEIOU'
    cons = ''.join(set(string.ascii_letters).difference(set(vowels)))
    trans = str.maketrans(vowels+cons, 'v'*len(vowels)+'c'*len(cons))

    return s.str.translate(trans)


def with_replace(s):
    return s.replace({"[^aeiouAEIOU]":'c', '[aeiouAEIOU]':'v'}, regex=True)


perfplot.show(
    setup=lambda n: pd.Series(np.random.choice(['foo', 'bAR', 'foobar', 'APPLE', 'ThisIsABigWord'], n)), 
    kernels=[
        lambda s: with_translate(s),
        lambda s: with_replace(s),
    ],
    labels=['Translate', 'Replace'],
    n_range=[2 ** k for k in range(19)],
    equality_check=None,  
    xlabel='len(s)'
)
like image 59
ALollz Avatar answered Sep 28 '22 02:09

ALollz


You can use replace with regex=True:

words = pd.Series(['This', 'is', 'an', 'Example'])
words.str.lower().replace({"[^aeiou]":'c', '[aeiou]':'v'}, regex=True)

Output:

0       ccvc
1         vc
2         vc
3    vcvcccv
dtype: object
like image 41
Quang Hoang Avatar answered Sep 28 '22 04:09

Quang Hoang


use string.replace with some regex to avoid the loop

df = pd.DataFrame(['Cat', 'DOG', 'bike'], columns=['words'])
# use string.replace
df['new_word'] = df['words'].str.lower().str.replace(r"[^aeiuo]", 'c').str.replace(r"[aeiou]", 'v')
print(df)

  words new_word
0   Cat      cvc
1   DOG      cvc
2  bike     cvcv
like image 44
It_is_Chris Avatar answered Sep 28 '22 03:09

It_is_Chris