I wrote the following code for computing character bigrams and the output is right below. My question is, how do I get an output that excludes the last character (ie t)? and is there a quicker and more efficient method for computing character n-grams? <pre class="prettyprint"><code>b='student' >>> y=[] >>> for x in range(len(b)): n=b[x:x+2] y.append(n) >>> y ['st', 'tu', 'ud', 'de', 'en', 'nt', 't'] </code></pre> Here is the result I would like to get:<code>['st','tu','ud','de','nt]</code> Thanks in advance for your suggestions.

To generate bigrams: <pre class="prettyprint"><code>In [8]: b='student' In [9]: [b[i:i+2] for i in range(len(b)-1)] Out[9]: ['st', 'tu', 'ud', 'de', 'en', 'nt'] </code></pre> To generalize to a different <code>n</code>: <pre class="prettyprint"><code>In [10]: n=4 In [11]: [b[i:i+n] for i in range(len(b)-n+1)] Out[11]: ['stud', 'tude', 'uden', 'dent'] </code></pre>

Quick implementation of character n-grams for word

Tags:

I wrote the following code for computing character bigrams and the output is right below. My question is, how do I get an output that excludes the last character (ie t)? and is there a quicker and more efficient method for computing character n-grams?

b='student' >>> y=[] >>> for x in range(len(b)):     n=b[x:x+2]     y.append(n) >>> y ['st', 'tu', 'ud', 'de', 'en', 'nt', 't']

Here is the result I would like to get:['st','tu','ud','de','nt]

Thanks in advance for your suggestions.

592

asked Sep 06 '13 12:09

Tiger1

1 Answers

To generate bigrams:

In [8]: b='student'  In [9]: [b[i:i+2] for i in range(len(b)-1)] Out[9]: ['st', 'tu', 'ud', 'de', 'en', 'nt']

To generalize to a different n:

In [10]: n=4  In [11]: [b[i:i+n] for i in range(len(b)-n+1)] Out[11]: ['stud', 'tude', 'uden', 'dent']

170

answered Sep 21 '22 15:09

NPE

Related questions
                            
                                svn commit fails: File not found: Transaction »52-1r«
                            
                                Binding service by BroadcastReceiver
                            
                                How to exclude multiple SLF4J bindings to LOG4J
                            
                                thumbnailImageAtTime: now deprecated - What's the alternative?
                            
                                Create virtualenv in existing directory without creating a "local" directory
                            
                                How to do a mouse over using selenium webdriver to see the hidden menu without performing any mouse clicks?
                            
                                Haskell cabal-install errors
                            
                                I need CSS3 transition to work in IE9
                            
                                Why do alternate delimiters not work with sed -e '/pattern/s/a/b/'? [duplicate]
                            
                                Hibernate Delete Error: Batch Update Returned Unexpected Row Count
                            
                                Entering Route53 Nameservers gives me errors on Godaddy
                            
                                How to remove leading zeroes from day and month values in Oracle, when parsing to string using to_char function?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Quick implementation of character n-grams for word

Tags:

Tiger1

People also ask

1 Answers

NPE

Recent Activity

Donate For Us