Using J language, I wish to attain a mapping of the counts of elements of an array.
Specifically, I want to input a lowercased English word with two to many letters and get back each pair of letters in the word along with counts of occurences.
I need a verb that gives something like this, in whatever J structure you think is appropriate:
For 'cocoa':
co 2
oc 1
oa 1
For 'banana':
ba 1
an 2
na 2
For 'milk':
mi 1
il 1
lk 1
For 'to':
to 1
(For single letter words like 'a', the task is undefined and will not be attempted.)
(Order is not important, that's just how I happened to list them.)
I can easily attain successive pairs of letters in a word as a matrix or list of boxes:
2(] ;._3)'cocoa'
co
oc
co
oa
]
2(< ;._3)'cocoa'
┌──┬──┬──┬──┐
│co│oc│co│oa│
└──┴──┴──┴──┘
But I need help getting from there to a mapping of pairs to counts.
I am aware of ~. and ~: but I don't just want to return the unique elements or indexes of duplicates. I want a mapping of counts.
NuVoc's "Loopless" page is indicating that / (or /\. or /\) are where I should be looking for accumulation problems. I am familiar with / for arithmetic operations on numeric arrays, but for u/y I don't know what u would have to be to accumulate the list of pairs of letters that would make up y.
(NB. I can already do this in "normal" languages like Java or Python without help. Similar questions on SO are for languages with very different syntax and semantics to J. I am interested in the idiomatic J approach to this sort of problem.)
To get the list of 2-letter combinations I'd use dyadic infix (\
):
2 ]\ 'banana'
ba
an
na
an
na
To count occurrences the primitive that immediately comes to mind is key (/.
)
#/.~ 2 ]\ 'banana'
1 2 2
If you want to match the counts to the letter combinations you can extend the verb to the following fork:
({. ; #)/.~ 2 ]\ 'banana'
┌──┬─┐
│ba│1│
├──┼─┤
│an│2│
├──┼─┤
│na│2│
└──┴─┘
I think that you are looking to map counts of unique items to the items. You can correct me if I am wrong.
Starting with
[t=. 2(< ;._3)'cocoa'
┌──┬──┬──┬──┐
│co│oc│co│oa│
└──┴──┴──┴──┘
You can use ~.
(Nub) to return the unique items in the list
~.t
┌──┬──┬──┐
│co│oc│oa│
└──┴──┴──┘
Then if you compare the nub to the boxed list you get a matrix where the 1's are the positions that match the nub to the boxed pairs in your string
t =/ ~.t
1 0 0
0 1 0
1 0 0
0 0 1
Sum the columns of this matrix and you get the number of times each item of the nub shows up
+/ t =/ ~.t
2 1 1
Then box them so that you can combine the integers along side the boxed characters
<"0 +/ t =/ ~.t
┌─┬─┬─┐
│2│1│1│
└─┴─┴─┘
Combine them by stitching together the nub and the count using ,.
(Stitch)
(~.t) ,. <"0 +/ t =/ ~.t
┌──┬─┐
│co│2│
├──┼─┤
│oc│1│
├──┼─┤
│oa│1│
└──┴─┘
[t=. 2(< ;._3)'banana'
┌──┬──┬──┬──┬──┐
│ba│an│na│an│na│
└──┴──┴──┴──┴──┘
(~.t) ,. <"0 +/ t =/ ~.t
┌──┬─┐
│ba│1│
├──┼─┤
│an│2│
├──┼─┤
│na│2│
└──┴─┘
[t=. 2(< ;._3)'milk'
┌──┬──┬──┐
│mi│il│lk│
└──┴──┴──┘
(~.t) ,. <"0 +/ t =/ ~.t
┌──┬─┐
│mi│1│
├──┼─┤
│il│1│
├──┼─┤
│lk│1│
└──┴─┘
Hope this helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With