Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sum of certain letters occurrences in alphanumeric string using Excel

I have an array of alphanumeric data used in application testing, and for certain reasons I need to calculate a sum of occurrences of letters from "a" to "f" in each string (this will be used for further data processing):

02599caa0b600 --> should be 4
489455f183c1fb49b --> should be 5
678661081c1h
66410hd2f0kxd94f5bb
8a0339a4417
f6d9f967ts4af6e
886sf7asc3e85ec
03f1fhh3c3a2am
e491b17638m60
1m8h2m07bhaa4tnhbc4
29ma900a80m96m65
ca6a75f505tsac8
956828db8ts7fd1d
cf1d220a59a7851180e
a8b7852xd9e7a9
b85963fbe30718db9976
39b8kx8f85abb1b6
0xxb3b648ab
a8da75f730d45048
588h69d344

This is what strings look like, their length is about 10-30 symbols, and I suppose to have about 3-5k of them daily for processing. Assumptions and limitations:

  1. Case of letters does NOT matter (happily).
  2. The list of letters may change one day, but very much likely still remains a range, e.g. a-k, d-g, etc. - therefore solution should be as much flexible as possible.
  3. Any temporary calculations / ranges are not prohibited, but the shorter the better.
  4. I'd prefer pure Excel solution, but in case it's too complicated - VBA still an option. Nevertheless, complicated Excel formula is better than "2-lines-of-code" VBA - if the 1st works as expected.

Things I've tried so far (as I noticed, that practice here is very much welcome):

  • Searched through already answered questions, but found no Excel-based solutions for anything similar. Other languages / approaches are not an option (except VBA).
  • The best thing I got on my own so far are nested SUBSTITUTE functions, but it's dirty and very straightforward. Assuming the range may change to c-x that'll be a nightmare.
  • I'm not a newbie to Excel, but things like complicated array formulas are still hard nuts for me - alas but true...

Anyway, I do not ask for "ready-to-go" "out-of-box" solution - I ask for help and right direction / approach for self-learning and further understanding of similar problems.

like image 215
Ksenia Avatar asked Dec 12 '22 18:12

Ksenia


1 Answers

You can use SUBSTITUTE without nesting multiple SUBSTITUTE functions, e.g. with text string in A1 this formula in B1 will count all letters a to f (upper or lower case)

=SUMPRODUCT(LEN(A1)-LEN(SUBSTITUTE(LOWER(A1),{"a","b","c","d","e","f"},"")))

for a lengthier list of letters like c to x you could use this version to avoid listing them all

=SUMPRODUCT(LEN(A1)-LEN(SUBSTITUTE(LOWER(A1),CHAR(96+ROW(INDIRECT("3:24"))),"")))

3:24 represents letter 3 (c) to letter 24 (x) so you can easily change that to 1:26 for all letters or 15:25 for o to y etc.

like image 116
barry houdini Avatar answered Apr 06 '23 15:04

barry houdini