Some similarly-named threads for this, but still couldn't solve my problem. I need to extract a fixed-length NUMBER value from an Excel string (8 digits in my scenario). Following Excel formula was provided for this purpose:
=MID(A1,FIND("--------",SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1,"0","-"),"1","-"),"2","-"),"3","-"),"4","-"),"5","-"),"6","-"),"7","-"),"8","-"),"9","-")),8)
It does the job, however I have two issues with this:
Most crucially - I'm looking for an exact match. While it does extract the first 8-digit sequence it finds, I'm really after only 8-digit numbers, meaning that 9-digit (or longer) numbers should be ignored (as 7-digit numbers already are). This formula also extracts first 8 digits from a longer number.
Less important, but would be great to only look for numbers starting with 1. So, really just trying to extract this: 1??????? as a numeric value. So something like "a12891212a" or "a 12891212 a" should be extracted, meanwhile 128912120a or 23456789 should not.
If reasonably doable, I'd prefer an Excel formula-based approach compared to VBA. Any help is much appreciated!
This could be done through formula quite alright, but all depends on your Excel version:
1) Excel 2016, you could still use a formula:
Formula in B1
:
=IFERROR(MID(A1,MAX((MID(A1,ROW(A$1:INDEX(A:A,LEN(A1))),1)="1")*(ISNUMBER(--MID(A1,ROW(A$1:INDEX(A:A,LEN(A1))),8)))*(NOT(ISNUMBER(--MID(A1,ROW(A$1:INDEX(A:A,LEN(A1)))+8,1))))*(NOT(ISNUMBER(--MID(A1,ROW(A$1:INDEX(A:A,LEN(A1)))-1,1))))*(ROW(A$1:INDEX(A:A,LEN(A1))))),8),"Nothing found")
Note: This is an array formula and needs to be confirmed through CtrlShiftEnter
2) Excel 2019, using CONCAT()
and FILTERXML()
:
Formula in B1
:
=IFERROR(FILTERXML("<t><s>"&CONCAT(IF(ISNUMBER(--MID(A1,ROW(A$1:INDEX(A:A,LEN(A1))),1)),MID(A1,ROW(A$1:INDEX(A:A,LEN(A1))),1),"</s><s>"))&"</s></t>","//s[starts-with(., '1') and string-length(.) =8]"),"Nothing Found")
Note: This is an array formula and needs to be confirmed through CtrlShiftEnter
3) Excel 365, using previous mentioned functions but including SEQUENCE()
:
Formula in B1
:
=IFERROR(FILTERXML("<t><s>"&LET(X,MID(A1,SEQUENCE(LEN(A1)),1),CONCAT(IF(ISNUMBER(--X),X,"</s><s>")))&"</s></t>","//s[starts-with(., '1') and string-length(.) =8]"),"Nothing Found")
The XPATH
part of the formulas take care of the actual query, looking for strings that start with a '1' and are of a total length of '8'. This would then even work with strings like 'abc123456789abc12345678abc29876543' returning '12345678'.
If you enjoy FILTERXML
and XPATH
, then you might find this interesting.
4) Excel 365, insiders edition (time of writing) using TEXTSPLIT()
:
=LET(X,MID(A1,SEQUENCE(LEN(A1)),1),Y,TEXTSPLIT(A1,IF(ISNUMBER(--X)," ",X),,1),FILTER(Y,(--LEFT(Y)=1)*(LEN(Y)=8),"Nothing Found"))
5) VBA: For if you must use VBA, I guess an UDF is a good option. Something like:
Function GetStr(str As String, pat As String) As String
With CreateObject("vbscript.regexp")
.Pattern = pat
.Global = True
If .Test(str) = True Then
GetStr = .Execute(str)(0).Submatches(0)
Else
GetStr = "Nothing found"
End If
End With
End Function
You can call this in B1
as per =GetStr(A1,"(?:^|\D)(1\d{7})(?:\D|$)")
. This is making use of a regular expression. If you are interested and want to learn more then this is an interesting read for you.
I left the pattern outside the UDF on purpose might you ever want to change it up. The current pattern can be seen in this online Demo, where from left to right the engine will look for:
(?:
- 1st Non-capturing group
^|\D
- Either a start string ancor or anything other than a digit.)
- Close 1st non-capturing group.(
- 1st capture group.
1\d{7}
- Search for a literal 1 followed by 7 digits.)
- Close 1st capture group.(?:
- 2nd Non-capturing group
\D|$
- Either anything other than a digit or an end string ancor.)
- Close 2nd non-capturing group.Here is a simple User Defined Function that looks for sub-strings that are numerals. It creates an array of the sub-strings. It then looks for an element of that array that has length 8 and a leading character of 1:
Option Explicit
Public Function NineD(s As String) As String
Dim L As Long, temp As String, wf As WorksheetFunction
Dim i As Long, arr, a
Set wf = Application.WorksheetFunction
temp = s
L = Len(s)
For i = 1 To L
If Mid(temp, i, 1) Like "[0-9]" Then
Else
temp = wf.Replace(temp, i, 1, " ")
End If
Next i
arr = Split(wf.Trim(temp), " ")
For Each a In arr
If Len(a) = 8 And Left(a, 1) = "1" Then
NineD = a
Exit Function
End If
Next a
End Function
User Defined Functions (UDFs) are very easy to install and use:
If you save the workbook, the UDF will be saved with it. If you are using a version of Excel later then 2003, you must save the file as .xlsm rather than .xlsx
To remove the UDF:
To use the UDF from Excel:
=NineD(A1)
To learn more about macros in general, see:
http://www.mvps.org/dmcritchie/excel/getstarted.htm
and
http://msdn.microsoft.com/en-us/library/ee814735(v=office.14).aspx
and for specifics on UDFs, see:
http://www.cpearson.com/excel/WritingFunctionsInVBA.aspx
Macros must be enabled for this to work!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With