Extraction of a Group of Digits and Dashes, from postings by Harlan Grove

Location: http://www.mvps.org/dmcritchie/excel/grove_digitsid.htm      
Code: http://www.mvps.org/dmcritchie/excel/code/grove_digitsid.txt
Home page: http://www.mvps.org/dmcritchie/excel/excel.htm
[View without Frames]

Extract Digits

Harlan Grove posted several User Defined Functions (UDF) involving extraction of digits and dashes from a another cell, such as might be found with a catalog description and number combined in some manner in a cell where one might want to extract the catalog number.

Examples

 ABCDEFGHIJK
 1 Descriptive  DD-All Digits-1st DD-1st  LDDid DD1st DD2nd  DD3rd DD-0 DD-M1  DD-M2
 2 3   3 3 3 3 3 > >  3 3 <
 3 0000 1-3  00001-3 0000 0000 0000 0000  1-3 > 0000 1-3 0000
 4 123 abc 33  12333 123 123 123 123 33  > 123 33 123
 5 33 abc 1-2-3-4  331-2-3-4 33 33 1-2-3-4 33  1-2-3-4 > 33 1-2-3-4 33
 6 a12 b34 c 56 d 78  12345678 12 12 12 12 34  56 12 78 56
 7 ab1-3b111cc  1-3111 1 1-3 1-3 1-3 111  > 1-3 111 1-3
 8 abc            
 9 abc -33 567-90  -33567-90 33 -33 567-90 -33  567-90 > -33 567-90 -33
10  beige desk 33, 00-434-9901  3300-434-9901 33  33 00-434-9901 33 00-434-9901 >  33 00-434-9901 33
11   B2: =personal.xls!DigitsDashesAll($A2)     
12   C2: =personal.xls!DigitsFirstID($A2)     
13   D2: =personal.xls!DigitsDashes1stID($A2)     
14   E2: =personal.xls!LongestDigitsDashesID($A2)     
15   F2: =personal.xls!DigitisDashesNthID($A2,1)     
16   G2: =personal.xls!DigitisDashesNthID($A2,2)     
17   H2: =personal.xls!DigitisDashesNthID($A2,3)     
18   I2: =personal.xls!DigitisDashesNthID($A2,0)     
19   J2: =personal.xls!DigitisDashesNthID($A2,-1)     
20   K2: =personal.xls!DigitisDashesNthID($A2,-2)     

Code and remaining topics (#code)

The subroutines below can be found in the code area

Code: http://www.mvps.org/dmcritchie/excel/code/grove_digitsid.txt

Extract all Digits and Dashes   (#DigitsDashesAll)

Concatenate all digits and dashes found in another cell.
Function DigitsDashesAll(ByVal s As String) As String
 'Harlan Grove, worksheet.functions, 2003-10-20
 'concatenate all digits and dashes found in a string
 Dim i as long, n as long
 n = Len(s)
 For i = 1 To n
  If Mid(s, i, 1) Like "[!-0-9]" Then Mid(s, i, 1) = " "
 Next i
 DigitsDashesAll = Application.WorksheetFunction.Substitute(s, " ", "")
End Function

Extract the First Set of Digits  (#DigitsFirstID)

Extract the first set of digits found in another cell.
Function DigitsFirstID(s As String) As String
  'Harlan Grove, worksheet.functions, 2003-10-20
  'extract first string of  digits,  based on
  '-- http://google.com/groups?threadm=_RKkb.24635%24cJ5.3777@www.newsranger.com
    Dim i As Long, j As Long, n As Long
    n = Len(s)
    i = 1
    Do While i <= n And Mid(s, i, 1) Like "[!0-9]"
        i = i + 1
    Loop
    j = i + 1
    Do While j <= n And Mid(s, j, 1) Like "[0-9]"
        j = j + 1
    Loop
    DigitsFirstID = Mid(s, i, j - i)
End Function

Extract the First Set of Digits and Dashes   (#DigitsDashes1stID)

Extract the first set of digits and dashes found in another cell.
Function DigitsDashes1stID(s As String) As String
  'get the longest continuous string of digits and dashes, based on
  'Harlan Grove, worksheet.functions, 2003-10-20
  'extract first string of  digits and dashes
  '-- http://google.com/groups?threadm=_RKkb.24635%24cJ5.3777@www.newsranger.com
    Dim i As Long, j As Long, n As Long
    n = Len(s)
    i = 1
    Do While i <= n And Mid(s, i, 1) Like "[!-0-9]"
        i = i + 1
    Loop
    j = i + 1
    Do While j <= n And Mid(s, j, 1) Like "[-0-9]"
        j = j + 1
    Loop
    DigitsDashes1stID = Mid(s, i, j - i)
End Function

Extract longest set of Digits and Dashes   (#LongestDigitsDashesID)

Extract the longest set of digits found in another cell.
Function LongestDigitsDashesID(s As String) As String
'Harlan Grove, worksheet.functions, 2003-10-20
'extract longest string of contiguous digits and dashes
'-- http://google.com/groups?threadm=G%Xkb.24725%24cJ5.3903@www.newsranger.com
Dim i As Long, j As Long, k As Long, n As Long
j = 0  'unnecessary but pedantic
k = 0  'unnecessary but pedantic
n = Len(s)
Do While i <= n
  i = j + 1
  Do While i <= n And Mid(s, i, 1) Like "[!-0-9]"
    i = i + 1
  Loop
  j = i + 1
  Do While j <= n And Mid(s, j, 1) Like "[-0-9]"
    j = j + 1
  Loop
  If j - i > k Then
    k = j - i
    LongestDigitsDashesID = Mid(s, i, k)
  End If
Loop
End Function

Extract specific set of Digits and Dashes «   (#DigitsDashesNthID)

Extract the nth set of digits and dashes found in another cell.  If the parameter for the nth set is too high the function will return a greater than sign (>), Negative parameter numbers will extract the nth set of digits and dashes found from the right and if the parameter is invalid will return a less than sign (<).

This single function can be used in place of several of the earlier functions. 
  -2 return the next to the last set (penultimate) of digits/dashes, not covered in the earlier UDF.
  -1 return the last set of digits/dashes, not covered in the earlier UDF.
   0 or defaulted as empty (treated same as 1)
   1 return the first set of digits/dashes
   2 return the second set of digits/dashes

Function DigitisDashesNthID(ByVal s As String, Optional n As Long = 1) As Variant
'Harlan Grove, 2003-10-20, worksheet.functions
' http://google.com/groups?threadm=zkWkb.24705%24cJ5.3911%40www.newsranger.com
Dim i As Long, j As Long, k As Long, m As Long, rv As Variant
If Not s Like "*[-0-9]*" Then
DigitisDashesNthID = IIf(n = 0, Array(""), "")
Exit Function
End If
s = s & " "
m = Len(s)
j = 0
k = 0
ReDim rv(1 To Int(m / 2))
For i = 1 To m
  If Mid(s, i, 1) Like "[!-0-9]" And j > 0 Then
    k = k + 1
    rv(k) = Mid(s, j, i - j)
    j = 0
  ElseIf Mid(s, i, 1) Like "[-0-9]" And j = 0 Then
    j = i
  End If
Next i
ReDim Preserve rv(1 To k)
If n = 0 Then
  DigitisDashesNthID = rv  'return an array of all [-0-9] substrings
ElseIf 1 <= n And n <= k Then
  DigitisDashesNthID = rv(n)  'n_th [-0-9] substring from the left
ElseIf -k <= n And n <= -1 Then
  DigitisDashesNthID = rv(k + 1 + n)  
	 'ABS(n)_th [-0-9] substring from the right
Else
  DigitisDashesNthID = IIf(n > 0, ">", "<")  
	 'no n_th or ABS(n)_th [-0-9] substring
End If
End Function

Subroutine to Extract a Number preceded by an arithmetic operator (#extractnumber_afteroperator)

Extract from the string any numeric value that is preceded by an arithmetic operator. (Ron Rosenfeld 2009-05-08)
Option Explicit
Private Sub ExtrConstants(str As String)
Dim re As Object, mc As Object, m As Object
Set re = CreateObject("vbscript.regexp")
    re.Global = True
    re.Pattern = "[-+/*^](\b\d*\.?\d+\b)"
If re.test(str) = True Then
    Set mc = re.Execute(str)
        For Each m In mc
          Debug.Print m.SubMatches(0)
        Next m
End If
End Sub

Subroutine to leave only Digits and Dashes of longest such word   (#LeaveDigits_andDashes)

Sub LeaveDigits_andDashes()
  Dim cell As Range   '2003-10-18 dmcritchie, misc, modified
  If Intersect(Selection, Selection.SpecialCells(xlConstants, _
         xlTextValues)) Is Nothing Then Exit Sub
  Application.ScreenUpdating = False
  Application.Calculation = xlCalculationManual   'pre XL97 xlManual
  For Each cell In Intersect(Selection, _
            Selection.SpecialCells(xlConstants, xlTextValues))
    cell.Value = "'" & LongestDigitsDashesID
  Next cell
  Application.Calculation = xlCalculationAutomatic  'pre XL97 xlAutomatic
  Application.ScreenUpdating = True
End Sub

Function and associated Subroutine to Remove only Digits (#RemoveDigitsAll)

001abc004 --> abc
9345 --> (empty)
abc-def --> abc-def
Function RemoveDigitsAll(ByVal s As String) As String
 'based on Harlan Grove, worksheet.functions, 2003-10-20
 'concatenate all non digits   found in a string
 Dim i As Long, n As Long
 n = Len(s)
 For i = 1 To n
  If Mid(s, i, 1) Like "[0-8]" Then Mid(s, i, 1) = "9"
 Next i
 RemoveDigitsAll = Application.WorksheetFunction.Substitute(s, "9", "")
End Function

Sub LeaveNonDigits()
  Dim cell As Range   '2003-10-18 dmcritchie, misc, modified
  Dim rng As Range
  On Error Resume Next
  Set rng = Intersect(Selection, Selection.SpecialCells(xlConstants))
  On Error GoTo 0
  If rng Is Nothing Then Exit Sub
  Application.ScreenUpdating = False
  Application.Calculation = xlCalculationManual   'pre XL97 xlManual
  For Each cell In rng
    cell.Value = "'" & RemoveDigitsAll(cell.Value)
  Next cell
  Application.Calculation = xlCalculationAutomatic  'pre XL97 xlAutomatic
  Application.ScreenUpdating = True
End Sub

Worksheet Functions to Extract Numbers (#extract)

To Extract number at beginning of a cell, ignore leading spaces, picks up decimal point
  =LOOKUP(9.99999999999999E+307,--LEFT(A1,ROW(INDIRECT("1:"&LEN(A1)))))

Worksheet Functions (#FIND)

 ABCDE
 1 description  count     description  =LEFT(A1,FIND("-",A1&"-")-1)
 2 IL-George  1   IL  =LEFT(A2,FIND("-",A2&"-")-1)
 3 IL-Harry  2   IL  =LEFT(A3,FIND("-",A3&"-")-1)
 4 IL-Tracy  4   IL  =LEFT(A4,FIND("-",A4&"-")-1)
 5 PA  8   PA  =LEFT(A5,FIND("-",A5&"-")-1)
 6 PA-Harriet  16   PA  =LEFT(A6,FIND("-",A6&"-")-1)
 7  7   =SUMIF(A2:A6,"IL*",B2:B6)    
 8  3   =COUNTIF(A2:A6,"IL*")    

Regular Expressions (#regexpr)

Harlan supplied the following reference Other articles describing “Regular Expressions” in other sources not necessarily related to Visual Basic.
http://devedge.netscape.com/library/manuals/2000/javascript/1.5/guide/
http://devedge.netscape.com/library/manuals/2000/javascript/1.5/reference/
http://devedge.netscape.com/library/manuals/2000/javascript/1.5/reference/regexp.html#1193136

applications:

Regular Expressions -- Examples (#regexpr_ex)

 abcdef value like ABC*  FALSE 
 abcdef12  abc* TRUE 
 A91 BCD  A[0-9]* BCD  TRUE 
 A12-345  [A-Z][0-9][0-9]-[0-9][0-9][0-9]  TRUE 
 A99-789  [A-Z]##-###  TRUE 
 A99-790  [A-Za-z]##-###  TRUE 
 b99-791  [A-Za-z]##-###  TRUE 
 1   1  TRUE 
 1   1  TRUE 
 0-9  0-9 TRUE 
 a1 xyz  [abc]* TRUE 
 a  [abc*] TRUE 
 -  [-0-9] TRUE 
 9   [0-9]  TRUE 
 0   [0]  TRUE 
 1   [!abcdefg]  TRUE 
 bac  [!0-9]a* TRUE 
 a  [!-0-9] TRUE 
 a  [!0-9] TRUE 
 1   [!-]  TRUE 
 bbbA91 BCD  *A[0-9]* BCD  TRUE 
 3!0-9  #!0-9 TRUE 
 !-0-9  !-0-9 TRUE 
 aBBBa  a*a TRUE 
 F  [A-Z] TRUE 
 F  [!A-Z] FALSE 
 a2a  a#a TRUE 
 aM5b  a[L-P]#[!c-e] TRUE 
 BAT123khg  B?T* TRUE 
 CAT123khg  B?T* FALSE 
 r@example.com  ?*@?*.?* TRUE 
  You will probably require the following Reference in your VBE or the equivalent.
Microsoft VBScript Regular Expressions 5.5
Private Sub Worksheet_Change(ByVal Target As Range)
  If Target.Column <> 1 And Target.Column <> 3 _
     Then Exit Sub
  On Error GoTo recover
  Cells(Target.Row, 4) = "--errors--"
  Application.EnableEvents = False
  If Cells(Target.Row, 1) = "" Or _
      Cells(Target.Row, 3) = "" Then
   Cells(Target.Row, 4) = ""
  Else
   Cells(Target.Row, 4) = Cells(Target.Row, 1).Value _
      Like (Cells(Target.Row, 3))
  End If
recover:
  Application.EnableEvents = True 
  '--Note if the function abnormally terminates you
  '    will have reenble events...
  '--http://www.mvps.org/dmcritchie/excel/code/proper.txt
  '--http://www.mvps.org/dmcritchie/excel/event.htm
End Sub
Same thing in a User Defined Function:  =RegExpr_Like(A2,C2)
Function RegExpr_LIKE(Cell As String, _
        myLike As String) As Boolean
  If Cell = "" Or myLike = "" Then
     RegExpr_LIKE = 0
     Exit Function
  Else
     RegExpr_LIKE = Cell Like myLike
  End If
End Function
 
RegExpr supports a wide range of regular expression
types.  Here is a short list.  The help file has a more
complete list with instructions.
?   Any single character. 
*   Zero or more characters. 
#   Any single digit (0 9). 
[charlist]     Any single character in charlist. 
[!charlist]    Any single character not in charlist.
Inside brackets the use of exclamation point requires that the
RegExpr not be found, outside any brackets, the exclamation point
must be matched as a character.  Characters within square brackets other than the not sign (exclamation point at the beginning must be in ASCII order.  (Note the not sign in other languages is more likely to be the more conventional ^ symbol.)
x*? Stingy or minimal matching 
$1 $2 ... Subexpressions or remembered patterns 
\1 \2 ... Backreferences 
(pattern) Matches pattern and captures the match. 
(?:pattern) Matches pattern but does 
               not capture the match 
(?=pattern)  Positive lookaheads 
(?!pattern)  Negative lookaheads 
 
x*  Zero or more x's 
x+   One or more x's 
x?   One or zero x's 
x{m,n} At least m and at most n x's 
[A-Z] Any uppercase character A-Z 
.    Any single character except a newline 
\w   Any alphanumeric character ([a-zA-Z0-9_]) 
\d   Any digit (the same as [0-9]) 



\d Digit (character class [0-9]) 
\D Non digit ([!0-9]) 
\w Word character ([a-zA-Z0-9_]) 
\W Non-word character ([!a-zA-Z0-9_]) 
\s Space character ([\t\n ]) 
\S Non-space character ([!\t\n ]) 
$  Matches "end of line" if placed at _
     the end of a regular expression
m///   match
also see Pattern (Java 2 Platform SE v1.4.2)

Cell has Alphabetic character(s)   (#has_alpha)

To find out if a cell has any alphabetic character in it, I think I would just check each character rather than use Regular Expressions.
Function Has_alpha(cell As String) As Boolean
  Dim x As String, i As Long
  For i = 1 To Len(cell)
     If UCase(Mid(cell, i, 1)) >= "A" And _
        UCase(Mid(cell, i, 1)) <= "Z" Then
       Has_alpha = True
       Exit Function
      End If
  Next i
  Has_alpha = False
End Function
   
 AB
 1 1   FALSE 
 2 123   FALSE 
 3 0100 01029250  FALSE 
 4 0100 01029304  FALSE 
 5 A  TRUE 
 6 111B11111  TRUE 
 7 0100 REHAB01  TRUE 

To use
  =personal.xls!has_alpha(A1)

HTML fragment-id is case sensitive in Firefox (#firefox)

The Firefox and Opera web browsers have case sensitive fragment-id references, unlike IE which is not sensitive.  You can use Agent Ransack to check your HTML source files to identify fragment-ids beginning with a capital using Regular Expressions and optional case-sensitive flag.  You may or may not want to keep them as such.

[ ]+id[=]+["]+[A-Z]+     -- to find use of capitals in id="

[(]+[#][A-Z]+               -- to find use of capitals in (#xxxx)

Regular Expression, Return string -- LIKE (#str)

The following was posted by Bernie Dietrick (programming, 2004-06-18)

Sumproduct (#sumproduct)

The following formula was posted by Bob Phillips (2005-10-29, public.excel) to add up cells with numbers or numbers after the end of wording in a cell -- actually any number on the end separated by a space "800 7" would be a 7, "a 800 b" would result in a #VALUE! error.

=SUMPRODUCT(--(RIGHT(" "&"0"&rng.,LEN(" "&"0"&rng.)- SEARCH("@"&" ",SUBSTITUTE(" "&"0"&rng.," "&"","@"&" ",LEN(" "&"0"&rng.)- LEN(SUBSTITUTE(" "&"0"&rng.," "&"","")))))))

where rng. is the range you are working on


Excel questions not directly concerning my web pages are best directed to newsgroups
such as news://msnews.microsoft.com/microsoft.public.excel.misc where Excel users all around the clock from at least 6 continents ask and answer Excel questions.  Posting suggestions and netiquette.  More information on newsgroups and searching newsgroups.    Google Groups (Usenet) Advanced Search Excel newsgroups (or search any newsgroup).
This page was introduced on October 20, 2003. 
[My Excel Pages -- home]    [INDEX to my site and the off-site pages I reference] 
[Site Search -- Excel]     [Go Back]    [Return to TOP

Please send your comments concerning this web page to: David McRitchie send email comments


Copyright © 1997 - 2009,  F. David McRitchie,  All Rights Reserved