Get Regular Expression matches

Recently I was given a task at work…someone pointed at a haystack and asked me to find a needle. Actually, the task involved reading the body of hundreds of email messages looking for telephone numbers. Regular expressions and powershell make this pretty easy:

([regex]"[2-9]\d{2}-\d{3}-\d{4}").Match("Body with phone number like 345-555-6789 in it.") | %{ $_.Value }

After completing this task, I figured it would be fairly easy to turn this into a function with predefined regular expressions that might be common. Here is that function.

[cc lang=”powershell” width=”550″]
Function Get-RegexMatches{
param(
[parameter(Mandatory=$true, ValueFromPipeline=$true)][alias(“input”)][String]$inputText,
[parameter(Mandatory=$false)][alias(“email”)][switch]$mail,
[parameter(Mandatory=$false)][alias(“telephoneNumber”)][switch]$phone,
[parameter(Mandatory=$false)][alias(“zipCode”)][switch]$zip,
[parameter(Mandatory=$false)][alias(“SocialSecurityNumber”)][switch]$ssn,
[parameter(Mandatory=$false)][alias(“ip”)][switch]$ipv4,
[parameter(Mandatory=$false)][switch]$ipv6,
[parameter(Mandatory=$false)][alias(“HostName”)][alias(“dnsHostName”)][switch]$dns,
[parameter(Mandatory=$false)][alias(“regex”)][string]$pattern
)

process {
if ($mail) { $regexTag = [regex]”[a-z0-9!#$%&’*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&’*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?” }
if ($phone) { $regexTag = [regex]”[2-9]\d{2}-\d{3}-\d{4}([\ ][x]\d{1,5})?” }
if ($zip) { $regexTag = [regex]”\d{5}([\-]\d{4})?”}
if ($ssn) { $regexTag = [regex]”((?!000)(?!666)(?:[0-6]\d{2}|7[0-2][0-9]|73[0-3]|7[5-6][0-9]|77[0-2]))-((?!00)\d{2})-((?!0000)\d{4})” }
if ($ipv4) { $regexTag = [regex]”0*([1-9]?\d|1\d\d|2[0-4]\d|25[0-5])\.0*([1-9]?\d|1\d\d|2[0-4]\d|25[0-5])\.0*([1-9]?\d|1\d\d|2[0-4]\d|25[0-5])\.0*([1-9]?\d|1\d\d|2[0-4]\d|25[0-5])” }
if ($ipv6) { $regexTag = [regex]”\s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?\s*” }
if ($dns) { $regexTag = [regex]”([\d\w-.]+?\.(a[cdefgilmnoqrstuwz]|b[abdefghijmnorstvwyz]|c[acdfghiklmnoruvxyz]|d[ejkmnoz]|e[ceghrst]|f[ijkmnor]|g[abdefghilmnpqrstuwy]|h[kmnrtu]|i[delmnoqrst]|j[emop]|k[eghimnprwyz]|l[abcikrstuvy]|m[acdghklmnopqrstuvwxyz]|n[acefgilopruz]|om|p[aefghklmnrstwy]|qa|r[eouw]|s[abcdeghijklmnortuvyz]|t[cdfghjkmnoprtvwz]|u[augkmsyz]|v[aceginu]|w[fs]|y[etu]|z[amw]|aero|arpa|biz|com|coop|edu|info|int|gov|mil|museum|name|net|org|pro)(\b|\W(?“my text with emailuser@somedomain.com” | Get-RegexMatches -mail “my text with emailuser@somedomain.com” | Get-RegexMatches -host

Please let me know if you have any suggestions or other use cases for such a function!

This entry was posted in Scripting. Bookmark the permalink.

One Response to Get Regular Expression matches

  1. Pingback: Get-Scripting Podcast Episode 32 | CrypticZero

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Notify me of followup comments via e-mail. You can also subscribe without commenting.