This will do it.
(\d\d\d-?\s?\d\d\d-?\s?\d\d\d\d\s?)?(x\d\d\d\d)?
or shorter equivalent:
(\d{3}-?\s?\d{3}-?\s?\d{4}\s?)?(x\d{4})?
You want to match the full phone number, optionally with space/dash, and make that whole thing optional, then include extension, and make that optional too.
Answer from Samuel Neff on Stack OverflowThis will do it.
(\d\d\d-?\s?\d\d\d-?\s?\d\d\d\d\s?)?(x\d\d\d\d)?
or shorter equivalent:
(\d{3}-?\s?\d{3}-?\s?\d{4}\s?)?(x\d{4})?
You want to match the full phone number, optionally with space/dash, and make that whole thing optional, then include extension, and make that optional too.
A pattern like this would match all your inputs:
\d{3}[- ]?\d{3}[- ]?\d{4}( x\d{4})?|x\d{4}
This will match either:
- three digits
- an optional space or hyphen
- three digits
- an optional space or hyphen
- four digits
an optional group of:
- a space, and a literal
x - four digits
—or—
- a space, and a literal
- a space, and a literal
x - four digits
Depending on your precise needs, you may need to start (^) and end ($) anchors to prohibit extra characters around your pattern (e.g. "foo x1234 bar"):
^\d{3}[- ]?\d{3}[- ]?\d{4}( x\d{4})?|x\d{4}$
Update
If you'd like to ensure that the digit two separators between the three phone number segments must be the same—e.g. 508-555 1212 would not be allowed—the easiest way would be something like this:
\d{3}([- ]?)\d{3}\1\d{4}( x\d{4})?|x\d{4}
The (...) creates a capture group, and because it happens to be the first one in the pattern, it's referred to as group 1. The \1 is a backreference, which will only match the exact string which was matched in group 1.
Videos
You may try:
^\((\d{3})\)\s*(\d{3})-(\d{4})(?: ext(\d{5}))?$
Explanation of the above regex:
^, $- Represents start and end of the line respectively.\((\d{3})\)- Represents first capturing group matching the digits inside().\s*- Matches a white-space character zero or more times.(\d{3})-- Represents second capturing group capturing exactly 3 digits followed by a-.(\d{4})- Represents third capturing group matching the digits exactly 4 times.(?: ext(\d{5}))?-(?:Represents a non capturing groupext- Followed by a space and literalext.(\d{5})- Represents digits exactly 5 times.)- Closing of the non-captured group.?- Represents the quantifier making the whole non-captured group optional.

You can find the sample demo of the above regex in here.
Powershell Commands:
PS C:\Path\To\MyDesktop> $input_path='C:\Path\To\MyDesktop\InputFile.txt'
PS C:\Path\To\MyDesktop> $output_path='C:\Path\To\MyDesktop\outFile.txt'
PS C:\Path\To\MyDesktop> $regex='^\((\d{3})\)\s*(\d{3})-(\d{4})(?: ext(\d{5}))?$'
PS C:\Path\To\MyDesktop> select-string -Path $input_path -Pattern $regex -AllMatches | % { "Phone Number: $($_.matches.groups[1])$($_.matches.groups[2])$($_.matches.groups[3]) Extension: $($_.matches.groups[4])" } > $output_path
Sample Result:

After you've replaced all characters, you could split the result to get two numbers
Applied to your example
@(
'(123) 455-6789'
, '(123) 577-2145 ext81245'
) | % {
$elements = $_ -replace '[()\s\s-]+' -split 'ext'
[PSCustomObject]@{
phone = $elements[0]
extension = $elements[1]
}
}
returns
phone extension
------ ---------
1234556789
1235772145 81245