Question

Currently I have a powershell proccess that is scanning a SQL Server Table and is reading a columns containing text. Currently we have characters that are in the extended ASCII land that are causing our downstream processes to break. I was orginally idenitfying these differences in SQL Server but it is terrible at text parsing so I decided to write a powershell script to do this that combined regular expressions. I will post the code for that as well to help other lost souls looking for such a regex.

$x = [regex]::Escape("\``~!@#$%^&*()_|{}=+:;`"'<,>.?/-")
$y = "([^A-z0-9 \0x005D\0x005B\t\n"+$x+"])"
$a =  [regex]::match( $($Row[1]), $y)

The problem comes when I want to display some of the ascii values back in an email saying that I'm scrubbing the data. The numbers don't come out the same as SQL Server. Caution I'm not sure if your results will be the same copying from you browser because these are extended ascii.

In powershell

[int]"–"[-0]; #result 8211 that appears to be wrong
[int]" "[-0]; #result 160 this appears to be right

In SQL Server

select ASCII('–') --result 150
select ASCII(' ') --result 160

What in powershell will help you to get the same results as SQL Server on the ASCII look up, if there is one.

TLDR; So my question is, is the above the correct method to look up ASCII values in powershell because it works for most values but doesn't work for the ASCII value 150 (this is the long dash that is from word).

Était-ce utile?

La solution

In SQL Server,

select UNICODE('–')

will return 8211.

I don't think PowerShell supports ANSI, except for I/O; it works in Unicode internally.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top