Question

Problem

The main problem we need to use regular expression on MS SQL Server 2019, with the capability of at least the level on the POSIX Regular expression.

Possible solutions

This Q/A from stackoverflow rightly concludes that if you query must rely on regular expressions you shuould use CLR. This Readgate article elaborates this approach more. So one of my colleagues and I proposed this solution, but my other colleague categorically stated that using CLR here would be a huge risk to security and stability, and using external script (Python or R) is more secure.

This seems to be dubious claim, since the user code in the CLR can be managed, so perhaps the opposite is true, but I was not able to persuade my colleague.

In my other question which I wrote in my desperation because I was forced to use external script and still produce a blazing fast query. SQLpro user states in his comment that:

Using Python or R can be worst in terms of security rather than using CLR!

Which I tend to believe.

Questions

So I have two questions:

  1. Which Regexp solution is more secure external script or CLR based (as described here)? And why?

  2. I also proposed to run the python code on the same Windows Server (must be the same server, because of a policy) but with python intrepeter installed on the OS. Because the results are exported into CSV files either way and stored in the SQL Server. So then I would able to use Python's multiprocessing module to achieve the right performance. The answer was the same that running Python inside SQL Server is more secure than in a outside application. Which is also a questionable claim.

Was it helpful?

Solution

FIRST : the security level of SQL CLR is somtehing like airbag + ABS + Lane Sense for a car. When you use the three devices you are at the top level in terms of security. That is the SAFE level of SQL CLR. When one of this disposal is not used your are in the EXTERNAL_ACCESS and without any of these you are in UNSAFE.

SECOND : there is no such security level in any other languages at all, making believe those other languages are safe which is not true.... If you want to drive a car and all the previous devices are missing, so there is no indication of any security level in this car... is this car safe ?

An an example, SQL CLR processes can be dropped out in a sandbox in case of bad misbehavior. This is not true at all for other languages ! I always do the same trick to proove it : using SQL CLR with an infinite recursion. The process wil be shot and SQL Server service will survive. Using an external Java machine to do the same, will drive your SQL Server to be down because Java will acquire all the memory to the detriment of SQL Server...

Now you have the choice !

OTHER TIPS

my other colleague categorically stated that using CLR here would be a huge risk to security and stability

Well, your colleague is categorically wrong (unless they can offer up actual proof of such claims).

Security

Ever since SQLCLR was introduced in SQL Server 2005 people have been saying that it is "unsafe". However, I have yet to see anyone actually prove that it is indeed unsafe. The only supposed "proof" I have seen is someone loading an assembly that writes a file to disk with the claim that since it can write to disk, it is a security vulnerability. But, that's a false claim due to:

  1. the person already had permission to create the assembly
  2. the assembly already had permission to access external resources via TRUSTWORTHY being enabled or a signature / certificate being used.

That's not proof of gaining elevated permissions if you already had elevated permissions in order to a) create and b) execute the code required to make it look like you were gaining elevated permissions.

SQLCLR is not inherently insecure. Sure, it can be use incorrectly to accidentally allow someone to gain elevated permissions, etc, but by itself it is safe and secure. And, it's easy enough to introduce a security vulnerability simply by using EXECUTE AS 'dbo' in a stored procedure or function, and that's pure T-SQL.

Here are several articles that I wrote on the topic of SQLCLR security:

That being said, within the context of using regular expressions, both SQLCLR and External Scripts are "secure" (assuming that you are not allowing ad hoc access to users to submit whatever random Python code they desire).

Usability

Here is a major difference: External Scripts execute via a stored procedure. Meaning, you can perform your RegEx on a single value. If it's possible to integrate the RegEx function into the SELECT and/or WHERE clauses (I know you can pass in a query and return a result set), then it's at best clunky and certainly not easy to maintain. Whereas with SQLCLR, you can create scalar functions and table-valued functions that are quite easy to integrate into queries for proper set-based solutions (and remember: properly coded SQLCLR scalar functions that don't do any data access can participate in parallel plans).

If you want RegEx functions without having to compile anything, there are quite a few available in the Free version of SQL# (SQLsharp), a SQLCLR library that I wrote.


For more info on working with SQLCLR in general, please visit my site: SQLCLR Info

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top