¿Cómo puedo reemplazar patrones impares dentro de una cadena?

https://stackoverflow.com//questions/25048739

21-12-2019
|

Pregunta

Estoy en el proceso de crear un procedimiento temporal en SQL porque tengo un valor de una tabla que está escrito en Markdown, por lo que aparece como HTML representado en el navegador web. (rebaja a conversión HTML).

La cadena de la columna actualmente tiene este aspecto:

Questions about **general computing hardware and software** are off-topic for Stack Overflow unless they directly involve tools used primarily for programming. You may be able to get help on [Super User](http://superuser.com/about)

Actualmente estoy trabajando con texto en negrita y cursiva.Esto significa (en el caso del texto en negrita) Necesitaré reemplazar N veces impares el patrón.**cone incluso veces con.
Yo vi reemplazar() pero realiza el reemplazo en todos los patrones de la cadena.

Entonces, ¿cómo puedo reemplazar una subcadena solo si es impar o solo si es par?

Actualizar: Algunas personas se preguntan qué esquemas estoy usando, así que eche un vistazo. aquí.

Un extra más si quieres: El hipervínculo de estilo Markdown al hipervínculo html no parece tan simple.

Solución

Utilizando elSTUFFfunción y un simpleWHILEbucle:

CREATE FUNCTION dbo.fn_OddEvenReplace(@text nvarchar(500), 
                                      @textToReplace nvarchar(10), 
                                      @oddText nvarchar(10), 
                                      @evenText nvarchar(500))
RETURNS varchar(max)
AS
BEGIN
    DECLARE @counter tinyint
    SET @counter = 1

    DECLARE @switchText nvarchar(10)
    WHILE CHARINDEX(@textToReplace, @text, 1) > 0
    BEGIN
        SELECT @text = STUFF(@text, 
                    CHARINDEX(@textToReplace, @text, 1), 
                    LEN(@textToReplace), 
                    IIF(@counter%2=0,@evenText,@oddText)),
                @counter = @counter + 1
    END
    RETURN @text
END

Y puedes usarlo así:

SELECT dbo.fn_OddEvenReplace(column, '**', '<b>', '</b>')
FROM table

ACTUALIZAR:

Esto se reescribe como SP:

CREATE PROC dbo.##sp_OddEvenReplace @text nvarchar(500), 
                                  @textToReplace nvarchar(10), 
                                  @oddText nvarchar(10), 
                                  @evenText nvarchar(10),
                                  @returnText nvarchar(500) output
AS
BEGIN
    DECLARE @counter tinyint
    SET @counter = 1

    DECLARE @switchText nvarchar(10)
    WHILE CHARINDEX(@textToReplace, @text, 1) > 0
    BEGIN
        SELECT @text = STUFF(@text, 
                    CHARINDEX(@textToReplace, @text, 1), 
                    LEN(@textToReplace), 
                    IIF(@counter%2=0,@evenText,@oddText)),
                @counter = @counter + 1
    END
    SET @returnText = @text
END
GO

Y para ejecutar:

DECLARE @returnText nvarchar(500)
EXEC dbo.##sp_OddEvenReplace '**a** **b** **c**', '**', '<b>', '</b>', @returnText output

SELECT @returnText

Otros consejos

Según la solicitud de OP, he modificado mi respuesta anterior para realizar un procedimiento almacenado temporal.He dejado mi respuesta anterior, ya que creo que el uso contra una tabla de cuerdas es útil también.

Si ya existe una tabla de tally (o números) con al menos 8000 valores, entonces la sección marcada del CTE se puede omitir y la referencia CTE Tally reemplazada con el nombre de latabla de tally existente.

create procedure #HtmlTagExpander(
     @InString   varchar(8000) 
    ,@OutString  varchar(8000)  output
)as 
begin
    declare @Delimiter  char(2) = '**';

    create table #t( 
         StartLocation  int             not null
        ,EndLocation    int             not null

        ,constraint PK unique clustered (StartLocation desc)
    );

    with 
          -- vvv Only needed in absence of Tally table vvv
    E1(N) as ( 
        select 1 from (values
            (1),(1),(1),(1),(1),
            (1),(1),(1),(1),(1)
        ) E1(N)
    ),                                              --10E+1 or 10 rows
    E2(N) as (select 1 from E1 a cross join E1 b),  --10E+2 or 100 rows
    E4(N) As (select 1 from E2 a cross join E2 b),  --10E+4 or 10,000 rows max
    tally(N) as (select row_number() over (order by (select null)) from E4),
          -- ^^^ Only needed in absence of Tally table ^^^

    Delimiter as (
        select len(@Delimiter)     as Length,
               len(@Delimiter)-1   as Offset
    ),
    cteTally(N) AS (
        select top (isnull(datalength(@InString),0)) 
            row_number() over (order by (select null)) 
        from tally
    ),
    cteStart(N1) AS 
        select 
            t.N 
        from cteTally t cross join Delimiter 
        where substring(@InString, t.N, Delimiter.Length) = @Delimiter
    ),
    cteValues as (
        select
             TagNumber = row_number() over(order by N1)
            ,Location   = N1
        from cteStart
    ),
    HtmlTagSpotter as (
        select
             TagNumber
            ,Location
        from cteValues
    ),
    tags as (
        select 
             Location       = f.Location
            ,IsOpen         = cast((TagNumber % 2) as bit)
            ,Occurrence     = TagNumber
        from HtmlTagSpotter f
    )
    insert #t(StartLocation,EndLocation)
    select 
         prev.Location
        ,data.Location
    from tags data
    join tags prev
       on prev.Occurrence = data.Occurrence - 1
      and prev.IsOpen     = 1;

    set @outString = @Instring;

    update this
    set @outString = stuff(stuff(@outString,this.EndLocation,  2,'</b>')
                                           ,this.StartLocation,2,'<b>')
    from #t this with (tablockx)
    option (maxdop 1);
end
go

invocado así:

declare @InString varchar(8000) ,@OutString varchar(8000); set @inString = 'Questions about **general computing hardware and software** are off-topic **for Stack Overflow.'; exec #HtmlTagExpander @InString,@OutString out; select @OutString; set @inString = 'Questions **about** general computing hardware and software **are off-topic** for Stack Overflow.'; exec #HtmlTagExpander @InString,@OutString out; select @OutString; go drop procedure #HtmlTagExpander; go

produce como salida:

Questions about general computing hardware and software are off-topic **for Stack Overflow. Questions about general computing hardware and software are off-topic for Stack Overflow.

Una opción es utilizar una expresión regular, ya que simplifica mucho la sustitución de dichos patrones.Las funciones RegEx no están integradas en SQL Server, por lo que necesita usar SQL CLR, ya sea compilado por usted o desde una biblioteca existente.

Para este ejemplo usaré el SQL# (SQLsharp) biblioteca (de la que soy autor), pero las funciones RegEx están disponibles en la versión gratuita.

SELECT SQL#.RegEx_Replace
(
   N'Questions about **general computing hardware and software** are off-topic\
for Stack Overflow unless **they** directly involve tools used primarily for\
**programming. You may be able to get help on [Super User]\
(https://superuser.com/about)', -- @ExpressionToValidate
   N'\*\*([^\*]*)\*\*', -- @RegularExpression
   N'<b>$1</b>', -- @Replacement
   -1, -- @Count (-1 = all)
   1, - @StartAt
   'IgnoreCase' -- @RegEx options
);

El patrón anterior \*\*([^\*]*)\*\* simplemente busca cualquier cosa rodeada de asteriscos dobles.En este caso, no necesita preocuparse por pares o impares.También significa que no obtendrás un mensaje mal formado. -sólo etiquetar si por alguna razón hay un extra ** en la cuerda.Agregué dos casos de prueba adicionales a la cadena original:un conjunto completo de ** alrededor del mundo they y un conjunto inigualable de ** justo antes de la palabra programming.La salida es:

Questions about <b>general computing hardware and software</b> are off-topicfor Stack Overflow unless <b>they</b> directly involve tools used primarily for **programming. You may be able to get help on [Super User](https://superuser.com/about)

que se representa como:

Preguntas sobre hardware y software informáticos en general están fuera de tema para Stack Overflow a menos que ellos involucran directamente herramientas utilizadas principalmente para **programación.Es posible que pueda obtener ayuda sobre Superusuario

Esta solución hace uso de las técnicas descritas por Jeff MoMen en este artículo sobre la suma corrienteproblema en SQL .Esta solución es larga, pero al hacer uso de la actualización exigua en SQL Server a través de un índice agrupado, tiene la promesa de ser mucho más eficiente en los conjuntos de datos grandes que las soluciones basadas en cursor.

actualización - modificada a continuación para operar de una tabla de cadenas

Suponiendo la existencia de una tabla de tally creada así (con al menos 8000 filas):

create table dbo.tally ( N int not null ,unique clustered (N desc) ); go with E1(N) as ( select 1 from (values (1),(1),(1),(1),(1), (1),(1),(1),(1),(1) ) E1(N) ), --10E+1 or 10 rows E2(N) as (select 1 from E1 a cross join E1 b), --10E+2 or 100 rows E4(N) As (select 1 from E2 a cross join E2 b) --10E+4 or 10,000 rows max insert dbo.tally(N) select row_number() over (order by (select null)) from E4; go

y una función htmltagspotter definido así:

create function dbo.HtmlTagSPotter( @pString varchar(8000) ,@pDelimiter char(2)) returns table with schemabinding as return WITH Delimiter as ( select len(@pDelimiter) as Length, len(@pDelimiter)-1 as Offset ), cteTally(N) AS ( select top (isnull(datalength(@pstring),0)) row_number() over (order by (select null)) from dbo.tally ), cteStart(N1) AS (--==== Returns starting position of each "delimiter" ) select t.N from cteTally t cross join Delimiter where substring(@pString, t.N, Delimiter.Length) = @pDelimiter ), cteValues as ( select ItemNumber = row_number() over(order by N1) ,Location = N1 from cteStart ) select ItemNumber ,Location from cteValues go

Luego, ejecutar el siguiente SQL realizará la sustitución requerida.Tenga en cuenta que la unión interna al final evita que se conviertan las etiquetas "impares" que se arrastran:

create table #t( ItemNo int not null ,Item varchar(8000) null ,StartLocation int not null ,EndLocation int not null ,constraint PK unique clustered (ItemNo,StartLocation desc) ); with data(i,s) as ( select i,s from (values (1,'Questions about **general computing hardware and software** are off-topic **for Stack Overflow.') ,(2,'Questions **about **general computing hardware and software** are off-topic **for Stack Overflow.') --....,....1....,....2....,....3....,....4....,....5....,....6....,....7....,....8....,....9....,....0 )data(i,s) ), tags as ( select ItemNo = data.i ,Item = data.s ,Location = f.Location ,IsOpen = cast((TagNumber % 2) as bit) ,Occurrence = TagNumber from data cross apply dbo.HtmlTagSPotter(data.s,'**') f ) insert #t(ItemNo,Item,StartLocation,EndLocation) select data.ItemNo ,data.Item ,prev.Location ,data.Location from tags data join tags prev on prev.ItemNo = data.ItemNo and prev.Occurrence = data.Occurrence - 1 and prev.IsOpen = 1 union all select i,s,8001,8002 from data ; declare @ItemNo int ,@ThisStting varchar(8000); declare @s varchar(8000); update this set @s = this.Item = case when this.StartLocation > 8000 then this.Item else stuff(stuff(@s,this.EndLocation, 2,'') ,this.StartLocation,2,'') end from #t this with (tablockx) option (maxdop 1); select Item from ( select Item ,ROW_NUMBER() over (partition by ItemNo order by StartLocation) as rn from #t ) t where rn = 1 go

produciendo:

Item ------------------------------------------------------------------------------------------------------------ Questions about general computing hardware and software are off-topic **for Stack Overflow. Questions about general computing hardware and software are off-topic for Stack Overflow.

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow