Query date SQL: da quanto tempo questa condizione è vera

https://stackoverflow.com/questions/635233

10-07-2019
|

Domanda

La domanda è da quanto tempo questi clienti sono stati cretini in una determinata data.

Sto lavorando contro Sybase

Per questa struttura di tabella semplificata della tabella history_data

table: history_of_jerkiness
processing_date  name  is_jerk
---------------  ----- -------
20090101         Matt  true
20090101         Bob   false        
20090101         Alex  true        
20090101         Carol true        
20090102         Matt  true        
20090102         Bob   true        
20090102         Alex  false        
20090102         Carol true        
20090103         Matt  true        
20090103         Bob   true        
20090103         Alex  true        
20090103         Carol false

Il rapporto per il 3 ° dovrebbe mostrare che Matt è sempre stato un coglione, Alex è appena diventato un coglione e Bob è stato un coglione per 2 giorni.

name    days jerky
-----   ----------
Matt    3
Bob     2
Alex    1

Vorrei trovare questi intervalli di tempo in modo dinamico, quindi se eseguo il rapporto per il 2 °, dovrei ottenere risultati diversi:

name    days_jerky
-----   ----------
Matt    2
Bob     1
Carol   2

La chiave qui sta cercando di trovare solo intervalli continui più vecchi di una certa data. Ho trovato alcuni indizi, ma sembra un problema in cui ci sarebbero soluzioni complicate molto intelligenti.

Soluzione

La mia soluzione da SQL Server - uguale a Dems ma ho inserito un minimo di riferimento. Presuppone che non vi siano lacune, ovvero che esiste una voce per ogni giorno per ogni persona. Se ciò non fosse vero, dovrei eseguire il ciclo.

DECLARE @run_date datetime
DECLARE @min_date datetime

SET @run_date = {d '2009-01-03'}

-- get day before any entries in the table to use as a false baseline date
SELECT @min_date = DATEADD(day, -1, MIN(processing_date)) FROM history_of_jerkiness

-- get last not a jerk date for each name that is before or on the run date
-- the difference in days between the run date and the last not a jerk date is the number of days as a jerk
SELECT [name], DATEDIFF(day, MAX(processing_date), @run_date)
FROM (
     SELECT processing_date, [name], is_jerk
     FROM history_of_jerkiness
     UNION ALL
     SELECT DISTINCT @min_date, [name], 0
     FROM history_of_jerkiness ) as data
WHERE is_jerk = 0
  AND processing_date <= @run_date
GROUP BY [name]
HAVING DATEDIFF(day, MAX(processing_date), @run_date) > 0

Ho creato la tabella di test con il seguente:

CREATE TABLE history_of_jerkiness (processing_date datetime, [name] varchar(20), is_jerk bit)

INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-01'}, 'Matt', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-01'}, 'Bob', 0)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-01'}, 'Alex', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-01'}, 'Carol', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-02'}, 'Matt', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-02'}, 'Bob', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-02'}, 'Alex', 0)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-02'}, 'Carol', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-03'}, 'Matt', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-03'}, 'Bob', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-03'}, 'Alex', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-03'}, 'Carol', 0)

Altri suggerimenti

Ciò può essere semplificato se si strutturano i dati in modo da soddisfare i seguenti criteri ...

Tutte le persone devono avere un record iniziale in cui non sono coglioni

Puoi fare qualcosa come ...

SELECT
   name,
   MAX(date)   last_day_jerk_free
FROM
   jerkiness AS [data]
WHERE
   jerk = 'false'
   AND date <= 'a date'
GROUP BY
   name

Sai già qual è la data di base ('una data'), e ora sai che l'ultimo giorno non erano un coglione. Non conosco Sybase ma sono sicuro che ci sono comandi che puoi usare per ottenere il numero di giorni tra 'a data' e 'last_day_jerk_free'

EDIT:

Esistono diversi modi per creare artificialmente un'inizializzazione "non a scatti" disco. Quello suggerito da Will Rickards usa una sottoquery che contiene un sindacato. In questo modo, tuttavia, ha due lati negativi ...
1. La query secondaria maschera tutti gli indici che altrimenti potrebbero essere stati utilizzati
2. Presuppone che tutte le persone dispongano di dati a partire dallo stesso punto

In alternativa, prendi il suggerimento di Will Rickard e sposta l'aggregazione dalla query esterna alla query interna (massimizzando così l'uso degli indici), e unione con una seconda sub query generalizzata per creare il jerky iniziale = record falso ...

SELECT name, DATEDIFF(day, MAX(processing_date), @run_date) AS days_jerky
FROM (

    SELECT name, MAX(processing_date) as processing_date
    FROM history_of_jerkiness
    WHERE is_jerk = 0 AND processing_date <= @run_date
    GROUP BY name

    UNION

    SELECT name, DATEADD(DAY, -1, MIN(processing_date))
    FROM history_of_jerkiness
    WHERE processing_date <= @run_date
    GROUP BY name

    ) as data
GROUP BY
   name

La query esterna deve ancora fare un massimo senza indici, ma su un numero ridotto di record (2 per nome, anziché n per nome). Il numero di record viene inoltre ridotto non richiedendo che ogni nome abbia un valore per ogni data in uso. Ci sono molti altri modi per farlo, alcuni possono essere visti nella mia cronologia delle modifiche.

" Questo può essere reso semplice se strutturi i dati in modo da soddisfare i seguenti criteri ...

Tutte le persone devono avere un record iniziale in cui non sono coglioni "

I criteri che i dati devono e non devono soddisfare dipendono dall'utente, non dallo sviluppatore.

Che ne dici di questo:

select a.name,count(*) from history_of_jerkiness a
left join history_of_jerkiness b
on a.name = b.name 
and a.processing_date >= b.processing_date
and a.is_jerk = 'true'
where not exists
( select * from history_of_jerkiness c
  where a.name = c.name
  and c.processing_date between a.processing_date and b.processing_date
  and c.is_jerk = 'false'
)
and a.processing_date <= :a_certain_date;

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow