Question

Assuming I have two tables, one with the following columns called lease_period:

tenant_trading_name, suite_id, lease_id, building_id

and another, called lease_period_audit with the following:

audit_date, audit_type, tenant_trading_name, suite_id, lease_id, building_id

Each time a record is updated on lease_period and entry is made on lease_period_audit with a status of 'Updated'. I'm trying to find all updates made only to the tenant_trading_name field but haven't had any success. So far I have the following:

select              lpa.*
from                property.lease_period_audit lpa
inner join          property.lease_period lp on lpa.suite_id = lp.suite_id and lpa.lease_id = lp.lease_id and lpa.building_id = lp.building_id
where               audit_type = 'Updated'
                    and lp.tenant_trading_name <> lpa.tenant_trading_name
order by            1 desc  

Where's the flaw in my thought process here? How can this be done / how should I be thinking about this?

Was it helpful?

Solution

Assuming the audit table also logs the lease_period primary key column, referenced here lp_id for simplicity, you could try the following approach:

  1. Find all the rows where audit_type is 'Updated'.

  2. Rank all the rows by audit_date and partitioning them by lp_id.

  3. Rank the rows by audit_date partitioning by lp_id, suite_id, lease_id, building_id.

  4. Get the difference between the two rankings.

  5. Rank the rows again by audit_date, partitioning them now by lp_id, suite_id, lease_id, building_id, (ranking_difference).

  6. Output all the rows where the last ranking value is 2 or greater.

The first four steps result in a row set where each group of consecutive (in ascending order of audit_date) rows with identical values of suite_id, lease_id, building_id for the same lp_id will be uniquely distinguished by a value calculated as the difference between the rankings #2 & #3.

Within the group, every row, starting from the second one, will differ from the previous one only in the value of tenant_trading_name, which is just what we need. So, we rank the rows once again, taking into account the 'group ID' we've just obtained, then return every row with the ranking of 2 or higher.

Here's an approximate implementation:

WITH marked AS (
  SELECT
    *,
    grp = ROW_NUMBER() OVER (PARTITION BY lp_id
                                 ORDER BY audit_date)
        - ROW_NUMBER() OVER (PARTITION BY lp_id, suite_id, lease_id, building_id
                                 ORDER BY audit_date)
  FROM lease_period_audit
  WHERE audit_type = 'Updated'
),
ranked AS (
  SELECT
    *,
    rnk = ROW_NUMBER() OVER (PARTITION BY lp_id, suite_id, lease_id, building_id, grp
                                 ORDER BY audit_date)
  FROM marked
)
SELECT
  audit_date,
  lp_id,
  tenant_trading_name,
  suite_id,
  lease_id,
  building_id
FROM ranked
WHERE rnk = 2

Note. This assumes that the audit table logs only real changes, i.e. there can't be two consecutive rows with the same primary key where all four columns have identical values.

OTHER TIPS

You should think something like this (pseudo-code):

edit: I haven't realized before that the lpa table actually has all the data, there's no need to join with LP

select lpa.*
from   lpa 
join   lpa_before on 
           lpa_before.id = lpa.id and 
           lpa_before.date = 
                  (select max(date) from lpa3 where lpa3.date < lpa.date and lpa.id = lpa3.id)
where  auditytype = 'update' and lpa.name <> lpa_before.name

I hope I could explain.. (it's not a trivial solution, but thats what comes to my mind)

In English:

Select the LPA registers. Join with another LPA that will be the LPA just before the original one. Call this LPA_BEFORE. To join the LPA_BEFORE you have to compare all ids from LPA and LPA_BEFORE, and the LPA_BEFORE date must be the maximum from the LPAs which have a date less than the original LPA. Compare the names of the LPA with LPA_BEFORE

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top