Calculate change in value window function when the identifier has dropped of the current period
-
26-01-2021 - |
Question
I was wondering how can i got about finding the change in value from one period to another but in my scenario the identifier has dropped off the current period.
I'm think this could be done with a cte and a select distinct id but i'm not sure how to implement.
Some demo date below
CREATE TABLE basic_pays (
employee_id int,
fiscal_year INT,
salary DECIMAL(10 , 2 ),
PRIMARY KEY (employee_id, fiscal_year)
);
INSERT INTO basic_pays(employee_id, fiscal_year,salary) VALUES(1,2017,24000);
INSERT INTO basic_pays(employee_id, fiscal_year,salary) VALUES(2,2017,17000);
INSERT INTO basic_pays(employee_id, fiscal_year,salary) VALUES(1,2018,25920);
SELECT
employee_id,
fiscal_year,
salary,
salary - LAG(salary)
OVER (PARTITION BY employee_id ORDER BY fiscal_year) salary_change
FROM
basic_pays
GROUP BY
employee_id, fiscal_year, salary
ORDER BY fiscal_year, employee_id
DROP TABLE basic_pays
The expected output would be
| employee_id | fiscal_year | salary | salary_change |
|-------------|-------------|----------|---------------|
| 1 | 2017 | 24000.00 | NULL |
| 2 | 2017 | 17000.00 | NULL |
| 1 | 2018 | 25920.00 | 1920.00 |
| 2 | 2018 | 0.00 | -17000.00 |
Solution
You could do it like this, performance wise this could hurt on bigger datasets.
;WITH CTE
AS
(
SELECT DISTINCT employee_id,a.fiscal_year
from basic_pays
CROSS APPLY
(
SELECT distinct fiscal_year
from basic_pays
) as a)
SELECT c.employee_id,
c.fiscal_year,
CASE WHEN salary IS NULL then 0.00 else salary END as salary,
CASE WHEN salary IS NULL then 0.00 else salary END - LAG(salary)
OVER (PARTITION BY c.employee_id ORDER BY c.fiscal_year) salary_change
FROM
CTE c
LEFT JOIN basic_pays b
ON C.employee_id = b.employee_id
AND c.fiscal_year= b.fiscal_year;
Result
employee_id fiscal_year salary salary_change
1 2017 24000.00 NULL
1 2018 25920.00 1920.00
2 2017 17000.00 NULL
2 2018 NULL -17000.00
OTHER TIPS
Let me suggest to use a Calendar table for this purpose.
I've simulated it by using a table value. And if you have a single row by employee_id/fiscal year, you can avoid GROUP BY clause.
SELECT
emp_id,
fy,
salary,
COALESCE(salary,0) - LAG(salary) OVER (PARTITION BY emp_id ORDER BY fy) salary_change
FROM
-- all combinations of year / employee_id
(SELECT
emp_id, fy
FROM
(VALUES (2017),(2018)) as calendar(fy)
CROSS JOIN
-- it should be SELECT employee_id from employee's table
(SELECT DISTINCT employee_id as emp_id FROM basic_pays) e) cal LEFT JOIN
basic_pays bp
ON bp.employee_id = cal.emp_id
AND bp.fiscal_year = cal.fy
ORDER BY
fy, emp_id
emp_id | fy | salary | salary_change -----: | ---: | :------- | :------------ 1 | 2017 | 24000.00 | null 2 | 2017 | 17000.00 | null 1 | 2018 | 25920.00 | 1920.00 2 | 2018 | null | -17000.00
db<>fiddle here
Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange