VHDL - variable vs. signal behaviour in queue

Question 1

When a variable is used for pixelCurrent in the process, then the value is updated and available immediately, where the value of a signal is not ready until the next cycle.

So when a variable is use, this line implements a RAM with asynchronous read based on addrb:

pixelCurrent := RAM(to_integer(UNSIGNED(addrb)));

Where an assign to a signal will implements a RAM with synchronous read, where the value read from the RAM is not available until next cycle.

The typical FPGA technologies has dedicated hardware for RAMs with synchronous read, but RAMs with asynchronous are made with combinatorial logic (look up tables / LUT).

So the huge amount of LUTs that appears when using a variable for pixelCurrent is because the synthesis tool tries to map the RAM with asynchronous read into LUTs, which typically requires a huge amount of LUTs and makes the resulting RAM very slow.

In the pipelined design it sounds like the asynchronous RAM read is not required, so if pixelCurrent is a signal, a synchronous RAM is used instead and the synthesis tool will map the RAM to an internal RAM hardware block, with code like:

pixelMinus2  := pixelMinus1;
pixelMinus1  := pixelCurrent;
pixelCurrent <= RAM(to_integer(UNSIGNED(addrb)));

Question 2

Signals, being the means of inter-process communication, have assignment semantics carefully designed to avoid race conditions and hazards. See this Q&A and this link to "VHDL's crown jewel" for the gory details.

Therefore when you assign pixelCurrent (signal)

pixelCurrent <= RAM(to_integer(UNSIGNED(addrb)));

the assignment doesn't happen until the process suspends (which for RTL code is typically when the process exits and at the sensitivity list), and the result is not available within this process until it next wakes up at if rising_edge(clk25). So this creates a pipeline register.

Variables within a VHDL process act like variables in a process in any other imperative language (C etc) - once updated, their new value is immediately available.

Therefore the following:

pixelCurrent := RAM(to_integer(UNSIGNED(addrb)));

IF slv_reg0(3) = '0' THEN 
    -- bypass filter for debugging
    dob <= pixelCurrent;

propagates the NEW value of pixelCurrent into the rest of the process, generating a HUGE design which tries to accomplish EVERYTHING within a single clock cycle.

There are two solutions : my preferred one is to use signals for pipeline registers, because you can describe the pipeline in its most natural manner (with the first stage first).

The second solution, using variables as pipeline registers - ironically you already partially adopt this solution -

pixelMinus2  := pixelMinus1;
pixelMinus1  := pixelCurrent;
pixelCurrent := RAM(to_integer(UNSIGNED(addrb)));

is to describe the pipeline BACKWARDS so that the assignment to a variable comes after the last use of its value.

Simply move these three assignments after the big IF slv_reg0(3) and your variable version should work.

Having verified that both approaches generate the same hardware, pick whichever approach you think leads to the clearest (easiest to understand) design.