Avoiding code duplication when the only difference is loop control statements (with the same statements in loop bodies)?

Question 1

Although I think what you already have will be fine after sticking the inner loop code blocks in an ordinary function as suggested by Adam Burry and Tony D, if you want you can combine the loops, using tables to encode the possible directions to move in. The trick is to use an array p[2] instead of separate i and j, to enable the question of which index is varied in the outer loop to be driven by a table. Then the only tricky thing is making sure that the other index, which will be varied in the inner loop, needs to start at its maximum value (instead of 0) iff it will decrement at each step:

enum indices { I, J };       // Can just use 0 and 1 if you want
template <size_t size> unsigned process(const unsigned (&grid)[size][size]) {
    static int d[][2] = { {1, 0}, {0, 1}, {1, 1}, {1, -1}, {1, 1}, {1, -1} };
    static int w[]    = {      J,      I,      J,       J,      I,       I };
    unsigned prodMax = 0;    // Note: not 1
    for (int k = 0; k < sizeof d / sizeof d[0]; ++k) {  // For each direction
        for (int init = 0; init < size; ++init) {
            Max_consecutive_prod mcp;
            int p[2];        // p[I] is like i, p[J] is like j
            for (p[w[k]] = init, p[!w[k]] = (d[k][!w[k]] == -1 ? size - 1 : 0);
                 min(p[I], p[J]) >= 0 && max(p[I], p[J]) < size;
                 p[I] += d[k][I], p[J] += d[k][J])
            {
                mcp.input(grid[p[I]][p[J]]);
                prodMax = max(prodMax, mcp.result());
            }
        }
    }

    return prodMax;
}

Question 2

You could create an enum for the different states and then pass it into the function. You would then create an if statement that would set the values based on the passed value.

Question 3

Your process_row() has a bug: from the example in the link, zero entries are allowed in the matrix, so if a row begins with e.g.

x y z 0 ...

and any of x, xy or xyz is larger than all other 4-element products on the rest of that row and on any other row in the matrix, it will incorrectly report that the this is the largest 4-element product. (I'm assuming here that Max_consecutive_prod calculates a rolling product of the last 4 elements provided with input()).

Unless your Max_consecutive_prod is unusually aware of how it is being called, you will also get erroneous results "wrapping" from the end of one row to the next, and from one process_...() call to the next.

Question 4

Suppose you flattened the grid so that it was just 400 numbers in a row, reading left to right and then top to bottom. The topmost row would consist of the first 20 numbers (that is, indices 0, ..., 19); the second rwo of the next 20 numbers, etc. In general, row i (starting from 0) would correspond to indices i*20, i*20 + 1, i*20 + 2, ..., i*20 + 19.

Now, what about columns? The leftmost column starts at position 0, just like the topmost row. It's next element at position 20 (the first element in the second row), and then 40, and... So it's not hard to see that the indices for column j are j, j + 20, j + 40, ..., j + 19*20.

Diagonals are not much different. Try it on paper (grid-ruled paper is good for this sort of thing.)

One more hint: Does it make a difference if you find the product of four elements, multiplying left-to-right, than the same four elements multiplying right-to-left?

Question 5

First, the Context object approach - this just packages the arguments to the support functions mentioned in my comment on your question... it's about as useful as the problem was significant ;-].

struct Context
{
    unsigned& proxMax;
    int i, j;
    Max_consecutive_prod mcp;
    Context(unsigned& prodMax) : prodMax(prodMax) { }
};

template <size_t size> unsigned process_diag_lower(const unsigned (&grid)[size][size])
{
    unsigned prodMax = 0;
    for (int init = 0; init < size; ++init)
    {
        Context context(prodMax);
        for (context.i = init, context.j = 0; context.i < size && context.j < size; ++context.i, ++context.j)
            loop_ij(context);
        loop_outer(context);
    }
    return prodMax;
}

Visitor pattern. Now, I said in my comment "you don't show us enough loop bodies to see the common requirements", and haven't seen anything since, so on the basis of the one body I've seen - namely:

template <size_t size> unsigned process_row(const unsigned (&grid)[size][size])
{
    unsigned prodMax = 0;
    for (int i = 0; i < size; ++i)
    {
        Max_consecutive_prod mcp;
        for (int j = 0; j < size; ++j)
        {
            mcp.input(grid[i][j]);
        }
        if (mcp.result() > prodMax)
        {
            prodMax = mcp.result();
        }
    }
    return prodMax;
}

The above can be split:

template <size_t size, template Visitor>
unsigned visit_row(const unsigned (&grid)[size][size], Visitor& visitor)
{
    for (int i = 0; i < size; ++i)
    {
        for (int j = 0; j < size; ++j)
            visitor.inner{grid[i][j]);
        visitor.outer();
    }
    return visitor.result();
}

struct Visitor
{
    unsigned prodMax;
    Max_consecutive_prod mcp;
    Visitor() : prodMax(0) { }
    void inner(unsigned n) {  mcp.input(n); }
    void outer()
    {
        if (mcp.result() > prodMax) prodMax = mcp.result();
        mcp = Max_consecutive_prod();  // reset for next time...
    }
    unsigned result() const { return prodMax; }
};

This way, the same Visitor class can be combined with your various grid-element iteration routines.