Snake in assembly , what datastructure to use?

https://stackoverflow.com/questions/13094373

14-07-2021
|

Domanda

Im totally new to assembly and believe it or not, our first assignment is to create snake in assembly. How should i store the snake? should i have it in the stack, or should i place it into some register? I have done some research about this "terrible" language for about 3 days, but cant figure out a good way to start. I would probably use a linked list of some sort in c++ , but unfortunately this is not c++.

Any help is very appreciated

Soluzione

How should i store the snake? should i have it in the stack, or should i place it into some register?

Assuming you are talking about a Snake animation / game, the answer is probably neither. You most likely use a 2-D array to represent the cells on the "screen", and represent the snake's body as cells of a given "colour".

I would probably start by working out the simplest way to implement the code in C or C++ ... without using any data structure libraries ... and then recode the algorithm in assembler.

Altri suggerimenti

The head of the snake can be tracked with one pairs of x,y coordinates. And of course you need the current direction of the head. (The tail needs more than just an x,y pair, probably a circular buffer of x,y pairs is the best bet. Then you don't actually need a tail x,y separate from the history buffer.)

As @Daniel's answer explains, looking at the next pixel that you're going to draw as the head moves tells you what the snake eats: nothing, apple, wall, or itself.

(If you have efficient read access to video RAM, you can simply read pixels instead of also keeping a shadow board array. But most modern systems don't have efficient video-RAM read access; on typical x86 VGA memory is uncacheable with write-combining. But if you pretend you're writing for a real 8086, VGA RAM was on the other side of an ISA bus, so that's slow too. But 8086 didn't have any cache. Some even earlier systems maybe had video RAM as part of main memory, IDK.)

Another option would be searching a list of apple and snake squares, but that's a lot more work than checking one array entry. A key factor in writing a game (or other real-time software) is to put a tight upper bound on the worst-case performance. When there are lots of squares with snake on them, and the step time is short (snake moving fast), you don't want the game to get chunky and have some steps take longer than others because the search took too long. That gets unplayable fast.

So probably you want an array of rows * cols bytes (which you do 2D indexing into) where you store codes like 0 for empty, 1 for wall, etc. If you're using a paletted video mode, you might use the same codes as for pixels in video RAM, if you're drawing directly into VRAM, or with modern video APIs, a screen buffer or an OpenGL texture. Otherwise you'll just have codes in your state array and 24 or 32-bit pixels in your screen buffer.

To simplify indexing, you could make the storage format use a power of 2 row stride, even if the board width isn't. So there'd potentially be columns of padding at the end of every row. But that could waste a lot of memory, and usually you only need to index relative to your current position (e.g. +-1 byte for the next column left/right, or +-row_stride to get the the next row down/up.)

Head/tail x,y coordinates could just be a single index or pointer into the flat board array, but we also need the actual x,y coordinate to draw graphics separately.

After every step, you also need to clear a square at the tail (unless the snake is still growing from a recent apple; you'll want a down-counter of how many pending growths there are). We know where to paint the screen back to black because we have the tail x,y.

But how do we update the tail x,y so the next step will follow the actual path the head took? You might hope that by looking at the squares around the tail, you could figure out which one is the next-oldest. We can prove that's not the case with an example where two different trajectories lead to an identical board position with the same Head and Tail. The player can create this board layout by zig-zagging horizontally or vertically.

   H123        H167          H is the head, T is the tail
    654         258            snake segments are 1..8 in order of moves.
    78T         34T

Without the 1..8 history, all squares are just "snake" and no algorithm can unambiguously decide which way the tail should move after erasing it. (Even a slow algorithm that could look around the whole board.)

There are other ambiguous cases for reasonable algorithms that only look at the 8 squares surrounding the tail, e.g.

    54               H12          # defeating a "local" algorithm
    T321H              34
                       T5

So we have to record history somehow. I think the best and simplest option is:

Use a circular buffer as a queue of (x,y) pairs to record the snake's path

Your snake will crawl through memory addresses very much the same way the snake crawls on screen, so it's kind of a appropriate and not a coincidence that this data structure leads to a simple implementation (with very little work to be done per step.)

The size of this buffer will set the max snake length we can support, at which point we're reusing a buffer entry to record the head right after reading the tail. You can make this bigger than rows * columns, so the game imposes no actual limit, if you want.

This might be the technical reason why many real classic Snake games have a max snake length at all. (Besides gameplay reasons.) With your game as the only thing running on the CPU, there's no reason not to just always use the same size circular buffer, i.e. statically allocated with as big as you'll ever need. Then you won't ever have the game stutter from copying it after growing or anything.

You might write the asm similar to this C:

typedef struct xy {uint8_t x, y;} xy_t;
static const unsigned max_snakelen = 512;
struct xy snakepath[max_snakelen];
// uint16_t []  board array offsets is great if we don't also need x,y for graphics

enum board_entry { SQUARE_EMPTY=0, SQUARE_APPLE, SQUARE_SNAKE, SQUARE_WALL };
static const unsigned rows = 40;
static const unsigned cols = 80;   // including walls
board_entry board[rows * cols];  // we'll do 2D indexing manually because asm has to

static inline unsigned dir_to_board_byte_offset(enum cur_direction) { ... }
 // top bit maps to +- rows, or bottom bit to +- 1
static inline xy_t dir_to_xy_offset(enum directions  cur_direction) { ... }
 // map 0..3 to  {+1, 0}, {-1,0}, {0,+1}, {0,-1} in some order.

void step(enum directions cur_direction)
{
    static unsigned tailidx = 0;  // maybe kept in a register
    static unsigned headidx = 5;  // and arrange for the initial snake to be in the buffer somehow

    if (!growth)    // tail stays still while snake grows
        --growth;
        xy_t tail = snakepath[tailidx++];      // movzx edx, byte [snakepath + rbx*2] // movzx ecx, byte [snakepath + rbx*2 + 1]
        tailidx &= max_snakelen - 1;   // wrap the circular buffer idx.  AND ebx, MAX_SNAKELEN - 1

        board[tail.y * cols + tail.x] = SQUARE_EMPTY;   // imul stuff
        // and update graphics
    }

    // Most of the registers used for the above stuff are dead now, and can be reused below

    // Tail segments disappear *before* the head moves: snake can be head-to-tail
    // and go full Ouroboros without crashing.

    xy_t headxy = snakepath[headidx++];  // or track head separately, maybe in a reg so we can more quickly  like our function arg.
    headidx &= max_snakelen - 1;

    headxy += dir_to_xy_offset(cur_direction);  // maybe use a 16-bit add to do both components in parallel, except that `-1` will carry into the upper element.  So that only works with SIMD `paddb` or something. 
    // pretend that's a C++ overload and do the component adds separately

    enum board_entry headtype = board[headxy.y * cols + headxy.x];
    if (headtype != SQUARE_EMPTY) {
       apple or death;
    }

    board[headxy.y * cols + headxy.x] = SQUARE_SNAKE;
 // ADD NEW HEAD to circular buffer
    snakepath[headidx] = headxy;                   // mov [snakepath + 2*rbx], ax
    // and draw graphics for head, using its x,y.
}

This is just intended to give you a general idea, it's pretty clunky and not good C style. (And not intended to be.) I know not everything is declared. It's up to you how much state you keep in registers in an asm version of an event loop that waits for keypresses and timer events. Functions you call will have to save/restore regs, but there's no harm in having the outer-most loop keep its state in regs if you're using the standard calling convention and doing that anyway.

Multiply used to be slow, so you might consider keeping x,y and 1D array indices or actual pointers in your snake struct. But modern CPUs have generally fast multipliers. (Like 3 cycle latency fully pipelined, on Intel since Core 2 or so, vs. over 100 on non-pipelined 8086).

On a 16-bit machine especially, having absolute addresses embedded in several instructions doesn't cost many code bytes. It gets worse in 32 bit code. x86-64 can use 32-bit absolute addresses in position-dependent code on Linux, still allowing addressing modes like [snakepath + rbx*2], otherwise you'd do like a RISC and get at least one base address in a register and reference static data relative to that.

Depending on your target ISA, you might keep more or fewer pointers in registers.

An alternative would be to record every time the player turned. But in the worst case, the player turns once per step, so you still need as many buffer entries as the length of the snake. (Or if you have less, you have highly undesirable gameplay: the player could unexpectedly be unable to turn because they turned too much in the past, and die a frustrating death.)

Each entry requires at least as much space as an x,y pair, and slightly more work to decode. (Recording just an x,y pair at every turn gives enough info to reconstruct the path by looking at the tail x,y vs. the last turn.)

A previous-direction + length could actually be much more compact, though: 1 byte per record. If you pack bits into bitfields, direction takes 2 bits, and we can choose to make new entries early to keep the length at 6 bits even if the board is wider or taller than 63 = 2⁶-1 squares. So each entry can easily be 1 byte. (Decode with dir = x & 3 and len = x >> 2).

You could use this dir+len format by decrementing the length of the tail entry until it reaches 0 (that means the tail got to the turn, and should start looking at the next buffer entry). For efficiency, you'd keep the current one you're working on unpacked, or you'd use sub byte [rsi], 4 and check for carry (to detect if the length had already been zero when you decremented, and that made it wrap).

I hope that after 6 years it's not an issue any longer, but in my case I based on the pixels. For example, there was the pixel that was assigned to be the head of the snake. I always saved it's location (x axis and y axis), and checked within a loop if there is any painted pixel in the direction of the snake:

If it was black- I did nothing and continued moving the snake. Black was the background.
If it was red- I knew the snake was going to eat an apple.
If it was blue, I knew the snake was about to collide a wall, and so on.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow