How do you handle large data transfers on very memory constrained, embedded systems?

https://stackoverflow.com/questions/359745

21-08-2019
|

Question

I have a microcontroller that must download a large file from a PC serial port (115200 baud) and write it to serial flash memory over SPI (~2 MHz). The flash writes must be in 256 byte blocks preceded by a write command and page address. The total RAM available on the system is 1 kB with an 80 byte stack size.

This is currently working by filling a 256 byte buffer from the UART and then ping-ponging to a another 256 byte buffer being filled by an interrupt on the RX buffer ready signal while the flash is written to with busy writes. The buffer swapping is repeated until the operation is complete.

I would prefer to setup TX/RX interrupt handlers for both the SPI and UART ports that operate on seperate circular buffers. So, instead of polling for new bytes and waiting for operations to complete I can simply fill the TX buffers and enable the interrupt or check the buffers for incoming data. This would give a lot more clock cycles for real work instead of waiting on peripherals.

After implementing the IRQ's with 128 byte circular buffers, I poll the UART RX buffer for data and immediately place it in the SPI TX buffer to do the file transfer. The problem I am having with this approach is that I don't have sufficient RAM for the buffers and the PC receive buffer is filling up faster than I get the data over to the flash transmit buffer. Obviously, transmission speed is not the problem (115.2 kHz in and 2 MHz out), but there is a write cycle wait after each 256-byte page is transmitted.

It appears the frequent SPI interrupts were blocking some of the UART interrupts and causing bytes to be missed. The solution I chose was to use a ring buffer for the UART receive interrupt and feed the data into a 256 byte page buffer that is sent to the serial flash by polling for byte transfers and write completion. A 128 ring buffer is big enough to prevent overflows during the SPI write.

Solution

I'd do something like a scatter gather on a PC. Create a linked list of a struct like this:

typedef struct data_buffer {
    char flags;
    char[128] data;
}

Have one of the bits in the flag mean "ReadyToFlash" and one for "Flashing". You should be able to tune the number of buffers in your linked list to keep the flash from catching the UART as it writes or vice versa.

If the flash gets to a buffer block that isn't "ReadyToFlash" it would stall and you'd need to have your UART IRQ start it back up. If the UART gets to a block that is "ReadyToFlash" or "Flashing" it is filling too fast and you probably need another buffer, if you have dynamic memory you could do this tuning at runtime and add a buffer to the list on the fly, otherwise you'll just need to do some empirical testing.

OTHER TIPS

Does the UART and the PC side of the application support RS-232 handshaking (flow control)? If so, when your receive buffer gets close to being full, have the ISR drop the CTS line - if the PC side is configured to respect hardware flow control it should stop sending when it sees this condition. Once you have drained (or nearly drained) the receive buffer, assert CTS again and the PC should start sending again.

Note that this makes the software on the embedded device considerably more complex - whether that's a trade-off you're willing to make would have to be analysis done by you and your manager & team.

This is exactly what flow control was created for, I know its a huge pain set up but if you enable flow control on the serial line your problems would be history.

I'm assuming you're transferring a binary file so XON-XOFF isn't the best solution, which leaves hardware flow control.

Another option is to use a protocol with built-in flow control such as XModem. I have a similar embedded project where the flash is written in 128byte pages. What a coincidence that XModem sends data in 128byte chunks then waits for an ACK before it sends the next.

Not sure what I'm missing here, but if the fact is that the average rate of data coming from the PC is higher than the average rate you can write it to the flash, then you're either going to need a lot of RAM, or you're going to need flow control.

But are you saying that it worked when you had block buffers, but now that you have byte buffers it doesn't?

Can you stick with the block buffers which are filled by the UART RX interrupt, and when each buffer is full, hand it off to the SPI/Flash code to empty that buffer using the SPI interrupt? That will save you copying each byte, and instead of having to do the circular buffer logic twice for each byte, you'll only have to do it for each block.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow