문제

I've been messing around with the Cambridge baking pi tutorials (Basic OS development with little demos for the raspberry pi). Only i've been writing the code in C instead. I've got my development environment setup and i'm able to run my code successfully with GCC optimization turned off but not on.

The problem occurs with my code that uses memory mapped IO (if i compile everything else except the following file with optimizations on, it works). Which i initially thought to be that my pointers were not declared as volatile so the compiler was optimizing the actual writes to memory away and using registers instead. But even when i declare them as volatile, the problem persists.

Here's how i do a write to memory:

#define UART0_CR ((volatile uint32_t *) (UART0_BASE + 0x30))
...
*UART_CR = 0;

The pointers are of a volatile type so i can't see why GCC would decide not to do the actual write. Is there anything else i need to be on the lookout here for? Am i misunderstanding the use of volatile?

Complete working file (with optimization off anyway):

#include <stdint.h>
#include <uart.h>

#define GPIO_BASE   0x20200000
#define GPPUD       ((volatile uint32_t *) (GPIO_BASE + 0x94))
#define GPPUDCLK0   ((volatile uint32_t *) (GPIO_BASE + 0x98))

#define UART0_BASE      0x20201000
#define UART0_DR        ((volatile uint32_t *) (UART0_BASE + 0x00))
#define UART0_FR        ((volatile uint32_t *) (UART0_BASE + 0x18))
#define UART0_IBRD      ((volatile uint32_t *) (UART0_BASE + 0x24))
#define UART0_FBRD      ((volatile uint32_t *) (UART0_BASE + 0x28))
#define UART0_LCRH      ((volatile uint32_t *) (UART0_BASE + 0x2C))
#define UART0_CR        ((volatile uint32_t *) (UART0_BASE + 0x30))
#define UART0_IMSC      ((volatile uint32_t *) (UART0_BASE + 0x38))
#define UART0_ICR       ((volatile uint32_t *) (UART0_BASE + 0x44))

static void delay(int32_t count) {
    asm volatile("__delay%=: subs %[count], %[count], #1; bne __delay%=\n"
                :
                : [count]"r"(count) 
                : "cc"
    );
}

void uart_init() {    
    *UART0_CR = 0; // Disable UART0.    
    *GPPUD = 0;     // Disable pull up/down for all GPIO pins & delay for 150 cycles.
    delay(150);   
    *GPPUDCLK0 = (1 << 14) | (1 << 15); // Disable pull up/down for pin 14,15 & delay for 150 cycles.
    delay(150);   
    *GPPUDCLK0 = 0; // Write 0 to GPPUDCLK0 to make it take effect.    
    *UART0_ICR = 0x7FF; // Clear pending interrupts.
    *UART0_IBRD = 1; //Set rate
    *UART0_FBRD = 40;     
    *UART0_LCRH = (1 << 4) | (1 << 5) | (1 << 6); // Enable FIFO & 8 bit data transmissio (1 stop bit, no parity).    
    *UART0_IMSC = (1 << 1) | (1 << 4) | (1 << 5) | (1 << 6) | (1 << 7) | (1 << 8) | (1 << 9) | (1 << 10); // Mask all interrupts.   
    *UART0_CR = (1 << 0) | (1 << 8) | (1 << 9);  // Enable UART0, receive & transfer part of UART.  
}

void uart_putc(uint8_t byte) {    
    while (1) { // wait for UART to become ready to transmit    
        if (!(*UART0_FR & (1 << 5))) break;
    }   
    *UART0_DR = byte; // Transmit
}

void uart_puts(const char *str) {
    while (*str) {
        uart_putc(*str++);
    }
}

EDIT:

Looked up how to view the assembly, very useful, thankyou. If i take first 2 writes out of uart i init (up to the first delay call) i get:

Unoptimized:

uart_init:
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 1, uses_anonymous_args = 0
    stmfd   sp!, {fp, lr}
    add fp, sp, #4
    ldr r3, .L3
    mov r2, #0
    str r2, [r3, #0]
    ldr r3, .L3+4
    mov r2, #0
    str r2, [r3, #0]

Optimized:

uart_init:
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
    ldr r3, .L2
    ldr r2, .L2+4
    mov r1, #0
    str r1, [r3, #48]
    str r1, [r2, #148]

The only difference seems to be that for the unoptimized, it doesn't add an offset to the .L2 and .L2+4 labels, which the the optimized one does. Unless the pointers at those labels have been prepared to already have the offsets calculated on them.

I have a forked qemu (qemu-rpi) modded for raspberry pi support, so i'm going to try checking what values are loaded in the r3 and r2 registers to see if they are the correct pointers before the store happens and then i'm going check whether the putc is looping on the transmit using breakpoints. Not very adept with my environment yet so this might take me a while!

도움이 되었습니까?

해결책

Using an optimizer along with code that talks to hardware can result in bizarre behavior.

Try using __attribute__((noinline)) in front of functions that talk to your hardware. I'd also recommend putting all code that talks with hardware into its own file and turning off optimization. The optimizer can re-order or in-line these items.

다른 팁

I'm trying to accomplish the same thing but instead of defining, I'm just assigning like this:

volatile const int *Register = (volatile const int *)0xFFFEF000;

Then you can send values directly to the registers by name:

Register = 0xFFFFFFFF;
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top