Using yyparse() to make a two pass assembler?
-
23-08-2019 - |
Question
I'm writing an assembler for a custom micro controller I'm working on. I've got the assembler to a point where it will assemble instructions down to binary.
However, I'm now having problems with getting labels to work. Currently, when my assembler encounters a new label, it stores the name of the label and the memory location its referring to. When an instruction references a label, the assembler looks up the label and replaces the label with the appropriate value.
This is fine and dandy, but what if the label is defined after the instruction referencing it? Because of this, I need to have my parser run over the code twice.
Here's what I currently have for my main function:
303 int main(int argc, char* argv[])
304 {
305
306 if(argc < 1 || strcmp(argv[1],"-h")==0 || 0==strcmp(argv[1],"--help"))
307 {
308 //printf("%s\n", usage);
309 return 1;
310 }
311 // redirect stdin to the file pointer
312 int stdin = dup(0);
313 close(0);
314
315 // pass 1 on the file
316 int fp = open(argv[1], O_RDONLY, "r");
317 dup2(fp, 0);
318
319 yyparse();
320
321 lseek(fp, SEEK_SET, 0);
322
323 // pass 2 on the file
324 if(secondPassNeeded)
325 {
326 fp = open(argv[1], O_RDONLY, "r");
327 dup2(fp, 0);
328 yyparse();
329 }
330 close(fp);
331
332 // restore stdin
333 dup2(0, stdin);
334
335 for(int i = 0; i < labels.size(); i++)
336 {
337 printf("Label: %s, Loc: %d\n", labels[i].name.c_str(), labels[i].memoryLoc);
338 }
339 return 0;
340 }
I'm using this inside a flex/bison configuration.
Solution
If that is all you need, you don't need a full two-pass assembler. If the label is not defined when you reference it, you simply output a stand-in address (say 0x0000) and have a data structure that lists all of the places with forward references and what symbol they refered to. At the end of the file (or block if you have local symbols), you simply go through that list and patch the addresses.