Why did BASIC use line numbers?

https://softwareengineering.stackexchange.com/questions/309767

12-12-2020
|

Question

Why did old BASICs (and maybe other languages) use line numbers as part of the source code?

I mean, what problems did it (try to) solve?

Solution

BASIC needs to be taken into context with its contemporary languages: early fortran, cobol and assembly.

Back when I was dabbling on 6502 assembly without labels, this meant that when you found that you needed to add an instruction somewhere in the middle of tightly packed code (I later added NOPs) you needed to go through and redo all of the jump addresses. This was time consuming.

Fortran was a line numbered based system that predated BASIC. In Fortran, columns 1-5 were a line number to be used for targets for branching. The key thing with Fortran was that the compilers tended to be a bit more intelligent than the BASIC interpreter and adding a few instructions was just a matter of punching some cards and putting them in the deck at the right place.

BASIC, on the other hand had to keep all of its instructions ordered. There wasn't much of a concept of a 'continuation of the previous line'. Instead, in Applesoft BASIC (one of the widely used dialects that I am familiar with and can find information on) each line in memory was represented as:

NN NN   TT TT   AA BB CC DD .. .. 00

It had two bytes for the address of the next line (NN NN). Two bytes for the line number of this line (TT TT), and then a list of tokens (AA BB CC DD .. ..) followed by the end of line marker (00). (This is from page 84-88 of Inside the Apple //e)

An important point to realize when looking at that memory representation is that the lines can be stored in memory out of order. The structure of memory was that of a linked list with a 'next line' pointer in the structure. This made it easy to add new lines between two lines - but you had to number each line for it to work properly.

Many times when working with BASIC, you were actually working in BASIC itself. In particular, a given string was either a line number and BASIC instructions, or a command to the basic interpreter to RUN or LIST. This made it easy to distinguish the code from the commands - all code starts with numbers.

These two pieces of information identifies why numbers were used - you can get a lot of information in 16 bits. String based labels would take much more space and are harder to order. Numbers are easy to work with, understandable, and easier to represent.

Later BASIC dialects where you weren't in the interpreter all the time were able to do away with the every line numbered and instead only needed to number the lines that were branch targets. In effect, labels.

OTHER TIPS

On early microcomputers editing was line based. You couldn't just move freely around in the source code and edit. You had a single line at the bottom of the screen where you could type commands and enter code. The rest of the screen was read-only code-listings and command output. If you wanted to edit say line 90 in the program you wrote "EDIT 90", and the contents of line 90 entered the single-line edit buffer. When you had edited the line you hit enter, and the program listing was updated. So you needed line numbers in order to be able to edit the program.

When code editors became more advanced and allowed you to move the cursor around in the code listing you didn't need line numbers anymore.

If you are thinking of BASIC dialects of the 8-bit home microcomputers of 80's, then those computers did not have text editors (unless you bought some word processor application). There was no way to have the entire BASIC program source code "open in an editor", like you would have when programming today. Programmer wouldn't even think about the program as a source code file, or text, really.

Example problem

So, lets say you have a simple program without line numbers in your head:

FOR I=1 TO 42
PRINT I
NEXT I

You boot up your computer. You have a prompt "ready" or something like that, and cursor sitting in next line. This is much like today's REPL environments of different scripting languages, though not really as strictly line based, more like screen based. So not quite like REPLs of today, but close.

Now if you start entering the program, you might get error after first line, because BASIC interpreter tries to immediately execute (and forget) it, and it doesn't make sense without NEXT to end the loop. This is not text editor where you edit text, this is where you give commands to the computer!

Partial solution

So you need some way to say, this is program line, store it! You could have a special command or just a symbol telling that hey, this is program line, store it. Let's imagine this:

#FOR I=1 TO 42
#PRINT I
#NEXT I

Ok, now our imaginary BASIC interpreter stored the program and you can run it. But now you want to edit the PRINT line. How do you do it? You are not in a text editor, you can't just move cursor to the line and edit it. Or you want to add another line like LET COUNT=COUNT+1 in the loop. How do you indicate where the new line should be inserted?

Working solution

Line numbers solve this in a very easy, if rather klunky way. If you enter a program line with a number that already exists, the old line gets replaced. Now the screen-based REPL environment becomes useful, because you can just move cursor to program listing on screen, edit the line on screen and press ENTER to store it. This seems like you are editing the line, when in fact you are editing text on screen and then replacing the entire line with new one from the screen. Also, inserting new lines becomes easy if you leave unused numbers in between. To demonstrate:

10 FOR I=1 TO 42
20 PRINT I
30 NEXT I

After re-entering line 20 with changes, and adding new lines, it could be

5 LET COUNT=0
10 FOR I=1 TO 42
20 PRINT "Index", I
25 LET COUNT=COUNT+1
30 NEXT I

Conclusion

Other answers explain how line numbers came to be. I'm trying to cover here, why the line numbers survived as long as they did, how they kept solving a real-world problem: They offered a way to do the actual programming without a real editor, in a very simple way. Once proper, easy-to-use full-screen text editors became the mainstream way to edit code, both with hardware limitations disappearing and when inertia of people adapting new things was overcome, then line number based BASIC dialects quite quickly disappeared from use, because the core usability problem they solved was no longer an issue.

In the place and era when Basic was developed, the best available I/O device was a teletype. Editing a program was done by printing (on paper) a listing of the whole program, or the interesting part of it, and then typing replacement lines with line numbers.

That's also why the default line numbering was by 10, so there would unused numbers between existing lines.

"Line numbers" means a few different things.

First of all, keep in mind that the concept of "lines" hasn't been around forever. Many programming languages in this era used punched cards, and having sequence numbers (usually in the last few columns of the card) helped you recover your deck in the proper order if you dropped it, or something awful happened in the card reader. There were machines to do this automatically.

Line numbers for use as targets of GOTO statements is a totally different concept. In FORTRAN IV, they were optional, and preceded the statement (in columns 1-5). In addition to being easier to implement than free-form labels, there was also the concept of the computed and assigned GOTO, which allowed you to jump to an arbitrary line number. This was something most modern programming languages don't have (although switch statements come close), but was a familiar trick to assembler programmers.

BASIC was derived from FORTRAN, and intended to be simpler to implement and to understand, so forcing every "line" to have a line number (both for sequencing and as the target of GOTO/GOSUB statements) was probably a design decision made for that reason.

I started programming in COBOL which used line numbers in columns 1-6 of each line. Because there were no IDE's in the 1970s everything was done via punched cards and the line number was used to identify which lines in the original source were to be replaced and which new lines added. We used to increment line numbers by 100 to give us room to add in more lines.

BASIC came about later than FORTRAN, in the line-terminal era. It featured a read-exe-print-loop environment that was more interactive than a deck of cards.

I learned to program, in BASIC, on a one line display that held 24 characters. Line numbers were a natural way to specify where you wanted a line to go, whether editing one or inserting between others.

I really can't imagine how else you would do it.

One point no one's mentioned yet is that it's easier for beginners to reason about program flow where the branch targets are explicit. So rather than having to match (possibly nested) BEGIN/END statements (or whatever block delimiters were used), it was pretty obvious where the control flow went. This was probably useful given BASIC's target audience (it is the Beginner's All-purpose Symbolic Instruction Code, after all).

Dartmouth Time Sharing System used a teletype interface. Thus it used a command based interface. Originally, line numbers were just used as a means to edit the program. You could insert, replace, or delete by using a line number. It does not appear that the early version used line numbers for goto statements, but this was a later addition to the language.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange

Why did BASIC use line numbers?

Example problem

Partial solution

Working solution

More problems we just solved

Conclusion