What are data structures at the lowest level?

Question 1

Cool question!

It becomes hardware whenever someone hands it off to calls to the processor (CPU). If you have to read a manual from a chip manufacturer to understand how to do what you want to do, you're at the level you describe.

_{(source: micro-examples.com)}

Here's an overview of the path from physics to hardware to code to users: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-002-circuits-and-electronics-spring-2007/video-lectures/lecture-1/ although I don't think it's to do with lambda calculus. (But you're just saying that's what inspired the question—it's not the question per se, right?)

There is a step where the person who writes the language has to interface with processor instructions, eg https://en.wikipedia.org/wiki/X86_instruction_listings || https://en.wikipedia.org/wiki/X86_calling_conventions.

Peek at kernel programming or think about embedded systems (in a microwave, on the wing of an airplane), ARMs, or mobile devices—these are the things people program with that have non-laptop chipsets.

People who write BLAS (linear algebra solver libraries) sometimes get down to this level of detail. For example https://en.wikipedia.org/wiki/Math_Kernel_Library or https://en.wikipedia.org/wiki/Automatically_Tuned_Linear_Algebra_Software. When they say the BLAS is "tuned" what do they mean? They're talking about sniffing out facts about your processor and changing how inner-inner-inner loops make their decisions to waste less time with the way the physical thing is configured.

_{(source: hswstatic.com)}

If I recall correctly, high-level programming languages (like C ;) ) make no assumptions about what system they will be running on so they make agnostic calls that run like ten times† slower than they would if they knew ahead of time which kind of call to make. Every. Time. This is the kind of thing that could drive you insane, but it's that classic engineering tradeoff of the technical people's time versus the end-user's performance. I guess if you become a kernel programmer or embedded-systems programmer you can help put an end to all of the wasted clock cycles on computers across the globe—processors getting hot as they waste a lot of pointless going back-and-forth. (Although there are clearly worse things that go on in the world.)

†: I just quickly searched how much BLAS speedups are and yeah, it can be a factor of like 15 or 20. So I don't think I'm exaggerating / misremembering what I heard about wasted movement. BTW, there is something similar goes on in power generation: the final step (the turbine) in power generation is only like 20% efficient. Doesn't all that waste just drive you crazy?! Time to become an engineer. ;)

A cool project you could check out is MenuetOS; someone wrote an operating system in assembler.

Yet more cool stuff to look at could be this guy who says it's actually pretty easy and fun to learn x86 assembly language. (!)

_{(source: iqsdirectory.com)}

You can also read back on the olden days when there was less distance between software and hardware (eg, programming with a punchcard). Thankfully people have written "high-level" languages that are more like the way people talk and think and less like moving some tape around. Data structures may not be the most obvious thing from everyday conversation but they are more abstract than [lists a sequence of GOTO and STORE instructions...].

HTH

Question 2

The trick is that you don't have to care. cons, cdr and car are abstractions and so their underlying implementation shouldn't matter.

What you have there are what are called "Church pairs", we build everything from functions. In modern machines we build everything from strings of 1s and 0s, but really, it doesn't matter.

Now, if you're wondering how all of these abstractions are implemented at in your particular implementation, that depends. In all likelihood your compiler/interpeter is jumping through hoops behind the scenes allocating cons cells as tightly packed pair of pointers (or similar) and turning your functions into a string of 0's and 1's forming the appropriate machine code and pairing this with a pointer to its environment.

But like I said, the whole beauty of building these abstractions is that you don't have to care as a user :)

Question 3

The substrate at the bottom of the abstraction chain varies depending on the hardware, but it's probably going to be some sort of register machine, which will have some kind of native instruction set.

SICP's last chapter is all about this layer. The book doesn't present any real-life hardware, but an abstract (ahem turtles!) register machine that sort of resembles what might actually be going on ... which happens to be modeled in Scheme, just for extra metacircular fun. I found it a tough read, but worthwhile.

If you're interested in this sort of stuff you might want to check out Knuth, too.

Question 4

There exists several bottoms for abstraction chains. Depending on the underlying task computer scientist can choose one, that suits better. The most common are Turing Machine and Lambda Calculus.