Confusion about data types, compilers, hardware data representation and static vs dynamic typing [closed]

https://softwareengineering.stackexchange.com/questions/303783

09-12-2020
|

Question

I am trying to understand static vs dynamic typing, but am really struggling to see how everything fits together.

It all starts with data types. As far as I understand, data types are quite abstract notions, which exist 'in' compilers in order to categorise data so that the operations on various types of data can be validated (i.e. in an attempt to stop you adding a string to an integer), and in order to generate the right machine code for the hardware interpretation of the value. I.e. say we have the following:

int myInt = 5;
char myChar = '5';

Console.WriteLine(myInt);
Console.WriteLine(myChar);

Both would ultimately write a five to the console window, but, since the representations in memory of integers and characters are different, the machine code which interprets the value in the memory location bound to the variable myInt, which takes that value and displays it on the console window, would be different to the machine code for myChar. Even though Console.WriteLine() 'does the same job', the different representations of five require different low level code.

So my question is this: if data types 'exist only in the compiler' - i.e. once the program has been compiled into machine code there is no knowledge of what type of data the value in a particular memory cell is (everything is just 1s and 0s) - then how can any type-checking be done at runtime? Surely there is no concept of data types at run time? So surely dynamic typing can't be anything to do with type-checking occurring at run time?

Where is my understanding going wrong, and could somebody please explain static and dynamic typing with respect to the argument given above? What is the big picture of what is going on?

I am trying to understand this for an essay, so references to books or online sources would be useful :) Thank you.

Solution

Your assertion that "datatypes exist only in the compiler" is not true for dynamic languages (and not a few static ones).

Once that doesn't hold, it becomes a simple runtime check.

OTHER TIPS

How can any type-checking be done at runtime? Surely there is no concept of data types at run time?

Sometimes there are data types at runtime. Other times, the type checking can be done entirely at compile-time. In principle, it's up to the implementation whether it chooses to implement a type check at compile time or at runtime, and they have a lot of options.

In languages like C++ and Java, there is a distinction between primitive types (typically integers, characters, etc) and structured or complex types (typically structs and classes). In these languages it's common for only the structured types to retain some metadata about their type at runtime (though that usually includes the types of their primitive members/fields). For example, in C++, the typeid function is typically evaluated at compile time, and works even on primitive types, while a dynamic_cast is executed at runtime and only works on classes.

In more dynamic languages like Python and Javascript, any variable can have any type, so a straightforward interpreter will most likely have to implement all variables as a "variant type", i.e. a class that literally carries its own type around with it. A more sophisticated interpretation may use optimizing AOT or JIT compilers that are run on sections of code where all typing issues can be proven at compile time, so it's safe to replace the variant types and their runtime checking with highly optimized code. This is why you see articles about the V8 Javascript engine's "inner classes" that tell you to avoid dynamically changing the structure of your objects or the types stored in your arrays, because that sort of thing prevents V8 from optimizing it down to "just 1s and 0s" that don't know their own types.

Hopefully those examples are sufficient to clear up your confusion.

You may think that

Console.WriteLine(myInt);
Console.WriteLine(myChar);

get compiled in the same way. But most likely the get compiled to something like

Console.WriteInteger(myInt);
Console.WriteEndOfLine();
Console.WriteChar(myChar);
Console.WriteEndOfLine();

On the other hand, some languages are very flexible. For example, Swift has the data type Any which can literally hold any value. An Any object is quite large, because it contains not only the value, but a complete description of the value. If you write

var myInt:Any = 5
var myChar:Any = '5'

then you have two objects of type Any, one containing the description of the "Int" type and an integer value, one containing the description of the "Char" type and a char value.

It depends on the language. In C, types exist only at compile time. In Java and C#, types exists both at compile time and runtime. In dynamic languages (like Python, JavaScript etc.) they exist only at runtime.

Your code example would therefore work quite different depending on the language. In Java/C# the two calls to Console.WriteLine() actually calls two different methods. This is called method overloading - multiple methods can have the same name, as long as they have different types of parameters, so the compiler can select the correct method implementation at compile time depending on the types of the arguments. The two different methods just happen to produce the same output in this specific case.

A dynamic language on the other hand would not have method overloading, but might have a single method which behaved differently depending on the runtime type of the argument.

It depends entirely on the runtime. Java does it one way, .NET does it another, Python does it a third way, C++ does it a limited way, and C doesn't do it at all.

Such runtime type info might come either as metadata (or a pointer to thereof) attached to an object, or instead be static type info disguised as dynamic type info (e.g. C++'s typeid operator). But, again, it depends on the runtime.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange