Dynamic Language Musings

Posted by ludamad on April 29, 2009, 10:20 a.m.

I've been recently reading up on dynamic language implementations, dynamic languages are languages where 'everything can change' in a sense, and are often are easier to program in than statically typed languages. Python takes things to an extreme, allowing one to override built in functions, and making every look up based on a possibly expandable hash map.

The main problems with dynamic languages is that types have to be dealt with during run time, meaning you don't know if something has an inappropriate type until the program actually runs, and performance, which stops dynamic languages from being used for general purposes.

Of course, for the most part the reason dynamic languages aren't used is because of the latter: they have poor performance. A chess engine written in Python, for example, is 20 times slower than one written in C. With a modern chess algorithm, this means a loss of about 4 plys (4 half moves) in search depth. This is inadmissible for professional programs.

So I've decided to create my own fairly dynamic language, mostly for fun/practice. The goal is to gain reasonable speed without compromising syntax.

I came across a neat way to speed up run-time typed languages - a sort of 'type guessing'. Basically, you create code based on an assumption that an object does not change types frequently, and then simply do a verification that the type has not changed on run-time. Think of the following example:

a = 0

a = SomeFunction(a)

Though objects can change types in dynamic languages, they frequently stay the same type for the entire object's life. Here, it'd be quite complicated to track down SomeFunction, go through the syntax, and figure out if what it returns is for sure a number. A simpler way is to find the latest assignment that gives us a definite type for 'a', ie a = 0, and use that as the guessed type. This way, we can have code that treats 'a' as an integer, and possibly inline functions that are associated with integers. The type of optimizations you can do with type assumptions is vast.

Of course, if 'a' was passed to a function, that function cannot really guess its type - unless it is inlined. Inlining is not possible in Python due to its dynamism, but will be possible in the less dynamic LudaScript.

Also, typically the 'type' of an object is usually stored as a pointer to a hash table of its variables. In ludascript I will experiment with java-style virtual table pointers, however this might have bad cache performance with lots of function names.

Typically in dynamic languages, everything is merely a reference, and the type is kept on the heap next to the allocated object data. However, I think there is interesting potential if the type pointer was put outside the heap and stored alongside the reference, this would allow for the 'reference' to be interpreted as an integer, allowing for fast integers (pythonic integers are allocated on the heap and checked against a pool of already allocated integers to make sure no two of the same are being kept).

That is all for now.

Comments

Juju 15 years, 8 months ago

Interesting, even if a little over my head.

Polystyrene Man 15 years, 8 months ago

I'm completely dumb on this subject.

How do you even go about writing your own language?

ludamad 15 years, 8 months ago

Poly: Depends, you either make an interpreter or a compiler. Either way you're just making something that looks at text and turns it into runnable instructions. Strictly speaking you 'design' and then 'implement' a programming language. Most recent languages implement and design at the same time (like Ruby, which has no formal specification, but whatever the C implementation does is considered standard).