Forum Replies Created
- AuthorPosts
I’ve seen mention in these fora that the Mill provides some support for garbage collectors. I’d like to ask a few questions on the topic if I may.
(1) Garbage collection typically involves a “stop the world” event where threads in the mutator are temporarily suspended, at least for the purposes of identifying roots. Stopping the world can be quite costly. Does the Mill offer any special support in this regard?
(2) Each concurrently executing thread may have pointers sitting on its belt. How can the GC get access to these during root marking?
(3) Many collectors compact memory. Since pointers can live on the belt and/or may have been squirreled away by the spiller, how should the GC take this into account?
Apologies for asking more than one question in a single post, but they are closely related.
- in reply to: Simulation #1688
# Flom – a Functional Language on the Mill
Ralph Becket at outlook dot com, 2-Feb-2015
I considered calling this “Mill Functional”, but that had an unfortunate contraction of its name.
## Introduction
The [Mill](http://millcomputing.com) is a new CPU architecture. The designers have invited curious parties to propose a simple language to be compiled or interpreted under simulation. Real Programmers use functional languages and this is my contribution.
Flom is a statically typed, strict functional programming language in the vein of [Core Haskell](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/CoreSynType) as used in GHC (although GHC’s Core is, of course, lazy). Some of the implementation ideas I have taken from my experience developing the [Mercury](http://mercurylang.org/) compiler.
The goal is to produce something useful, which can be implemented reasonably efficiently with modest effort, and which can be later extended largely via syntactic sugar.
## Some examples
If you’re familiar with Haskell or ML or F#, these examples are standard. (The syntax is somewhat conjectural and the reader is invited to use their imagination to fill in any gaps.)
### Fibonacci
The slow way:
let fib n = if n < 2 then 1 else (fib (n - 1)) + (fib (n - 2))
The quick way:
let fib n = let fib' n i j = if n <= 1 then i else fib (n - 1) j (i + j) in fib' n 1 1
### Lists
// All introduced types are either type synonyms or // discriminated unions (aka algebraic data types). // Types can take type parameters, as here. // type list T = nil | cons T (list T) let len xs = case xs of nil = 0 | cons _ xs' = 1 + (len xs') let map f xs = case xs of nil = nil | cons x xs' = cons (f x) (map f xs') let foldr f xs z = case xs of nil = z | cons x xs' = f x (foldr f xs' z) // Note that type constructors are also functions: // let append xs ys = foldlr cons xs ys let filter p xs = let f x ys = if p x then cons x ys else ys in foldlr f xs nil
### Quicksort
let qsort xs = case xs of nil = nil | cons x xs' = let as = qsort (filter (> x) xs') let zs = qsort (filter (< x) xs') in append as (cons x zs)
## Core Flom and desugaring
Expr ::= k // Literal constant; e.g., an integer. | ID // A named quantity (variable, function, etc.) | Expr Expr ... // Function application | LetsExpr | CaseExpr | ( Expr ) LetsExpr ::= LetExpr* in Expr LetExpr ::= let ID ID* = Expr // Parameters indicate a function. CaseExpr ::= case Expr of Pattern | Pattern ... Pattern ::= ID (ID | _) * = Expr // First ID is the constructor tag; | _ = Expr // _ is the "don't care" pattern. Type ::= type ID ID* = ID TypeExpr* | ID TypeExpr* ... TypeExpr ::= ID ID* // Type constructor or type synonym.
### Desugaring
if
expressionsif x then y else z
desugars to
case x of True = y | _ = z
### Desugaring nested expressions
f (g x)
desugars to
let tmp = g x in f tmp
(and likewise for all the other cases until we’re left with nothing else to simplify).
### Desugaring currying
If, say,
f
is a function of arity three occuring in an application of arity one, thenf x
desugars to
let f' y z = f x y z in f'
### Desugaring free variables and local functions
A free variable is defined here as a name whose value cannot be statically determined. The free variables of a function are desugared away by explicitly constructing and passing an environment argument and lifting the local function definition to the top level. For example,
let f x y = let g a b = a * x + b * y in g 11 22
desugars to
let f x y = let env = tuple2 x y in g' env 11 22 let g' env a b = case env of tuple2 x y = a * x + b * y
where
tuple2
stands for some built-in arity-2 tuple data type.### Other desugaring opportunities
The usual opportunities exist for adding nicer syntax for, say, lambdas, lists, etc.
## Typechecking and type inference
The type system is plain old Hindley-Milner and [Algorithm W](http://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system) will do the trick nicely.
## Implementation detail
[This section is essentially a set of suggestions for writing a reasonably efficient Flom compiler without great effort. The first instance would be a cross-compiler, which ought to be sufficient to craft a native compiler in Flom itself.]
All atomic values (built-in types such as ints, chars, floats, etc.) occupy space equivalent to a pointer.
All values constructed via data constructor live on the heap and are referenced by pointer. Each such construction is a vector of pointer-sized words, comprising the data constructor tag followed by any argument values.
The usual discriminated-union optimisation tricks can all be applied at some point.
Equality and comparison on atomic values is by identity. Equality and comparison for algebraic data types is lexicographic.
There is no destructive assignment! Any such facility, including IO, must be provided via functions written in something other than Flom and somehow imported.
Garbage collection ought to be straightforward, but it isn’t clear to me exactly what is involved with the Mill.
### Translation to genAsm
This is just a sketch… and my grasp of genAsm is wobbly at best.
[[e]]v
denotes the translation of Flom expressione
, bound to genAsm symbolv
.k ---> k -: v_k where k is a constant x ---> v_x where x is a function parameter or local let x = e local or top-level variable ---> [[e]]v_x case e of patExpr1 | patExpr2 | ... ---> [[e]]v_e if [[patExpr1]]v_c elif [[patExpr2]]v_c ... else <disaster!> where [[tag x1 x2 ... = e]]v ---> load(v_e, 0) <tag> eql then load(v_e, 1) -: v_x1 load(v_e, 2) -: v_x2 ... [[e]]v f x1 x2 ... function application ---> [[x1]]v_x1 [[x2]]v_x2 ... call f v_x1 v_x2 ... -: v_f N.B.: the simple direct tail recursion case should be compiled into an ordinary loop. ctor x1 x2 ... data constructor function ---> func ctor(x1, x2, ...): <alloc space>v_c store(v_c, 0, <tag>) store(v_c, 1, x1) store(v_c, 2, x2) ... retn v_c let f x1 x2 ... = e top-level function definition ---> func f(x1, x2, ...): [[e]]v retn v
- This reply was modified 9 years, 9 months ago by ralphbecket.
- This reply was modified 9 years, 9 months ago by ralphbecket.
- This reply was modified 9 years, 9 months ago by ralphbecket.
- This reply was modified 9 years, 9 months ago by ralphbecket. Reason: Trying to preserve code formatting
- in reply to: Simulation #1647
I would love to tackle something Lispish, but it pains me to say I couldn’t guarantee the time to do it. Tempt me with a simulator and I might have a go!
Regarding Scheme’s tail-calls and weird call/cc stuff: for a decade I worked on the Mercury compiler which had several back-ends, typically targetting GCC (there was at least one register-machine like “VM” and at least one native C back-end). For tail calls in the “native” C back-end, we’d resort to using GCC’s “goto label-variable” in various places. Tail recursion was converted into while loops, but non-inlined mutual recursion just left stuff on the stack. I cannot recall an instance where this got us into trouble, and we were running very large Mecury programs (a few commercial enterprises use Mercury as well).
- in reply to: Simulation #1644
Just to stick my oar in, but if the tricks you need to pull to make a decent interpreter exceed those needed to write a naive compiler, why not just write a naive compiler?
As a languages/compilers enthusiast, Forth has always struck me as an amusing exercise in implementation, but a total pain in the posterior to read or write. Maybe that’s just me.
I would *really* love it if the first language on the Mill was inspired by the lambda calculus…
- in reply to: Simulation #1699
@tomberek – I’m not sure that Flom is intended as a demonstration of the Mill’s cleverness. Rather, my idea was to come up with something that
- is easy to implement on the Mill (Forth’s dual stacks seems problematic),
- is comfortable enough for something like real programming,
- and is amenable to optimisation.
The simple, regular, statically checked type system of Flom (it’s a stripped-down ML) isn’t just expressive, it makes code generation much simpler and the generated code can be much more efficient.
- in reply to: Simulation #1694
Apologies if this was the wrong spot — I just wanted to keep the discussion going.
- AuthorPosts