Instruction `0037` is where it goes off the rails.
### Things left undone
Many. But things that were actively under development and in a state of unfinishedness:
1. Tuple patterns
2. Loops
3. Function calls
#### Tuple patterns
This is the blocking issue for loops, function calls, etc.
You need tuple pattern matching to get proper looping and function calls.
Here are some of the issues I'm having:
* Is it possible to represent tuples on the stack? Right now they're allocated on the heap, which isn't great for function calls.
* How to represent complex patterns? There are a few possibilities:
- Hard-coded into the bytecode (this is probably the thing to do?)
- Represented as a data structure, which itself would have to be allocated on the heap
- Some hybrid of the two:
* Easy scalar values are hard-coded: `nil`, `true`, `:foo`, `10` are all built into the bytecode + constants table
* Perhaps dict patterns will have to be data structures
#### Patterns, generally
On reflection, I think the easiest (perhaps not simplest!) way to go is to model the patterns as separate datatypes stored per-chunk in a vec.
The idea is that we push a value onto the stack, and then have a `match` instruction that takes an index into the pattern vec.
We don't even need to store all pattern types in that vec: constants (which already get stored in the constant vec), interpolations, and compound patterns.
`nil`, `false`, etc. are singletons and can be handled like (but not exactly as) the `nil` and `false`_values_.
This also means we can outsource the pattern matching mechanics to Rust, which means we don't have to fuss with "efficient compiling of pattern matching" titchiness.
This also has the benefit, while probably being fast _enough_, of reflecting the conceptual domain of Ludus, in which patterns and values are different DSLs within the language.
So: model it that way.
### Now that we've got tuple patterns
#### May 23, 2025
A few thoughts:
* Patterns aren't _things_, they're complex conditional forms. I had not really learned that; note that the "compiling pattern matching efficiently" paper is about how to optimize those conditionals. The tuple pattern compilation more closely resembles an `if` or `when` form.
* Tuple patterns break the nice stack-based semantics of binding. So do other patterns. That means I had to separate out bindings and the stack. I did this by introducing a representation of the stack into the compiler (it's just a stack-depth counter).
- This ended up being rather titchy. I think there's a lot of room to simplify things by avoiding manipulating this counter directly. My sense is that I should probably move a lot of the `emit_op` calls into methods that ensure all the bookkeeping happens automatically.
*`when` is much closer to `if` than `match`; remember that!
* Function calls should be different from tuple pattern matching. Tuples are currently (and maybe forever?) allocated on the heap. Function calls should *not* have to pass through the heap. The good news: `Arguments` is already a different AST node type than `Tuple`; we'll want an `ArgumentsPattern` pattern node type that's different from (and thus compiled differently than) `TuplePattern`. They'll be similar--the matching logic is the same, after all--but the arguments will be on the stack already, and won't need to be unpacked in the same way.
- One difficulty will be matching against different arities? But actually, we should compile these arities as different functions.
- Given splats, can we actually compile functions into different arities? Consider the following:
```
fn foo {
(x) -> & arity 1
(y, z) -> & arity 2
(x, y, 2) -> & arity 3
(...z) -> & arity 0+
}
```
`fn(1, 2, 3)` and `fn(1, 2, 4)` would invoke different "arities" of the function, 3 and 0+, respectively. I suspect the simpler thing really is to just compile each function as a singleton, and just keep track of the number of arguments you're matching.
* Before we get to function calls, `loop`/`recur` make sense as the starting ground. I had started that before, and there's some code in those branches of the compiler. But I ran into tuple pattern matching. That's now done, although actually, the `loop`/`recur` situation probably needs a rewrite from the ground up.
* Just remember: we're not aiming for *fast*, we're aiming for *fast enough*. And I don't have a ton of time. So the thing to do is to be as little clever as possible.
* I suspect the dominoes will fall reasonably quickly from here through the following:
- [x]`list` and `dict` patterns
- [x] updating `repeat`
- [x] standing up `loop`/`recur`
- [x] standing up functions
- [x] more complex synthetic expressions
- [x]`do` expressions
* That will get me a lot of the way there. What's left after that which might be challenging?
- [x] string interpolation
- [x] splats
- [ ] splatterns
- [x] string patterns
- [x] partial application
- [ ] tail calls
- [ ] stack traces in panics
- [ ] actually good lexing, parsing, and validation errors. I got some of the way there in the fall, but everything needs to be "good enough."
* After that, we're in integration hell: taking this thing and putting it together for Computer Class 1. Other things that I want (e.g., `test` forms) are for later on.
* There's then a whole host of things I'll need to get done for CC2:
- some kind of actual parsing strategy (that's good enough for "Dissociated Press"/Markov chains)
- actors
- animation in the frontend
* In addition to a lot of this, I think I need some kind of testing solution. The Janet interpreter is pretty well-behaved.
### Now that we've got some additional opcodes and loop/recur working
#### 2025-05-27
The `loop` compilation is _almost_ the same as a function body. That said, the thing that's different is that we don't know the arity of the function that's called.
A few possibilities:
* Probably the best option: enforce a new requirement that splat patterns in function clauses *must* be longer than any explicit arity of the function. So, taking the above:
```
fn foo {
(x) -> & arity 1
(y, z) -> & arity 2
(x, y, 2) -> & arity 3
(...z) -> & arity 0+
}
```
This would give you a validation error that splats must be longer than any other arity.
Similarly, we could enforce this:
```
fn foo {
(x) -> & arity 1
(x, y) -> & arity 2
(x, ...) & arity n > 1; error! too short
(x, y, ...) & arity n > 2; ok!
}
```
The algorithm for compiling functions ends up being a little bit titchy, because we'll have to store multiple functions (i.e. chunks) with different arities.
Each arity gets a different chunk.
And the call function opcode comes with a second argument that specifies the number of arguments, 0 to 7.
(That means we only get a max of 7 arguments, unless I decide to give the call opcode two bytes, and make the call/return register much bigger.)
Todo, then:
* [x] reduce the call/return register to 7
* [x] implement single-arity compilation
* [x] implement single-arity calls, but with two bytes
* [x] compile multiple-arity functions
* [x] add closures
### Some thoughts while chattin w/ MNL
On `or` and `and`: these should be reserved words with special casing in the parser.
You can't pass them as functions, because that would change their semantics.
So they *look* like functions, but they don't *behave* like functions.
In Clojure, if you try `(def foo or)`, you get an error that you "can't take the value of a macro."
I'll need to change Ludus so that `or` and `and` expressions actually generate different AST nodes, and then compile them from there.
AND: done.
### Implementing functions & calls, finally
#### 2025-06-01
Just to explain where I am to myself:
* I have a rough draft (probably not yet fully functional) of function compilation in the compiler.
* Now I have to implement two op codes: `Call` and `Return`.
* I now need to implement call frames.
* A frame in _Crafting Interpreters_ has: a pointer to a Lox function object, an ip, and an index into the value stack that indicates the stack bottom for this function call. Taking each of these in turn:
* The lifetime for the pointer to the function:
- The pointer to the function object cannot, I think, be an explicit lifetime reference, since I don't think I know how to prove to the Rust borrow checker that the function will live long enough, especially since it's inside an `Rc`.
- That suggests that I actually need the whole `Value::Fn` struct and not just the inner `LFn` struct, so I can borrow it.
* The ip and stack bottom are just placeholders and don't change.
### Partially applied functions
#### 2025-06-05
Partially applied functions are a little complicated, because they involve both the compiler and the VM.
Maybe.
My sense is that, perhaps, the way to do them is actually to create a different value branch.
They ....
### Jumping! Numbers! And giving up the grind
#### 2025-06-05
Ok! So.
This won't be ready for next week.
That's clear enough now--even though I've made swell progress!
One thing I just discovered, which, well, it feels silly I haven't found this before.
Jump instructions, all of them, need 16 bits, not 8.
That means some fancy bit shifting, and likely some refactoring of the compiler & vm to make them easier to work with.
For reference, here's the algorithm for munging u8s and u16s:
```rust
let a: u16 = 14261;
let b_high: u8 = (a >> 8) as u8;
let b_low: u8 = a as u8;
let c: u16 = ((b_high as u16) <<8)+b_lowasu16;
println!("{a} // {b_high}/{b_low} // {c}");
```
To reiterate the punch list that *I would have needed for Computer Class 1*:
* In the middle of doing TCO, looks like it works for `do` forms based on the bytecode, but I ran into these weird binding problems while trying to test that in the vm
* Once that's put to bed, I believe TCO fully works
- I suspect this can be done not using anything other than an index into a chunk's `constants` vec--no fancy memory swapping or anything; and also--done in the compiler rather than the VM.
- [ ] Splats in functions must be the same arity, and greater than any explicit arity
* [ ] actually good error messages
- [ ] parsing
- [ ] my memory is that validator messages are already good?
- [ ] panics, esp. no match panics
* [ ] panics should be able to refernce the line number where they fail
* [ ] that suggests that we need a mapping from bytecodes to AST nodes
* The way I had been planning on doing this is having a vec that moves in lockstep with bytecode that's just references to ast nodes, which are `'static`, so that shouldn't be too bad. But this is per-chunk, which means we need a reference to that vec in the VM. My sense is that what we want is actually a separate data structure that holds the AST nodes--we'll only need them in the sad path, which can be slow.
### Bugs discovered while trying to compile prelude
#### 2025-06-20
Consider the following code:
```
fn one {
(x as :number) -> {
fn two () -> :number
two
}
(x as :bool) -> {
fn two () -> :bool
two
}
}
```
The second clause causes a panic in the compiler.
I'm not entirely sure what's going on.
That said, I'm pretty sure the root of it is that the fact that `two` was already bound in the first clause means that it's in the constants vector in that chunk under the name "two."
Just patched up the `if` alternative branch unconditional jump, which was jumping too far.
Now it really is just some pretty systematic testing of prelude functions, including the problems with upvalues, which I haven't yet been able to recreate.
Rather than doing everything by hand right now, I think the way to go about things is to figure out how to do as much automated bugfixing as possible.
That means reprioritizing some things.
So here's a short punch list of things to do in that register:
* [x] Hook validator back in to both source AND prelude code
- [x] Validator should know about the environment for global/prelude function
- [x] Run validator on current prelude to fix current known errors
* [ ] Do what it takes to compile this interpreter into Ludus's JS environment
- [ ] JSONify Ludus values
- [ ] Write a function that's source code to JSON result
- [ ] Expose this to a WASM compiler
- [ ] Patch this into a JS file
- [ ] Automate this build process
* [ ] Start testing against the cases in `ludus-test`
* [ ] Systematically debug prelude
- [ ] Bring it in function by function, testing each in turn
***
I've started working on systematically going through the Prelude.
I've found a closure error.
This is in `map`/`mapping`.
What's happening is that the inner function, `mapping`, is closing over values directly from a stack that no longer exists.
What we need to have happen is that if a function is closing over values _inside_ a function, it needs to capture not the upvalues directly from the stack, but _from the enclosing closure_.
I think I need to consult Uncle Bob Nystrom to get a sense of what to do here.
***
So I found the minimal test case:
```
let foo = {
let thing = :thing
let bar = :bar
let baz = :baz
fn quux () -> {
fn frobulate () -> (bar, baz)
}
}
foo ()
```
`frobulate` is closed over when `quux` is called, when the stack looks completely different than it does when `quux` is defined.
If you remove line 2, binding `thing`, then you don't get a panic, but when you call `foo () ()`, the result is `(fn quux, fn frobulate)`.
The problem is that `frobulate`'s upvalues are indexes into the stack, rather than having some relation to `quux`'s upvalues.
What needs to happen is that an enclosing function needs to capture, define, and pass down the upvalues for its enclosed functions.
I'm having an exact problem that Uncle Bob is describing at
* [ ] I need to return to the question of whether/how strings are ordered; do we use `at`, or do we want `char_at`? etc.
* [ ] Original implementation of `butlast` is breaking stack discipline; I don't know why. It ends up returning from evaluating one of the arguments straight into a `load` instruction. Something about tail calls and ternary synthetic expressions and base functions. (For now, I can call `slice` instead of `base :slice` and it works.)
- Original version of `update` also had this same problem with `assoc`; fixed it by calling the Ludus, rather than Rust, function.
NEW PROBLEM: a lot of instructions in the VM don't properly offset from the call frame's stack base, which leads to weirdness when doing things inside function calls.
NEW SOLUTION: create a function that does the offset properly, and replace everywhere we directly access the stack.
At the moment, I'm thinking about how to get good error messages.
Panics are difficult.
And I'm worried about ariadne as the error reporting crate.
Since it writes to stdout, it has all kinds of escape codes.
I need a plain ass string, at least for the web frontend.
So.
Current task, however, is how to get reasonable panic error messages.
Let's simplify the problem.
First, let's get tracebacks and line numbers.
We use Chumsky spanned ASTs.
The span is just a range into an str.
What I can do is pretty stupidly just generate line numbers from the spans in the compiler, and from there, get a reasonable traceback.
So instead of using Ariadne's fancy report builder, let's just do something more what we already have here:
```
Ludus panicked! no match
on line 1 in input,
calling: add
with arguments: ("foo")
expected match with one of:
()
(x as :number)
(x as :number, y as :number)
(x, y, ...zs)
((x1, y1), (x2, y2))
>>> add ("foo")
......^
```
We need:
* the location of the function call in terms of the line number
* the arguments used
* the patterns expected to match (special cases: `let` vs `match` vs `fn`)
That means, for bookkeeping, we need:
* In the compiler, line number
* In the VM, the arguments
* In the VM, the pattern AST node.
In Janet-Ludus, there are only a few types of panic:
*`fn-no-match`: no match against a function call
*`let-no-match`: no match against a `let` binding
*`match-no-match`: no match against a `match` form
*`generic-panic`: everything else
*`runtime-error`: an internal Ludus error
The first three are simply formatting differences.
There are no tracebacks.
Tracebacks should be easy enough, although there's some fiddly bits.
While it's nice to have the carret, the brutalist attempt here should be just to give us the line--since the carret isn't exactly precise in the Janet interpereter.
And the traceback should look something like:
```
calling foo with (:bar, :baz)
at line 12 in input
calling bar with ()
at line 34 in prelude
calling baz with (1, 2, 3)
at line 12 in input
```
Which means, again: function names, ip->line conversion, and arguments.
The runtime needs a representation of the patterns in _any_ matching form.
The speed is so much greater now that I'm not so concerned about little optimizations.
So: a chunk needs a vec of patterns-representations. (I'm thinking simply a `Vec<String>`.)
So does a function, for `doc!`.
Same same re: `Vec<String>`.
A VM needs a register for the scrutinee (which with function calls is just the arguments, already captured).
A VM also needs a register for the pattern.
So when there's a no match, we just yank the pattern and the scrutinee out of these registers.
This all seems very straightforward compared to the compiling & VM stuff.
Here's some stub code I wrote for dealing with ranges, source, line numbers:
```rust
let str = "foo bar baz\nquux frob\nthing thing thing";
let range = 0..4;
println!("{}", str.get(range).unwrap());
let lines: Vec<&str> = str.split_terminator("\n").collect();