135 lines
6.4 KiB
Markdown
135 lines
6.4 KiB
Markdown
# Working notes on bytecode stuff
|
|
|
|
### 2024-12-15
|
|
So far, I've done the easy stuff: constants, and ifs.
|
|
|
|
There's still some easy stuff left:
|
|
* [ ] lists
|
|
* [ ] dicts
|
|
* [ ] when
|
|
* [ ] panic
|
|
|
|
So I'll do those next.
|
|
|
|
But then we've got two doozies: patterns and bindings, and tuples.
|
|
|
|
#### Tuples make things hard
|
|
In fact, it's tuples that make things hard.
|
|
The idea is that, when possible, tuples should be stored on the stack.
|
|
That makes them a different creature than anything else.
|
|
But the goal is to be able, in a function call, to just push a tuple onto the stack, and then match against it.
|
|
Because a tuple _isn't_ just another `Value`, that makes things challenging.
|
|
BUT: matching against all other `Values` should be straightforward enough?
|
|
|
|
I think that the way to do this is to reify patterns.
|
|
Rather than try to emit bytecodes to embody patterns, the patterns are some kind of data that get compiled and pushed onto a stack like keywords and interned strings and whatnot.
|
|
And then you can push a pattern onto the stack right behind a value, and then have a `match` opcode that pops them off.
|
|
|
|
Things get a bit gnarly since patterns can be nested. I'll start with the basic cases and run from there.
|
|
|
|
But when things get *very* gnarly is considering tuples on the stack.
|
|
How do you pop off a tuple?
|
|
|
|
Two thoughts:
|
|
1. Just put tuples on the heap. And treat function arguments/matching differently.
|
|
2. Have a "register" that stages values to be pattern matched.
|
|
|
|
##### Regarding the first option
|
|
I recall seeing somebody somewhere make a comment that trying to represent function arguments as tuples caused tons of pain.
|
|
I can see why that would be the case, from an implementation standpoint.
|
|
We should have _values_, and don't do fancy bookkeeping if we don't have to.
|
|
|
|
_Conceptually_, it makes a great deal of sense to think of tuples as being deeply the same as function invocation.
|
|
But _practically_, they are different things, especially with Rust underneath.
|
|
|
|
This feels like this cuts along the grain, and so this is what I will try.
|
|
|
|
I suspect that I'll end up specializing a lot around function arguments and calling, but that feels more tractable than the bookkeeping around stack-based tuples.
|
|
|
|
### 2024-12-17
|
|
Next thoughts: take some things systematically rather than choosing an approach first.
|
|
|
|
#### Things that always match
|
|
* Placeholder.
|
|
- I _think_ this is just a no-op. A `let` expression leaves its rhs pushed on the stack.
|
|
|
|
* Word: put something on the stack, and bind a name.
|
|
- This should follow the logic of locals as articulated in _Crafting Interpreters_.
|
|
|
|
In both of these cases, there's no conditional logic, simply a bind.
|
|
|
|
#### Things that never bind
|
|
* Atomic values: put the rhs on the stack, then do an equality check, and panic if it fails. Leave the thing on the stack.
|
|
|
|
#### Analysis
|
|
In terms of bytecode, I think one thing to do, in the simple case, is to do the following:
|
|
* `push` a `pattern` onto the stack
|
|
* `match`--pops the pattern and the value off the stack, and then applies the pattern to the value. It leaves the value on the stack, and pushes a special value onto the stack representing a match, or not.
|
|
- We'll probably want `match-1`, `match-2`, `match-3`, etc., opcodes for matching a value that's that far back in the stack. E.g., `match-1` matches against not the top element, but the `top - 1` element.
|
|
- This is _specifically_ for matching function arguments and `loop` forms.
|
|
* There are a few different things we might do from here:
|
|
- `panic_if_no_match`: panic if the last thing is a `no_match`, or just keep going if not.
|
|
- `jump_if_no_match`: in a `match` form or a function, we'll want to move to the next clause if there's no match, so jump to the next clause's `pattern` `push` code.
|
|
* Compound patterns are going to be more complex.
|
|
- I think, for example, what you're going to need to do is to get opcodes that work on our data structures, so, for example, when you have a `match_compound` opcode and you start digging into the pattern.
|
|
* Compound patterns are specifically _data structures_. So simple structures should be stack-allocated, and and complex structures should be pointers to something on the heap. Maybe?
|
|
|
|
#### A little note
|
|
For instructions that need more than 256 possibilities, we'll need to mush two `u8`s together into a `u16`. The one liner for this is:
|
|
|
|
```rust
|
|
let number = ((first as u16) << 8) | second as u16;
|
|
```
|
|
|
|
#### Oy, stacks and expressions
|
|
One thing that's giving me grief is when to pop and when to note on the value stack.
|
|
|
|
So, like, we need to make sure that a line of code leaves the stack exactly where it was before it ran, with the exception of binding forms: `let`, `fn`, `box`, etc. Those leave one (or more!) items on the stack.
|
|
|
|
In the simplest case, we have a line of code that's just a constant:
|
|
|
|
```
|
|
false
|
|
```
|
|
This should emit the bytecode instructions (more or less):
|
|
```
|
|
push false
|
|
pop
|
|
```
|
|
The push comes from the `false` value.
|
|
The pop comes from the end of a (nonbinding) line.
|
|
|
|
The problem is that there's no way (at all, in Ludus) to distinguish between an expression that's just a constant and a line that is a complete line of code that's an expression.
|
|
|
|
So if we have the following:
|
|
```
|
|
let foo = false
|
|
```
|
|
We want:
|
|
```
|
|
push false
|
|
```
|
|
Or, rather, given that `foo` is a word pattern, what we actually want is:
|
|
```
|
|
push false # constant
|
|
push pattern/word # load pattern
|
|
pop
|
|
pop # compare
|
|
push false # for the binding
|
|
```
|
|
|
|
But it's worth it here to explore Ludus's semantics.
|
|
It's the case that there are actually only three binding forms (for now): `let`, `fn`, and `box`.
|
|
Figuring out `let` will help a great deal.
|
|
Match also binds things, but at the very least, match doesn't bind with expressions on the rhs, but a single value.
|
|
|
|
Think, too about expressions: everything comes down to a single value (of course), even tuples (especially now that I'm separating function calls from tuple values (probably)).
|
|
So: anything that *isn't* a binding form should, before the `pop` from the end of a line, only leave a single value on the stack.
|
|
Which suggests that, as odd as it is, pushing a single `nil` onto the stack, just to pop it, might make sense.
|
|
Or, perhaps the thing to do is to peek: if the line in question is binding or not, then emit different bytecode.
|
|
That's probably the thing to do. Jesus, Scott.
|
|
|
|
And **another** thing worth internalizing: every single instruction that's not an explicit push or pop should leave the stack length unchanged.
|
|
So store and load need always to swap in a `nil`
|
|
|