Compare commits

...

21 Commits

Author SHA1 Message Date
Scott Richmond
6c803cdf5a take some loop notes 2024-12-27 00:54:31 -05:00
Scott Richmond
a7ee8f8e57 vm::run is now a loop, not vm::interpret as a tailcall 2024-12-27 00:47:22 -05:00
Scott Richmond
8908630a21 add match_depth to vm 2024-12-27 00:22:01 -05:00
Scott Richmond
6f582bff06 refactor if/else to match in guard compilation 2024-12-26 23:48:38 -05:00
Scott Richmond
f5965fdb44 compile guards in match forms 2024-12-26 23:46:06 -05:00
Scott Richmond
cfe0b83192 fix block compilation; compile & run repeat 2024-12-26 23:33:57 -05:00
Scott Richmond
4fa2ce5e78 separate compiler & chunk 2024-12-26 19:03:09 -05:00
Scott Richmond
40d4f48878 notes and comments 2024-12-26 18:41:54 -05:00
Scott Richmond
ef0ac40dbe working & thinking 2024-12-24 12:35:44 -05:00
Scott Richmond
a4f12c8f7d continue work on compiling functions 2024-12-23 10:55:28 -05:00
Scott Richmond
9f4e630544 get lifetime out of Chunk, thus out of Value 2024-12-22 19:51:02 -05:00
Scott Richmond
be23ee6c44 get simple match forms done 2024-12-22 19:33:59 -05:00
Scott Richmond
d943185db8 do lots of work 2024-12-22 19:07:42 -05:00
Scott Richmond
d4342b0623 get binding & pretty debugging working 2024-12-18 01:28:23 -05:00
Scott Richmond
48754f92a4 do work 2024-12-17 23:45:39 -05:00
Scott Richmond
096d8d00bc add untracked from opening bytecode branch 2024-12-15 23:50:12 -05:00
Scott Richmond
9c3205d4c1 DRY out validator, simplify code 2024-12-15 23:49:43 -05:00
Scott Richmond
6c78cffe56 finish list of valid types 2024-12-15 23:49:27 -05:00
Scott Richmond
35fc591c76 make some progress: atoms and ifs 2024-12-15 23:28:57 -05:00
Scott Richmond
eff2ed90d5 some simple bytecodes! 2024-12-15 17:54:40 -05:00
Scott Richmond
86aea78c21 start working on a bytecode interpreter! 2024-12-15 16:37:51 -05:00
12 changed files with 2452 additions and 998 deletions

View File

@ -12,3 +12,8 @@ imbl = "3.0.0"
struct_scalpel = "0.1.1" struct_scalpel = "0.1.1"
ran = "2.0.1" ran = "2.0.1"
rust-embed = "8.5.0" rust-embed = "8.5.0"
boxing = "0.1.2"
ordered-float = "4.5.0"
index_vec = "0.1.4"
num-derive = "0.4.2"
num-traits = "0.2.19"

239
bytecode_thoughts.md Normal file
View File

@ -0,0 +1,239 @@
# Working notes on bytecode stuff
### 2024-12-15
So far, I've done the easy stuff: constants, and ifs.
There's still some easy stuff left:
* [ ] lists
* [ ] dicts
* [ ] when
* [ ] panic
So I'll do those next.
But then we've got two doozies: patterns and bindings, and tuples.
#### Tuples make things hard
In fact, it's tuples that make things hard.
The idea is that, when possible, tuples should be stored on the stack.
That makes them a different creature than anything else.
But the goal is to be able, in a function call, to just push a tuple onto the stack, and then match against it.
Because a tuple _isn't_ just another `Value`, that makes things challenging.
BUT: matching against all other `Values` should be straightforward enough?
I think that the way to do this is to reify patterns.
Rather than try to emit bytecodes to embody patterns, the patterns are some kind of data that get compiled and pushed onto a stack like keywords and interned strings and whatnot.
And then you can push a pattern onto the stack right behind a value, and then have a `match` opcode that pops them off.
Things get a bit gnarly since patterns can be nested. I'll start with the basic cases and run from there.
But when things get *very* gnarly is considering tuples on the stack.
How do you pop off a tuple?
Two thoughts:
1. Just put tuples on the heap. And treat function arguments/matching differently.
2. Have a "register" that stages values to be pattern matched.
##### Regarding the first option
I recall seeing somebody somewhere make a comment that trying to represent function arguments as tuples caused tons of pain.
I can see why that would be the case, from an implementation standpoint.
We should have _values_, and don't do fancy bookkeeping if we don't have to.
_Conceptually_, it makes a great deal of sense to think of tuples as being deeply the same as function invocation.
But _practically_, they are different things, especially with Rust underneath.
This feels like this cuts along the grain, and so this is what I will try.
I suspect that I'll end up specializing a lot around function arguments and calling, but that feels more tractable than the bookkeeping around stack-based tuples.
### 2024-12-17
Next thoughts: take some things systematically rather than choosing an approach first.
#### Things that always match
* Placeholder.
- I _think_ this is just a no-op. A `let` expression leaves its rhs pushed on the stack.
* Word: put something on the stack, and bind a name.
- This should follow the logic of locals as articulated in _Crafting Interpreters_.
In both of these cases, there's no conditional logic, simply a bind.
#### Things that never bind
* Atomic values: put the rhs on the stack, then do an equality check, and panic if it fails. Leave the thing on the stack.
#### Analysis
In terms of bytecode, I think one thing to do, in the simple case, is to do the following:
* `push` a `pattern` onto the stack
* `match`--pops the pattern and the value off the stack, and then applies the pattern to the value. It leaves the value on the stack, and pushes a special value onto the stack representing a match, or not.
- We'll probably want `match-1`, `match-2`, `match-3`, etc., opcodes for matching a value that's that far back in the stack. E.g., `match-1` matches against not the top element, but the `top - 1` element.
- This is _specifically_ for matching function arguments and `loop` forms.
* There are a few different things we might do from here:
- `panic_if_no_match`: panic if the last thing is a `no_match`, or just keep going if not.
- `jump_if_no_match`: in a `match` form or a function, we'll want to move to the next clause if there's no match, so jump to the next clause's `pattern` `push` code.
* Compound patterns are going to be more complex.
- I think, for example, what you're going to need to do is to get opcodes that work on our data structures, so, for example, when you have a `match_compound` opcode and you start digging into the pattern.
* Compound patterns are specifically _data structures_. So simple structures should be stack-allocated, and and complex structures should be pointers to something on the heap. Maybe?
#### A little note
For instructions that need more than 256 possibilities, we'll need to mush two `u8`s together into a `u16`. The one liner for this is:
```rust
let number = ((first as u16) << 8) | second as u16;
```
#### Oy, stacks and expressions
One thing that's giving me grief is when to pop and when to note on the value stack.
So, like, we need to make sure that a line of code leaves the stack exactly where it was before it ran, with the exception of binding forms: `let`, `fn`, `box`, etc. Those leave one (or more!) items on the stack.
In the simplest case, we have a line of code that's just a constant:
```
false
```
This should emit the bytecode instructions (more or less):
```
push false
pop
```
The push comes from the `false` value.
The pop comes from the end of a (nonbinding) line.
The problem is that there's no way (at all, in Ludus) to distinguish between an expression that's just a constant and a line that is a complete line of code that's an expression.
So if we have the following:
```
let foo = false
```
We want:
```
push false
```
Or, rather, given that `foo` is a word pattern, what we actually want is:
```
push false # constant
push pattern/word # load pattern
pop
pop # compare
push false # for the binding
```
But it's worth it here to explore Ludus's semantics.
It's the case that there are actually only three binding forms (for now): `let`, `fn`, and `box`.
Figuring out `let` will help a great deal.
Match also binds things, but at the very least, match doesn't bind with expressions on the rhs, but a single value.
Think, too about expressions: everything comes down to a single value (of course), even tuples (especially now that I'm separating function calls from tuple values (probably)).
So: anything that *isn't* a binding form should, before the `pop` from the end of a line, only leave a single value on the stack.
Which suggests that, as odd as it is, pushing a single `nil` onto the stack, just to pop it, might make sense.
Or, perhaps the thing to do is to peek: if the line in question is binding or not, then emit different bytecode.
That's probably the thing to do. Jesus, Scott.
And **another** thing worth internalizing: every single instruction that's not an explicit push or pop should leave the stack length unchanged.
So store and load need always to swap in a `nil`
### 2024-12-23
Compiling functions.
So I'm working through the functions chapter of _CI_, and there are a few things that I'm trying to wrap my head around.
First, I'm thinking that since we're not using raw pointers, we'll need some functional indirection to get our current byte.
So one of the hard things here is that, unlike with Lox, Ludus doesn't have fixed-arity functions. That means that the bindings for function calls can't be as dead simple as in Lox. More to the point, because we don't know everything statically, we'll need to do some dynamic magic.
The Bob Nystrom program uses three useful auxiliary constructs to make functions straightforward:
* `CallFrame`s, which know which function is being called, has their own instruction pointer, and an offset for the first stack slot that can be used by the function.
```c
typedef struct {
ObjFunction* function;
uint8_t* ip;
Value* slots;
} CallFrame;
```
Or the Rust equivalent:
```rust
struct CallFrame {
function: LFn,
ip: usize,
stack_root: usize,
}
```
* `Closure`s, which are actual objects that live alongside functions. They have a reference to a function and to an array of "upvalues"...
* `Upvalue`s, which are ways of pointing to values _below_ the `stack_root` of the call frame.
##### Digression: Prelude
I decided to skip the Prelude resolution in the compiler and only work with locals. But actually, closures, arguments, and the prelude are kind of the same problem: referring to values that aren't currently available on the stack.
We do, however, know at compile time the following:
* If a binding's target is on the stack, in a closure, or in the prelude.
* This does, however, require that the function arguments work in a different way.
The way to do this, I reckon, is this:
* Limit arguments (to, say, no more than 7).
* A `CallFrame` includes an arity field.
* It also includes an array of length 7.
* Each `match` operation in function arguments clones from the call frame, and the first instruction for any given body (i.e. once we've done the match) is to clear the arguments registers in the `CallFrame`, thus decrementing all the refcounts of all the heap-allocated objects.
* And the current strategy of scoping and popping in the current implementation of `match` will work just fine!
Meanwhile, we don't actually need upvalues, because bindings cannot change in Ludus. So instead of upvalues and their indirection, we can just emit a bunch of instructions to have a `values` field on a closure. The compiler, meanwhile, will know how to extract and emit instructions both to emit those values *and* to offer correct offsets.
The only part I haven't figured out quite yet is how to encode access to what's stored in a closure.
Also, I'm not certain we need the indirection of a closure object in Ludus. The function object itself can do the work, no?
And the compiler knows which function it's closing over, and we can emit a bunch of instructions to close stuff over easily, after compiling the function and putting it in the constants table. The way to do this is to yank the value to the top of the stack using normal name resolution procedures, and then use a two-byte operand, `Op::Close` + index of the function in the constants table.
##### End of digression.
And, because we know exactly is bound in a given closure, we can actually emit instructions to close over a given value easily.
#### A small optimization
The lifetimes make things complicated; but I'm not sure that I would want to actually manage them manually, given how much they make my head hurt with Rust. I do get the sense that we will, at some point, need some lifetimes. A `Chunk` right now is chunky, with lots of owned `vec`s.
Uncle Bob separates `Chunk`s and `Compiler`s, which, yes! But then we have a problem: all of the information to climb back to source code is in the `Compiler` and not in the `Chunk`. How to manage that encoding?
(Also the keyword and string intern tables should be global, and not only in a single compiler, since we're about to get nested compilers...)
### 2024-12-24
Other interesting optimizations abound:
* `add`, `sub`, `inc`, `dec`, `type`, and other extremely frequently used, simple functions can be compiled directly to built-in opcodes. We still need functions for them, with the same arities, for higher order function use.
- The special-case logic is in the `Synthetic` compiler branch, rather than anywhere else.
- It's probably best to disallow re-binding these names anywhere _except_ Prelude, where we'll want them shadowed.
- We can enforce this in `Validator` rather than `Compiler`.
* `or` and `and` are likewise built-in, but because they don't evaluate their arguments eagerly, that's another, different special case that's a series of eval, `jump_if_false`, eval, `jump_if_false`, instructions.
* More to the point, the difference between `or` and `and` here and the built-ins is that `or` and `and` are variadic, where I was originally thinking about `and` and co. as fixed-arity, with variadic behaviours defined by a shadowing/backing Ludus function. That isn't necessary, I don't think.
* Meanwhile, `and` and `or` will also, of necessity, have backing shadowing functions.
#### More on CallFrames and arg passing
* We don't actually need the arguments register! I was complicating things. The stack between the `stack_root` and the top will be _exactly_ the same as an arguments register would have been in my imagination. So we can determine the number of arguments passed in with `stack.len() - stack_root`, and we can access argument positions with `stack_root + n`, since the first argument is at `stack_root`.
- This has the added benefit of not having to do any dances to keep the refcount of any heap-allocated objects as low as possible. No extra `Clone`s here.
* In addition, we need two `check_arity` ops: one for fixed-arity clauses, and one for clauses with splatterns. Easily enough done. Remember: opcodes are for special cases!
#### Tail calls
* The way to implement tail calls is actually now really straightforward! The idea is to simply have a `TailCall` rather than a `Call` opcode. In place of creating a new stack frame and pushing it to the call stack on top of the old call frame, you pop the old call frame, then push the new one to the call stack.
* That does mean the `Compiler` will need to keep track of tail calls. This should be pretty straightforward, actually, and the logic is already there in `Validator`.
* The thing here is that the new stack frame simply requires the same return location as the old one it's replacing.
* That reminds me that there's an issue in terms of keeping track of not just the IP, but the chunk. In Lox, the IP is a pointer to a `u8`, which works great in C. But in Rust, we can't use a raw pointer like that, but an index into a `vec<u8>`. Which means the return location needs both a chunk and an index, not just a `u8` pointer:
```rust
struct StackFrame<'a> {
function: LFn,
stack_root: usize,
return: (&'a Chunk, usize),
}
```
(I hate that there's a lifetime here.)
This gives us a way to access everything we need: where to return to, the root of the stack, the chunk (function->chunk), the closures (function->closures).
### 2024-12-26
One particular concern here, which needs some work: recursion is challenging.
In particular, the issue is that if, as I have been planning, a function closes over all its values at the moment it is compiled, the only value type that requires updating is a function. A function can be declared but not yet defined, and then when another function that uses that function is defined, the closed-over value will be to the declaration but not the definition.
One way to handle this, I think is using `std::cell::OnceCell`. Rather than a `RefCell`, `OnceCell` has no runtime overhead. Instead, what happens is you effectively put a `None` in the cell. Then, once you have the value you want to put in there, you call `set` on the `OnceCell`, and it does what it needs to.
This allows for the closures to be closed over right after compilation.

688
src/compiler.rs Normal file
View File

@ -0,0 +1,688 @@
use crate::parser::Ast;
use crate::spans::Spanned;
use crate::value::*;
use chumsky::prelude::SimpleSpan;
use num_derive::{FromPrimitive, ToPrimitive};
use num_traits::FromPrimitive;
use std::cell::OnceCell;
use std::rc::Rc;
#[derive(Copy, Clone, Debug, PartialEq, Eq, FromPrimitive, ToPrimitive)]
pub enum Op {
Nil,
True,
False,
Constant,
Jump,
JumpIfFalse,
Pop,
PushBinding,
Store,
Load,
ResetMatch,
MatchNil,
MatchTrue,
MatchFalse,
MatchWord,
PanicIfNoMatch,
MatchConstant,
MatchTuple,
PushTuple,
PushList,
PushDict,
PushBox,
GetKey,
PanicNoWhen,
JumpIfNoMatch,
PanicNoMatch,
TypeOf,
JumpBack,
JumpIfZero,
Duplicate,
Decrement,
Truncate,
MatchDepth,
}
impl std::fmt::Display for Op {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
use Op::*;
let rep = match self {
Nil => "nil",
True => "true",
False => "false",
Constant => "constant",
Jump => "jump",
JumpIfFalse => "jump_if_false",
Pop => "pop",
PushBinding => "push_binding",
Store => "store",
Load => "load",
MatchNil => "match_nil",
MatchTrue => "match_true",
MatchFalse => "match_false",
MatchWord => "match_word",
ResetMatch => "reset_match",
PanicIfNoMatch => "panic_if_no_match",
MatchConstant => "match_constant",
MatchTuple => "match_tuple",
PushTuple => "push_tuple",
PushList => "push_list",
PushDict => "push_dict",
PushBox => "push_box",
GetKey => "get_key",
PanicNoWhen => "panic_no_when",
JumpIfNoMatch => "jump_if_no_match",
PanicNoMatch => "panic_no_match",
TypeOf => "type_of",
JumpBack => "jump_back",
JumpIfZero => "jump_if_zero",
Decrement => "decrement",
Truncate => "truncate",
Duplicate => "duplicate",
MatchDepth => "match_depth",
};
write!(f, "{rep}")
}
}
#[derive(Clone, Debug, PartialEq)]
pub struct Binding {
name: &'static str,
depth: isize,
}
#[derive(Clone, Debug, PartialEq)]
pub struct Chunk {
pub constants: Vec<Value>,
pub bytecode: Vec<u8>,
pub strings: Vec<&'static str>,
pub keywords: Vec<&'static str>,
}
impl Chunk {
pub fn dissasemble_instr(&self, i: usize) {
let op = Op::from_u8(self.bytecode[i]).unwrap();
use Op::*;
match op {
Pop | Store | Load | Nil | True | False | MatchNil | MatchTrue | MatchFalse
| PanicIfNoMatch | MatchWord | ResetMatch | GetKey | PanicNoWhen | PanicNoMatch
| TypeOf | Duplicate | Decrement | Truncate => {
println!("{i:04}: {op}")
}
Constant | MatchConstant => {
let next = self.bytecode[i + 1];
let value = &self.constants[next as usize].show(self);
println!("{i:04}: {:16} {next:04}: {value}", op.to_string());
}
PushBinding | MatchTuple | PushTuple | PushDict | PushList | PushBox | Jump
| JumpIfFalse | JumpIfNoMatch | JumpBack | JumpIfZero | MatchDepth => {
let next = self.bytecode[i + 1];
println!("{i:04}: {:16} {next:04}", op.to_string());
}
}
}
pub fn kw_from(&self, kw: &str) -> Option<Value> {
self.kw_index_from(kw).map(Value::Keyword)
}
pub fn kw_index_from(&self, kw: &str) -> Option<usize> {
self.keywords.iter().position(|s| *s == kw)
}
}
pub struct Compiler {
pub chunk: Chunk,
pub bindings: Vec<Binding>,
scope_depth: isize,
num_bindings: usize,
pub spans: Vec<SimpleSpan>,
pub nodes: Vec<&'static Ast>,
pub ast: &'static Ast,
pub span: SimpleSpan,
pub src: &'static str,
pub name: &'static str,
loop_idxes: Vec<usize>,
}
fn is_binding(expr: &Spanned<Ast>) -> bool {
let (ast, _) = expr;
use Ast::*;
match ast {
Let(..) | LBox(..) => true,
Fn(name, ..) => !name.is_empty(),
_ => false,
}
}
impl Compiler {
pub fn new(ast: &'static Spanned<Ast>, name: &'static str, src: &'static str) -> Compiler {
let chunk = Chunk {
constants: vec![],
bytecode: vec![],
strings: vec![],
keywords: vec![
"nil", "bool", "number", "keyword", "string", "tuple", "list", "dict", "box", "fn",
],
};
Compiler {
chunk,
bindings: vec![],
scope_depth: -1,
num_bindings: 0,
spans: vec![],
nodes: vec![],
ast: &ast.0,
span: ast.1,
loop_idxes: vec![],
src,
name,
}
}
pub fn kw_from(&self, kw: &str) -> Option<Value> {
self.kw_index_from(kw).map(Value::Keyword)
}
pub fn kw_index_from(&self, kw: &str) -> Option<usize> {
self.chunk.keywords.iter().position(|s| *s == kw)
}
pub fn visit(&mut self, node: &'static Spanned<Ast>) {
let root_node = self.ast;
let root_span = self.span;
let (ast, span) = node;
self.ast = ast;
self.span = *span;
self.compile();
self.ast = root_node;
self.span = root_span;
}
fn emit_constant(&mut self, val: Value) {
let constant_index = self.chunk.constants.len();
if constant_index > u8::MAX as usize {
panic!(
"internal Ludus compiler error: too many constants in chunk:{}:: {}",
self.span, self.ast
)
}
self.chunk.constants.push(val);
self.chunk.bytecode.push(Op::Constant as u8);
self.spans.push(self.span);
self.chunk.bytecode.push(constant_index as u8);
self.spans.push(self.span);
}
fn match_constant(&mut self, val: Value) {
let constant_index = match self.chunk.constants.iter().position(|v| *v == val) {
Some(idx) => idx,
None => self.chunk.constants.len(),
};
if constant_index > u8::MAX as usize {
panic!(
"internal Ludus compiler error: too many constants in chunk:{}:: {}",
self.span, self.ast
)
}
if constant_index == self.chunk.constants.len() {
self.chunk.constants.push(val);
}
self.chunk.bytecode.push(Op::MatchConstant as u8);
self.spans.push(self.span);
self.chunk.bytecode.push(constant_index as u8);
self.spans.push(self.span);
self.bind("");
}
fn emit_op(&mut self, op: Op) {
self.chunk.bytecode.push(op as u8);
self.spans.push(self.span);
}
fn emit_byte(&mut self, byte: usize) {
self.chunk.bytecode.push(byte as u8);
self.spans.push(self.span);
}
fn len(&self) -> usize {
self.chunk.bytecode.len()
}
fn bind(&mut self, name: &'static str) {
self.bindings.push(Binding {
name,
depth: self.scope_depth,
});
}
fn enter_loop(&mut self) {
self.loop_idxes.push(self.len());
}
fn leave_loop(&mut self) {
self.loop_idxes.pop();
}
fn loop_idx(&mut self) -> usize {
*self.loop_idxes.last().unwrap()
}
pub fn compile(&mut self) {
use Ast::*;
match self.ast {
Error => unreachable!(),
Nil => self.emit_op(Op::Nil),
Number(n) => self.emit_constant(Value::Number(*n)),
Boolean(b) => self.emit_op(if *b { Op::True } else { Op::False }),
String(s) => {
let existing_str = self.chunk.strings.iter().position(|e| e == s);
let str_index = match existing_str {
Some(idx) => idx,
None => self.chunk.strings.len(),
};
self.chunk.strings.push(s);
self.emit_constant(Value::Interned(str_index));
}
Keyword(s) => {
let existing_kw = self.chunk.keywords.iter().position(|kw| kw == s);
let kw_index = match existing_kw {
Some(index) => index,
None => self.chunk.keywords.len(),
};
if kw_index == self.chunk.keywords.len() {
self.chunk.keywords.push(s);
}
self.emit_constant(Value::Keyword(kw_index));
}
Block(lines) => {
self.scope_depth += 1;
for expr in lines.iter().take(lines.len() - 1) {
if is_binding(expr) {
self.visit(expr);
} else {
self.visit(expr);
self.emit_op(Op::Pop);
}
}
let last_expr = lines.last().unwrap();
if is_binding(last_expr) {
self.visit(last_expr);
self.emit_op(Op::Duplicate);
} else {
self.visit(last_expr);
}
self.emit_op(Op::Store);
self.scope_depth -= 1;
while let Some(binding) = self.bindings.last() {
if binding.depth > self.scope_depth {
self.emit_op(Op::Pop);
self.bindings.pop();
} else {
break;
}
}
self.emit_op(Op::Pop);
self.emit_op(Op::Load);
}
If(cond, then, r#else) => {
self.visit(cond);
let jif_idx = self.len();
self.emit_op(Op::JumpIfFalse);
self.emit_byte(0xff);
self.visit(then);
let jump_idx = self.len();
self.emit_op(Op::Jump);
self.emit_byte(0xff);
self.visit(r#else);
let end_idx = self.len();
let jif_offset = jump_idx - jif_idx;
let jump_offset = end_idx - jump_idx - 2;
self.chunk.bytecode[jif_idx + 1] = jif_offset as u8;
self.chunk.bytecode[jump_idx + 1] = jump_offset as u8;
}
Let(patt, expr) => {
self.emit_op(Op::ResetMatch);
self.visit(expr);
self.visit(patt);
self.emit_op(Op::PanicIfNoMatch);
}
WordPattern(name) => {
self.emit_op(Op::MatchWord);
self.bind(name);
}
Word(name) => {
self.emit_op(Op::PushBinding);
let biter = self.bindings.iter().enumerate().rev();
for (i, binding) in biter {
if binding.name == *name {
self.emit_byte(i);
break;
}
}
}
PlaceholderPattern => {
self.emit_op(Op::MatchWord);
self.bind("");
}
NilPattern => {
self.emit_op(Op::MatchNil);
self.bind("");
}
BooleanPattern(b) => {
if *b {
self.emit_op(Op::MatchTrue);
self.bind("");
} else {
self.emit_op(Op::MatchFalse);
self.bind("");
}
}
NumberPattern(n) => {
self.match_constant(Value::Number(*n));
}
KeywordPattern(s) => {
let existing_kw = self.chunk.keywords.iter().position(|kw| kw == s);
let kw_index = match existing_kw {
Some(index) => index,
None => self.chunk.keywords.len(),
};
if kw_index == self.chunk.keywords.len() {
self.chunk.keywords.push(s);
}
self.match_constant(Value::Keyword(kw_index));
}
StringPattern(s) => {
let existing_str = self.chunk.strings.iter().position(|e| e == s);
let str_index = match existing_str {
Some(idx) => idx,
None => self.chunk.strings.len(),
};
if str_index == self.chunk.strings.len() {
self.chunk.strings.push(s)
}
self.match_constant(Value::Interned(str_index));
}
Tuple(members) => {
for member in members {
self.visit(member);
}
self.emit_op(Op::PushTuple);
self.emit_byte(members.len());
}
List(members) => {
for member in members {
self.visit(member);
}
self.emit_op(Op::PushList);
self.emit_byte(members.len());
}
LBox(name, expr) => {
self.visit(expr);
self.emit_op(Op::PushBox);
self.bind(name);
}
Dict(pairs) => {
for pair in pairs {
self.visit(pair);
}
self.emit_op(Op::PushDict);
self.emit_byte(pairs.len());
}
Pair(key, value) => {
let existing_kw = self.chunk.keywords.iter().position(|kw| kw == key);
let kw_index = match existing_kw {
Some(index) => index,
None => self.chunk.keywords.len(),
};
if kw_index == self.chunk.keywords.len() {
self.chunk.keywords.push(key);
}
self.emit_constant(Value::Keyword(kw_index));
self.visit(value);
}
Synthetic(first, second, rest) => {
match (&first.0, &second.0) {
(Word(_), Keyword(_)) => {
self.visit(first);
self.visit(second);
self.emit_op(Op::GetKey);
}
(Keyword(_), Arguments(args)) => {
self.visit(&args[0]);
self.visit(first);
self.emit_op(Op::GetKey);
}
(Word(_), Arguments(_)) => {
todo!()
}
_ => unreachable!(),
}
// TODO: implement longer synthetic expressions
for term in rest {
todo!()
}
}
When(clauses) => {
let mut jump_idxes = vec![];
let mut clauses = clauses.iter();
while let Some((WhenClause(cond, body), _)) = clauses.next() {
self.visit(cond.as_ref());
self.emit_op(Op::JumpIfFalse);
let jif_jump_idx = self.len();
self.emit_byte(0xff);
self.visit(body);
self.emit_op(Op::Jump);
jump_idxes.push(self.len());
self.emit_byte(0xff);
self.chunk.bytecode[jif_jump_idx] = self.len() as u8 - jif_jump_idx as u8 - 1;
}
self.emit_op(Op::PanicNoWhen);
for idx in jump_idxes {
self.chunk.bytecode[idx] = self.len() as u8 - idx as u8 + 1;
}
}
WhenClause(..) => unreachable!(),
Match(scrutinee, clauses) => {
self.visit(scrutinee.as_ref());
let mut jump_idxes = vec![];
let mut clauses = clauses.iter();
while let Some((MatchClause(pattern, guard, body), _)) = clauses.next() {
self.scope_depth += 1;
self.visit(pattern);
self.emit_op(Op::JumpIfNoMatch);
let jnm_jump_idx = self.len();
self.emit_byte(0xff);
// conditional compilation of guards
// hard to DRY out
match guard.as_ref() {
Some(expr) => {
self.visit(expr);
self.emit_op(Op::JumpIfFalse);
let jif_idx = self.len();
self.emit_byte(0xff);
self.visit(body);
self.emit_op(Op::Store);
self.scope_depth -= 1;
while let Some(binding) = self.bindings.last() {
if binding.depth > self.scope_depth {
self.emit_op(Op::Pop);
self.bindings.pop();
} else {
break;
}
}
self.emit_op(Op::Jump);
jump_idxes.push(self.len());
self.emit_byte(0xff);
self.chunk.bytecode[jnm_jump_idx] =
self.len() as u8 - jnm_jump_idx as u8 - 1;
self.chunk.bytecode[jif_idx] = self.len() as u8 - jif_idx as u8 - 1;
}
None => {
self.visit(body);
self.emit_op(Op::Store);
self.scope_depth -= 1;
while let Some(binding) = self.bindings.last() {
if binding.depth > self.scope_depth {
self.emit_op(Op::Pop);
self.bindings.pop();
} else {
break;
}
}
self.emit_op(Op::Jump);
jump_idxes.push(self.len());
self.emit_byte(0xff);
self.chunk.bytecode[jnm_jump_idx] =
self.len() as u8 - jnm_jump_idx as u8 - 1;
}
}
}
self.emit_op(Op::PanicNoMatch);
self.emit_op(Op::Load);
for idx in jump_idxes {
self.chunk.bytecode[idx] = self.len() as u8 - idx as u8;
}
}
MatchClause(..) => unreachable!(),
Fn(name, body, doc) => {
// first, declare the function
// TODO: or, check if the function has already been declared!
let init_val = Value::Fn(Rc::new(OnceCell::new()));
self.emit_constant(init_val);
self.bind(name);
// compile the function
let mut compiler = Compiler::new(body, self.name, self.src);
compiler.compile();
if crate::DEBUG_COMPILE {
println!("==function: {name}==");
compiler.disassemble();
}
let lfn = crate::value::LFn {
name,
doc: *doc,
chunk: compiler.chunk,
closed: vec![],
};
// TODO: close over everything accessed in the function
// TODO: pull the function off the stack, and set the OnceCell.
}
FnDeclaration(name) => {
let lfn = Value::Fn(Rc::new(OnceCell::new()));
self.emit_constant(lfn);
self.bind(name);
}
FnBody(clauses) => {
self.emit_op(Op::ResetMatch);
}
Repeat(times, body) => {
self.visit(times);
self.emit_op(Op::Truncate);
// skip the decrement the first time
self.emit_op(Op::Jump);
self.emit_byte(1);
// begin repeat
self.emit_op(Op::Decrement);
let repeat_begin = self.len();
self.emit_op(Op::Duplicate);
self.emit_op(Op::JumpIfZero);
self.emit_byte(0xff);
// compile the body
self.visit(body);
// pop whatever value the body returns
self.emit_op(Op::Pop);
self.emit_op(Op::JumpBack);
// set jump points
let repeat_end = self.len();
self.emit_byte(repeat_end - repeat_begin);
self.chunk.bytecode[repeat_begin + 2] = (repeat_end - repeat_begin - 2) as u8;
// pop the counter
self.emit_op(Op::Pop);
// and emit nil
self.emit_constant(Value::Nil);
}
Loop(value, clauses) => {
//algo:
//first, put the values on the stack
let (Ast::Tuple(members), _) = value.as_ref() else {
unreachable!()
};
for member in members {
self.visit(member);
}
let arity = members.len();
//then, save the beginning of the loop
self.enter_loop();
self.emit_op(Op::ResetMatch);
//next, compile each clause:
let mut clauses = clauses.iter();
while let Some((Ast::MatchClause(pattern, _, body), _)) = clauses.next() {
self.scope_depth += 1;
let (Ast::TuplePattern(members), _) = pattern.as_ref() else {
unreachable!()
};
// TODO: finish compiling match clauses
// I just added "match depth" to the VM
// this will set match depth to artiy
// and decrement it each pattern
// the compiler will need to know about match depth for binding to work
// we should match against ALL args first
// rather than jump_no_matching after every arg check
// compile the body
// and then jump_no_match to the next clause
// at the end, panic_no_match
}
//match against the values on the stack
//we know the (fixed) arity, so we should know where to look
//compile the clauses exactly as in `match`
}
Recur(args) => {}
Interpolated(..)
| Arguments(..)
| Placeholder
| Panic(..)
| Do(..)
| Splat(..)
| InterpolatedPattern(..)
| AsPattern(..)
| Splattern(..)
| TuplePattern(..)
| ListPattern(..)
| PairPattern(..)
| DictPattern(..) => todo!(),
}
}
pub fn disassemble(&self) {
println!("=== chunk: {} ===", self.name);
println!("IDX | CODE | INFO");
let mut codes = self.chunk.bytecode.iter().enumerate();
while let Some((i, byte)) = codes.next() {
let op = Op::from_u8(*byte).unwrap();
use Op::*;
match op {
Pop | Store | Load | Nil | True | False | MatchNil | MatchTrue | MatchFalse
| MatchWord | ResetMatch | PanicIfNoMatch | GetKey | PanicNoWhen | PanicNoMatch
| TypeOf | Duplicate | Truncate | Decrement => {
println!("{i:04}: {op}")
}
Constant | MatchConstant => {
let (_, next) = codes.next().unwrap();
let value = &self.chunk.constants[*next as usize].show(&self.chunk);
println!("{i:04}: {:16} {next:04}: {value}", op.to_string());
}
PushBinding | MatchTuple | PushTuple | PushDict | PushList | PushBox | Jump
| JumpIfFalse | JumpIfNoMatch | JumpBack | JumpIfZero | MatchDepth => {
let (_, next) = codes.next().unwrap();
println!("{i:04}: {:16} {next:04}", op.to_string());
}
}
}
}
}

View File

@ -1,148 +1,28 @@
// an implementation of Ludus
// curently left undone (and not adding for a while yet):
// * sets
// * interpolated strings & string patterns
// * pkgs, namespaces, imports, `use` forms
// * with forms
// * test forms
// * ignored words
// todo:
// * [x] rewrite fn parser to use chumsky::Recursive::declare/define
// - [x] do this to extract/simplify/DRY things like tuple patterns, fn clauses, etc.
// * [x] Work around chumsky::Stream::from_iter().spanned disappearing in most recent version
// * [x] investigate using labels (which is behind a compiler flag, somehow)
// * [ ] write parsing errors
// * [ ] wire up Ariadne parsing errors
// * [x] add stack traces and code locations to panics
// * [x] validation
// * [x] break this out into multiple files
// * [x] write a tree-walk VM
// - [x] learn how to deal with lifetimes
// - [x] with stack mechanics and refcounting
// - [ ] with tail-call optimization (nb: this may not be possible w/ a TW-VM)
// - [ ] with all the necessary forms for current Ludus
// * [x] guards in match clauses
// * [x] `as` patterns
// * [x] splat patterns in tuples, lists, dicts
// * [x] splats in list and dict literals
// * [x] `loop` and `recur`
// * [x] string patterns
// * [x] string interpolation
// * [x] docstrings
// * [x] write `base` in Rust
// * [ ] turn this into a library function
// * [ ] compile this into WASM
// * [ ] perf testing
use chumsky::{input::Stream, prelude::*}; use chumsky::{input::Stream, prelude::*};
use rust_embed::Embed;
const DEBUG_COMPILE: bool = true;
const DEBUG_RUN: bool = true;
mod memory_sandbox;
mod spans; mod spans;
use crate::spans::Spanned;
mod lexer; mod lexer;
use crate::lexer::*; use crate::lexer::lexer;
mod value;
use crate::value::*;
mod parser; mod parser;
use crate::parser::*; use crate::parser::{parser, Ast};
mod base;
use crate::base::*;
mod validator; mod validator;
use crate::validator::*;
mod process; mod compiler;
use crate::process::*; use crate::compiler::Compiler;
mod errors; mod value;
use crate::errors::*;
#[derive(Embed)] mod vm;
#[folder = "assets/"] use vm::Vm;
struct Asset;
pub fn prelude<'src>() -> (
Vec<(String, Value<'src>)>,
std::collections::HashMap<*const Ast, FnInfo>,
) {
let prelude = Asset::get("prelude.ld").unwrap().data.into_owned();
// we know for sure Prelude should live through the whole run of the program
let leaked = Box::leak(Box::new(prelude));
let prelude = std::str::from_utf8(leaked).unwrap();
let (ptoks, perrs) = lexer().parse(prelude).into_output_errors();
if !perrs.is_empty() {
println!("Errors lexing Prelude");
println!("{:?}", perrs);
panic!();
}
let ptoks = ptoks.unwrap();
let (p_ast, perrs) = parser()
.parse(Stream::from_iter(ptoks).map((0..prelude.len()).into(), |(t, s)| (t, s)))
.into_output_errors();
if !perrs.is_empty() {
println!("Errors parsing Prelude");
println!("{:?}", perrs);
panic!();
}
let prelude_parsed = Box::leak(Box::new(p_ast.unwrap()));
let base_pkg = base();
let mut v6or = Validator::new(
&prelude_parsed.0,
prelude_parsed.1,
"prelude",
prelude,
&base_pkg,
);
v6or.validate();
if !v6or.errors.is_empty() {
report_invalidation(v6or.errors);
panic!("interal Ludus error: invalid prelude")
}
let mut base_ctx = Process::<'src> {
input: "prelude",
src: prelude,
locals: base_pkg.clone(),
ast: &prelude_parsed.0,
span: prelude_parsed.1,
prelude: vec![],
fn_info: v6or.fn_info,
};
let prelude = base_ctx.eval();
let mut p_ctx = vec![];
match prelude {
Ok(Value::Dict(p_dict)) => {
for (key, value) in p_dict.iter() {
p_ctx.push((key.to_string(), value.clone()))
}
}
Ok(_) => {
println!("Bad Prelude export");
panic!();
}
Err(LErr { msg, .. }) => {
println!("Error running Prelude");
println!("{:?}", msg);
panic!();
}
};
(p_ctx, base_ctx.fn_info)
}
pub fn run(src: &'static str) { pub fn run(src: &'static str) {
let (tokens, lex_errs) = lexer().parse(src).into_output_errors(); let (tokens, lex_errs) = lexer().parse(src).into_output_errors();
@ -150,59 +30,48 @@ pub fn run(src: &'static str) {
println!("{:?}", lex_errs); println!("{:?}", lex_errs);
return; return;
} }
let tokens = tokens.unwrap(); let tokens = tokens.unwrap();
let to_parse = tokens.clone();
let (parse_result, parse_errors) = parser() let (parse_result, parse_errors) = parser()
.parse(Stream::from_iter(to_parse).map((0..src.len()).into(), |(t, s)| (t, s))) .parse(Stream::from_iter(tokens).map((0..src.len()).into(), |(t, s)| (t, s)))
.into_output_errors(); .into_output_errors();
if !parse_errors.is_empty() { if !parse_errors.is_empty() {
println!("{:?}", parse_errors); println!("{:?}", parse_errors);
return; return;
} }
let parsed = parse_result.unwrap(); // ::sigh:: The AST should be 'static
// This simplifies lifetimes, and
// in any event, the AST should live forever
let parsed: &'static Spanned<Ast> = Box::leak(Box::new(parse_result.unwrap()));
let (prelude_ctx, mut prelude_fn_info) = prelude(); let mut compiler = Compiler::new(parsed, "test", src);
compiler.compile();
let mut v6or = Validator::new(&parsed.0, parsed.1, "script", src, &prelude_ctx); if DEBUG_COMPILE {
compiler.disassemble();
v6or.validate(); println!("\n\n")
if !v6or.errors.is_empty() {
report_invalidation(v6or.errors);
return;
} }
prelude_fn_info.extend(&mut v6or.fn_info.into_iter()); if DEBUG_RUN {
println!("=== vm run: test ===");
}
let mut proc = Process { let mut vm = Vm::new(&compiler.chunk);
input: "script", let result = vm.run();
src, let output = match result {
locals: vec![], Ok(val) => val.show(&compiler.chunk),
prelude: prelude_ctx, Err(panic) => format!("{:?}", panic),
ast: &parsed.0,
span: parsed.1,
fn_info: prelude_fn_info,
}; };
println!("{output}");
let result = proc.eval();
match result {
Ok(result) => println!("{}", result),
Err(err) => report_panic(err),
}
} }
pub fn main() { pub fn main() {
let src = " let src = "
loop (100000, 1) with { match :foo with {
(1, acc) -> acc :bar -> :no
(n, acc) -> recur (dec (n), add (n, acc)) :foo -> :yes
} }
"; ";
run(src); run(src);
// struct_scalpel::print_dissection_info::<value::Value>()
// struct_scalpel::print_dissection_info::<parser::Ast>();
// println!("{}", std::mem::size_of::<parser::Ast>())
} }

58
src/memory_sandbox.rs Normal file
View File

@ -0,0 +1,58 @@
use imbl::{HashMap, Vector};
use index_vec::Idx;
use std::cell::RefCell;
use std::ops::Range;
use std::rc::Rc;
struct Word(&'static str);
struct Keyword(&'static str);
struct Interned(&'static str);
enum StringPart {
Word(&'static str),
Data(&'static str),
Inline(&'static str),
}
#[derive(Clone, Debug, PartialEq)]
struct LBox {
name: usize,
cell: RefCell<Value>,
}
#[derive(Clone, Debug, PartialEq)]
struct Fn {
name: &'static str,
body: Vec<String>,
//...etc
}
#[derive(Clone, Debug, PartialEq)]
enum Value {
Nil,
Placeholder,
Boolean(bool),
Keyword(usize),
Interned(usize),
FnDecl(usize),
String(Rc<String>),
Number(f64),
Tuple(Rc<Vec<Value>>),
List(Box<Vector<Value>>),
Dict(Box<HashMap<&'static str, Value>>),
Box(Rc<LBox>),
Fn(Rc<RefCell<Fn>>),
}
fn futz() {
let foo: &'static str = "foo";
let baz: Vec<u8> = vec![];
let bar: Range<usize> = 1..3;
let quux: Vector<u8> = Vector::new();
let fuzz = Rc::new(quux);
let blah = Box::new(foo);
let val = Value::Number(12.09);
let foo: f64 = 12.0;
}

212
src/old_main.rs Normal file
View File

@ -0,0 +1,212 @@
// an implementation of Ludus
// curently left undone (and not adding for a while yet):
// * sets
// * interpolated strings & string patterns
// * pkgs, namespaces, imports, `use` forms
// * with forms
// * test forms
// * ignored words
// todo:
// * [x] rewrite fn parser to use chumsky::Recursive::declare/define
// - [x] do this to extract/simplify/DRY things like tuple patterns, fn clauses, etc.
// * [x] Work around chumsky::Stream::from_iter().spanned disappearing in most recent version
// * [x] investigate using labels (which is behind a compiler flag, somehow)
// * [ ] write parsing errors
// * [ ] wire up Ariadne parsing errors
// * [x] add stack traces and code locations to panics
// * [x] validation
// * [x] break this out into multiple files
// * [x] write a tree-walk VM
// - [x] learn how to deal with lifetimes
// - [x] with stack mechanics and refcounting
// - [ ] with tail-call optimization (nb: this may not be possible w/ a TW-VM)
// - [ ] with all the necessary forms for current Ludus
// * [x] guards in match clauses
// * [x] `as` patterns
// * [x] splat patterns in tuples, lists, dicts
// * [x] splats in list and dict literals
// * [x] `loop` and `recur`
// * [x] string patterns
// * [x] string interpolation
// * [x] docstrings
// * [x] write `base` in Rust
// * [ ] turn this into a library function
// * [ ] compile this into WASM
// * [ ] perf testing
use chumsky::{input::Stream, prelude::*};
use rust_embed::Embed;
mod spans;
mod lexer;
use crate::lexer::*;
mod value;
use crate::value::*;
mod parser;
use crate::parser::*;
mod base;
use crate::base::*;
mod validator;
use crate::validator::*;
mod process;
use crate::process::*;
mod errors;
use crate::errors::*;
mod byte_values;
mod compiler;
mod memory_sandbox;
#[derive(Embed)]
#[folder = "assets/"]
struct Asset;
pub fn prelude<'src>() -> (
Vec<(String, Value<'src>)>,
std::collections::HashMap<*const Ast, FnInfo>,
) {
let prelude = Asset::get("prelude.ld").unwrap().data.into_owned();
// we know for sure Prelude should live through the whole run of the program
let leaked = Box::leak(Box::new(prelude));
let prelude = std::str::from_utf8(leaked).unwrap();
let (ptoks, perrs) = lexer().parse(prelude).into_output_errors();
if !perrs.is_empty() {
println!("Errors lexing Prelude");
println!("{:?}", perrs);
panic!();
}
let ptoks = ptoks.unwrap();
let (p_ast, perrs) = parser()
.parse(Stream::from_iter(ptoks).map((0..prelude.len()).into(), |(t, s)| (t, s)))
.into_output_errors();
if !perrs.is_empty() {
println!("Errors parsing Prelude");
println!("{:?}", perrs);
panic!();
}
let prelude_parsed = Box::leak(Box::new(p_ast.unwrap()));
let base_pkg = base();
let mut v6or = Validator::new(
&prelude_parsed.0,
prelude_parsed.1,
"prelude",
prelude,
&base_pkg,
);
v6or.validate();
if !v6or.errors.is_empty() {
report_invalidation(v6or.errors);
panic!("interal Ludus error: invalid prelude")
}
let mut base_ctx = Process::<'src> {
input: "prelude",
src: prelude,
locals: base_pkg.clone(),
ast: &prelude_parsed.0,
span: prelude_parsed.1,
prelude: vec![],
fn_info: v6or.fn_info,
};
let prelude = base_ctx.eval();
let mut p_ctx = vec![];
match prelude {
Ok(Value::Dict(p_dict)) => {
for (key, value) in p_dict.iter() {
p_ctx.push((key.to_string(), value.clone()))
}
}
Ok(_) => {
println!("Bad Prelude export");
panic!();
}
Err(LErr { msg, .. }) => {
println!("Error running Prelude");
println!("{:?}", msg);
panic!();
}
};
(p_ctx, base_ctx.fn_info)
}
pub fn run(src: &'static str) {
let (tokens, lex_errs) = lexer().parse(src).into_output_errors();
if !lex_errs.is_empty() {
println!("{:?}", lex_errs);
return;
}
let tokens = tokens.unwrap();
let to_parse = tokens.clone();
let (parse_result, parse_errors) = parser()
.parse(Stream::from_iter(to_parse).map((0..src.len()).into(), |(t, s)| (t, s)))
.into_output_errors();
if !parse_errors.is_empty() {
println!("{:?}", parse_errors);
return;
}
let parsed = parse_result.unwrap();
let (prelude_ctx, mut prelude_fn_info) = prelude();
let mut v6or = Validator::new(&parsed.0, parsed.1, "script", src, &prelude_ctx);
v6or.validate();
if !v6or.errors.is_empty() {
report_invalidation(v6or.errors);
return;
}
prelude_fn_info.extend(&mut v6or.fn_info.into_iter());
let mut proc = Process {
input: "script",
src,
locals: vec![],
prelude: prelude_ctx,
ast: &parsed.0,
span: parsed.1,
fn_info: prelude_fn_info,
};
let result = proc.eval();
match result {
Ok(result) => println!("{}", result),
Err(err) => report_panic(err),
}
}
pub fn main() {
let src = "
loop (100000, 1) with {
(1, acc) -> acc
(n, acc) -> recur (dec (n), add (n, acc))
}
";
run(src);
// struct_scalpel::print_dissection_info::<value::Value>()
// struct_scalpel::print_dissection_info::<parser::Ast>();
// println!("{}", std::mem::size_of::<parser::Ast>())
}

193
src/old_value.rs Normal file
View File

@ -0,0 +1,193 @@
use crate::base::*;
use crate::parser::*;
use crate::spans::*;
use imbl::*;
use std::cell::RefCell;
use std::fmt;
use std::rc::Rc;
use struct_scalpel::Dissectible;
#[derive(Clone, Debug)]
pub struct Fn<'src> {
pub name: String,
pub body: &'src Vec<Spanned<Ast>>,
pub doc: Option<String>,
pub enclosing: Vec<(String, Value<'src>)>,
pub has_run: bool,
pub input: &'static str,
pub src: &'static str,
}
#[derive(Debug, Dissectible)]
pub enum Value<'src> {
Nil,
Placeholder,
Boolean(bool),
Number(f64),
Keyword(&'static str),
InternedString(&'static str),
AllocatedString(Rc<String>),
// on the heap for now
Tuple(Rc<Vec<Self>>),
Args(Rc<Vec<Self>>),
List(Vector<Self>),
Dict(HashMap<&'static str, Self>),
Box(&'static str, Rc<RefCell<Self>>),
Fn(Rc<RefCell<Fn<'src>>>),
FnDecl(&'static str),
Base(BaseFn<'src>),
Recur(Vec<Self>),
// Set(HashSet<Self>),
// Sets are hard
// Sets require Eq
// Eq is not implemented on f64, because NaNs
// We could use ordered_float::NotNan
// Let's defer that
// We're not really using sets in Ludus
// Other things we're not implementing yet:
// pkgs, nses, tests
}
impl<'src> Clone for Value<'src> {
fn clone(&self) -> Value<'src> {
match self {
Value::Nil => Value::Nil,
Value::Boolean(b) => Value::Boolean(*b),
Value::InternedString(s) => Value::InternedString(s),
Value::AllocatedString(s) => Value::AllocatedString(s.clone()),
Value::Keyword(s) => Value::Keyword(s),
Value::Number(n) => Value::Number(*n),
Value::Tuple(t) => Value::Tuple(t.clone()),
Value::Args(a) => Value::Args(a.clone()),
Value::Fn(f) => Value::Fn(f.clone()),
Value::FnDecl(name) => Value::FnDecl(name),
Value::List(l) => Value::List(l.clone()),
Value::Dict(d) => Value::Dict(d.clone()),
Value::Box(name, b) => Value::Box(name, b.clone()),
Value::Placeholder => Value::Placeholder,
Value::Base(b) => Value::Base(b.clone()),
Value::Recur(..) => unreachable!(),
}
}
}
impl fmt::Display for Value<'_> {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
Value::Nil => write!(f, "nil"),
Value::Boolean(b) => write!(f, "{b}"),
Value::Number(n) => write!(f, "{n}"),
Value::Keyword(k) => write!(f, ":{k}"),
Value::InternedString(s) => write!(f, "\"{s}\""),
Value::AllocatedString(s) => write!(f, "\"{s}\""),
Value::Fn(fun) => write!(f, "fn {}", fun.borrow().name),
Value::FnDecl(name) => write!(f, "fn {name}"),
Value::Tuple(t) | Value::Args(t) => write!(
f,
"({})",
t.iter()
.map(|x| x.to_string())
.collect::<Vec<_>>()
.join(", ")
),
Value::List(l) => write!(
f,
"[{}]",
l.iter()
.map(|x| x.to_string())
.collect::<Vec<_>>()
.join(", ")
),
Value::Dict(d) => write!(
f,
"#{{{}}}",
d.iter()
.map(|(k, v)| format!(":{k} {v}"))
.collect::<Vec<_>>()
.join(", ")
),
Value::Box(name, value) => {
write!(
f,
"box {}: [{}]",
name,
&value.try_borrow().unwrap().to_string()
)
}
Value::Placeholder => write!(f, "_"),
Value::Base(..) => unreachable!(),
Value::Recur(..) => unreachable!(),
}
}
}
impl Value<'_> {
pub fn bool(&self) -> bool {
!matches!(self, Value::Nil | Value::Boolean(false))
}
}
impl<'src> PartialEq for Value<'src> {
fn eq(&self, other: &Value<'src>) -> bool {
match (self, other) {
// value equality types
(Value::Nil, Value::Nil) => true,
(Value::Boolean(x), Value::Boolean(y)) => x == y,
(Value::Number(x), Value::Number(y)) => x == y,
(Value::InternedString(x), Value::InternedString(y)) => x == y,
(Value::AllocatedString(x), Value::AllocatedString(y)) => x == y,
(Value::InternedString(x), Value::AllocatedString(y)) => *x == **y,
(Value::AllocatedString(x), Value::InternedString(y)) => **x == *y,
(Value::Keyword(x), Value::Keyword(y)) => x == y,
(Value::Tuple(x), Value::Tuple(y)) => x == y,
(Value::List(x), Value::List(y)) => x == y,
(Value::Dict(x), Value::Dict(y)) => x == y,
// reference equality types
(Value::Fn(x), Value::Fn(y)) => {
Rc::<RefCell<Fn<'_>>>::as_ptr(x) == Rc::<RefCell<Fn<'_>>>::as_ptr(y)
}
(Value::Box(_, x), Value::Box(_, y)) => {
Rc::<RefCell<Value<'_>>>::as_ptr(x) == Rc::<RefCell<Value<'_>>>::as_ptr(y)
}
_ => false,
}
}
}
impl Eq for Value<'_> {}
impl Value<'_> {
pub fn interpolate(&self) -> String {
match self {
Value::Nil => String::new(),
Value::Boolean(b) => format!("{b}"),
Value::Number(n) => format!("{n}"),
Value::Keyword(k) => format!(":{k}"),
Value::AllocatedString(s) => format!("{s}"),
Value::InternedString(s) => s.to_string(),
Value::Box(_, x) => x.borrow().interpolate(),
Value::Tuple(xs) => xs
.iter()
.map(|x| x.interpolate())
.collect::<Vec<_>>()
.join(", "),
Value::List(xs) => xs
.iter()
.map(|x| x.interpolate())
.collect::<Vec<_>>()
.join(", "),
Value::Dict(xs) => xs
.iter()
.map(|(k, v)| format!(":{} {}", k, v.interpolate()))
.collect::<Vec<_>>()
.join(", "),
Value::Fn(x) => format!("fn {}", x.borrow().name),
Value::FnDecl(name) => format!("fn {name}"),
Value::Placeholder => unreachable!(),
Value::Args(_) => unreachable!(),
Value::Recur(_) => unreachable!(),
Value::Base(_) => unreachable!(),
}
}
}

533
src/old_vm.rs Normal file
View File

@ -0,0 +1,533 @@
use crate::base::*;
use crate::parser::*;
use crate::value::*;
use imbl::HashMap;
use imbl::Vector;
use std::cell::RefCell;
use std::rc::Rc;
#[derive(Clone, Debug)]
pub struct LudusError {
pub msg: String,
}
// oy
// lifetimes are a mess
// I need 'src kind of everywhere
// But (maybe) using 'src in eval
// for ctx
// means I can't borrow it mutably
// I guess the question is how to get
// the branches for Ast::Block and Ast::If
// to work with a mutable borrow of ctx
// pub struct Ctx<'src> {
// pub locals: Vec<(&'src str, Value<'src>)>,
// // pub names: Vec<&'src str>,
// // pub values: Vec<Value<'src>>,
// }
// impl<'src> Ctx<'src> {
// pub fn resolve(&self, name: &'src str) -> Value {
// if let Some((_, val)) = self.locals.iter().rev().find(|(bound, _)| *bound == name) {
// val.clone()
// } else {
// unreachable!()
// }
// }
// pub fn store(&mut self, name: &'src str, value: Value<'src>) {
// self.locals.push((name, value));
// }
// }
type Context<'src> = Vec<(String, Value<'src>)>;
pub fn match_eq<T, U>(x: T, y: T, z: U) -> Option<U>
where
T: PartialEq,
{
if x == y {
Some(z)
} else {
None
}
}
pub fn match_pattern<'src, 'a>(
patt: &Pattern,
val: &Value<'src>,
ctx: &'a mut Context<'src>,
) -> Option<&'a mut Context<'src>> {
match (patt, val) {
(Pattern::Nil, Value::Nil) => Some(ctx),
(Pattern::Placeholder, _) => Some(ctx),
(Pattern::Number(x), Value::Number(y)) => match_eq(x, y, ctx),
(Pattern::Boolean(x), Value::Boolean(y)) => match_eq(x, y, ctx),
(Pattern::Keyword(x), Value::Keyword(y)) => match_eq(x, y, ctx),
(Pattern::String(x), Value::InternedString(y)) => match_eq(x, y, ctx),
(Pattern::String(x), Value::AllocatedString(y)) => match_eq(&x.to_string(), y, ctx),
(Pattern::Interpolated(_, StringMatcher(matcher)), Value::InternedString(y)) => {
match matcher(y.to_string()) {
Some(matches) => {
let mut matches = matches
.iter()
.map(|(word, string)| {
(
word.clone(),
Value::AllocatedString(Rc::new(string.clone())),
)
})
.collect::<Vec<_>>();
ctx.append(&mut matches);
Some(ctx)
}
None => None,
}
}
(Pattern::Word(w), val) => {
ctx.push((w.to_string(), val.clone()));
Some(ctx)
}
(Pattern::As(word, type_str), value) => {
let ludus_type = r#type(value);
let type_kw = Value::Keyword(type_str);
if type_kw == ludus_type {
ctx.push((word.to_string(), value.clone()));
Some(ctx)
} else {
None
}
}
// todo: add splats to these match clauses
(Pattern::Tuple(x), Value::Tuple(y)) => {
let has_splat = x
.iter()
.any(|patt| matches!(patt, (Pattern::Splattern(_), _)));
if x.len() > y.len() || (!has_splat && x.len() != y.len()) {
return None;
};
let to = ctx.len();
for i in 0..x.len() {
if let Pattern::Splattern(patt) = &x[i].0 {
let mut list = Vector::new();
for i in i..y.len() {
list.push_back(y[i].clone())
}
let list = Value::List(list);
match_pattern(&patt.0, &list, ctx);
} else if match_pattern(&x[i].0, &y[i], ctx).is_none() {
while ctx.len() > to {
ctx.pop();
}
return None;
}
}
Some(ctx)
}
(Pattern::List(x), Value::List(y)) => {
let has_splat = x
.iter()
.any(|patt| matches!(patt, (Pattern::Splattern(_), _)));
if x.len() > y.len() || (!has_splat && x.len() != y.len()) {
return None;
};
let to = ctx.len();
for (i, (patt, _)) in x.iter().enumerate() {
if let Pattern::Splattern(patt) = &patt {
let list = Value::List(y.skip(i));
match_pattern(&patt.0, &list, ctx);
} else if match_pattern(patt, y.get(i).unwrap(), ctx).is_none() {
while ctx.len() > to {
ctx.pop();
}
return None;
}
}
Some(ctx)
}
// TODO: optimize this on several levels
// - [ ] opportunistic mutation
// - [ ] get rid of all the pointer indirection in word splats
(Pattern::Dict(x), Value::Dict(y)) => {
let has_splat = x
.iter()
.any(|patt| matches!(patt, (Pattern::Splattern(_), _)));
if x.len() > y.len() || (!has_splat && x.len() != y.len()) {
return None;
};
let to = ctx.len();
let mut matched = vec![];
for (pattern, _) in x {
match pattern {
Pattern::Pair(key, patt) => {
if let Some(val) = y.get(key) {
if match_pattern(&patt.0, val, ctx).is_none() {
while ctx.len() > to {
ctx.pop();
}
return None;
} else {
matched.push(key);
}
} else {
return None;
};
}
Pattern::Splattern(pattern) => match pattern.0 {
Pattern::Word(w) => {
// TODO: find a way to take ownership
// this will ALWAYS make structural changes, because of this clone
// we want opportunistic mutation if possible
let mut unmatched = y.clone();
for key in matched.iter() {
unmatched.remove(*key);
}
ctx.push((w.to_string(), Value::Dict(unmatched)));
}
Pattern::Placeholder => (),
_ => unreachable!(),
},
_ => unreachable!(),
}
}
Some(ctx)
}
_ => None,
}
}
pub fn match_clauses<'src>(
value: &Value<'src>,
clauses: &'src [MatchClause],
ctx: &mut Context<'src>,
) -> Result<Value<'src>, LudusError> {
let to = ctx.len();
for MatchClause { patt, body, guard } in clauses.iter() {
if let Some(ctx) = match_pattern(&patt.0, value, ctx) {
let pass_guard = match guard {
None => true,
Some((ast, _)) => {
let guard_res = eval(ast, ctx);
match &guard_res {
Err(_) => return guard_res,
Ok(val) => val.bool(),
}
}
};
if !pass_guard {
while ctx.len() > to {
ctx.pop();
}
continue;
}
let res = eval(&body.0, ctx);
while ctx.len() > to {
ctx.pop();
}
return res;
}
}
Err(LudusError {
msg: "no match".to_string(),
})
}
pub fn apply<'src>(
callee: Value<'src>,
caller: Value<'src>,
ctx: &mut Context,
) -> Result<Value<'src>, LudusError> {
match (callee, caller) {
(Value::Keyword(kw), Value::Dict(dict)) => {
if let Some(val) = dict.get(kw) {
Ok(val.clone())
} else {
Ok(Value::Nil)
}
}
(Value::Dict(dict), Value::Keyword(kw)) => {
if let Some(val) = dict.get(kw) {
Ok(val.clone())
} else {
Ok(Value::Nil)
}
}
(Value::Fn(f), Value::Tuple(args)) => {
let args = Value::Tuple(args);
match_clauses(&args, f.body, ctx)
}
(Value::Fn(_f), Value::Args(_args)) => todo!(),
(_, Value::Keyword(_)) => Ok(Value::Nil),
(_, Value::Args(_)) => Err(LudusError {
msg: "you may only call a function".to_string(),
}),
(Value::Base(f), Value::Tuple(args)) => match f {
Base::Nullary(f) => {
if args.len() != 0 {
Err(LudusError {
msg: "wrong arity: expected 0 arguments".to_string(),
})
} else {
Ok(f())
}
}
Base::Unary(f) => {
if args.len() != 1 {
Err(LudusError {
msg: "wrong arity: expected 1 argument".to_string(),
})
} else {
Ok(f(&args[0]))
}
}
Base::Binary(r#fn) => {
if args.len() != 2 {
Err(LudusError {
msg: "wrong arity: expected 2 arguments".to_string(),
})
} else {
Ok(r#fn(&args[0], &args[1]))
}
}
Base::Ternary(f) => {
if args.len() != 3 {
Err(LudusError {
msg: "wrong arity: expected 3 arguments".to_string(),
})
} else {
Ok(f(&args[0], &args[1], &args[2]))
}
}
},
_ => unreachable!(),
}
}
pub fn eval<'src, 'a>(
ast: &'src Ast,
ctx: &'a mut Vec<(String, Value<'src>)>,
) -> Result<Value<'src>, LudusError> {
match ast {
Ast::Nil => Ok(Value::Nil),
Ast::Boolean(b) => Ok(Value::Boolean(*b)),
Ast::Number(n) => Ok(Value::Number(*n)),
Ast::Keyword(k) => Ok(Value::Keyword(k)),
Ast::String(s) => Ok(Value::InternedString(s)),
Ast::Interpolated(parts) => {
let mut interpolated = String::new();
for part in parts {
match &part.0 {
StringPart::Data(s) => interpolated.push_str(s.as_str()),
StringPart::Word(w) => {
let val = if let Some((_, value)) =
ctx.iter().rev().find(|(name, _)| w == name)
{
value.clone()
} else {
return Err(LudusError {
msg: format!("unbound name {w}"),
});
};
interpolated.push_str(val.interpolate().as_str())
}
StringPart::Inline(_) => unreachable!(),
}
}
Ok(Value::AllocatedString(Rc::new(interpolated)))
}
Ast::Block(exprs) => {
let to = ctx.len();
let mut result = Value::Nil;
for (expr, _) in exprs {
result = eval(expr, ctx)?;
}
while ctx.len() > to {
ctx.pop();
}
Ok(result)
}
Ast::If(cond, if_true, if_false) => {
let truthy = eval(&cond.0, ctx)?.bool();
if truthy {
eval(&if_true.0, ctx)
} else {
eval(&if_false.0, ctx)
}
}
Ast::List(members) => {
let mut vect = Vector::new();
for member in members {
if let Ast::Splat(_) = member.0 {
let to_splat = eval(&member.0, ctx)?;
match to_splat {
Value::List(list) => vect.append(list),
_ => {
return Err(LudusError {
msg: "only lists may be splatted into lists".to_string(),
})
}
}
} else {
vect.push_back(eval(&member.0, ctx)?)
}
}
Ok(Value::List(vect))
}
Ast::Tuple(members) => {
let mut vect = Vec::new();
for member in members {
vect.push(eval(&member.0, ctx)?);
}
Ok(Value::Tuple(Rc::new(vect)))
}
Ast::Word(w) | Ast::Splat(w) => {
let val = if let Some((_, value)) = ctx.iter().rev().find(|(name, _)| w == name) {
value.clone()
} else {
return Err(LudusError {
msg: format!("unbound name {w}"),
});
};
Ok(val)
}
Ast::Let(patt, expr) => {
let val = eval(&expr.0, ctx)?;
match match_pattern(&patt.0, &val, ctx) {
Some(_) => Ok(val),
None => Err(LudusError {
msg: "No match".to_string(),
}),
}
}
Ast::Placeholder => Ok(Value::Placeholder),
Ast::Error => unreachable!(),
Ast::Arguments(a) => {
let mut args = vec![];
for (arg, _) in a.iter() {
let arg = eval(arg, ctx)?;
args.push(arg);
}
if args.iter().any(|arg| matches!(arg, Value::Placeholder)) {
Ok(Value::Args(Rc::new(args)))
} else {
Ok(Value::Tuple(Rc::new(args)))
}
}
Ast::Dict(terms) => {
let mut dict = HashMap::new();
for term in terms {
let (term, _) = term;
match term {
Ast::Pair(key, value) => {
let value = eval(&value.0, ctx)?;
dict.insert(*key, value);
}
Ast::Splat(_) => {
let resolved = eval(term, ctx)?;
let Value::Dict(to_splat) = resolved else {
return Err(LudusError {
msg: "cannot splat non-dict into dict".to_string(),
});
};
dict = to_splat.union(dict);
}
_ => unreachable!(),
}
}
Ok(Value::Dict(dict))
}
Ast::Box(name, expr) => {
let val = eval(&expr.0, ctx)?;
let boxed = Value::Box(name, Rc::new(RefCell::new(val)));
ctx.push((name.to_string(), boxed.clone()));
Ok(boxed)
}
Ast::Synthetic(root, first, rest) => {
let root = eval(&root.0, ctx)?;
let first = eval(&first.0, ctx)?;
let mut curr = apply(root, first, ctx)?;
for term in rest.iter() {
let next = eval(&term.0, ctx)?;
curr = apply(curr, next, ctx)?;
}
Ok(curr)
}
Ast::When(clauses) => {
for clause in clauses.iter() {
let WhenClause { cond, body } = &clause.0;
if eval(&cond.0, ctx)?.bool() {
return eval(&body.0, ctx);
};
}
Err(LudusError {
msg: "no match".to_string(),
})
}
Ast::Match(value, clauses) => {
let value = eval(&value.0, ctx)?;
match_clauses(&value, clauses, ctx)
}
Ast::Fn(name, clauses, doc) => {
let doc = doc.map(|s| s.to_string());
let the_fn = Value::Fn::<'src>(Rc::new(Fn::<'src> {
name: name.to_string(),
body: clauses,
doc,
}));
ctx.push((name.to_string(), the_fn.clone()));
Ok(the_fn)
}
Ast::FnDeclaration(_name) => todo!(),
Ast::Panic(msg) => {
let msg = eval(&msg.0, ctx)?;
Err(LudusError {
msg: msg.to_string(),
})
}
Ast::Repeat(times, body) => {
let times_num = match eval(&times.0, ctx) {
Ok(Value::Number(n)) => n as usize,
_ => {
return Err(LudusError {
msg: "repeat may only take numbers".to_string(),
})
}
};
for _ in 0..times_num {
eval(&body.0, ctx)?;
}
Ok(Value::Nil)
}
Ast::Do(terms) => {
let mut result = eval(&terms[0].0, ctx)?;
for (term, _) in terms.iter().skip(1) {
let next = eval(term, ctx)?;
let arg = Value::Tuple(Rc::new(vec![result]));
result = apply(next, arg, ctx)?;
}
Ok(result)
}
Ast::Pair(..) => {
unreachable!()
}
Ast::Loop(init, clauses) => {
let mut args = eval(&init.0, ctx)?;
loop {
let result = match_clauses(&args, clauses, ctx)?;
if let Value::Recur(recur_args) = result {
args = Value::Tuple(Rc::new(recur_args));
} else {
return Ok(result);
}
}
}
Ast::Recur(args) => {
let mut vect = Vec::new();
for arg in args {
vect.push(eval(&arg.0, ctx)?);
}
Ok(Value::Recur(vect))
}
}
}

View File

@ -51,6 +51,12 @@ impl fmt::Display for StringPart {
} }
} }
pub struct LFn {
name: &'static str,
clauses: Vec<Spanned<Ast>>,
doc: Option<&'static str>,
}
#[derive(Clone, Debug, PartialEq, Dissectible)] #[derive(Clone, Debug, PartialEq, Dissectible)]
pub enum Ast { pub enum Ast {
// a special Error node // a special Error node
@ -83,7 +89,8 @@ pub enum Ast {
Box<Option<Spanned<Self>>>, Box<Option<Spanned<Self>>>,
Box<Spanned<Self>>, Box<Spanned<Self>>,
), ),
Fn(&'static str, Vec<Spanned<Self>>, Option<&'static str>), Fn(&'static str, Box<Spanned<Ast>>, Option<&'static str>),
FnBody(Vec<Spanned<Ast>>),
FnDeclaration(&'static str), FnDeclaration(&'static str),
Panic(Box<Spanned<Self>>), Panic(Box<Spanned<Self>>),
Do(Vec<Spanned<Self>>), Do(Vec<Spanned<Self>>),
@ -205,11 +212,10 @@ impl fmt::Display for Ast {
.join("\n") .join("\n")
) )
} }
Fn(name, clauses, _) => { FnBody(clauses) => {
write!( write!(
f, f,
"fn: {}\n{}", "{}",
name,
clauses clauses
.iter() .iter()
.map(|clause| clause.0.to_string()) .map(|clause| clause.0.to_string())
@ -217,6 +223,9 @@ impl fmt::Display for Ast {
.join("\n") .join("\n")
) )
} }
Fn(name, body, ..) => {
write!(f, "fn: {name}\n{}", body.0)
}
FnDeclaration(_name) => todo!(), FnDeclaration(_name) => todo!(),
Panic(_expr) => todo!(), Panic(_expr) => todo!(),
Do(terms) => { Do(terms) => {
@ -922,7 +931,12 @@ where
let lambda = just(Token::Reserved("fn")) let lambda = just(Token::Reserved("fn"))
.ignore_then(fn_unguarded.clone()) .ignore_then(fn_unguarded.clone())
.map_with(|clause, e| (Fn("anonymous", vec![clause], None), e.span())); .map_with(|clause, e| {
(
Fn("", Box::new((Ast::FnBody(vec![clause]), e.span())), None),
e.span(),
)
});
let fn_clauses = fn_clause let fn_clauses = fn_clause
.clone() .clone()
@ -1010,7 +1024,10 @@ where
} else { } else {
unreachable!() unreachable!()
}; };
(Fn(name, vec![clause], None), e.span()) (
Fn(name, Box::new((Ast::FnBody(vec![clause]), e.span())), None),
e.span(),
)
}); });
let docstr = select! {Token::String(s) => s}; let docstr = select! {Token::String(s) => s};
@ -1032,7 +1049,10 @@ where
} else { } else {
unreachable!() unreachable!()
}; };
(Fn(name, clauses, docstr), e.span()) (
Fn(name, Box::new((Ast::FnBody(clauses), e.span())), docstr),
e.span(),
)
}); });
let fn_ = fn_named.or(fn_compound).or(fn_decl); let fn_ = fn_named.or(fn_compound).or(fn_decl);

View File

@ -1,5 +1,5 @@
use crate::parser::*; use crate::parser::*;
use crate::spans::Span; use crate::spans::{Span, Spanned};
use crate::value::Value; use crate::value::Value;
use std::collections::{HashMap, HashSet}; use std::collections::{HashMap, HashSet};
@ -53,9 +53,9 @@ fn match_arities(arities: &HashSet<Arity>, num_args: u8) -> bool {
} }
#[derive(Debug, PartialEq)] #[derive(Debug, PartialEq)]
pub struct Validator<'a, 'src> { pub struct Validator<'a> {
pub locals: Vec<(String, Span, FnInfo)>, pub locals: Vec<(String, Span, FnInfo)>,
pub prelude: &'a Vec<(String, Value<'src>)>, pub prelude: &'a Vec<(&'static str, Value)>,
pub input: &'static str, pub input: &'static str,
pub src: &'static str, pub src: &'static str,
pub ast: &'a Ast, pub ast: &'a Ast,
@ -65,14 +65,14 @@ pub struct Validator<'a, 'src> {
status: VStatus, status: VStatus,
} }
impl<'a, 'src: 'a> Validator<'a, 'src> { impl<'a> Validator<'a> {
pub fn new( pub fn new(
ast: &'a Ast, ast: &'a Ast,
span: Span, span: Span,
input: &'static str, input: &'static str,
src: &'static str, src: &'static str,
prelude: &'a Vec<(String, Value<'src>)>, prelude: &'a Vec<(&'static str, Value)>,
) -> Validator<'a, 'src> { ) -> Validator<'a> {
Validator { Validator {
input, input,
src, src,
@ -109,7 +109,7 @@ impl<'a, 'src: 'a> Validator<'a, 'src> {
fn resolved(&self, name: &str) -> bool { fn resolved(&self, name: &str) -> bool {
self.locals.iter().any(|(bound, ..)| name == bound.as_str()) self.locals.iter().any(|(bound, ..)| name == bound.as_str())
|| self.prelude.iter().any(|(bound, _)| name == bound.as_str()) || self.prelude.iter().any(|(bound, _)| name == *bound)
} }
fn bound(&self, name: &str) -> Option<&(String, Span, FnInfo)> { fn bound(&self, name: &str) -> Option<&(String, Span, FnInfo)> {
@ -143,6 +143,13 @@ impl<'a, 'src: 'a> Validator<'a, 'src> {
} }
} }
fn visit(&mut self, node: &'a Spanned<Ast>) {
let (expr, span) = node;
self.ast = expr;
self.span = *span;
self.validate();
}
pub fn validate(&mut self) { pub fn validate(&mut self) {
use Ast::*; use Ast::*;
let root = self.ast; let root = self.ast;
@ -179,18 +186,13 @@ impl<'a, 'src: 'a> Validator<'a, 'src> {
} }
let to = self.locals.len(); let to = self.locals.len();
let tailpos = self.status.tail_position; let tailpos = self.status.tail_position;
for (expr, span) in block.iter().take(block.len() - 1) { for line in block.iter().take(block.len() - 1) {
self.status.tail_position = false; self.status.tail_position = false;
self.ast = expr; self.visit(line);
self.span = *span;
self.validate();
} }
let (expr, span) = block.last().unwrap();
self.ast = expr;
self.span = *span;
self.status.tail_position = tailpos; self.status.tail_position = tailpos;
self.validate(); self.visit(block.last().unwrap());
let block_bindings = self.locals.split_off(to); let block_bindings = self.locals.split_off(to);
@ -207,22 +209,12 @@ impl<'a, 'src: 'a> Validator<'a, 'src> {
let tailpos = self.status.tail_position; let tailpos = self.status.tail_position;
self.status.tail_position = false; self.status.tail_position = false;
let (expr, span) = cond.as_ref(); self.visit(cond.as_ref());
self.ast = expr;
self.span = *span;
self.validate();
// pass through tailpos only to then/else // pass through tailpos only to then/else
self.status.tail_position = tailpos; self.status.tail_position = tailpos;
let (expr, span) = then.as_ref(); self.visit(then.as_ref());
self.ast = expr; self.visit(r#else.as_ref());
self.span = *span;
self.validate();
let (expr, span) = r#else.as_ref();
self.ast = expr;
self.span = *span;
self.validate();
} }
Tuple(members) => { Tuple(members) => {
if members.is_empty() { if members.is_empty() {
@ -230,10 +222,8 @@ impl<'a, 'src: 'a> Validator<'a, 'src> {
} }
let tailpos = self.status.tail_position; let tailpos = self.status.tail_position;
self.status.tail_position = false; self.status.tail_position = false;
for (expr, span) in members { for member in members {
self.ast = expr; self.visit(member);
self.span = *span;
self.validate();
} }
self.status.tail_position = tailpos; self.status.tail_position = tailpos;
} }
@ -244,10 +234,8 @@ impl<'a, 'src: 'a> Validator<'a, 'src> {
} }
let tailpos = self.status.tail_position; let tailpos = self.status.tail_position;
self.status.tail_position = false; self.status.tail_position = false;
for (expr, span) in args { for arg in args {
self.ast = expr; self.visit(arg);
self.span = *span;
self.validate();
} }
self.status.has_placeholder = false; self.status.has_placeholder = false;
self.status.tail_position = tailpos; self.status.tail_position = tailpos;
@ -267,30 +255,21 @@ impl<'a, 'src: 'a> Validator<'a, 'src> {
} }
let tailpos = self.status.tail_position; let tailpos = self.status.tail_position;
self.status.tail_position = false; self.status.tail_position = false;
for (expr, span) in list { for member in list {
self.ast = expr; self.visit(member);
self.span = *span;
self.validate();
} }
self.status.tail_position = tailpos; self.status.tail_position = tailpos;
} }
Pair(_, value) => { Pair(_, value) => self.visit(value.as_ref()),
let (expr, span) = value.as_ref();
self.ast = expr;
self.span = *span;
self.validate();
}
Dict(dict) => { Dict(dict) => {
if dict.is_empty() { if dict.is_empty() {
return; return;
} }
let tailpos = self.status.tail_position; let tailpos = self.status.tail_position;
self.status.tail_position = false; self.status.tail_position = false;
for (expr, span) in dict { for pair in dict {
self.ast = expr; self.visit(pair)
self.span = *span;
self.validate();
} }
self.status.tail_position = tailpos; self.status.tail_position = tailpos;
} }
@ -299,31 +278,16 @@ impl<'a, 'src: 'a> Validator<'a, 'src> {
// check arity against fn info if first term is word and second term is args // check arity against fn info if first term is word and second term is args
Synthetic(first, second, rest) => { Synthetic(first, second, rest) => {
match (&first.0, &second.0) { match (&first.0, &second.0) {
(Ast::Word(_), Ast::Keyword(_)) => { (Ast::Word(_), Ast::Keyword(_)) => self.visit(first.as_ref()),
let (expr, span) = first.as_ref();
self.ast = expr;
self.span = *span;
self.validate();
}
(Ast::Keyword(_), Ast::Arguments(args)) => { (Ast::Keyword(_), Ast::Arguments(args)) => {
if args.len() != 1 { if args.len() != 1 {
self.err("called keywords may only take one argument".to_string()) self.err("called keywords may only take one argument".to_string())
} }
let (expr, span) = second.as_ref(); self.visit(second.as_ref());
self.ast = expr;
self.span = *span;
self.validate();
} }
(Ast::Word(name), Ast::Arguments(args)) => { (Ast::Word(name), Ast::Arguments(args)) => {
let (expr, span) = first.as_ref(); self.visit(first.as_ref());
self.ast = expr; self.visit(second.as_ref());
self.span = *span;
self.validate();
let (expr, span) = second.as_ref();
self.ast = expr;
self.span = *span;
self.validate();
//TODO: check arities of prelude fns, too //TODO: check arities of prelude fns, too
let fn_binding = self.bound(name); let fn_binding = self.bound(name);
@ -337,32 +301,20 @@ impl<'a, 'src: 'a> Validator<'a, 'src> {
_ => unreachable!(), _ => unreachable!(),
} }
for term in rest { for term in rest {
let (expr, span) = term; self.visit(term);
self.ast = expr;
self.span = *span;
self.validate();
} }
} }
WhenClause(cond, body) => { WhenClause(cond, body) => {
let tailpos = self.status.tail_position; let tailpos = self.status.tail_position;
self.status.tail_position = false; self.status.tail_position = false;
let (expr, span) = cond.as_ref(); self.visit(cond.as_ref());
self.ast = expr; //pass through tail position for when bodies
self.span = *span;
self.validate();
self.status.tail_position = tailpos; self.status.tail_position = tailpos;
let (expr, span) = body.as_ref(); self.visit(body.as_ref());
self.ast = expr;
self.span = *span;
self.validate();
} }
When(clauses) => { When(clauses) => {
for clause in clauses { for clause in clauses {
let (expr, span) = clause; self.visit(clause);
self.ast = expr;
self.span = *span;
self.validate();
} }
} }
@ -374,54 +326,30 @@ impl<'a, 'src: 'a> Validator<'a, 'src> {
} else { } else {
self.bind(name.to_string()); self.bind(name.to_string());
} }
let (expr, span) = boxed.as_ref(); self.visit(boxed.as_ref());
self.ast = expr;
self.span = *span;
self.validate();
} }
Let(lhs, rhs) => { Let(lhs, rhs) => {
let (expr, span) = rhs.as_ref(); self.visit(rhs.as_ref());
self.ast = expr; self.visit(lhs.as_ref());
self.span = *span;
self.validate();
let (expr, span) = lhs.as_ref();
self.ast = expr;
self.span = *span;
self.validate();
} }
MatchClause(pattern, guard, body) => { MatchClause(pattern, guard, body) => {
let to = self.locals.len(); let to = self.locals.len();
let (patt, span) = pattern.as_ref(); self.visit(pattern.as_ref());
self.ast = patt;
self.span = *span;
self.validate();
if let Some((expr, span)) = guard.as_ref() { if let Some(guard) = guard.as_ref() {
self.ast = expr; self.visit(guard);
self.span = *span;
self.validate();
} }
let (expr, span) = body.as_ref(); self.visit(body.as_ref());
self.ast = expr;
self.span = *span;
self.validate();
self.locals.truncate(to); self.locals.truncate(to);
} }
Match(scrutinee, clauses) => { Match(scrutinee, clauses) => {
let (expr, span) = scrutinee.as_ref(); self.visit(scrutinee.as_ref());
self.ast = expr;
self.span = *span;
self.validate();
for clause in clauses { for clause in clauses {
let (expr, span) = clause; self.visit(clause);
self.ast = expr;
self.span = *span;
self.validate();
} }
} }
FnDeclaration(name) => { FnDeclaration(name) => {
@ -434,7 +362,8 @@ impl<'a, 'src: 'a> Validator<'a, 'src> {
self.declare_fn(name.to_string()); self.declare_fn(name.to_string());
self.status.tail_position = tailpos; self.status.tail_position = tailpos;
} }
Fn(name, clauses, ..) => { FnBody(..) => unreachable!(),
Fn(name, body, ..) => {
let mut is_declared = false; let mut is_declared = false;
match self.bound(name) { match self.bound(name) {
Some((_, _, FnInfo::Declared)) => is_declared = true, Some((_, _, FnInfo::Declared)) => is_declared = true,
@ -452,8 +381,12 @@ impl<'a, 'src: 'a> Validator<'a, 'src> {
let from = self.status.used_bindings.len(); let from = self.status.used_bindings.len();
let mut arities = HashSet::new(); let mut arities = HashSet::new();
let (Ast::FnBody(clauses), _) = body.as_ref() else {
unreachable!()
};
for clause in clauses { for clause in clauses {
// TODO: validate all parts of clauses // we have to do this explicitly here because of arity checking
let (expr, span) = clause; let (expr, span) = clause;
self.ast = expr; self.ast = expr;
self.span = *span; self.span = *span;
@ -462,12 +395,7 @@ impl<'a, 'src: 'a> Validator<'a, 'src> {
self.validate(); self.validate();
} }
// this should be right // collect info about what the function closes over
// we can't bind anything that's already bound,
// even in arg names
// so anything that is already bound and used
// will, of necessity, be closed over
// we don't want to try to close over locals in functions
let mut closed_over = HashSet::new(); let mut closed_over = HashSet::new();
for binding in self.status.used_bindings.iter().skip(from) { for binding in self.status.used_bindings.iter().skip(from) {
if self.bound(binding.as_str()).is_some() { if self.bound(binding.as_str()).is_some() {
@ -488,10 +416,7 @@ impl<'a, 'src: 'a> Validator<'a, 'src> {
Panic(msg) => { Panic(msg) => {
let tailpos = self.status.tail_position; let tailpos = self.status.tail_position;
self.status.tail_position = false; self.status.tail_position = false;
let (expr, span) = msg.as_ref(); self.visit(msg.as_ref());
self.ast = expr;
self.span = *span;
self.validate();
self.status.tail_position = tailpos; self.status.tail_position = tailpos;
} }
// TODO: fix the tail call here? // TODO: fix the tail call here?
@ -500,39 +425,23 @@ impl<'a, 'src: 'a> Validator<'a, 'src> {
return self.err("do expressions must have at least two terms".to_string()); return self.err("do expressions must have at least two terms".to_string());
} }
for term in terms.iter().take(terms.len() - 1) { for term in terms.iter().take(terms.len() - 1) {
let (expr, span) = term; self.visit(term);
self.ast = expr;
self.span = *span;
self.validate();
} }
let last = terms.last().unwrap();
let (expr, span) = terms.last().unwrap(); self.visit(last);
self.ast = expr; if matches!(last.0, Ast::Recur(_)) {
self.span = *span;
if matches!(expr, Ast::Recur(_)) {
self.err("`recur` may not be used in `do` forms".to_string()); self.err("`recur` may not be used in `do` forms".to_string());
} }
self.validate();
} }
Repeat(times, body) => { Repeat(times, body) => {
self.status.tail_position = false; self.status.tail_position = false;
let (expr, span) = times.as_ref(); self.visit(times.as_ref());
self.ast = expr; self.visit(body.as_ref());
self.span = *span;
self.validate();
let (expr, span) = body.as_ref();
self.ast = expr;
self.span = *span;
self.validate();
} }
Loop(with, body) => { Loop(with, body) => {
let (expr, span) = with.as_ref(); self.visit(with.as_ref());
self.span = *span;
self.ast = expr;
self.validate();
let Ast::Tuple(input) = expr else { let Ast::Tuple(input) = &with.0 else {
unreachable!() unreachable!()
}; };
@ -588,10 +497,7 @@ impl<'a, 'src: 'a> Validator<'a, 'src> {
self.status.tail_position = false; self.status.tail_position = false;
for arg in args { for arg in args {
let (expr, span) = arg; self.visit(arg);
self.ast = expr;
self.span = *span;
self.validate();
} }
} }
WordPattern(name) => match self.bound(name) { WordPattern(name) => match self.bound(name) {
@ -654,25 +560,15 @@ impl<'a, 'src: 'a> Validator<'a, 'src> {
return; return;
} }
for term in terms.iter().take(terms.len() - 1) { for term in terms.iter().take(terms.len() - 1) {
let (patt, span) = term; self.visit(term);
self.ast = patt;
self.span = *span;
self.validate();
} }
self.status.last_term = true; self.status.last_term = true;
let (patt, span) = terms.last().unwrap(); let last = terms.last().unwrap();
self.ast = patt; self.visit(last);
self.span = *span;
self.validate();
self.status.last_term = false; self.status.last_term = false;
} }
PairPattern(_, patt) => { PairPattern(_, patt) => self.visit(patt.as_ref()),
let (patt, span) = patt.as_ref();
self.ast = patt;
self.span = *span;
self.validate();
}
// terminals can never be invalid // terminals can never be invalid
Nil | Boolean(_) | Number(_) | Keyword(_) | String(_) => (), Nil | Boolean(_) | Number(_) | Keyword(_) | String(_) => (),
// terminal patterns can never be invalid // terminal patterns can never be invalid

View File

@ -1,193 +1,148 @@
use crate::base::*; use crate::compiler::Chunk;
use crate::parser::*; use crate::parser::Ast;
use crate::spans::*; use crate::spans::Spanned;
use imbl::*; use imbl::{HashMap, Vector};
use std::cell::RefCell; use std::cell::{OnceCell, RefCell};
use std::fmt;
use std::rc::Rc; use std::rc::Rc;
use struct_scalpel::Dissectible;
#[derive(Clone, Debug)] #[derive(Clone, Debug, PartialEq)]
pub struct Fn<'src> { pub struct LFn {
pub name: String, pub name: &'static str,
pub body: &'src Vec<Spanned<Ast>>, pub doc: Option<&'static str>,
pub doc: Option<String>, // pub enclosing: Vec<(usize, Value)>,
pub enclosing: Vec<(String, Value<'src>)>, // pub has_run: bool,
pub has_run: bool, // pub input: &'static str,
pub input: &'static str, // pub src: &'static str,
pub src: &'static str, pub chunk: Chunk,
pub closed: Vec<Value>,
} }
#[derive(Debug, Dissectible)] impl LFn {
pub enum Value<'src> { pub fn close(&mut self, val: Value) {
Nil, self.closed.push(val);
Placeholder,
Boolean(bool),
Number(f64),
Keyword(&'static str),
InternedString(&'static str),
AllocatedString(Rc<String>),
// on the heap for now
Tuple(Rc<Vec<Self>>),
Args(Rc<Vec<Self>>),
List(Vector<Self>),
Dict(HashMap<&'static str, Self>),
Box(&'static str, Rc<RefCell<Self>>),
Fn(Rc<RefCell<Fn<'src>>>),
FnDecl(&'static str),
Base(BaseFn<'src>),
Recur(Vec<Self>),
// Set(HashSet<Self>),
// Sets are hard
// Sets require Eq
// Eq is not implemented on f64, because NaNs
// We could use ordered_float::NotNan
// Let's defer that
// We're not really using sets in Ludus
// Other things we're not implementing yet:
// pkgs, nses, tests
}
impl<'src> Clone for Value<'src> {
fn clone(&self) -> Value<'src> {
match self {
Value::Nil => Value::Nil,
Value::Boolean(b) => Value::Boolean(*b),
Value::InternedString(s) => Value::InternedString(s),
Value::AllocatedString(s) => Value::AllocatedString(s.clone()),
Value::Keyword(s) => Value::Keyword(s),
Value::Number(n) => Value::Number(*n),
Value::Tuple(t) => Value::Tuple(t.clone()),
Value::Args(a) => Value::Args(a.clone()),
Value::Fn(f) => Value::Fn(f.clone()),
Value::FnDecl(name) => Value::FnDecl(name),
Value::List(l) => Value::List(l.clone()),
Value::Dict(d) => Value::Dict(d.clone()),
Value::Box(name, b) => Value::Box(name, b.clone()),
Value::Placeholder => Value::Placeholder,
Value::Base(b) => Value::Base(b.clone()),
Value::Recur(..) => unreachable!(),
}
} }
} }
impl fmt::Display for Value<'_> { #[derive(Clone, Debug, PartialEq)]
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { pub enum Value {
Nil,
True,
False,
Keyword(usize),
Interned(usize),
FnDecl(usize),
String(Rc<String>),
Number(f64),
Tuple(Rc<Vec<Value>>),
List(Box<Vector<Value>>),
Dict(Box<HashMap<usize, Value>>),
Box(Rc<RefCell<Value>>),
Fn(Rc<OnceCell<LFn>>),
}
impl std::fmt::Display for Value {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
use Value::*;
match self { match self {
Value::Nil => write!(f, "nil"), Nil => write!(f, "nil"),
Value::Boolean(b) => write!(f, "{b}"), True => write!(f, "true"),
Value::Number(n) => write!(f, "{n}"), False => write!(f, "false"),
Value::Keyword(k) => write!(f, ":{k}"), Keyword(idx) => write!(f, ":{idx}"),
Value::InternedString(s) => write!(f, "\"{s}\""), Interned(idx) => write!(f, "\"@{idx}\""),
Value::AllocatedString(s) => write!(f, "\"{s}\""), Number(n) => write!(f, "{n}"),
Value::Fn(fun) => write!(f, "fn {}", fun.borrow().name), Tuple(members) => write!(
Value::FnDecl(name) => write!(f, "fn {name}"),
Value::Tuple(t) | Value::Args(t) => write!(
f, f,
"({})", "({})",
t.iter() members
.iter()
.map(|x| x.to_string()) .map(|x| x.to_string())
.collect::<Vec<_>>() .collect::<Vec<_>>()
.join(", ") .join(", ")
), ),
Value::List(l) => write!( List(members) => write!(
f, f,
"[{}]", "[{}]",
l.iter() members
.iter()
.map(|x| x.to_string()) .map(|x| x.to_string())
.collect::<Vec<_>>() .collect::<Vec<_>>()
.join(", ") .join(", ")
), ),
Value::Dict(d) => write!( Dict(members) => write!(
f, f,
"#{{{}}}", "#{{{}}}",
d.iter() members
.map(|(k, v)| format!(":{k} {v}")) .iter()
.map(|(k, v)| format!("{k} {v}"))
.collect::<Vec<_>>() .collect::<Vec<_>>()
.join(", ") .join(", ")
), ),
Value::Box(name, value) => { Box(value) => write!(f, "box {}", value.as_ref().borrow()),
write!( Fn(lfn) => write!(f, "fn {}", lfn.get().unwrap().name),
f, _ => todo!(),
"box {}: [{}]",
name,
&value.try_borrow().unwrap().to_string()
)
}
Value::Placeholder => write!(f, "_"),
Value::Base(..) => unreachable!(),
Value::Recur(..) => unreachable!(),
} }
} }
} }
impl Value<'_> { impl Value {
pub fn bool(&self) -> bool { pub fn show(&self, ctx: &Chunk) -> String {
!matches!(self, Value::Nil | Value::Boolean(false)) use Value::*;
} match &self {
} Nil => "nil".to_string(),
True => "true".to_string(),
impl<'src> PartialEq for Value<'src> { False => "false".to_string(),
fn eq(&self, other: &Value<'src>) -> bool { Number(n) => format!("{n}"),
match (self, other) { Interned(i) => {
// value equality types let str_str = ctx.strings[*i];
(Value::Nil, Value::Nil) => true, format!("\"{str_str}\"")
(Value::Boolean(x), Value::Boolean(y)) => x == y,
(Value::Number(x), Value::Number(y)) => x == y,
(Value::InternedString(x), Value::InternedString(y)) => x == y,
(Value::AllocatedString(x), Value::AllocatedString(y)) => x == y,
(Value::InternedString(x), Value::AllocatedString(y)) => *x == **y,
(Value::AllocatedString(x), Value::InternedString(y)) => **x == *y,
(Value::Keyword(x), Value::Keyword(y)) => x == y,
(Value::Tuple(x), Value::Tuple(y)) => x == y,
(Value::List(x), Value::List(y)) => x == y,
(Value::Dict(x), Value::Dict(y)) => x == y,
// reference equality types
(Value::Fn(x), Value::Fn(y)) => {
Rc::<RefCell<Fn<'_>>>::as_ptr(x) == Rc::<RefCell<Fn<'_>>>::as_ptr(y)
} }
(Value::Box(_, x), Value::Box(_, y)) => { Keyword(i) => {
Rc::<RefCell<Value<'_>>>::as_ptr(x) == Rc::<RefCell<Value<'_>>>::as_ptr(y) let kw_str = ctx.keywords[*i];
format!(":{kw_str}")
} }
_ => false, Tuple(t) => {
let members = t.iter().map(|e| e.show(ctx)).collect::<Vec<_>>().join(", ");
format!("({members})")
}
List(l) => {
let members = l.iter().map(|e| e.show(ctx)).collect::<Vec<_>>().join(", ");
format!("[{members}]")
}
Dict(d) => {
let members = d
.iter()
.map(|(k, v)| {
let key_show = Value::Keyword(*k).show(ctx);
let value_show = v.show(ctx);
format!("{key_show} {value_show}")
})
.collect::<Vec<_>>()
.join(", ");
format!("#{{{members}}}")
}
String(s) => s.as_ref().clone(),
Box(x) => format!("box {{ {} }}", x.as_ref().borrow().show(ctx)),
Fn(lfn) => format!("fn {}", lfn.get().unwrap().name),
_ => todo!(),
} }
} }
}
impl Eq for Value<'_> {} pub fn type_of(&self) -> &'static str {
use Value::*;
impl Value<'_> {
pub fn interpolate(&self) -> String {
match self { match self {
Value::Nil => String::new(), Nil => "nil",
Value::Boolean(b) => format!("{b}"), True => "bool",
Value::Number(n) => format!("{n}"), False => "bool",
Value::Keyword(k) => format!(":{k}"), Keyword(..) => "keyword",
Value::AllocatedString(s) => format!("{s}"), Interned(..) => "string",
Value::InternedString(s) => s.to_string(), FnDecl(..) => "fn",
Value::Box(_, x) => x.borrow().interpolate(), String(..) => "string",
Value::Tuple(xs) => xs Number(..) => "number",
.iter() Tuple(..) => "tuple",
.map(|x| x.interpolate()) List(..) => "list",
.collect::<Vec<_>>() Dict(..) => "dict",
.join(", "), Box(..) => "box",
Value::List(xs) => xs Fn(..) => "fn",
.iter()
.map(|x| x.interpolate())
.collect::<Vec<_>>()
.join(", "),
Value::Dict(xs) => xs
.iter()
.map(|(k, v)| format!(":{} {}", k, v.interpolate()))
.collect::<Vec<_>>()
.join(", "),
Value::Fn(x) => format!("fn {}", x.borrow().name),
Value::FnDecl(name) => format!("fn {name}"),
Value::Placeholder => unreachable!(),
Value::Args(_) => unreachable!(),
Value::Recur(_) => unreachable!(),
Value::Base(_) => unreachable!(),
} }
} }
} }

778
src/vm.rs
View File

@ -1,533 +1,319 @@
use crate::base::*; use crate::compiler::{Chunk, Op};
use crate::parser::*; use crate::parser::Ast;
use crate::value::*; use crate::spans::Spanned;
use imbl::HashMap; use crate::value::Value;
use imbl::Vector; use chumsky::prelude::SimpleSpan;
use imbl::{HashMap, Vector};
use num_traits::FromPrimitive;
use std::cell::RefCell; use std::cell::RefCell;
use std::mem::swap;
use std::rc::Rc; use std::rc::Rc;
#[derive(Clone, Debug)] #[derive(Debug, Clone, PartialEq)]
pub struct LudusError { // pub struct Panic {
pub msg: String, // pub input: &'static str,
} // pub src: &'static str,
// pub msg: String,
// oy // pub span: SimpleSpan,
// lifetimes are a mess // pub trace: Vec<Trace>,
// I need 'src kind of everywhere // pub extra: String,
// But (maybe) using 'src in eval
// for ctx
// means I can't borrow it mutably
// I guess the question is how to get
// the branches for Ast::Block and Ast::If
// to work with a mutable borrow of ctx
// pub struct Ctx<'src> {
// pub locals: Vec<(&'src str, Value<'src>)>,
// // pub names: Vec<&'src str>,
// // pub values: Vec<Value<'src>>,
// } // }
pub struct Panic(&'static str);
// impl<'src> Ctx<'src> { #[derive(Debug, Clone, PartialEq)]
// pub fn resolve(&self, name: &'src str) -> Value { pub struct Trace {
// if let Some((_, val)) = self.locals.iter().rev().find(|(bound, _)| *bound == name) { pub callee: Spanned<Ast>,
// val.clone() pub caller: Spanned<Ast>,
// } else { pub function: Value,
// unreachable!() pub arguments: Value,
// } pub input: &'static str,
// } pub src: &'static str,
// pub fn store(&mut self, name: &'src str, value: Value<'src>) {
// self.locals.push((name, value));
// }
// }
type Context<'src> = Vec<(String, Value<'src>)>;
pub fn match_eq<T, U>(x: T, y: T, z: U) -> Option<U>
where
T: PartialEq,
{
if x == y {
Some(z)
} else {
None
}
} }
pub fn match_pattern<'src, 'a>( pub struct Vm<'a> {
patt: &Pattern, pub stack: Vec<Value>,
val: &Value<'src>, pub chunk: &'a Chunk,
ctx: &'a mut Context<'src>, pub ip: usize,
) -> Option<&'a mut Context<'src>> { pub return_register: Value,
match (patt, val) { pub matches: bool,
(Pattern::Nil, Value::Nil) => Some(ctx), pub match_depth: u8,
(Pattern::Placeholder, _) => Some(ctx), pub result: Option<Result<Value, Panic>>,
(Pattern::Number(x), Value::Number(y)) => match_eq(x, y, ctx),
(Pattern::Boolean(x), Value::Boolean(y)) => match_eq(x, y, ctx),
(Pattern::Keyword(x), Value::Keyword(y)) => match_eq(x, y, ctx),
(Pattern::String(x), Value::InternedString(y)) => match_eq(x, y, ctx),
(Pattern::String(x), Value::AllocatedString(y)) => match_eq(&x.to_string(), y, ctx),
(Pattern::Interpolated(_, StringMatcher(matcher)), Value::InternedString(y)) => {
match matcher(y.to_string()) {
Some(matches) => {
let mut matches = matches
.iter()
.map(|(word, string)| {
(
word.clone(),
Value::AllocatedString(Rc::new(string.clone())),
)
})
.collect::<Vec<_>>();
ctx.append(&mut matches);
Some(ctx)
}
None => None,
}
}
(Pattern::Word(w), val) => {
ctx.push((w.to_string(), val.clone()));
Some(ctx)
}
(Pattern::As(word, type_str), value) => {
let ludus_type = r#type(value);
let type_kw = Value::Keyword(type_str);
if type_kw == ludus_type {
ctx.push((word.to_string(), value.clone()));
Some(ctx)
} else {
None
}
}
// todo: add splats to these match clauses
(Pattern::Tuple(x), Value::Tuple(y)) => {
let has_splat = x
.iter()
.any(|patt| matches!(patt, (Pattern::Splattern(_), _)));
if x.len() > y.len() || (!has_splat && x.len() != y.len()) {
return None;
};
let to = ctx.len();
for i in 0..x.len() {
if let Pattern::Splattern(patt) = &x[i].0 {
let mut list = Vector::new();
for i in i..y.len() {
list.push_back(y[i].clone())
}
let list = Value::List(list);
match_pattern(&patt.0, &list, ctx);
} else if match_pattern(&x[i].0, &y[i], ctx).is_none() {
while ctx.len() > to {
ctx.pop();
}
return None;
}
}
Some(ctx)
}
(Pattern::List(x), Value::List(y)) => {
let has_splat = x
.iter()
.any(|patt| matches!(patt, (Pattern::Splattern(_), _)));
if x.len() > y.len() || (!has_splat && x.len() != y.len()) {
return None;
};
let to = ctx.len();
for (i, (patt, _)) in x.iter().enumerate() {
if let Pattern::Splattern(patt) = &patt {
let list = Value::List(y.skip(i));
match_pattern(&patt.0, &list, ctx);
} else if match_pattern(patt, y.get(i).unwrap(), ctx).is_none() {
while ctx.len() > to {
ctx.pop();
}
return None;
}
}
Some(ctx)
}
// TODO: optimize this on several levels
// - [ ] opportunistic mutation
// - [ ] get rid of all the pointer indirection in word splats
(Pattern::Dict(x), Value::Dict(y)) => {
let has_splat = x
.iter()
.any(|patt| matches!(patt, (Pattern::Splattern(_), _)));
if x.len() > y.len() || (!has_splat && x.len() != y.len()) {
return None;
};
let to = ctx.len();
let mut matched = vec![];
for (pattern, _) in x {
match pattern {
Pattern::Pair(key, patt) => {
if let Some(val) = y.get(key) {
if match_pattern(&patt.0, val, ctx).is_none() {
while ctx.len() > to {
ctx.pop();
}
return None;
} else {
matched.push(key);
}
} else {
return None;
};
}
Pattern::Splattern(pattern) => match pattern.0 {
Pattern::Word(w) => {
// TODO: find a way to take ownership
// this will ALWAYS make structural changes, because of this clone
// we want opportunistic mutation if possible
let mut unmatched = y.clone();
for key in matched.iter() {
unmatched.remove(*key);
}
ctx.push((w.to_string(), Value::Dict(unmatched)));
}
Pattern::Placeholder => (),
_ => unreachable!(),
},
_ => unreachable!(),
}
}
Some(ctx)
}
_ => None,
}
} }
pub fn match_clauses<'src>( impl<'a> Vm<'a> {
value: &Value<'src>, pub fn new(chunk: &'a Chunk) -> Vm<'a> {
clauses: &'src [MatchClause], Vm {
ctx: &mut Context<'src>, chunk,
) -> Result<Value<'src>, LudusError> { stack: vec![],
let to = ctx.len(); ip: 0,
for MatchClause { patt, body, guard } in clauses.iter() { return_register: Value::Nil,
if let Some(ctx) = match_pattern(&patt.0, value, ctx) { matches: false,
let pass_guard = match guard { match_depth: 0,
None => true, result: None,
Some((ast, _)) => {
let guard_res = eval(ast, ctx);
match &guard_res {
Err(_) => return guard_res,
Ok(val) => val.bool(),
}
}
};
if !pass_guard {
while ctx.len() > to {
ctx.pop();
}
continue;
}
let res = eval(&body.0, ctx);
while ctx.len() > to {
ctx.pop();
}
return res;
} }
} }
Err(LudusError {
msg: "no match".to_string(),
})
}
pub fn apply<'src>( pub fn push(&mut self, value: Value) {
callee: Value<'src>, self.stack.push(value);
caller: Value<'src>,
ctx: &mut Context,
) -> Result<Value<'src>, LudusError> {
match (callee, caller) {
(Value::Keyword(kw), Value::Dict(dict)) => {
if let Some(val) = dict.get(kw) {
Ok(val.clone())
} else {
Ok(Value::Nil)
}
}
(Value::Dict(dict), Value::Keyword(kw)) => {
if let Some(val) = dict.get(kw) {
Ok(val.clone())
} else {
Ok(Value::Nil)
}
}
(Value::Fn(f), Value::Tuple(args)) => {
let args = Value::Tuple(args);
match_clauses(&args, f.body, ctx)
}
(Value::Fn(_f), Value::Args(_args)) => todo!(),
(_, Value::Keyword(_)) => Ok(Value::Nil),
(_, Value::Args(_)) => Err(LudusError {
msg: "you may only call a function".to_string(),
}),
(Value::Base(f), Value::Tuple(args)) => match f {
Base::Nullary(f) => {
if args.len() != 0 {
Err(LudusError {
msg: "wrong arity: expected 0 arguments".to_string(),
})
} else {
Ok(f())
}
}
Base::Unary(f) => {
if args.len() != 1 {
Err(LudusError {
msg: "wrong arity: expected 1 argument".to_string(),
})
} else {
Ok(f(&args[0]))
}
}
Base::Binary(r#fn) => {
if args.len() != 2 {
Err(LudusError {
msg: "wrong arity: expected 2 arguments".to_string(),
})
} else {
Ok(r#fn(&args[0], &args[1]))
}
}
Base::Ternary(f) => {
if args.len() != 3 {
Err(LudusError {
msg: "wrong arity: expected 3 arguments".to_string(),
})
} else {
Ok(f(&args[0], &args[1], &args[2]))
}
}
},
_ => unreachable!(),
} }
}
pub fn eval<'src, 'a>( pub fn pop(&mut self) -> Value {
ast: &'src Ast, self.stack.pop().unwrap()
ctx: &'a mut Vec<(String, Value<'src>)>, }
) -> Result<Value<'src>, LudusError> {
match ast { pub fn peek(&self) -> &Value {
Ast::Nil => Ok(Value::Nil), self.stack.last().unwrap()
Ast::Boolean(b) => Ok(Value::Boolean(*b)), }
Ast::Number(n) => Ok(Value::Number(*n)),
Ast::Keyword(k) => Ok(Value::Keyword(k)), fn print_stack(&self) {
Ast::String(s) => Ok(Value::InternedString(s)), let inner = self
Ast::Interpolated(parts) => { .stack
let mut interpolated = String::new(); .iter()
for part in parts { .map(|val| val.to_string())
match &part.0 { .collect::<Vec<_>>()
StringPart::Data(s) => interpolated.push_str(s.as_str()), .join("|");
StringPart::Word(w) => { println!("{:04}: [{inner}] {}", self.ip, self.return_register);
let val = if let Some((_, value)) = }
ctx.iter().rev().find(|(name, _)| w == name)
{ fn print_debug(&self) {
value.clone() self.print_stack();
} else { self.chunk.dissasemble_instr(self.ip);
return Err(LudusError { }
msg: format!("unbound name {w}"),
}); pub fn run(&mut self) -> &Result<Value, Panic> {
}; while self.result.is_none() {
interpolated.push_str(val.interpolate().as_str()) self.interpret();
}
self.result.as_ref().unwrap()
}
pub fn panic(&mut self, msg: &'static str) {
self.result = Some(Err(Panic(msg)));
}
pub fn interpret(&mut self) {
let Some(byte) = self.chunk.bytecode.get(self.ip) else {
self.result = Some(Ok(self.stack.pop().unwrap()));
return;
};
if crate::DEBUG_RUN {
self.print_debug();
}
let op = Op::from_u8(*byte).unwrap();
use Op::*;
match op {
Nil => {
self.push(Value::Nil);
self.ip += 1;
}
True => {
self.push(Value::True);
self.ip += 1;
}
False => {
self.push(Value::False);
self.ip += 1;
}
Constant => {
let const_idx = self.chunk.bytecode[self.ip + 1];
let value = self.chunk.constants[const_idx as usize].clone();
self.push(value);
self.ip += 2;
}
Jump => {
let jump_len = self.chunk.bytecode[self.ip + 1];
self.ip += jump_len as usize + 2;
}
JumpBack => {
let jump_len = self.chunk.bytecode[self.ip + 1];
self.ip -= jump_len as usize;
}
JumpIfFalse => {
let jump_len = self.chunk.bytecode[self.ip + 1];
let cond = self.pop();
match cond {
Value::Nil | Value::False => {
self.ip += jump_len as usize + 2;
}
_ => {
self.ip += 2;
} }
StringPart::Inline(_) => unreachable!(),
} }
} }
Ok(Value::AllocatedString(Rc::new(interpolated))) JumpIfZero => {
} let jump_len = self.chunk.bytecode[self.ip + 1];
Ast::Block(exprs) => { let cond = self.pop();
let to = ctx.len(); match cond {
let mut result = Value::Nil; Value::Number(0.0) => {
for (expr, _) in exprs { self.ip += jump_len as usize + 2;
result = eval(expr, ctx)?; self.interpret()
}
while ctx.len() > to {
ctx.pop();
}
Ok(result)
}
Ast::If(cond, if_true, if_false) => {
let truthy = eval(&cond.0, ctx)?.bool();
if truthy {
eval(&if_true.0, ctx)
} else {
eval(&if_false.0, ctx)
}
}
Ast::List(members) => {
let mut vect = Vector::new();
for member in members {
if let Ast::Splat(_) = member.0 {
let to_splat = eval(&member.0, ctx)?;
match to_splat {
Value::List(list) => vect.append(list),
_ => {
return Err(LudusError {
msg: "only lists may be splatted into lists".to_string(),
})
}
} }
} else { Value::Number(..) => {
vect.push_back(eval(&member.0, ctx)?) self.ip += 2;
self.interpret()
}
_ => self.panic("repeat requires a number"),
} }
} }
Ok(Value::List(vect)) Pop => {
} self.pop();
Ast::Tuple(members) => { self.ip += 1;
let mut vect = Vec::new();
for member in members {
vect.push(eval(&member.0, ctx)?);
} }
Ok(Value::Tuple(Rc::new(vect))) PushBinding => {
} let binding_idx = self.chunk.bytecode[self.ip + 1] as usize;
Ast::Word(w) | Ast::Splat(w) => { let binding_value = self.stack[binding_idx].clone();
let val = if let Some((_, value)) = ctx.iter().rev().find(|(name, _)| w == name) { self.push(binding_value);
value.clone() self.ip += 2;
} else {
return Err(LudusError {
msg: format!("unbound name {w}"),
});
};
Ok(val)
}
Ast::Let(patt, expr) => {
let val = eval(&expr.0, ctx)?;
match match_pattern(&patt.0, &val, ctx) {
Some(_) => Ok(val),
None => Err(LudusError {
msg: "No match".to_string(),
}),
} }
} Store => {
Ast::Placeholder => Ok(Value::Placeholder), self.return_register = self.pop();
Ast::Error => unreachable!(), self.push(Value::Nil);
Ast::Arguments(a) => { self.ip += 1;
let mut args = vec![];
for (arg, _) in a.iter() {
let arg = eval(arg, ctx)?;
args.push(arg);
} }
if args.iter().any(|arg| matches!(arg, Value::Placeholder)) { Load => {
Ok(Value::Args(Rc::new(args))) let mut value = Value::Nil;
} else { swap(&mut self.return_register, &mut value);
Ok(Value::Tuple(Rc::new(args))) self.push(value);
self.ip += 1;
} }
} ResetMatch => {
Ast::Dict(terms) => { self.matches = false;
let mut dict = HashMap::new(); self.match_depth = 0;
for term in terms { self.ip += 1;
let (term, _) = term;
match term {
Ast::Pair(key, value) => {
let value = eval(&value.0, ctx)?;
dict.insert(*key, value);
}
Ast::Splat(_) => {
let resolved = eval(term, ctx)?;
let Value::Dict(to_splat) = resolved else {
return Err(LudusError {
msg: "cannot splat non-dict into dict".to_string(),
});
};
dict = to_splat.union(dict);
}
_ => unreachable!(),
}
} }
Ok(Value::Dict(dict)) MatchWord => {
} self.matches = true;
Ast::Box(name, expr) => { self.ip += 1;
let val = eval(&expr.0, ctx)?;
let boxed = Value::Box(name, Rc::new(RefCell::new(val)));
ctx.push((name.to_string(), boxed.clone()));
Ok(boxed)
}
Ast::Synthetic(root, first, rest) => {
let root = eval(&root.0, ctx)?;
let first = eval(&first.0, ctx)?;
let mut curr = apply(root, first, ctx)?;
for term in rest.iter() {
let next = eval(&term.0, ctx)?;
curr = apply(curr, next, ctx)?;
} }
Ok(curr) MatchNil => {
} let idx = self.stack.len() - self.match_depth as usize - 1;
Ast::When(clauses) => { if self.stack[idx] == Value::Nil {
for clause in clauses.iter() { self.matches = true;
let WhenClause { cond, body } = &clause.0;
if eval(&cond.0, ctx)?.bool() {
return eval(&body.0, ctx);
}; };
self.ip += 1;
self.interpret()
} }
Err(LudusError { MatchTrue => {
msg: "no match".to_string(), let idx = self.stack.len() - self.match_depth as usize - 1;
}) if self.stack[idx] == Value::True {
} self.matches = true;
Ast::Match(value, clauses) => { };
let value = eval(&value.0, ctx)?; self.ip += 1;
match_clauses(&value, clauses, ctx) }
} MatchFalse => {
Ast::Fn(name, clauses, doc) => { let idx = self.stack.len() - self.match_depth as usize - 1;
let doc = doc.map(|s| s.to_string()); if self.stack[idx] == Value::False {
let the_fn = Value::Fn::<'src>(Rc::new(Fn::<'src> { self.matches = true;
name: name.to_string(),
body: clauses,
doc,
}));
ctx.push((name.to_string(), the_fn.clone()));
Ok(the_fn)
}
Ast::FnDeclaration(_name) => todo!(),
Ast::Panic(msg) => {
let msg = eval(&msg.0, ctx)?;
Err(LudusError {
msg: msg.to_string(),
})
}
Ast::Repeat(times, body) => {
let times_num = match eval(&times.0, ctx) {
Ok(Value::Number(n)) => n as usize,
_ => {
return Err(LudusError {
msg: "repeat may only take numbers".to_string(),
})
} }
}; self.ip += 1;
for _ in 0..times_num {
eval(&body.0, ctx)?;
} }
Ok(Value::Nil) PanicIfNoMatch => {
} if !self.matches {
Ast::Do(terms) => { self.panic("no match");
let mut result = eval(&terms[0].0, ctx)?;
for (term, _) in terms.iter().skip(1) {
let next = eval(term, ctx)?;
let arg = Value::Tuple(Rc::new(vec![result]));
result = apply(next, arg, ctx)?;
}
Ok(result)
}
Ast::Pair(..) => {
unreachable!()
}
Ast::Loop(init, clauses) => {
let mut args = eval(&init.0, ctx)?;
loop {
let result = match_clauses(&args, clauses, ctx)?;
if let Value::Recur(recur_args) = result {
args = Value::Tuple(Rc::new(recur_args));
} else { } else {
return Ok(result); self.ip += 1;
} }
} }
} MatchConstant => {
Ast::Recur(args) => { let const_idx = self.chunk.bytecode[self.ip + 1];
let mut vect = Vec::new(); let idx = self.stack.len() - self.match_depth as usize - 1;
for arg in args { self.matches = self.stack[idx] == self.chunk.constants[const_idx as usize];
vect.push(eval(&arg.0, ctx)?); self.ip += 2;
} }
Ok(Value::Recur(vect)) PushTuple => {
let tuple_len = self.chunk.bytecode[self.ip + 1];
let tuple_members = self.stack.split_off(self.stack.len() - tuple_len as usize);
let tuple = Value::Tuple(Rc::new(tuple_members));
self.stack.push(tuple);
self.ip += 2;
}
PushList => {
let list_len = self.chunk.bytecode[self.ip + 1];
let list_members = self.stack.split_off(self.stack.len() - list_len as usize);
let list = Value::List(Box::new(Vector::from(list_members)));
self.stack.push(list);
self.ip += 2;
}
PushDict => {
let dict_len = self.chunk.bytecode[self.ip + 1] as usize * 2;
let dict_members = self.stack.split_off(self.stack.len() - dict_len);
let mut dict = HashMap::new();
let mut dict_iter = dict_members.iter();
while let Some(kw) = dict_iter.next() {
let Value::Keyword(key) = kw else {
unreachable!()
};
let value = dict_iter.next().unwrap();
dict.insert(*key, value.clone());
}
self.stack.push(Value::Dict(Box::new(dict)));
self.ip += 2;
}
PushBox => {
let val = self.pop();
self.stack.push(Value::Box(Rc::new(RefCell::new(val))));
self.ip += 1;
}
GetKey => {
let key = self.pop();
let Value::Keyword(idx) = key else {
unreachable!()
};
let dict = self.pop();
let value = match dict {
Value::Dict(d) => d.as_ref().get(&idx).unwrap_or(&Value::Nil).clone(),
_ => Value::Nil,
};
self.push(value);
self.ip += 1;
}
MatchTuple => {
todo!()
}
JumpIfNoMatch => {
let jump_len = self.chunk.bytecode[self.ip + 1] as usize;
if !self.matches {
self.ip += jump_len + 2;
} else {
self.ip += 2;
}
}
TypeOf => {
let val = self.pop();
let type_of = self.chunk.kw_from(val.type_of()).unwrap();
self.push(type_of);
self.ip += 1;
}
Truncate => {
let val = self.pop();
if let Value::Number(x) = val {
self.push(Value::Number(x as usize as f64));
self.ip += 1;
} else {
self.panic("repeate requires a number");
}
}
Decrement => {
let val = self.pop();
if let Value::Number(x) = val {
self.push(Value::Number(x - 1.0));
self.ip += 1;
self.interpret()
} else {
self.panic("you may only decrement a number");
}
}
Duplicate => {
self.push(self.peek().clone());
self.ip += 1;
self.interpret()
}
MatchDepth => {
self.match_depth = self.chunk.bytecode[self.ip + 1];
self.ip += 2;
self.interpret()
}
PanicNoWhen | PanicNoMatch => self.panic("no match"),
} }
} }
} }