Skip to content
This repository was archived by the owner on Feb 22, 2024. It is now read-only.

Play with explicit AST #20

Merged
merged 7 commits into from
Oct 8, 2020
Merged

Play with explicit AST #20

merged 7 commits into from
Oct 8, 2020

Conversation

mkmik
Copy link
Member

@mkmik mkmik commented Oct 6, 2020

This is just a draft to see whether walking the pest parse tree becomes easier to reason about if we implement the interpreter around an abstract syntax tree instead of using a chain of selectors.

Probably it could become even simpler if I tweaked the grammar a bit, but I resisted doing that in the first pass.

One of the most interesting aspects of this approach is that the pest parse tree could be, in principle, mechanically mapped on on an abstract tree, using tools such as pest_ast+pest_derive

@glyn
Copy link
Contributor

glyn commented Oct 6, 2020

I quite like having an AST, but since this AST represents a strict superset of valid selectors, it adds a bit of complexity as I worry about whether an AST could be produced which doesn't corresponding to a valid selector. I guess that issue is implicit in the current implementation, but at least it doesn't stand out so much.

@mkmik
Copy link
Member Author

mkmik commented Oct 7, 2020

I quite like having an AST, but since this AST represents a strict superset of valid selectors, it adds a bit of complexity as I worry about whether an AST could be produced which doesn't corresponding to a valid selector. I guess that issue is implicit in the current implementation, but at least it doesn't stand out so much.

ideally a good AST should make invalid selectors unrepresentable; just to be clear, the case I can see is a "rootless" selector.
are there other things you can see?

@mkmik mkmik force-pushed the ast branch 2 times, most recently from 222cc12 to 1fbd1e0 Compare October 7, 2020 08:17
@mkmik
Copy link
Member Author

mkmik commented Oct 7, 2020

hmm, no actually the Root path enum variant must be present otherwise the type recursion doesn't terminate;
so I guess don't understand what you meant :-)

@mkmik
Copy link
Member Author

mkmik commented Oct 7, 2020

is it because the this AST allows: Selector::Dot(Index::Number(0)) ?

@glyn
Copy link
Contributor

glyn commented Oct 7, 2020

is it because the this AST allows: Selector::Dot(Index::Number(0)) ?

Yes, precisely.

@mkmik mkmik force-pushed the ast branch 2 times, most recently from df1f6c6 to 1641fd5 Compare October 7, 2020 09:19
Copy link
Contributor

@glyn glyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Just a couple of nits on the tree diagrams.

@mkmik
Copy link
Member Author

mkmik commented Oct 7, 2020

ok, so now I fixed the AST data model so that .4 is not representable;

I'd like to see how hard it would be to also make [1,"foo"] unrepresentable

@glyn
Copy link
Contributor

glyn commented Oct 7, 2020

I'd like to see how hard it would be to also make [1,"foo"] unrepresentable

That selector is valid and should be representable. Try the following tests:

  }, {
    "name": "mixed union, applied to array",
    "selector": "$[1,\"foo\"]",
    "document": ["first", "second"],
    "result": ["second"]
  }, {
    "name": "mixed union, applied to object",
    "selector": "$[1,\"foo\"]",
    "document": {"foo" : "A", "b" : "B"},
    "result": ["A"]

src/ast.rs Outdated
/// ^ \
/// / \ [Number(1), Number(2)]
/// / \
/// Root ___/ \___ DotName("foo")
Copy link
Contributor

@glyn glyn Oct 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two diagrams still seem inconsistent. I'd expect the part of the tree nearest the root to be identical in both cases since $.foo is a common prefix of both selectors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's because the expressions have right to left precedence

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still don't get it. $.foo.bar produces the AST:

Sel(Sel(Root, DotName("foo")), DotName("bar"))

and $.foo[1,2]["bar"] produces:

Sel(Sel(Sel(Root, DotName("foo")), Union([Number(1), Number(2)])), Union([Field("bar")]))

Root appears inside Sel(Root, DotName("foo")) in both cases. Shouldn't this subtree look the same in both diagrams?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh :brainfart:; fixed

Value::Object(m) => Box::new(m.values()),
Value::Array(a) => Box::new(a.iter()),
_ => Box::new(std::iter::empty()),
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the readability of this approach. (It's a pity Box::new can't be factored out of the outer match, but that's rust for you.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully soon rust will have the box keyword which at least removes the need for extra parentheses

@mkmik
Copy link
Member Author

mkmik commented Oct 7, 2020

That selector is valid and should be representable. Try the following tests:

Ah yes, you're right

@mkmik mkmik force-pushed the ast branch 2 times, most recently from 1031ec9 to d35dad7 Compare October 7, 2020 21:10
/// ```
///
/// Selectors are left associative, thus `$.foo[1,2]["bar"]` behaves
/// like (pseudocode) `(($.foo)[1,2])["bar"]`; thus the root of the resulting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth pointing out that the Root selector ($) is not related to the root of the AST? Might confuse people otherwise.

Copy link
Contributor

@glyn glyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few naming issues. What else do you have in mind for this PR before it is ready for merging?

@mkmik
Copy link
Member Author

mkmik commented Oct 8, 2020

A few naming issues. What else do you have in mind for this PR before it is ready for merging?

nothing that can't be added in smaller subsequent PRs

Copy link
Contributor

@glyn glyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thanks!

@glyn glyn merged commit 5f537a2 into jsonpath-standard:main Oct 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants