moo-ignore

2.5.3 • Public • Published

npm version Test

Moo-ignore

Moo-ignore (🐄) is a wrapper around the moo tokenizer/lexer generator that provides a nearley.js compatible lexer with the capacity to ignore specified tokens.

Usage

Install it:

$ npm install moo-ignore

Exports

This module exports an object having the makeLexer constructor and the moo object (as in const moo = require("moo")):

const { makeLexer, moo } = require("moo-ignore");

Ignoring tokens

Then you can use it in your Nearley.js program and ignore some tokens like white spaces and comments:

@{%
const tokens = require("./tokens");
const { makeLexer } = require("moo-ignore");

let lexer = makeLexer(tokens);
lexer.ignore("ws", "comment");

const getType = ([t]) => t.type;
%}

@lexer lexer

S -> FUN LP name COMMA name COMMA name RP 
      DO 
        DO  END SEMICOLON 
        DO END 
      END
     END

name  ->      %identifier {% getType %}
COMMA ->       ","        {% getType %}
LP    ->       "("        {% getType %}
RP    ->       ")"        {% getType %}
END   ->      %end        {% getType %}
DO    ->      %dolua      {% getType %}
FUN   ->      %fun        {% getType %}
SEMICOLON ->  ";"         {% getType %}

Alternatively, you can set to ignore some tokens at construction time in the call to makeLexer:

let lexer = makeLexer(tokens, ["ws", "comment"]);

Or you can also combine both ways:

let lexer = makeLexer(tokens, ["ws"]);
lexer.ignore("comment");

For sake of completeness, here is the contents of the file tokens.js we have used in the former code:

const { moo } = require("moo-ignore");

module.exports = {
    ws: { match: /\s+/, lineBreaks: true },
    comment: /#[^\n]*/,
    lp: "(",
    rp: ")",
    comma: ",",
    semicolon: ";",
    identifier: {
        match: /[a-z_][a-z_0-9]*/,
        type: moo.keywords({
            fun: "fun",
            end: "end",
            dolua: "do"
        })
    }
}

See the tests folder in this distribution for more examples of use. Here is a program that tests the former example:

const nearley = require("nearley");
const grammar = require("./test-grammar.js");

let s = `
fun (id, idtwo, idthree)  
  do   #hello
    do end;
    do end # another comment
  end 
end`;

try {
  const parser = new nearley.Parser(nearley.Grammar.fromCompiled(grammar));
  parser.feed(s);
  console.log(parser.results[0]) /* [ 'fun', 'lp', 'identifier', 'comma',
          'identifier', 'comma', 'identifier', 'rp',
          'dolua',      'dolua', 'end', 'semicolon',
          'dolua',      'end', 'end', 'end' */
} catch (e) {
    console.log(e);
}

The eof option: Emitting a token to signal the End Of File

The last argument of makeLexer is an object with configuration options:

let lexer = makeLexer(Tokens, [ tokens, to, ignore ], { options });

Currently, the only option supported in this version is eof.

Remember that lexers generated by moo emit undefined when the end of the input is reached. This option changes this behavior.

If the option { eof : true } is specified, and a token with the name EOF: "termination string" appears in the tokens specification, moo-ignore will concat the "termination string" at the end of the input stream.

const { makeLexer } = require("moo-ignore");
const Tokens = {
  EOF: "__EOF__",
  WHITES: { match: /\s+/, lineBreaks: true },
  /* etc. */
};

let lexer = makeLexer(Tokens, ["WHITES"], { eof: true });

The generated lexer will emit this EOF token when the end of the input is reached.

Inside your grammar you'll have to explicit the use of the EOF token. Something like this:

@{%
const { lexer } = require('./lex.js');
%}
@lexer lexer
program -> expression %EOF {% id %}
# ... other rules

Readme

Keywords

Package Sidebar

Install

npm i moo-ignore

Weekly Downloads

405

Version

2.5.3

License

ISC

Unpacked Size

18.6 kB

Total Files

13

Last publish

Collaborators

  • crguezl
  • daniel-del-castillo