Skip to content

Case sensitivity handling and note about in the docs #117

@autioch

Description

@autioch

Hi!
First of all, I'd like to thank You for creating such fast lexer. I've been using it along with nearley.js in various projects. It really changed my way of approaching any text parsing related topics.

I'm currently working on a language that requires all tokens to be case-insensitive, without exceptions. For now, following tips that I've found over the internet (and issues in this repo), I've been using some custom helpers that transform token text into case insensitive regex without the /i flag. This works, however it's not pretty. Also, even if unreal, I have doubts about the overall performance of my parser.

Why I'm creating this issue? I would like a concise description on how to approach situations where all (or some) tokens are case-insensitive. An example would be nice as well.

Let's say, that my lexer usage looks like this:

import moo from 'moo';

const lexer = ({

  /* This doesn't care about case sensitivity. */
  STRING: /"(?:[^\\]|\\.)*?"/,

  /* Case sensitivity doesn't apply here. */
  NUMBER: /(?:\.\d+|\d+\.?\d*)/,

  /* Case sensitivity doesn't apply here. */
  ADD: '+',

  /* Manually force case insensitivity */
  IN: ['in', 'iN', 'In', 'IN'],

  /* Use a helper */
  ABS: textToCaseInsensitiveRegex('ABS')
});

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions