-
Notifications
You must be signed in to change notification settings - Fork 38.4k
Description
Version: 1.26.1
Commit: 493869e
Date: 2018-08-16T18:38:57.434Z
Electron: 2.0.5
Chrome: 61.0.3163.100
Node.js: 8.9.3
V8: 6.1.534.41
Architecture: x64
While trying to deal with a Grammar issue for PowerShell, I tried to apply a regex solution that would seem to work, but doesn't. The regex works in .Net (ie PowerShell itself), but not in VS Code (I know they are not the same), though I have had success with the same regex pattern before, but under a slightly different context.
I need to be able to differentiate between:
$variable-split "`n"
some-file-named-splitI am trying to catch the -split operator after the variable, without a space, but yet not lock on to the -split portion of a command name or file name.
The following regex gives an example, it matches both a variable and the -split operator, or neither of them, when used in PowerShell. This is just a demonstration, the actual syntax process is much more complicated. Note the negative look behind of \w OR anchor from end of last match (\G).
PS C:\> $match = "(\$[a-zA-Z][a-zA-Z?_]*)|((?i:(?<!\w)|\G)-split\b)"
PS C:\> [regex]::matches("`$hello-split", $match)
Groups : {0, 1, 2}
Success : True
Name : 0
Captures : {0}
Index : 0
Length : 6
Value : $hello
Groups : {0, 1, 2}
Success : True
Name : 0
Captures : {0}
Index : 6
Length : 6
Value : -split
PS C:\> [regex]::matches("hello-split", $match)
(no matches returned)
I know VS Code supports \G, as I have used it in a repository item that was included in another pattern's content. There it successfully separated out a situation where once you start a line comment in PowerShell, you may use additional #s, then additional white space, but after that, you must use a period to start a comment based help keyword, but the same comment based help keyword could be used on a multi line comment (right after the start, but no extra #s), or on a new line in a multiline comment as well, as long as only white space appeared before the period, and its a single repository item, so a combination of (\G|^)\s*\. was able to work for all three conditions.
### .synopsis
<# .description
.notes
#>
Even GitHub's parser doesn't get them all.
Here the difference is that the $variable is matched by its own repository item, and the -split operator is captured by its own item, both at the root level of the syntax. At this point, \G appears to do nothing, and causes no matches. Doing some testing, the only place it appears to cause a match is at the very beginning of the very first line of a file.
I can understand the difference between the two scenario's above (the comment based help keyword is an included item in both comment line and comment block, and variable and operator are both separate items included in the root of the syntax), but being able to tie a match to the end of a previous match shouldn't be restricted to just being included inside the previous match's scope.
My actual match string (just an edit of the original):
"match": "(?:(?<!\\w|!)|\\G)-(?i:join|split)(?!\\p{L})"Unless I have overlooked some obvious solution to this problem, I'm sure it will be mentioned that this is a limitation of the regex engine used by VS Code's textmate syntax system. Hopefully that's not the consensus, because textmate is already limited enough.
My current PowerShell.tmLanguage.json file can be found in the wip_goal branch of msftrncs/PowerShell.tmLanguage. I've been working though the issues reported on the PowerShell/EditorSyntax repository, but have not yet started to generate the edits and pull requests against that repository.