This is a note of an idea I posted in two Twitter threads (here, here, and here). I think it’s important to capture the idea before it gets lost when blogs and Twitter disappear.
The problem with “semantic highlighting”, or what I would just call syntactic highlighting because there are really many levels of highlighting based on lexical, CFG, static and dynamic semantics, is that it’s nearly impossible for a programmer to augment his editor with rules to perform the type of check he wants. TextMate highlights the lexical syntax of a program. LSP “semantic highlighting” considers the static semantics of the program. But, if you would like the editor to highlight something more interesting, like the live/dead analysis of a variable, or constant propagation, you’re basically out of luck.
Parsing entire parse trees does not solve the problem of identifying parts of the program that you are interested in. You are only interested in paths through the tree. XPath is the best solution here.
With a grammar and a parse tree decorated with the results of semantic analysis, many types of highlighting are now possible using an XPath-based solution. For example, using Antlr’s notation for lexical and CFG symbols, comments could be tagged with “//COMMENT => green”, keywords tagged with “//keyword => blue”, and fields tagged with “//field_declaration//variable_declarator/identifier => pink”. To employ a new highlighting, one would simply tell the editor to re-tag the text using a new collection of rules.
The only problem with this idea is implementing the static semantics for the problem you are interested in.
–Ken