I’ve written a basic Bison import for Antlrvsix, and now I’m in the guts of writing a Flex import. While writing the Bison import, I soon realized a problem that I could not deal with immediately. Many people write Bison code that is very undisciplined. Often the code contains driver code (calc.y from the official Gnu repository), The static semantics is often wrapped up with the tree construction. While this is ok, this means the grammar contains target-language code, which is not a great way to write grammars. If you really want an abstract syntax tree instead of a parse tree for semantic checks, you should discipline the semantic checks outside of tree construction. For now, I do not copy code blocks to the generated Antlr grammar.
For Flex, the problem is that the embedded code exclusively performs semantic checks at the lexer phase. So this code must be included in a generated Antlr lexer grammar. The good news is that it’s not too difficult to convert the patterns in Flex into Antlr lexer rules and modes. You can then convert the code blocks in a Flex rule into your target language. Whether this works or not, I will find out–I’m translating line-by-line the Flex lexical grammar into an Antlr lexer grammar, and so far, it looks good.
Aside, I’ve been using Antlrvsix quite a bit myself to write the Flex grammars. I added a refactoring to sort lexer modes alphabetically since the original Flex “scan.l” code defined the start states in a haphazard order. And, since I’ve switch to the “Dark mode” theme in Visual Studio 2019, I added to the options color selections for grammar symbols so I can see the symbol text more clearly.
–Ken