Tips wanted on regular expression engine implementation
$30-5000 USD
Κλειστή
Αναρτήθηκε περίπου 21 χρόνια πριν
$30-5000 USD
Πληρωμή κατά την παράδοση
Hello fellow coders, I've been working on implementing a POSIX compliant regular expression matcher for 18 months, and have made some progress (this must be FAST.) I need some tips/second opinions from other programmers on an algorithm (can be pseudocode or simply textual description) of compiling a regular expression pattern into a form of bytecode that can be executed by the matching engine. Specifically: the engine needs to be POSIX compliant, and I want to hear about capturing subexpressions, and parsing the expression ready for compilation. This won't turn into a coding project, since I am only looking for tips/advice for my own coding. Thanks so much for your advice, I really want to make this the best regexp the world has ever seen... :) PS. I have the POSIX docs on the latest regex spec if you'd like to study it. PPS. Study the source code (in Java) of the Apache Jakarta Project's regexp (package [login to view URL] - download Winzip file here: [login to view URL]) and let me know what you think of this recursive-descent technique, and using a plain old array for bytecodes, as opposed to a linked list of opcodes and opdata, for example...
## Deliverables
1) Tips/opinions/advice on implementation of POSIX-compliant regular expression matcher. 2) Outline of functions regcomp and regexec. 3) Bytecode format. 4) Parsing technique. 5) Execution engine - correctly implementing greedy and reluctant (lazy) matching, capturing subexpressions and optimisation advice (i.e. making it FAST!)
## Platform
All platforms, i.e. strictly ANSI/ISO C.