Natural Language Parsing with Ruby
Hi, today I am here to share with you a way to parse natural language with Ruby, using treetop gem.
Why Do I Need it?
Imagine we need that a user from our application input some rule or condition to solve a problem. We could use a traditional field like
[type="text"] to get it, no?!
But, what if this input is so complex (too many logical/comparison operators for instance) that you would need a lot of them to pull it off?
I came across with this situation a time ago, so, the better solution I found, was to use natural language to let users input rules using their own mother language.
Treetop let us define the syntax that is going to be parsed, so then, we need to create a
treetop file with desired rules.
Let's suppose we need to take some action just if the result of the following rule were true: "if number of orders is greater than X" (X is an integer number).
Below, we can see a
treetop file describing the above statement.
Ok, we have the syntax set, now we need to parse and evaluate the statement.
Our system does not know how to interpret the assertion, we need to help.. Let's create a file to put some assistance.
text_value method from each class inheriting from
Treetop::Runtime::SyntaxNode represents the value to be returned when parsing some statement, so here, we say to return
> always that the parser finds the snippet
greater than (that is linked to the
Once the class does not override the
text_value method, it will return the same value contained into the assertion, but, to get the return, we need to create the class, as we did with
Ok, now our system already knows how to deal with the statement, let's parse it:
Here we call the parser passing the statement with the already known number of orders and clean the tree (as written in this post).
We iterate over the extracted values from our assertion, create a valid Ruby statement and then evaluate it with the
This is a simple example, but we can make use of more complex rules to parse any statement.
You can check out the source code here.