Boost Spirit Semantic Action Assignments

This example demonstrates:

  • The Symbol Table
  • Non-terminal rules
Symbol Table

The symbol table holds a dictionary of symbols where each symbol is a sequence of characters. The template class, can work efficiently with 8, 16, 32 and even 64 bit characters. Mutable data of type T are associated with each symbol.

Traditionally, symbol table management is maintained separately outside the BNF grammar through semantic actions. Contrary to standard practice, the Spirit symbol table class is a parser. An object of which may be used anywhere in the EBNF grammar specification. It is an example of a dynamic parser. A dynamic parser is characterized by its ability to modify its behavior at run time. Initially, an empty symbols object matches nothing. At any time, symbols may be added or removed, thus, dynamically altering its behavior.

Each entry in a symbol table may have an associated mutable data slot. In this regard, one can view the symbol table as an associative container (or map) of key-value pairs where the keys are strings.

The symbols class expects one template parameter to specify the data type associated with each symbol: its attribute. There are a couple of namespaces in X3 where you can find various versions of the symbols class for handling different character encoding including ascii, standard, standard_wide, iso8859_1, and unicode. The default symbol parser type in the main x3 namespace is standard.

Here's a parser for roman hundreds (100..900) using the symbol table. Keep in mind that the data associated with each slot is the parser's attribute (which is passed to attached semantic actions).

structhundreds_:x3::symbols<unsigned>{hundreds_(){add("C",100)("CC",200)("CCC",300)("CD",400)("D",500)("DC",600)("DCC",700)("DCCC",800)("CM",900);}}hundreds;

Here's a parser for roman tens (10..90):

structtens_:x3::symbols<unsigned>{tens_(){add("X",10)("XX",20)("XXX",30)("XL",40)("L",50)("LX",60)("LXX",70)("LXXX",80)("XC",90);}}tens;

and, finally, for ones (1..9):

structones_:x3::symbols<unsigned>{ones_(){add("I",1)("II",2)("III",3)("IV",4)("V",5)("VI",6)("VII",7)("VIII",8)("IX",9);}}ones;

Now we can use , and anywhere in our parser expressions. They are all parsers.

Rules

Up until now, we've been inlining our parser expressions, passing them directly to the function. The expression evaluates into a temporary, unnamed parser which is passed into the function, used, and then destroyed. This is fine for small parsers. When the expressions get complicated, you'd want to break the expressions into smaller easier-to-understand pieces, name them, and refer to them from other parser expressions by name.

A parser expression can be assigned to what is called a "rule". There are various ways to declare rules. The simplest form is:

rule<ID>constr="some-name";

At the very least, the rule needs an identification tag. This ID can be any struct or class type and need not be defined. Forward declaration would suffice. The name is optional, but is useful for debugging and error handling, as we'll see later. Notice that rule is declared . Rules are immutable and are best declared as .

Note

Unlike Qi (Spirit V2), X3 rules can be used with both and without having to specify the skip parser

For our next example, there's one more rule form you should know about:

rule<ID,Attribute>constr="some-name";

The Attribute specifies the attributes of the rule. You've seen that our parsers can have an attribute. Recall that the parser has an attribute of . To be precise, these are synthesized attributes. The parser "synthesizes" the attribute value. Think of them as function return values.

After having declared a rule, you need a definition for the rule. Example:

autoconstr_def=double_>>*(','>>double_);

By convention, rule definitions have a _def suffix. Like rules, rule definitions are immutable and are best declared as . Now that we have a rule and its definition, we tie the rule with a rule definition using the macro:

BOOST_SPIRIT_DEFINE(r);
Note

is variadic and may be used for one or more rules. Example:

Grammars

Unlike Qi (Spirit V2), X3 discards the notion of a grammar as a concrete entity for encapsulating rules. In X3, a grammar is simply a logical group of rules that work together, typically with a single top-level start rule which serves as the main entry point. X3 grammars are grouped using namespaces. The roman numeral grammar is a very nice and simple example of a grammar:

namespaceparser{usingx3::eps;usingx3::lit;usingx3::_val;usingx3::_attr;usingascii::char_;autoset_zero=[&](auto&ctx){_val(ctx)=0;};autoadd1000=[&](auto&ctx){_val(ctx)+=1000;};autoadd=[&](auto&ctx){_val(ctx)+=_attr(ctx);};x3::rule<classroman,unsigned>constroman="roman";autoconstroman_def=eps[set_zero]>>(-(+lit('M')[add1000])>>-hundreds[add]>>-tens[add]>>-ones[add]);BOOST_SPIRIT_DEFINE(roman);}

Things to take notice of:

  • The start rule's attribute is .
  • gets a reference to the rule's synthesized attribute.
  • gets a reference to the parser's synthesized attribute.
  • is a special spirit parser that consumes no input but is always successful. We use it to initialize the rule's synthesized attribute, to zero before anything else. The actual parser starts at , parsing roman thousands. Using this way is good for doing pre and post initializations.
  • The rule and the definition are const objects.
Let's Parse!
boolr=parse(iter,end,roman,result);if(r&&iter==end){std::cout<<"-------------------------\n";std::cout<<"Parsing succeeded\n";std::cout<<"result = "<<result<<std::endl;std::cout<<"-------------------------\n";}else{std::stringrest(iter,end);std::cout<<"-------------------------\n";std::cout<<"Parsing failed\n";std::cout<<"stopped at: \": "<<rest<<"\"\n";std::cout<<"-------------------------\n";}

is our roman numeral parser. This time around we are using the no-skipping version of the parse functions. We do not want to skip any spaces! We are also passing in an attribute, , which will receive the parsed value.

The full cpp file for this example can be found here: ../../../example/x3/roman.cpp


I have a rule that doesn't have any semantic actions. I just used = as I had forgot to specify %= but the rule worked fine; the attribute was assigned.

I'm confused as I was under the impression the longer-winded way of doing this was to use r = grammar[_val = _1], but one could use r %= grammar. Why then does my r = grammar work?

Although I don't think it's required, see below for some standalone program code. Repeated question marks show where %= and = are interchangeable.

#include <iostream>

#include <boost/fusion/include/adapt_struct.hpp>

#include <boost/optional.hpp>

#include <boost/spirit/include/qi.hpp>

struct Info

{

boost::optional<std::string>value_;

};

BOOST_FUSION_ADAPT_STRUCT(

Info,

(boost::optional<std::string>,value_)

)

template<typename Iterator>

struct field_value_grammar : boost::spirit::qi::grammar<Iterator,std::string()>

{

field_value_grammar() : field_value_grammar::base_type(field_value)

{

using namespace boost::spirit;

using qi::char_;

using qi::space;

single_word %= +(char_ - (space | '"'));// No skipping (implicit lexeme[])

quoted_string %= '"' > *(char_ - '"') > '"';// No skipping (implicit lexeme[])

field_value %= single_word | quoted_string;// No skipping (implicit lexeme[])

}

boost::spirit::qi::rule<Iterator,std::string()> single_word;

boost::spirit::qi::rule<Iterator,std::string()> quoted_string;

boost::spirit::qi::rule<Iterator,std::string()> field_value;

};

template<typename Iterator,typename Skipper>

struct info_grammar : boost::spirit::qi::grammar<Iterator,Info(),Skipper>

{

info_grammar() : info_grammar::base_type(info)

{

using namespace boost::spirit;

info = lit("Info") > '{' >> -("string" > field_value) > '}';// ??????????????????????

}

field_value_grammar<Iterator> field_value;

boost::spirit::qi::rule<Iterator,Info(),Skipper> info;

};

template<typename Iterator>

bool ParseInfo(Iterator first,Iterator last,Info& data)

{

typedef boost::spirit::ascii::space_type Skipper;

Skipper skipper;

bool r = boost::spirit::qi::phrase_parse(first,last,

(

info_grammar<Iterator,Skipper>()

),

skipper,

data);

return r && first == last;

}

void TestInfo()

{

try

{

std::cout << "TESTING INFO" << std::endl;

std::cout << "============" << std::endl;

std::string s = "Info {\n\

string \"informational text\"\n\

 }";

Info data;

bool b = ParseInfo(s.begin(),s.end(),data);

if(b)

std::cout << "parse success" << std::endl;

else

std::cout << "parse FAILURE !!!" << std::endl;

if(data.value_)

std::cout << '[' << *data.value_ << ']' << std::endl;

else

std::cout << "info string absent" << std::endl;

}

catch(const std::exception& e)

{

std::cout << "error: " << e.what() << std::endl;

}

}

int main(int argc,const char* argv[])

{

TestInfo();

}


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Spirit-general mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/spirit-general

0 thoughts on “Boost Spirit Semantic Action Assignments

Leave a Reply

Your email address will not be published. Required fields are marked *