Meditation, The Art of Exploitation

Thinking? At last I have discovered it--thought; this alone is inseparable from me. I am, I exist--that is certain. But for how long? For as long as I am thinking. For it could be, that were I totally to cease from thinking, I should totally cease to exist....I am, then, in the strict sense only a thing that thinks.

Tuesday, October 23, 2007

C++ boost lambda internals

Theoretical Background:

In computer science, lambda as an abbreviation refers to lambda function in lambda calculus (http://en.wikipedia.org/wiki/Lambda_calculus). lambda calculus is a fundamental concept in theory of computability developed by Alonzo Church. It's the theoretical foundation of many functional programming languages.

Informally, in lambda calculus, every expression stands for a function with a single input, called its argument; the argument of the function is in turn a function with a single argument, and the value of the function is another function with a single argument. A function is anonymously defined by a lambda expression which expresses the function's action on its argument. For instance, the "add-two" function f such that f(x) = x + 2 would be expressed in lambda calculus as λ x. x + 2 (or equivalently as λ y. y + 2; the name of the formal argument is immaterial) and the application of the function f(3) would be written as (λ x. x + 2) 3. Function application is left associative: f x y = (f x) y.

Formally lambda calculus can be expressed in the following BNF rules:

::=
::= (λ . )
::= ( )

The first 2 rules are used to describe logical forms to construct lambda functions. The last rule describe applications of lambda function.

Two note worthy observations from the above description:

1) C++ non member function, static member function, non static member function, in general a C++ function that assumes the form of return_type func(argument) DOES not qualify as a lambda function because return_type cannot be function in C++. return_type could be a function pointer or a functor, used in a form similar to a function. Due to this reason, when we talk about C++ lambda function our relaxed defintion often does not conform to the precise definition given in lambda calculus.

2) The 3rd rule implies that it's possible to have a lambda function that takes itself as argument and thus results in infinite recursion (left recursion) depending on the action the lambda function performs.


lambda in C++

A popular implementation of lambda calculus in C++ is provided by lambda library in boost distribution (http://www.boost.org/doc/html/lambda.html). First we start with a boost lambda example and demonstrate the internals of boost lambda. During investigation, the following tools are used: vim, grep, totalview, ksnapshot, g++

#include "boost/lambda/lambda.hpp"

#include
#include

using namespace std;
using namespace boost::lambda;

int main(){

int x[5] = {1,2,3,4,5};

for_each(x, x+5, cout << _1 << '\n');
}

In this example, we defined 2 lambdas (lambda and function with previous clarification are inter-exchangeable terms from now on) and applied lambda for_each on the 2nd user defined lambda 'cout << _1 << '\n'.

_1 is a place holder defined in /usr/include/boost/lambda/core.hpp as:

namespace boost {
namespace lambda {

namespace {

// These are constants types and need to be initialised
boost::lambda::placeholder1_type free1 = boost::lambda::placeholder1_type();
boost::lambda::placeholder2_type free2 = boost::lambda::placeholder2_type();
boost::lambda::placeholder3_type free3 = boost::lambda::placeholder3_type();

boost::lambda::placeholder1_type& _1 = free1;
boost::lambda::placeholder2_type& _2 = free2;
boost::lambda::placeholder3_type& _3 = free3;
// _1, _2, ... naming scheme by Peter Dimov
} // unnamed

} // lambda
} // boost


/usr/include/boost/lambda/detail/lambda_functors.hpp defines the placeholder1_type as:


typedef const lambda_functor< placeholder > placeholder1_type;
template struct placeholder;

template<> struct placeholder {

template struct sig {
typedef typename detail::get_element_or_null_type<0, SigArgs>::type type;
};

template
RET call(CALL_FORMAL_ARGS) const {
BOOST_STATIC_ASSERT(boost::is_reference::value);
CALL_USE_ARGS; // does nothing, prevents warnings for unused args
return a;
}
};

Or in the form after macro expansion:

template struct placeholder;

template<> struct placeholder {

template struct sig {
typedef typename detail::get_element_or_null_type<0, SigArgs>::type type;
};

template
RET call(A& a, B& b, C& c, Env& env) const {
typedef ::boost::static_assert_test< sizeof(::boost::STATIC_ASSERTION_FAILURE< (bool)( boost::is_reference::value ) >)> boost_static_assert_typedef_60;
::boost::lambda::detail::do_nothing(a, b, c, env);
return a;
}
};



Notable keywords here are: sig, call, as they clue us eventually to how the lambda library implements for_each lambda call. The macro expanded form is obtained by supplying "--save-temps' argument to g++. Similar option is available for other c++ compilers to aid analysis of complex c++ template and macro laiden programs.

Compile time C++ parser picks up lambda 'cout << _1 << '\n'' and uses Koenig name look up rule (argument dependent name lookup) to resolve first '<<' into an operator overloaded in lambda namespace. The overloaded operator is defined in /usr/include/boost/lambda/detail/operators.hpp


#define BOOST_LAMBDA_BE2(OPER_NAME, ACTION, CONSTA, CONSTB, CONVERSION) \
template \
inline const \
lambda_functor< \
lambda_functor_base< \
ACTION, \
tuple< typename CONVERSION ::type, lambda_functor > \
> \
> \
OPER_NAME (CONSTA& a, const lambda_functor& b) { \
return \
lambda_functor_base< \
ACTION, \
tuple< typename CONVERSION ::type, lambda_functor > \
> \
(tuple< typename CONVERSION ::type, lambda_functor >(a, b)); \
}
BOOST_LAMBDA_BE2(BOOST_LAMBDA_COMMA_OPERATOR_NAME, other_action, const A, const B, const_copy_argument)

Or in the macro expanded form:

template inline const lambda_functor< lambda_functor_base< bitwise_action, tuple< typename const_copy_argument ::type, lambda_functor > > > operator<< (const A& a, const lambda_functor& b) { return lambda_functor_base< bitwise_action, tuple< typename const_copy_argument ::type, lambda_functor > > (tuple< typename const_copy_argument ::type, lambda_functor >(a, b)); }


Instantiation of lambda_functor requires instantiation of the following classes: lambda_functor_base, boost::tuples::tuple, boost::tuples::cons. lambda_functor_base is defined in
/usr/include/boost/lambda/detail/operator_lambda_func_base.hpp as yet another macro:
BOOST_LAMBDA_BINARY_ACTION(<<,bitwise_action)

The relevant definitions of bitwise_action and leftshift_action can be found in /usr/include/boost/lambda/detail/operator_actions.hpp


#define BOOST_LAMBDA_BINARY_ACTION(SYMBOL, ACTION_CLASS) \
template \
class lambda_functor_base { \
public: \
Args args; \
public: \
explicit lambda_functor_base(const Args& a) : args(a) {} \
\
template \
RET call(CALL_FORMAL_ARGS) const { \
return detail::select(boost::tuples::get<0>(args), CALL_ACTUAL_ARGS) \
SYMBOL \
detail::select(boost::tuples::get<1>(args), CALL_ACTUAL_ARGS); \
} \
template struct sig { \
typedef typename \
detail::binary_rt::type type; \
}; \
};

Or in macro expanded form:
template
class lambda_functor_base< explicit_return_type_action, Args>
{
public:
Args args;

explicit lambda_functor_base(const Args& a) : args(a) {}

template struct sig { typedef RET type; };

template
RET call(A& a, B& b, C& c, Env& env) const
{
return detail::constify_rvals::go(
detail::r_select::go(boost::tuples::get<0>(args), a, b, c, env));
}
};



Inside std::for_each, __Function is expanded into a lambda_functor_base data structure. When invoked as a functor, it resolves to:

template
typename inherited::template sig< tuple >::type
operator()(A& a) const {
return inherited::template call<
typename inherited::template sig< tuple >::type
>(a, cnull_type(), cnull_type(), cnull_type());
}


At this point, A& a contains the value of array element in x.

The complete type of lambda_functor is defined as:

boost::lambda::lambda_functor< boost::lambda::lambda_functor_base< boost::lambda::bitwise_action< boost::lambda::leftshift_action>,boost::tuples::tuple< boost::lambda::lambda_functor< boost::lambda::lambda_functor_base< boost::lambda::bitwise_action< boost::lambda::leftshift_action>,boost::tuples::tuple< std::ostream&,boost::lambda::lambda_functor< boost::lambda::placeholder<1> >,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type> > >,const char,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type> > >::operator ()



Conclusion

Through source code analysis and debugging, we glimpsed through part of the internals of boost lambda libary. This library uses macro and template metaprogramming extensively. It's tightly coupled to boost tuple library. It uses type traits technique and other generative programming techniques freely.

We use overloaded '<<' to demonstrate the inner working of lambda. This analysis does not show a complete a picture of lambda library, e.g without lambda::bind. But it provides sufficient insight into the library and how to analyze similar C++ library if needed.