Meditation, The Art of Exploitation

Thinking? At last I have discovered it--thought; this alone is inseparable from me. I am, I exist--that is certain. But for how long? For as long as I am thinking. For it could be, that were I totally to cease from thinking, I should totally cease to exist....I am, then, in the strict sense only a thing that thinks.

Wednesday, October 24, 2007

Wolfram's 2,3 turing machine was proven to be universal

http://www.wolframscience.com/prizes/tm23/solution_news.html
http://www.wolframscience.com/prizes/tm23/TM23Proof.pdf

A system is "Universal" if it can, given infinite memory and an appropriate program, compute any computable function. A previous system even more simple than this one, Rule 110, was proven to be Universal by one of Wolfram's associates (Wolfram had the idea that it might be, Matthew Cook discovered the proof). However, a Universal Turing machine has some extra requirements with regards to the implementation. So this is the simplest Universal Turing Machine.

If you already know programming, then here's how to think of it:

A Turing machine is a programming language.
A Universal Turing machine is a language that is sufficiently flexible enough to perform any computation (including emulate any other Turing machine - i.e. language)
A Non-Universal Turing machine is a language that is built to do a specific purpose well, but does not have enough flexibility to perform any computation.

For computer programmers, nearly every general-purpose language we deal with on a daily basis is Turing-complete. An example of a non-Turing Complete language might be a configuration file. It has a language, you may even be able to do some basic scripting in it, but unless it is built out of a general-purpose language you cannot perform any computation in it.

What they think it is useful for is to help us to make nanomachines easier. If we can construct a Turing machine with simpler parts, we can have a compiler that can pick up the slack in making the program.

Totalview debugging tips

Totalview is a multi platform debugger that has the best support for parallel software debugging. Here are a couple of tricks I learnt recently:

1. mpi application startup. With openmpi, use command 'mpirun -tv -np N program'. With mpich2, one can use either 'mpirun -tv -np N program' or 'totalview python -a `which mpiexec` -tvsu -np N program'. Totalview cannot restart program with the first command. Totalview GUI provides an interface to launch MPI application directly, one can specify the MPI library type, NP from the interface. This is the best way to start MPI program with totalview without platform dependent knowledge.

2. Mixed language program debugging. This gem was provided by totalview tech support. Quote:

One method that may be easier than others is to
set a breakpoint in the C++ code where the variable is initialized
(assuming it's in the C++ code where this happens). Then you can dive
on the variable, and if it gets returned to the Fortran code, you
already have a handle on it. Of course, sometimes it's not that simple.
But the basic idea is easy to follow.

When in the Fortran code, dive on the variable in question. In the data
window, you should see a button that says More (and Less) If you are
using 8.3, it shows now as an arrow pointing down with a + sign next to
it. Clicking on this or the more button expands the header, and allows
you to change the language to C or C++. You should then be able to cast
the type to a pointer to the appropriate structure. I've used the
previous method (breakpointing in the C++ initialization routine) just
to find the right type to cast to, but that was basically because I was
unfamiliar with the code being looked at.

Tuesday, October 23, 2007

C++ boost lambda internals

Theoretical Background:

In computer science, lambda as an abbreviation refers to lambda function in lambda calculus (http://en.wikipedia.org/wiki/Lambda_calculus). lambda calculus is a fundamental concept in theory of computability developed by Alonzo Church. It's the theoretical foundation of many functional programming languages.

Informally, in lambda calculus, every expression stands for a function with a single input, called its argument; the argument of the function is in turn a function with a single argument, and the value of the function is another function with a single argument. A function is anonymously defined by a lambda expression which expresses the function's action on its argument. For instance, the "add-two" function f such that f(x) = x + 2 would be expressed in lambda calculus as λ x. x + 2 (or equivalently as λ y. y + 2; the name of the formal argument is immaterial) and the application of the function f(3) would be written as (λ x. x + 2) 3. Function application is left associative: f x y = (f x) y.

Formally lambda calculus can be expressed in the following BNF rules:

::=
::= (λ . )
::= ( )

The first 2 rules are used to describe logical forms to construct lambda functions. The last rule describe applications of lambda function.

Two note worthy observations from the above description:

1) C++ non member function, static member function, non static member function, in general a C++ function that assumes the form of return_type func(argument) DOES not qualify as a lambda function because return_type cannot be function in C++. return_type could be a function pointer or a functor, used in a form similar to a function. Due to this reason, when we talk about C++ lambda function our relaxed defintion often does not conform to the precise definition given in lambda calculus.

2) The 3rd rule implies that it's possible to have a lambda function that takes itself as argument and thus results in infinite recursion (left recursion) depending on the action the lambda function performs.


lambda in C++

A popular implementation of lambda calculus in C++ is provided by lambda library in boost distribution (http://www.boost.org/doc/html/lambda.html). First we start with a boost lambda example and demonstrate the internals of boost lambda. During investigation, the following tools are used: vim, grep, totalview, ksnapshot, g++

#include "boost/lambda/lambda.hpp"

#include
#include

using namespace std;
using namespace boost::lambda;

int main(){

int x[5] = {1,2,3,4,5};

for_each(x, x+5, cout << _1 << '\n');
}

In this example, we defined 2 lambdas (lambda and function with previous clarification are inter-exchangeable terms from now on) and applied lambda for_each on the 2nd user defined lambda 'cout << _1 << '\n'.

_1 is a place holder defined in /usr/include/boost/lambda/core.hpp as:

namespace boost {
namespace lambda {

namespace {

// These are constants types and need to be initialised
boost::lambda::placeholder1_type free1 = boost::lambda::placeholder1_type();
boost::lambda::placeholder2_type free2 = boost::lambda::placeholder2_type();
boost::lambda::placeholder3_type free3 = boost::lambda::placeholder3_type();

boost::lambda::placeholder1_type& _1 = free1;
boost::lambda::placeholder2_type& _2 = free2;
boost::lambda::placeholder3_type& _3 = free3;
// _1, _2, ... naming scheme by Peter Dimov
} // unnamed

} // lambda
} // boost


/usr/include/boost/lambda/detail/lambda_functors.hpp defines the placeholder1_type as:


typedef const lambda_functor< placeholder > placeholder1_type;
template struct placeholder;

template<> struct placeholder {

template struct sig {
typedef typename detail::get_element_or_null_type<0, SigArgs>::type type;
};

template
RET call(CALL_FORMAL_ARGS) const {
BOOST_STATIC_ASSERT(boost::is_reference::value);
CALL_USE_ARGS; // does nothing, prevents warnings for unused args
return a;
}
};

Or in the form after macro expansion:

template struct placeholder;

template<> struct placeholder {

template struct sig {
typedef typename detail::get_element_or_null_type<0, SigArgs>::type type;
};

template
RET call(A& a, B& b, C& c, Env& env) const {
typedef ::boost::static_assert_test< sizeof(::boost::STATIC_ASSERTION_FAILURE< (bool)( boost::is_reference::value ) >)> boost_static_assert_typedef_60;
::boost::lambda::detail::do_nothing(a, b, c, env);
return a;
}
};



Notable keywords here are: sig, call, as they clue us eventually to how the lambda library implements for_each lambda call. The macro expanded form is obtained by supplying "--save-temps' argument to g++. Similar option is available for other c++ compilers to aid analysis of complex c++ template and macro laiden programs.

Compile time C++ parser picks up lambda 'cout << _1 << '\n'' and uses Koenig name look up rule (argument dependent name lookup) to resolve first '<<' into an operator overloaded in lambda namespace. The overloaded operator is defined in /usr/include/boost/lambda/detail/operators.hpp


#define BOOST_LAMBDA_BE2(OPER_NAME, ACTION, CONSTA, CONSTB, CONVERSION) \
template \
inline const \
lambda_functor< \
lambda_functor_base< \
ACTION, \
tuple< typename CONVERSION ::type, lambda_functor > \
> \
> \
OPER_NAME (CONSTA& a, const lambda_functor& b) { \
return \
lambda_functor_base< \
ACTION, \
tuple< typename CONVERSION ::type, lambda_functor > \
> \
(tuple< typename CONVERSION ::type, lambda_functor >(a, b)); \
}
BOOST_LAMBDA_BE2(BOOST_LAMBDA_COMMA_OPERATOR_NAME, other_action, const A, const B, const_copy_argument)

Or in the macro expanded form:

template inline const lambda_functor< lambda_functor_base< bitwise_action, tuple< typename const_copy_argument ::type, lambda_functor > > > operator<< (const A& a, const lambda_functor& b) { return lambda_functor_base< bitwise_action, tuple< typename const_copy_argument ::type, lambda_functor > > (tuple< typename const_copy_argument ::type, lambda_functor >(a, b)); }


Instantiation of lambda_functor requires instantiation of the following classes: lambda_functor_base, boost::tuples::tuple, boost::tuples::cons. lambda_functor_base is defined in
/usr/include/boost/lambda/detail/operator_lambda_func_base.hpp as yet another macro:
BOOST_LAMBDA_BINARY_ACTION(<<,bitwise_action)

The relevant definitions of bitwise_action and leftshift_action can be found in /usr/include/boost/lambda/detail/operator_actions.hpp


#define BOOST_LAMBDA_BINARY_ACTION(SYMBOL, ACTION_CLASS) \
template \
class lambda_functor_base { \
public: \
Args args; \
public: \
explicit lambda_functor_base(const Args& a) : args(a) {} \
\
template \
RET call(CALL_FORMAL_ARGS) const { \
return detail::select(boost::tuples::get<0>(args), CALL_ACTUAL_ARGS) \
SYMBOL \
detail::select(boost::tuples::get<1>(args), CALL_ACTUAL_ARGS); \
} \
template struct sig { \
typedef typename \
detail::binary_rt::type type; \
}; \
};

Or in macro expanded form:
template
class lambda_functor_base< explicit_return_type_action, Args>
{
public:
Args args;

explicit lambda_functor_base(const Args& a) : args(a) {}

template struct sig { typedef RET type; };

template
RET call(A& a, B& b, C& c, Env& env) const
{
return detail::constify_rvals::go(
detail::r_select::go(boost::tuples::get<0>(args), a, b, c, env));
}
};



Inside std::for_each, __Function is expanded into a lambda_functor_base data structure. When invoked as a functor, it resolves to:

template
typename inherited::template sig< tuple >::type
operator()(A& a) const {
return inherited::template call<
typename inherited::template sig< tuple >::type
>(a, cnull_type(), cnull_type(), cnull_type());
}


At this point, A& a contains the value of array element in x.

The complete type of lambda_functor is defined as:

boost::lambda::lambda_functor< boost::lambda::lambda_functor_base< boost::lambda::bitwise_action< boost::lambda::leftshift_action>,boost::tuples::tuple< boost::lambda::lambda_functor< boost::lambda::lambda_functor_base< boost::lambda::bitwise_action< boost::lambda::leftshift_action>,boost::tuples::tuple< std::ostream&,boost::lambda::lambda_functor< boost::lambda::placeholder<1> >,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type> > >,const char,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type,boost::tuples::null_type> > >::operator ()



Conclusion

Through source code analysis and debugging, we glimpsed through part of the internals of boost lambda libary. This library uses macro and template metaprogramming extensively. It's tightly coupled to boost tuple library. It uses type traits technique and other generative programming techniques freely.

We use overloaded '<<' to demonstrate the inner working of lambda. This analysis does not show a complete a picture of lambda library, e.g without lambda::bind. But it provides sufficient insight into the library and how to analyze similar C++ library if needed.

Thursday, October 11, 2007

Generic programming: fundamental concepts in C++ template metaprogramming

Procedural programming has a few general and important constructs: conditional, loop, recursion.

In Functional programming, loop is not allowed because variable is not mutable. To loop, is to alter counter variables. In Functional programming, recursion is used to accumulate or iterate, in recursion the rule 'immutable variable' is not violated.

A programming construct is a fundamental concept of programming, it's a logical unit. A program is a hierarchical structure of programming constructs. There are 4 basic constructs: sequential, conditional, loop, function (lambda). What a program does is side effect of a program executing these constructs.

1. Sequential construct:
do A
do B
do C

2. What's a conditional construct?

if (condition) then
do A
else
do B
end if

This construct is allowed in both procedural and functional programming.

3. What's a loop? there are 2 kinds of loops based on its result (not side effect), infinite loop, finite loop. Often a loop can be decomposed into sequential and conditional constructs. For example a infinite loop

condition = true
while(condition) then
do A
do B
end while

a finite loop
condition = true
while(condition) then
do A
if(result=do B) condition = false // note that the value of condition is altered
end while

In functional programming, variable value is immutable once defined, to rewrite the above example with recursion, we need the help of a function construct.

4. define a function or a lambda

return_type function(argument_type)
function body
end function

bool do_b()
bool result = do B
return result
end do_b

void do_a_b()
do A
bool result = do_b
if(!result)
do_a_b
end do_a_b

As you can see, no variable value is ever changed after the point of definition (declaration and initialization), a key characteristic of functional programming.

C++ template provides powerful means to do functional programming, interestingly by design it has the same 'immutable variable' requirement. Let's see how we again transform the above example into C++ templates

bool do_B(){
}

template
class do_a_b{

do_a_b(){
do_A;
bool result = do_B;
do_a_b();
}

};

template
class do_a_b{
};

In summary, conditional is implemented with partial specialization, and loop is implemented with recursion. Why bother doing this? Functional programming as a form of generic programming, provides a higher level of concept correctness through programming logic.

References:
http://en.wikipedia.org/wiki/Tail_recursion

Tuesday, October 02, 2007

C++ template programming name lookup

Inside of a template, the compiler performs two-phase name lookup for any name encountered:

The first occurs when the compiler initially parses the template
definition. In this phase, the compiler tries to determine which names
do not depend on the template arguments, and it tries to resolve those
names. (Non-dependent name) One thing to note is that during the first phase, inherited names from depdent class (base class is a template) don't get resolved as dependent names since the compiler cannot know yet if the base has the name declared.

The second phase occurs when you actually try to use the template. As
now all the template parameters are also known, the compiler can
resolve the rest of the names. (Dependent name)

Consider the following example



template struct S;
template <> struct S<0> {
S<0>(void): y(0) {}
int y;
};

template struct S: public S {
S(void) {y++;}

};



S<0>::y is a non-dependent name, so is S::y. The compiler will not be able to resolve S::y in the first phase of the two phase name lookup. Too fix this problem, one must convert S::y to a dependent name. There are two methods,

The compiler is right and the reason is that by deriving
S from S the base class is a type-dependent and
the compiler is not allowed to assume anything about it's
members (Or to say: Name look-up does not occur in the
dependent base classes), since there is no guarantee that
they will exist (you can change the definition of an
arbitrary S by explicit or partial specialization over
a set of z).
To ensure that the compiler does not *immediatly* try to
resolve the entity named "y" inside the scope of S, you
make itself a type-dependent expression, e.g. by replacing
the unqualified y by

this->y

or by

S::y (or S or S)

1) S() { this->y++; }, this forces the compiler to instantiate the base dependent names and so on and so forth

2) S::y simply turns y into a dependent name and will only be resolved after all templates are instantiated.