.. _design: Design ====== This section elaborates on what a *transducer* is. It will help you understand how to use them, but also how to write your own transducers and processes. What is a transducer? --------------------- .. image:: https://public.sinusoid.es/talks/transducers-cppcon15/pic/rich2.jpg :align: right :width: 40% .. .. centered:: "Transducers extract the essence of *map* and *filter*" | Rich Hickey, author of Clojure, *on Transducers* `(video) `_ `(blog) `_ .. _accumulate: https://en.cppreference.com/w/cpp/algorithm/accumulate .. _reduce: https://clojuredocs.org/clojure.core/reduce .. _fold: https://wiki.haskell.org/Fold Transducers come after the realization that the most general algorithm over sequences is what in C++ we call accumulate_, but in other languages is often called reduce_ or fold_. Accumulate applies a *reducing function* iteratively over a sequence. The reducing function takes a *state* and some *input* as arguments, and returns an updated state. For example, we can define a *reducing function* ``output`` that takes an output iterator as *state*, some value as *input*, and when invoked, outputs the value using the iterator and returns the iterator into the next position: .. code-block:: c++ auto output = [](auto state, auto input) { *state = input; return ++state; }; Using this function, we can implement ``std::copy`` in terms of ``accumulate``: .. code-block:: c++ template OutputIt copy(InputIt first, InputIt last, InputIt out) { return accumulate(first, last, out, output); } We can also use this structure to implement ``transform`` and ``filter``: .. code-block:: c++ template OutputIt transform(InputIt first, InputIt last, InputIt out, Mapping mapping) { return accumulate(first, last, out, [&](auto state, auto input) { return output(state, mapping(input)); }); } template OutputIt filter(InputIt first, InputIt last, InputIt out, Predicate pred) { return accumulate(first, last, out, [&](auto state, auto input) { return pred(input) ? output(state, input) : state; }); } Notice something? The functions look almost identical. There is a part that is almost identical: calls to ``accumulate`` and ``output`` that specify the input and output parts of the :ref:`process`. The meat of the algorithm is sandwitched in between. We can extract their by writing them as functions take a *reducing function* (like output) as an argument, and as result return another *reducing function* that, for example, maps the input betweeing passing it over the next reducing function. A **transducer** is *a function* that takes a *reducing function*, and returns another *reducing function* that wraps it. .. code-block:: c++ template auto map(Mapping mapping) { return [=] (auto step) { return [=] (auto s, auto... ins) { return step(s, mapping(ins...)); }; }; } template auto filter(Predicate pred) { return [=] (auto step) { return [=] (auto s, auto... ins) { return pred(input) ? step(s, ins...) : s; }; }; } .. warning:: This code could be improved by leveraging **move semantics** and **perfect forwarding** wherever we pass stuff to the underlying transducer. This, however, adds too much noise to the code examples. These concepts are already hard enough to understand to those uninitated in functional programming. As an exercise can think about you would use ``std::move`` and ``std::forward`` in these examples. You can check the implementation of these transducers in the library itself for some inspiration. These two functions return a *transducer*, this a function that takes a *reducing function* called ``step`` as an argument, and return another reducing function. We can revise our implementation of our standard algorithms using them: .. code-block:: c++ template OutputIt transform(InputIt first, InputIt last, InputIt out, Mapping mapping) { return accumulate(first, last, out, map(mapping)(output)); } template OutputIt filter(InputIt first, InputIt last, InputIt out, Predicate pred) { return accumulate(first, last, out, filter(pred)(output)); } What if you wanted to write a function that filters and transforms all in one go? .. code-block:: c++ template OutputIt filter_and_transform( InputIt first, InputIt last, InputIt out, Predicate pred, Mapping mapping) { return accumulate(first, last, out, filter(pred)(map(mapping)(output))); } Or using piping for function composition and extracting out the transducer: .. code-block:: c++ template OutputIt filter_and_transform( InputIt first, InputIt last, InputIt out, Predicate pred, Mapping mapping) { auto xf = filter(pred) | mapping(mapping); return accumulate(first, last, out, xf(output))); } .. admonition:: Tip :class: tip The library provides a function ``comp`` for function composition, but also ``operator|`` for more natural reading syntax. You can also turn any function ``f`` into a pipeable one by using ``comp(f)``. .. admonition:: Order of composition The ``filter_and_transform`` function above performs filtering and mapping precisely in that order: it filters first, and only if the value passes the predicate is it mapped at all. This, *composition of transducers reads from left to right*. This may be unintuitive if that this is simply function composition, and function composition reads from right to left. However, remember that a transducer, as a function, is not a transformation over a sequence, but a transformation over an operation. Each transducer is, indeed, called from right to left. Each transducer transforms the operation, wrapping it into a more complex operation. It is like an onion, each transducer adds a new layer to the onion. The result is a more complex operation. When data is fed into the operation, it will enter through the external layer of the onion, this is through the layer that the last transducer (the leftmost!), and penetrate the onion in this reverse order until it reaches the core of the onion---to the rightmost transducer, and then the original *reducing function*. .. image:: _static/composition-flat.svg :width: 100% Dealing with state ------------------ Now you know how to write a simple transducer. But what if you want to do something more advanced? For example, you may want to write a transducer like Python's enumerate_, that given a sequence of values, pairs each element with an index. You may be tempted to do it like this: .. code-block:: c++ auto enumerate = [=] (auto step) { return [=, idx = 0] (auto s, auto... ins) mutable { return step(s, idx++, ins...); }; }; This works, but it has one problem: it hides a mutable variable in the reducer. It needs to be marked ``mutable`` and it is no longer a `pure function`_. We have to be careful about which instance are we using when. But we already have a functional, explicit, *state*, the variable `s`. The problem is that we can not make any assumptions about its type or manipulate it in any way --- only somewhere down in the next reducing function ``step`` can it be understood. .. _pure function: https://en.wikipedia.org/wiki/Pure_function .. _enumerate: https://book.pythontips.com/en/latest/enumerate.html The solution is to instead *wrap* the incoming state, adding any additional data you need next to it. There is some nuance to it: the first time the transducer is invoked there state you receive is unwrapped, but the second time you'll receive a wrapped state. The library provides some methods to deal with this: - ``state_wrap(s, d)``, returns a new wrapped state containing both the state ``s`` and the additional data ``d``. - ``state_unwrap(s)``, returns the underlying state from the potentially wrapped state ``s``. - ``state_data(s, [] { return ...; })``, returns the wrapped data in the state ``s``. If none exists, it, produces an initial value using the supplied callback. Using this tool we can now improve our implementation of ``enumerate``: .. code-block:: c++ auto enumerate = [=] (auto step) { return [=] (auto s, auto... ins) mutable { auto idx = state_data(s, [] { return 0; }); auto next = step(s, idx, ins...); return wrap_state(next, ++idx); }; }; .. note:: In Clojure, stateful transducers hide state within the transducer the way we showed in the initial example. The reason, I believe, that they do not follow our approach, is that since Clojure is a dynamic language, this would add too much of a runtime overhead, as one would need to constantly check whether we are dealing with wrapped or unwrapped data. In C++ all this happens at compile time, with no runtime overhead. Ironically C++ happens to be *more* functional than Clojure in their transducers implementation 😉 Early termination ----------------- Some transducers stop producing any output after certain conditions. One example is ``take`` which it lets at most ``n`` elements of the sequence in. Howe can we signal to the reducing process, and from within the reducing function, that there is no more work to be done? One option would be to throw an exception, that may be catched by the reducing process. But exceptions are an inefficient way of control flow in C++. Instead, we may realize that termination, whether we are done or not, is a property of the state. Think about it: the transducer ``take`` is gonna anotate the state with some kind of *count*. The transducer is done when this count reaches zero. We say that a state that represents a finished computation is *reduced*. We can express this in the library like this: .. code-block:: c++ struct take_tag {}; bool state_wrapper_data_is_reduced(take_tag, int n) { return n <= 0; } auto take(int n) { return comp([=](auto step) { return [=](auto s, auto... is) { return wrap_state( step(state_unwrap(s), is...), state_data(s, [=] { return n; }) - 1); }; }); } Note how we added a ``take_tag`` to identify our state, and how we introduced the ``state_wrapper_data_is_reduced`` to specify whether the data associated to a particular state indicates termination. .. admonition:: Is ``accumulate`` a valid reducing process? You may have noted that we are increasingly adding stuff to how the reducing *process* interacts with the *reducing function*. For once, the type of the state changes after the first iteration. Also, we now have to consider whether the state is reduced (the library provides ``state_is_reduced`` for that) and stop the process in that case. For this reason: ``std::accumulate`` may not work, in general, with the reducing functions produced by our transducers. The library provides it's own function, ``reduce()`` and the convenience ``transduce()`` that work like *accumulate* but considering all these state management quirks. Conditional invocation ---------------------- In the presence of state, we need to reconsider transducers that conditionally invoke the next reducing function, like ``filter``. The problem is: if the next reducing function wraps the state with additional state, we may need to return something of a different type depending on whether we call the reducing function (we get a wrapped state) or don't (we may or may not have a wrapped state at hand already). The solution is to return something like a ``variant`` of a wrapped or an unwrapped state. To ease this, the library provides the functions ``call`` and ``skip``, that take the same arguments (an invocation of the next reducing function) and return the same type, but the latter does not actually call the reducing function. This may be clearer considering this improved implementation of ``filter``: .. code-block:: c++ template auto filter(Predicate pred) { return [=] (auto step) { return [=] (auto s, auto... ins) { return pred(input) ? call(step, s, ins...) : skip(step, s, ins...); }; }; }; Inspecting the state -------------------- We have seen how we can use ``state_data`` and ``state_unwrap`` and ``state_wrap`` to add our oun state to the computation without having special control flow for the first invocation of the reducing function. However, sometimes we may want to indeed do something different when we receive an unwrapped state, signaling that we have not yet done any work. One example is ``dedupe`` which removes duplicate consecutive elements from the sequence. On the first invocation, we know that the element is no duplicate, because there was none before, so we can call the next reducing function unconditionally. The library provides a ``with_state(s, fn1, fn2)`` that takes a state ``s`` and calls ``fn1`` or ``fn2`` depending on whether ``s`` is already wrapped or not. We can use it to implemente ``dedupe`` as follows: .. code-block:: c++ auto dedupe = [](auto step) { return [=](auto s, auto... is) { return with_state(s, [&](auto s) { auto next = std::make_tuple(is...); return wrap_state(step(state_unwrap(s), is...), next); }, [&](auto s) { auto next = std::make_tuple(is...); return next == state_wrapper_data(st) ? s : wrap_state(step(state_unwrap(st), is...), next); }); }; }; Variadics --------- Note how in the *reducing* functions above we are always receiving and passing the inputs using ``ins...``. This is because *transducers are variadic*, this is, an input maybe consist of none, one, or many elements. This allows for some interesting use-cases. Generators ~~~~~~~~~~ A transducer with no arguments can act as generator, that with every "pulse" it receives no input, but produces a value. For example, we can use a transducer to create a vector with 10 pseudo-random numbers like this: .. code-block:: c++ auto v = into(std::vector{}, zug::map(&std::rand) | zug::take(10)); assert(v.size() == 10); Note that if we did not use ``take(10)`` there, that function would never end. However, we may alternatively define an infinite lazy range of pseudo-random numbers using: .. code-block:: c++ auto s = zug::sequence(zug::map(&std::rand)); std::copy(s.begin(), s.end() + 10, ...); Zipping ~~~~~~~ Alternatively, we can use transducers to combine multiple sequences into one, for example, by pairing every two elements. When using ranges this can be slightly cumbersome, since every range must have one and only one value type, so the every element must be combined into a tuple. With transducers, the two elements can be passed to the next transducer to combine them however you want. For example, if we have two vectors of integers, we can produce a vector with the sum of every two elements like this: .. code-block:: c++ auto v1 = std::vector{ 1, 2, 3, 4}; auto v2 = std::vector{10, 20, 30, 40}; auto xf = zug::map(std::plus<>{}); auto r = zug::into(std::vector{}, xf, v1, v2); assert(r == {11, 22, 33, 44}); Of course, if what you want is, in fact, to generate tuples, the library has you covered: .. code-block:: c++ auto v1 = std::vector{ 1, 2, 3, 4}; auto v2 = std::vector{10, 20, 30, 40}; auto r = zug::into(std::vector>{}, zug::zip, v1, v2); assert(r == {{1, 10}, {2, 20}, {3, 30}, {4, 40}}); Or maybe you have tuples, and want to operate on the elements: .. code-block:: c++ auto v = std::vector>{ {1, 10}, {2, 20}, {3, 30}, {4, 40} }; auto xf = zug::unzip | zug::map(std::plus<>{}); auto r = zug::into(std::vector{}, xf, v); assert(r == {11, 22, 33, 44}); Performance ----------- After being amazed, or maybe horrified, by these castles of lambda functions, you shall be wondering: *but how the heck does this perform?* The short answer is: **very well**. Most of the time, a combination of transducers can be as fast as a hand-rolled loop that performs the transformation. At every step, a transducer is just a simple function and it is very easy to inline. In fact, when transforming a range eagerly it is normally more efficient than equivalent C++20 range adaptor, since there is no intermediate chaining of polling iterators, instead, the data is passed directly through a chain of reducing functions. However, some things might be slightly slower, and using ``zug::sequence`` to create a lazy iterator based on a transducer can often be slower than a range adaptor. The comparison is kind of apple to oranges, since they are different abstractions. In fact, while indeed transducers can be convenient to express transformations over containers, they can also express, with minimal overhead, transformations over push-based temporal sequences. This is something that is simply impossible with the standard library. Check `Lager`_ for an application of this ability of transducers. .. _Lager: https://github.com/arximboldi/lager .. admonition:: Contribute! These conclussions are based on ad-hoc inspection of the generated assembly and benchmarking done during the development of the library. Sadly, there is no sistematic benchmark suite included with the library. Of course, developing one would be a good learning exercise, `contact us`_ if you would like to contribute one! .. _contact us: https://github.com/arximboldi/zug