<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Alexander Gallego</title>
    <description>Alex Gallego is a principal engineer at Akamai. Previously he was the cto of &lt;a href=http://concord.io&gt;concord&lt;/a&gt;, a real time distributed stream processor.
</description>
    <link>http://alexgallego.org/</link>
    <atom:link href="http://alexgallego.org/feed.xml" rel="self" type="application/rss+xml"/>
    <pubDate>Sun, 08 Jul 2018 09:20:38 -0400</pubDate>
    <lastBuildDate>Sun, 08 Jul 2018 09:20:38 -0400</lastBuildDate>
    <generator>Jekyll v3.8.2</generator>
    
      <item>
        <title>The Effects of CPU Turbo: 768X stddev</title>
        <description>&lt;p&gt;I build and maintain &lt;a href=&quot;http://github.com/senior7515/smf&quot;&gt;smf - the fastest RPC in the west&lt;/a&gt;.
The main language supported is C++. We use &lt;a href=&quot;https://github.com/scylladb/seastar&quot;&gt;Seastar&lt;/a&gt; as
the actual asynchronous framework.&lt;/p&gt;

&lt;p&gt;The RPC protocol uses &lt;code class=&quot;highlighter-rouge&quot;&gt;flatbuffers&lt;/code&gt; to make cross language support possible, in addition to
its main goal of providing zero cost deserialization on the receiving side.
Currently, Go and Java are in the process of being added to the repo,
with partial support already merged.&lt;/p&gt;

&lt;p&gt;On Friday, June 28 2018, I got a text from a friend
that said: &lt;code class=&quot;highlighter-rouge&quot;&gt;last time I checked flatbuffers was slow vs cap 'n proto&lt;/code&gt;.
At first, I was suspicious, since I’ve &lt;em&gt;never&lt;/em&gt; seen flatbuffers be
in any top &lt;code class=&quot;highlighter-rouge&quot;&gt;$&amp;gt; perf &lt;/code&gt; profile of any kind, on real production applications. However,
I had no numbers or benchmarks to prove it.&lt;/p&gt;

&lt;p&gt;I thought I had a quick-and-dirty hack to amortize the cost of object graphs -
 it never occurred to me to measure large type trees or large buffers.&lt;/p&gt;

&lt;h1 id=&quot;bluf---bottom-line-up-front&quot;&gt;BLUF - Bottom Line Up Front.&lt;/h1&gt;

&lt;blockquote&gt;
  &lt;p&gt;I had introduced a &lt;strong&gt;performance optimization&lt;/strong&gt; that actually turned out to be
bad for large buffers.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;To be precise:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;On small very buffers it was ~6% &lt;strong&gt;faster&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;On large buffers it was ~41% &lt;strong&gt;slower&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;nooooooooooooooooooooooooooo!&lt;/p&gt;

&lt;h1 id=&quot;the-beginning-of-yak-shaving&quot;&gt;The beginning of yak-shaving&lt;/h1&gt;

&lt;p&gt;I found a project that had benchmarked
&lt;code class=&quot;highlighter-rouge&quot;&gt;cap'n proto vs flatbuffers&lt;/code&gt;. In particular, it measures buffer construction,
since both cap’n proto and flatbuffers &lt;em&gt;deserialization is effectively a pointer cast&lt;/em&gt; - Yay
for little endian enforced encoding into an aligned byte array.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://github.com/thekvs/cpp-serializers&quot;&gt;cpp-serializers&lt;/a&gt; project
measures cap’n proto vs flatbuffers encoding and to my surprise,
flatbuffers measured at 2.5X slower.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://raw.githubusercontent.com/thekvs/cpp-serializers/master/images/time2.png&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/thekvs/cpp-serializers/blob/master/benchmark.cpp#L440-L463&quot;&gt;Their test&lt;/a&gt;:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-cpp&quot; data-lang=&quot;cpp&quot;&gt;&lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chrono&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;high_resolution_clock&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;now&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;size_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iterations&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;flatbuffers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FlatBufferBuilder&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;builder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;strings&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;clear&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;size_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kStringsCount&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;strings&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;push_back&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;builder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CreateString&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kStringValue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ids_vec&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;builder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CreateVector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kIntegers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;strings_vec&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;builder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CreateVector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;strings&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CreateRecord&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;builder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ids_vec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;strings_vec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;builder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Finish&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;reinterpret_cast&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;builder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GetBufferPointer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;builder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GetSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;vector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GetRecord&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ids&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;builder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ReleaseBufferPointer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;finish&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chrono&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;high_resolution_clock&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;now&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;duration&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chrono&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;duration_cast&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chrono&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;milliseconds&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;finish&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;As you can see, the benchmark creates a flatbuffers builder (a kind of &lt;code class=&quot;highlighter-rouge&quot;&gt;std::vector&lt;/code&gt;). It stresses
the builder by allocating an array of strings and ints… and that’s about it.&lt;/p&gt;

&lt;p&gt;Internally, the &lt;code class=&quot;highlighter-rouge&quot;&gt;flatbuffers::FlatbufferBuilder&lt;/code&gt; encodes from top-to-bottom (downward growth) and when
it reaches the bottom, it reallocs and grows the size of the underlying array to fit the contents.&lt;/p&gt;

&lt;p&gt;The actual code for reallocation is:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-cpp&quot; data-lang=&quot;cpp&quot;&gt;&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;reallocate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;size_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;FLATBUFFERS_ASSERT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;allocator_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;old_reserved&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reserved_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;old_size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;old_scratch_size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;scratch_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;reserved_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                            &lt;span class=&quot;n&quot;&gt;old_reserved&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;?&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;old_reserved&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;initial_size_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;reserved_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reserved_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buffer_minalign_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;~&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buffer_minalign_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buf_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;buf_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;allocator_&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reallocate_downward&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buf_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;old_reserved&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reserved_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                                             &lt;span class=&quot;n&quot;&gt;old_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;old_scratch_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;buf_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;allocator_&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;allocate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reserved_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;cur_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buf_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reserved_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;old_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;scratch_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buf_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;old_scratch_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;It grows by powers of 2 &lt;code class=&quot;highlighter-rouge&quot;&gt;(reserved_ + buffer_minalign_ - 1) &amp;amp; ~(buffer_minalign_ - 1);&lt;/code&gt;
, and usually by 1024 bytes which is the default &lt;code class=&quot;highlighter-rouge&quot;&gt;initial_size_;&lt;/code&gt;&lt;/p&gt;

&lt;h1 id=&quot;setup&quot;&gt;Setup&lt;/h1&gt;

&lt;p&gt;Let’s start with a Flatbuffers IDL of a simple key=value struct.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-cpp&quot; data-lang=&quot;cpp&quot;&gt;    &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kvpair&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The C++ API has a nice object api on top of the raw &lt;code class=&quot;highlighter-rouge&quot;&gt;flatbuffers::FlatbufferBuilder&lt;/code&gt;, so
a quick GoogleBenchmark yields something like this:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-cpp&quot; data-lang=&quot;cpp&quot;&gt;&lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;inline&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kvpairT&lt;/span&gt;
&lt;span class=&quot;nf&quot;&gt;gen_kv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;uint32_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;kvpairT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;resize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sc&quot;&gt;'x'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;resize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sc&quot;&gt;'y'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt;
&lt;span class=&quot;nf&quot;&gt;BM_alloc_simple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;benchmark&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;State&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PauseTiming&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gen_kv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ResumeTiming&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;// build it!&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;flatbuffers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FlatBufferBuilder&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Finish&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kvpair&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;nullptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mem&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Release&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;reinterpret_cast&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;seastar&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;temporary_buffer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;seastar&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;make_object_deleter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;move&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)));&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;benchmark&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DoNotOptimize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;BENCHMARK&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BM_alloc_simple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;18&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;18&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;A few things to note:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;We use &lt;code class=&quot;highlighter-rouge&quot;&gt;auto mem = bdr.Release();&lt;/code&gt; and wrap that into a &lt;code class=&quot;highlighter-rouge&quot;&gt;seastar::temporary_buffer&amp;lt;char&amp;gt;&lt;/code&gt; with
effectively zero copy (+- some pointer assignments).&lt;/li&gt;
  &lt;li&gt;This is important because our entire messaging API is about bridging the code generation from
flatbuffers and seastar.&lt;/li&gt;
  &lt;li&gt;In addition to this, our RPC mechanism codegen’s seastar &amp;lt;–&amp;gt; flatbuffers glue code.&lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;---------------------------------------------------------------------------
Benchmark                                    Time           CPU Iterations
---------------------------------------------------------------------------
...

BM_alloc_simple/2/2/threads:1                        586 ns        588 ns    1184218
BM_alloc_simple/4/4/threads:1                        586 ns        589 ns    1181340
BM_alloc_simple/16/16/threads:1                      598 ns        600 ns    1153577
BM_alloc_simple/256/256/threads:1                    601 ns        602 ns    1147963
BM_alloc_simple/4096/4096/threads:1                  824 ns        828 ns     837036
BM_alloc_simple/65536/65536/threads:1               6825 ns       6845 ns     101321

...

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;To encode a &lt;code class=&quot;highlighter-rouge&quot;&gt;2 x 64KB&lt;/code&gt; buffers it takes 6.8 micros. Yikes! - But is this bad?
(as of flatbuffers checkin &lt;code class=&quot;highlighter-rouge&quot;&gt;34cb163e389e928db08ed2bd0e16ee0ac53ab1ce&lt;/code&gt;).
Note this is &lt;em&gt;only&lt;/em&gt; 3X the cost of &lt;code class=&quot;highlighter-rouge&quot;&gt;std::malloc&lt;/code&gt; + &lt;code class=&quot;highlighter-rouge&quot;&gt;std::memset&lt;/code&gt;, which is the fastest
thing I can think of as a base comparison.&lt;/p&gt;

&lt;p&gt;Before we dive into possible optimizations, let’s fix our desktop for these
micro optimizations.&lt;/p&gt;

&lt;h1 id=&quot;fixing-the-environment&quot;&gt;Fixing the environment&lt;/h1&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;k&quot;&gt;function &lt;/span&gt;cpu_disable_performance_cpupower_state&lt;span class=&quot;o&quot;&gt;(){&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;cpupower frequency-set &lt;span class=&quot;nt&quot;&gt;--governor&lt;/span&gt; powersave
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;function &lt;/span&gt;cpu_enable_performance_cpupower_state&lt;span class=&quot;o&quot;&gt;(){&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;cpupower frequency-set &lt;span class=&quot;nt&quot;&gt;--governor&lt;/span&gt; performance
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;function &lt;/span&gt;cpu_available_frequencies&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;i &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; /sys/devices/system/cpu/cpu[0-9]&lt;span class=&quot;k&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do
        &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;:&quot;&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;/cpufreq/scaling_min_freq: &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cat&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt;/cpufreq/scaling_min_freq&lt;span class=&quot;k&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;/cpufreq/scaling_max_freq: &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cat&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt;/cpufreq/scaling_max_freq&lt;span class=&quot;k&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;done&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;function &lt;/span&gt;cpu_set_min_frequencies&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;local &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;freq&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[[&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$freq&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;then &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;exit &lt;/span&gt;1&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fi
    for &lt;/span&gt;i &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; /sys/devices/system/cpu/cpu[0-9]&lt;span class=&quot;k&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do
        &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;:&quot;&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;/cpufreq/scaling_min_freq: &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cat&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt;/cpufreq/scaling_min_freq&lt;span class=&quot;k&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$freq&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; | &lt;span class=&quot;nb&quot;&gt;sudo tee&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;/cpufreq/scaling_min_freq&quot;&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;/cpufreq/scaling_min_freq: &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cat&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt;/cpufreq/scaling_min_freq&lt;span class=&quot;k&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;done&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;function &lt;/span&gt;cpu_set_max_frequencies&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;local &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;freq&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[[&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$freq&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;then &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;exit &lt;/span&gt;1&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fi
    for &lt;/span&gt;i &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; /sys/devices/system/cpu/cpu[0-9]&lt;span class=&quot;k&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do
        &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;:&quot;&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;/cpufreq/scaling_max_freq: &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cat&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt;/cpufreq/scaling_max_freq&lt;span class=&quot;k&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$freq&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; | &lt;span class=&quot;nb&quot;&gt;sudo tee&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;/cpufreq/scaling_max_freq&quot;&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;/cpufreq/scaling_max_freq: &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cat&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$i&lt;/span&gt;/cpufreq/scaling_max_freq&lt;span class=&quot;k&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;done&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;My CPU is a
&lt;a href=&quot;https://ark.intel.com/products/97468/Intel-Xeon-Processor-E3-1535M-v6-8M-Cache-3_10-GHz&quot;&gt;Intel(R) Xeon(R) CPU E3-1535M v6 @ 3.10GHz&lt;/a&gt;
with a turbo of 4.20GHz.
That’s great for desktop experience where interactivity matters and terrible
for performance benchmarking.&lt;/p&gt;

&lt;p&gt;When I first tried to show the results to my partner
(non CS Major, but puts up w/ me asking her to stare at my screen) I had
unpredictable results.&lt;/p&gt;

&lt;p&gt;Armed w/ the shell funcs above, my checklist is now as follows:&lt;/p&gt;

&lt;h2 id=&quot;settings&quot;&gt;Settings&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Ensure that your BIOS says performance when connected to AC&lt;/li&gt;
  &lt;li&gt;Check your CPU freqencies via &lt;code class=&quot;highlighter-rouge&quot;&gt;cat /proc/cpuinfo  | grep model&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Set your CPU governor via &lt;code class=&quot;highlighter-rouge&quot;&gt;cpu_enable_performance_cpupower_state&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Set the min frequency to the frequency reported by your CPU/model/vendor via: &lt;code class=&quot;highlighter-rouge&quot;&gt;cpu_set_min_frequencies 3100000&lt;/code&gt; in my case&lt;/li&gt;
  &lt;li&gt;Set the max frequency to the frequency reported by your CPU/model/vendor via: &lt;code class=&quot;highlighter-rouge&quot;&gt;cpu_set_max_frequencies 3100000&lt;/code&gt; in my case&lt;/li&gt;
  &lt;li&gt;Verify that you always build in Release mode&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Verying the frequencies is now simple via &lt;code class=&quot;highlighter-rouge&quot;&gt;cpu_available_frequencies&lt;/code&gt;&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt; &lt;span class=&quot;nv&quot;&gt;$&amp;gt;&lt;/span&gt; cpu_available_frequencies
/sys/devices/system/cpu/cpu0:
/sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq: 3100000
/sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq: 3100000
/sys/devices/system/cpu/cpu1:
/sys/devices/system/cpu/cpu1/cpufreq/scaling_min_freq: 3100000
/sys/devices/system/cpu/cpu1/cpufreq/scaling_max_freq: 3100000
/sys/devices/system/cpu/cpu2:
/sys/devices/system/cpu/cpu2/cpufreq/scaling_min_freq: 3100000
/sys/devices/system/cpu/cpu2/cpufreq/scaling_max_freq: 3100000
/sys/devices/system/cpu/cpu3:
/sys/devices/system/cpu/cpu3/cpufreq/scaling_min_freq: 3100000
/sys/devices/system/cpu/cpu3/cpufreq/scaling_max_freq: 3100000
/sys/devices/system/cpu/cpu4:
/sys/devices/system/cpu/cpu4/cpufreq/scaling_min_freq: 3100000
/sys/devices/system/cpu/cpu4/cpufreq/scaling_max_freq: 3100000
/sys/devices/system/cpu/cpu5:
/sys/devices/system/cpu/cpu5/cpufreq/scaling_min_freq: 3100000
/sys/devices/system/cpu/cpu5/cpufreq/scaling_max_freq: 3100000
/sys/devices/system/cpu/cpu6:
/sys/devices/system/cpu/cpu6/cpufreq/scaling_min_freq: 3100000
/sys/devices/system/cpu/cpu6/cpufreq/scaling_max_freq: 3100000
/sys/devices/system/cpu/cpu7:
/sys/devices/system/cpu/cpu7/cpufreq/scaling_min_freq: 3100000
/sys/devices/system/cpu/cpu7/cpufreq/scaling_max_freq: 3100000&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The take away is you don’t necessarily want speed, you want predictability.&lt;/p&gt;

&lt;h1 id=&quot;next---compiler-explorer---my-new-fav-tool&quot;&gt;Next - Compiler Explorer - my new fav tool&lt;/h1&gt;

&lt;p&gt;First, let’s execute the commands above and get the compiler flags that we’ll need for compiler explorer.&lt;/p&gt;

&lt;p&gt;If you are using CMake, follow these screencasts and don’t forget to set 
&lt;code class=&quot;highlighter-rouge&quot;&gt;set(CMAKE_EXPORT_COMPILE_COMMANDS 1)&lt;/code&gt; on your main &lt;code class=&quot;highlighter-rouge&quot;&gt;CMakeLists.txt&lt;/code&gt; file&lt;/p&gt;

&lt;script src=&quot;https://asciinema.org/a/wUsxTCQ74QLM9TUfX0laRcr8y.js?speed=2&quot; id=&quot;asciicast-wUsxTCQ74QLM9TUfX0laRcr8y&quot; async=&quot;&quot; data-autoplay=&quot;true&quot; data-size=&quot;small&quot;&gt;
  &lt;/script&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;Now that we have the compiler flags ready, let’s fire up compiler explorer:&lt;/p&gt;

&lt;script src=&quot;https://asciinema.org/a/hrLAAsSVAXCGDkjM3aJp8oBBG.js&quot; id=&quot;asciicast-hrLAAsSVAXCGDkjM3aJp8oBBG&quot; async=&quot;&quot;&gt;&lt;/script&gt;

&lt;p&gt;When you navigate to &lt;code class=&quot;highlighter-rouge&quot;&gt;localhost:1024&lt;/code&gt; you are welcomed with the usual and friendly CompilerExplorer UI.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/compiler_explorer.png&quot; alt=&quot;emacs&quot; /&gt;&lt;/p&gt;

&lt;h1 id=&quot;with-these-tools-we-can-now-benchmark--optimize-our-code&quot;&gt;With these tools we can now benchmark &amp;amp; optimize our code&lt;/h1&gt;

&lt;h2 id=&quot;the-results&quot;&gt;The results&lt;/h2&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;Running /home/agallego/workspace/smf/build/release/src/benchmarks/fbs_alloc/smf_fbsalloc_benchmark_test
Run on &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;8 X 4200 MHz CPU s&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
CPU Caches:
  L1 Data 32K &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;x4&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
  L1 Instruction 32K &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;x4&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
  L2 Unified 256K &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;x4&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
  L3 Unified 8192K &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;x1&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;---------------------------------------------------------------------------&lt;/span&gt;
Benchmark                                    Time           CPU Iterations
&lt;span class=&quot;nt&quot;&gt;---------------------------------------------------------------------------&lt;/span&gt;
BM_malloc_base/2/2/threads:1                          11 ns         11 ns   62858390
BM_malloc_base/4/4/threads:1                          11 ns         11 ns   61565906
BM_malloc_base/16/16/threads:1                        11 ns         11 ns   64118519
BM_malloc_base/256/256/threads:1                      20 ns         20 ns   35574778
BM_malloc_base/4096/4096/threads:1                    98 ns         98 ns    7123958
BM_malloc_base/65536/65536/threads:1                2339 ns       2336 ns     281254
BM_malloc_base/262144/262144/threads:1             11648 ns      11635 ns      58503
BM_alloc_simple/2/2/threads:1                        586 ns        588 ns    1184218
BM_alloc_simple/4/4/threads:1                        586 ns        589 ns    1181340
BM_alloc_simple/16/16/threads:1                      598 ns        600 ns    1153577
BM_alloc_simple/256/256/threads:1                    601 ns        602 ns    1147963
BM_alloc_simple/4096/4096/threads:1                  824 ns        828 ns     837036
BM_alloc_simple/65536/65536/threads:1               6825 ns       6845 ns     101321
BM_alloc_simple/262144/262144/threads:1            32838 ns      32784 ns      21351
BM_alloc_thread_local/2/2/threads:1                  557 ns        559 ns    1220122
BM_alloc_thread_local/4/4/threads:1                  557 ns        559 ns    1221065
BM_alloc_thread_local/16/16/threads:1                569 ns        571 ns    1198623
BM_alloc_thread_local/256/256/threads:1              578 ns        580 ns    1181208
BM_alloc_thread_local/4096/4096/threads:1            806 ns        819 ns     845698
BM_alloc_thread_local/65536/65536/threads:1        10182 ns      10237 ns      68206
BM_alloc_thread_local/262144/262144/threads:1      43066 ns      42998 ns      16279
BM_alloc_hybrid/2/2/threads:1                        563 ns        565 ns    1212235
BM_alloc_hybrid/4/4/threads:1                        563 ns        566 ns    1209207
BM_alloc_hybrid/16/16/threads:1                      577 ns        579 ns    1185042
BM_alloc_hybrid/256/256/threads:1                    587 ns        589 ns    1166182
BM_alloc_hybrid/4096/4096/threads:1                  843 ns        847 ns     819194
BM_alloc_hybrid/65536/65536/threads:1               6810 ns       6832 ns     101497
BM_alloc_hybrid/262144/262144/threads:1            32838 ns      32774 ns      21368&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;As expected, for small buffers, there is a large cost vs the base of malloc+memset, and
at the higher ends, the allocation &amp;amp; byte traversal start to dominate.
The range is ~58x (worst) - ~2.8X (best).&lt;/p&gt;

&lt;h2 id=&quot;our-benched-code&quot;&gt;Our benched code&lt;/h2&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-cpp&quot; data-lang=&quot;cpp&quot;&gt;&lt;span class=&quot;cp&quot;&gt;#include &amp;lt;cstring&amp;gt;
#include &amp;lt;memory&amp;gt;
#include &amp;lt;thread&amp;gt;
&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;#include &amp;lt;benchmark/benchmark.h&amp;gt;
#include &amp;lt;core/print.hh&amp;gt;
#include &amp;lt;core/sstring.hh&amp;gt;
#include &amp;lt;core/temporary_buffer.hh&amp;gt;
&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;#include &quot;kv_generated.h&quot;
#include &quot;smf/native_type_utils.h&quot;
&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;inline&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kvpairT&lt;/span&gt;
&lt;span class=&quot;nf&quot;&gt;gen_kv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;uint32_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;kvpairT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;resize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sc&quot;&gt;'x'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;resize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sc&quot;&gt;'y'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt;
&lt;span class=&quot;nf&quot;&gt;BM_malloc_base&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;benchmark&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;State&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;reinterpret_cast&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;malloc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)));&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;memset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sc&quot;&gt;'x'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;benchmark&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DoNotOptimize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;free&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;BENCHMARK&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BM_malloc_base&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;18&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;18&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Threads&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt;
&lt;span class=&quot;nf&quot;&gt;BM_alloc_simple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;benchmark&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;State&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PauseTiming&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gen_kv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ResumeTiming&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;// build it!&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;flatbuffers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FlatBufferBuilder&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Finish&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kvpair&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;nullptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mem&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Release&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;reinterpret_cast&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;seastar&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;temporary_buffer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;seastar&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;make_object_deleter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;move&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)));&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;benchmark&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DoNotOptimize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;BENCHMARK&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BM_alloc_simple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;18&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;18&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Threads&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt;
&lt;span class=&quot;nf&quot;&gt;BM_alloc_thread_local&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;benchmark&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;State&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;thread_local&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;flatbuffers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FlatBufferBuilder&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1024&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PauseTiming&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gen_kv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ResumeTiming&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;// key operations&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Clear&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Finish&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kvpair&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;nullptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// ned key operations&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;// std::cout &amp;lt;&amp;lt; &quot;Size to copy: &quot; &amp;lt;&amp;lt; bdr.GetSize() &amp;lt;&amp;lt; std::endl;&lt;/span&gt;

    &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;nullptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;posix_memalign&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GetSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ENOMEM&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;throw&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bad_alloc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;EINVAL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;throw&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;runtime_error&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;seastar&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sprint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;&quot;Invalid alignment of %d; allocating %d bytes&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GetSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()));&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;DLOG_THROW_IF&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;&quot;ERRNO: {}, Bad aligned allocation of {} with alignment: {}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GetSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;memcpy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;reinterpret_cast&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GetBufferPointer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()),&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GetSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;benchmark&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DoNotOptimize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;free&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;BENCHMARK&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BM_alloc_thread_local&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;18&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;18&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Threads&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;template&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;typename&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RootType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;seastar&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;temporary_buffer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;hybrid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;typename&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RootType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;NativeTableType&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;thread_local&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;builder&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;make_unique&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;flatbuffers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FlatBufferBuilder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;builder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Clear&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Finish&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RootType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;nullptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SMF_UNLIKELY&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GetSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2048&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mem&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Release&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;reinterpret_cast&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;// fix the original builder&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;builder&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;make_unique&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;flatbuffers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FlatBufferBuilder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;seastar&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;temporary_buffer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;seastar&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;make_object_deleter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;move&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)));&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;


  &lt;span class=&quot;c1&quot;&gt;// always allocate to the largest member 8-bytes&lt;/span&gt;
  &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;nullptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;posix_memalign&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GetSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ENOMEM&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;throw&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bad_alloc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;EINVAL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;throw&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;runtime_error&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;seastar&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sprint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;&quot;Invalid alignment of %d; allocating %d bytes&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GetSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()));&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;DLOG_THROW_IF&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;ERRNO: {}, Bad aligned allocation of {} with alignment: {}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GetSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;seastar&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;temporary_buffer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;retval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;reinterpret_cast&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GetSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;seastar&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;make_free_deleter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;memcpy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;retval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;reinterpret_cast&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bdr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GetBufferPointer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;retval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;move&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;retval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;BM_alloc_hybrid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;benchmark&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;State&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PauseTiming&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gen_kv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ResumeTiming&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hybrid&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kvpair&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;benchmark&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DoNotOptimize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;BENCHMARK&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BM_alloc_hybrid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;18&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;18&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Threads&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;BENCHMARK_MAIN&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;h1 id=&quot;lessons-learned-benchmarking-with-turbo-is-dishonest&quot;&gt;Lessons Learned: Benchmarking with turbo is dishonest.&lt;/h1&gt;

&lt;p&gt;Not that people writing OSS (or any software really) are out there to get you.
It is just easy to forget to tune your machine specifically for benchmarking. 
There is no such thing as a quick and dirty benchmark especially if the
results are not categorically different, i.e.: 1 minute vs 10 mins vs 1hr.&lt;/p&gt;

&lt;p&gt;Let’s compare the stability of multiple runs with turbo vs without.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/cpu_jitter.png&quot; alt=&quot;cpu jitter&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/cpu_turbo_vs_no_turbo_fbsalloc.png&quot; alt=&quot;fbsalloc&quot; /&gt;&lt;/p&gt;

&lt;h3 id=&quot;thats-a-stddev-difference-of-up-to-768x-&quot;&gt;That’s a stddev difference of up to 768X !!!&lt;/h3&gt;

&lt;p&gt;No-turbo means precisely this:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;cpu_available_frequencies

/sys/devices/system/cpu/cpu0:
/sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq: 800000
/sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq: 4200000
/sys/devices/system/cpu/cpu1:
/sys/devices/system/cpu/cpu1/cpufreq/scaling_min_freq: 800000
/sys/devices/system/cpu/cpu1/cpufreq/scaling_max_freq: 4200000
/sys/devices/system/cpu/cpu2:
/sys/devices/system/cpu/cpu2/cpufreq/scaling_min_freq: 800000
/sys/devices/system/cpu/cpu2/cpufreq/scaling_max_freq: 4200000
/sys/devices/system/cpu/cpu3:
/sys/devices/system/cpu/cpu3/cpufreq/scaling_min_freq: 800000
/sys/devices/system/cpu/cpu3/cpufreq/scaling_max_freq: 4200000
/sys/devices/system/cpu/cpu4:
/sys/devices/system/cpu/cpu4/cpufreq/scaling_min_freq: 800000
/sys/devices/system/cpu/cpu4/cpufreq/scaling_max_freq: 4200000
/sys/devices/system/cpu/cpu5:
/sys/devices/system/cpu/cpu5/cpufreq/scaling_min_freq: 800000
/sys/devices/system/cpu/cpu5/cpufreq/scaling_max_freq: 4200000
/sys/devices/system/cpu/cpu6:
/sys/devices/system/cpu/cpu6/cpufreq/scaling_min_freq: 800000
/sys/devices/system/cpu/cpu6/cpufreq/scaling_max_freq: 4200000
/sys/devices/system/cpu/cpu7:
/sys/devices/system/cpu/cpu7/cpufreq/scaling_min_freq: 800000
/sys/devices/system/cpu/cpu7/cpufreq/scaling_max_freq: 4200000&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;h1 id=&quot;smf&quot;&gt;SMF&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;smf&lt;/strong&gt; just got 40% faster for large buffers thanks to these benchmarks, 
if you give it a shot let me know. Stay tuned
for the Java and Go code generators and performance benchmarks coming soon.&lt;/p&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;Let me know if you found this useful, any missinformation, or additional performance tunning
on twitter &lt;a href=&quot;https://twitter.com/emaxerrno&quot;&gt;@emaxerrno&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;br /&gt;
&lt;br /&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;Special thanks to my partner Sarah Rohrbach as well as Chris Heller, Noah Watkins
for reading earlier drafts of this post.&lt;/p&gt;

</description>
        <pubDate>Sat, 30 Jun 2018 20:00:00 -0400</pubDate>
        <link>http://alexgallego.org/perf/compiler/explorer/flatbuffers/smf/2018/06/30/effects-cpu-turbo.html</link>
        <guid isPermaLink="true">http://alexgallego.org/perf/compiler/explorer/flatbuffers/smf/2018/06/30/effects-cpu-turbo.html</guid>
        
        
        <category>perf</category>
        
        <category>compiler</category>
        
        <category>explorer</category>
        
        <category>flatbuffers</category>
        
        <category>smf</category>
        
      </item>
    
      <item>
        <title>A tale of performance debugging: from 1.3X slower to 48X faster than Apache Kafka</title>
        <description>&lt;p&gt;I recently wrote a
&lt;a href=&quot;/concurrency/o_direct/2018/02/02/O_DIRECT.html&quot;&gt;log file writer&lt;/a&gt;
that allows for multiple non-sequential writes to an 
&lt;code class=&quot;highlighter-rouge&quot;&gt;O_DIRECT&lt;/code&gt; file handle on XFS. Specifically, it dispatches 4 (configurable)
concurrent writes, then does a join of the page-aligned buffer
writes. That file writer handles timed-based flushing policy,
so that you can have a reasonable idea of when your data actually makes it 
to disk by either exceeding capacity (say 1MB), or exceeding a timeout 
(say 1second).&lt;/p&gt;

&lt;p&gt;All of this work, was &lt;strong&gt;to design a ( very? ) fast Write Ahead Log&lt;/strong&gt; 
(the astute reader will be quick to point out that if you don’t flush 
at every write, is not really a WAL, but just a log writer). After all 
of this work (~2K LOC), my heart was broken to realize I was actually
&lt;strong&gt;-1.3X slower&lt;/strong&gt; than Apache Kafka after the big re-write! &lt;strong&gt;sigh…&lt;/strong&gt;
(tested with the non DPDK runtime - for those of you 
who have been following along).&lt;/p&gt;

&lt;p&gt;For the impatient, what follows is the performance debugging process
to understand my bottlenecks, and slowly bring performance to a new 
high of &lt;strong&gt;48X times faster than Apache Kafka, which was my baseline&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;1) I used Apache Kafka tail latency as a my largest latency budget.&lt;/p&gt;

  &lt;p&gt;2) These tests did not use the DPDK runtime. Only epoll + new aio&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1 id=&quot;max-latency-budget-kafka-on-nvme--xfs--kernel-41416-300fc27x86_64&quot;&gt;Max latency budget: Kafka on NVMe + XFS + kernel 4.14.16-300.fc27.x86_64&lt;/h1&gt;

&lt;p&gt;To have a useful baseline for my largest latency budget, I 
enabled lz4 compression for Kafka (&lt;em&gt;much&lt;/em&gt; slower w/out it), 
and also increase the number of partitions to 32.
This is the same number of partitions that our broker used for the tests.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;c&quot;&gt;# config/server.properties&lt;/span&gt;
compression.codec&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;3
num.partitions&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;32&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;I was using the latest 1.0 release: &lt;code class=&quot;highlighter-rouge&quot;&gt;kafka_2.11-1.0.0&lt;/code&gt;.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;c&quot;&gt;# produce 100MM records&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;#&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;#&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;time&lt;/span&gt; ./bin/kafka-run-class.sh   &lt;span class=&quot;se&quot;&gt;\ &lt;/span&gt;
    org.apache.kafka.tools.ProducerPerformance &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
        &lt;span class=&quot;nt&quot;&gt;--topic&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;test&lt;/span&gt;            &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
        &lt;span class=&quot;nt&quot;&gt;--num-records&lt;/span&gt; 100000000 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
        &lt;span class=&quot;nt&quot;&gt;--record-size&lt;/span&gt; 100       &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
        &lt;span class=&quot;nt&quot;&gt;--throughput&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-1&lt;/span&gt;         &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
        &lt;span class=&quot;nt&quot;&gt;--producer-props&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;acks&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;1 bootstrap.servers&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;localhost:9092&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
         Latency distribution for writing 100MM 
         records with the command above to Apache Kafka 2.11-1.0. 
    &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/kafka.fastestwal.latency.png&quot; /&gt;
  &lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;blockquote&gt;
  &lt;p&gt;Acknowledgements:&lt;/p&gt;

  &lt;p&gt;1) I disabled all other apps, browsers, etc.&lt;/p&gt;

  &lt;p&gt;2) localhost has in-kernel optimizations which will only &lt;em&gt;add&lt;/em&gt; latency
   on a prod deployment which will likely go through a network rack&lt;/p&gt;

  &lt;p&gt;3) This ignores CPU scaling (power states) among others&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1 id=&quot;perf-statrecordlist--toplev-pmu_tools&quot;&gt;perf {stat,record,list} &amp;amp; toplev (PMU_tools)&lt;/h1&gt;

&lt;p&gt;When in doubt, measure. When 100% sure, still measure. 
90% of the improvements actually came from an area I wasn’t expecting. 
10% did come from a code path I intuited needed some TLC.&lt;/p&gt;

&lt;p&gt;In fact, to be 100% honest, I am still in awe at the results. In part 
the starting profile and ending profile look superficially
similar, as you will soon see. In part because this was the first time
I was using hardware counters to actually debug 
a program and realized that I would not have missed out on some
sleep had I learned this earlier in life.&lt;/p&gt;

&lt;p&gt;My weight-loss-program is as follows:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;perf record -g -p &lt;code class=&quot;highlighter-rouge&quot;&gt;&amp;lt;pid&amp;gt;&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;perf report&lt;/li&gt;
  &lt;li&gt;If something non-obvious wasn’t there, then:
    &lt;ul&gt;
      &lt;li&gt;perf stat -d -p &lt;code class=&quot;highlighter-rouge&quot;&gt;&amp;lt;pid&amp;gt;&lt;/code&gt;&lt;/li&gt;
      &lt;li&gt;perf record -g -p &lt;code class=&quot;highlighter-rouge&quot;&gt;&amp;lt;pid&amp;gt;&lt;/code&gt;&lt;/li&gt;
      &lt;li&gt;perf report&lt;/li&gt;
      &lt;li&gt;CPU Flamegraph&lt;/li&gt;
      &lt;li&gt;toplev.py&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;1st-non-obvious-performance-bottleneck-stdsort&quot;&gt;1st non-obvious performance bottleneck std::sort&lt;/h2&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-diff&quot; data-lang=&quot;diff&quot;&gt;&lt;span class=&quot;gh&quot;&gt;diff --git a/src/filesystem/wal_write_behind_cache.cc b/src/filesystem/wal_write_behind_cache.cc
index a35f8ccd..dab6711f 100644
&lt;/span&gt;&lt;span class=&quot;gd&quot;&gt;--- a/src/filesystem/wal_write_behind_cache.cc
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+++ b/src/filesystem/wal_write_behind_cache.cc
&lt;/span&gt;&lt;span class=&quot;gu&quot;&gt;@@ -68,8 +68,6 @@ wal_write_behind_cache::put(uint64_t offset, item_ptr data) {
&lt;/span&gt;   stats_.bytes_written += data-&amp;gt;on_disk_size();
   puts_.emplace(offset, data);
   keys_.push_back(offset);
&lt;span class=&quot;gd&quot;&gt;-  // keeping order is hugely important for removing data.
-  std::sort(keys_.begin(), keys_.end());
&lt;/span&gt; }
 &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;blockquote&gt;
  &lt;p&gt;If you decide to repro at home, please comment wal_write_ahead_log.cc:57
and wal_write_ahead_log.cc:23&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is a very slow commit. This is when I found out I &lt;em&gt;needed&lt;/em&gt; 
to do something about it immediately. Look at these numbers! yikes.&lt;/p&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
    &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/perf.first.bugfix.png&quot; /&gt;
  &lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;&lt;strong&gt;….this is 24x improvement!!! - One line of code.&lt;/strong&gt;&lt;/p&gt;

&lt;h2 id=&quot;finding-the-culprit&quot;&gt;Finding the culprit!&lt;/h2&gt;

&lt;p&gt;Going back to our weight-loss program:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;c&quot;&gt;# assume 24776 is the process id&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;#&lt;/span&gt;
perf record &lt;span class=&quot;nt&quot;&gt;-F&lt;/span&gt; 99 &lt;span class=&quot;nt&quot;&gt;-g&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-p&lt;/span&gt;  24776

&lt;span class=&quot;c&quot;&gt;# run the client test then call:&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;#&lt;/span&gt;
perf report&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
         perf report for the first non-intuitive bugfix 
    &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/std.vector.iterator.perfbug.4.png&quot; /&gt;
  &lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/std.vector.iterator.perfbug.3.png&quot; /&gt;
  &lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;&lt;strong&gt;… woah! 74% of the time is spent sorting &lt;code class=&quot;highlighter-rouge&quot;&gt;uint64_t&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
         perf report AFTER the change    
     &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/std.vector.iterator.perfbug.5.png&quot; /&gt;
  &lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;… Yay!! no more bottleneck… but wait! Doesn’t this change break
correctness? - No, this was a leftover code from a previous imlementation
and honestly, pretty harmless for the behavior of the program other than
it made it unbearably slow!&lt;/p&gt;

&lt;h1 id=&quot;2nd-improvement-poor-data-locality---code-slaying&quot;&gt;2nd improvement: poor data locality - code slaying&lt;/h1&gt;

&lt;p&gt;Remember our recipe from above? The &lt;code class=&quot;highlighter-rouge&quot;&gt;perf report&lt;/code&gt; no longer shows
something obvious! … what to do!!??!&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;perf stat -d -p &amp;lt;pid&amp;gt;&lt;/code&gt; to the rescue!&lt;/p&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
         perf stat -d -p 
     &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/perf.stat.bad.png&quot; /&gt;
  &lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;The issue is then… what does this actually mean! LLC stands for 
last level cache, which in turn it means you are fetching from main memory.
… a lot.&lt;/p&gt;

&lt;p&gt;This is the first time I’ve used this tool, and instead of doing 
a &lt;code class=&quot;highlighter-rouge&quot;&gt;perf record -e LLC-cache-misses&lt;/code&gt; I decided to fix it.
I went ahead and said - oh I know better - it’s basically a bunch of 
pointer chasing (correct), let me fix it &lt;em&gt;right here…&lt;/em&gt;(incorrect)&lt;/p&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
         perf stat -d -p 
     &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/perf.stat.bad.2.png&quot; /&gt;
  &lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;Changing code from pointers to stack allocated structures &lt;em&gt;did&lt;/em&gt; have an 
improvement on the LLC-cache-misses but, not nearly enough. I ended up ‘fixing’
non broken code because I thought I knew better than my profiler.&lt;/p&gt;

&lt;h1 id=&quot;3rd-improvement-poor-data-locality---non-intuitive-fix-2&quot;&gt;3rd improvement: poor data locality - non intuitive fix 2&lt;/h1&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;perf record &lt;span class=&quot;se&quot;&gt;\ &lt;/span&gt;
   &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; LLC-loads,LLC-load-misses,instructions &lt;span class=&quot;se&quot;&gt;\ &lt;/span&gt;
   &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; cycles,branch-load-misses,faults &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
   &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; bus-cycles,mem-loads,mem-stores &lt;span class=&quot;nt&quot;&gt;-a&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-p&lt;/span&gt; &amp;lt;pid&amp;gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
         perf report with --poll-mode 
    &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/llc.cache.misses.perf.report.png&quot; /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/llc.cache.misses.perf.report.2.png&quot; /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;
&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
         perf report without --poll-mode 
    &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/llc.cache.misses.perf.report.3.png&quot; /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;It took a long time to play with the &lt;strong&gt;annotated perf report&lt;/strong&gt; and 
reading the assembly and the culprits for the LLC-cache-misses, but the 
&lt;strong&gt;biggest contributor was to move fron an indexed datastructure to a simple array&lt;/strong&gt;. 
Effectively the map was already pre-ordered as writes happen 
monotonically. Switching algorithms from a &lt;code class=&quot;highlighter-rouge&quot;&gt;map.find()&lt;/code&gt; to &lt;code class=&quot;highlighter-rouge&quot;&gt;std::lower_bound&lt;/code&gt; 
with a &lt;code class=&quot;highlighter-rouge&quot;&gt;std::deque&amp;lt;&amp;gt;&lt;/code&gt; was the biggest contributor.&lt;/p&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
         non-intuitive fix 2 
    &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/non.intuitive.fix.2.png&quot; /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;The second contributor was to actually to cache the size of the 
item in wrapper datastructure. That is, I moved from 
&lt;code class=&quot;highlighter-rouge&quot;&gt;key.ptr-&amp;gt;get_size_on_disk()&lt;/code&gt; to a strcut that remembered the &lt;strong&gt;size_on_disk&lt;/strong&gt; 
such that you could simply do
&lt;code class=&quot;highlighter-rouge&quot;&gt;key.size_on_disk&lt;/code&gt;. That was cached upon insertion into the cache.&lt;/p&gt;

&lt;p&gt;This was the the most mind blowing part of this whole thing that in 
retrospect makes total sense. Jumping around memory (insert refs) 
is slow and causes CPU stalls if the access patterns are pseudo-random
and the CPU prefetcher cannot help you.&lt;/p&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
         latency distribution non-intuitive fix 2 
    &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/non.intuitive.fix.3.png&quot; /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;h1 id=&quot;results&quot;&gt;Results&lt;/h1&gt;

&lt;p&gt;At this stage you can tell:&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;Kafka.Slowest(1618ms) / SMF.Slowest(34ms) ~= 48 lower latency&lt;/code&gt;&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;c&quot;&gt;# CLIENT: inside smf/build_release&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;#&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;#&lt;/span&gt;
./src/smfb/client/smfb_low_level_client &lt;span class=&quot;se&quot;&gt;\ &lt;/span&gt;
              &lt;span class=&quot;nt&quot;&gt;--req-num&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;12202           &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
              &lt;span class=&quot;nt&quot;&gt;--batch-size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;8196         &lt;span class=&quot;se&quot;&gt;\ &lt;/span&gt;
              &lt;span class=&quot;nt&quot;&gt;--cpuset&lt;/span&gt; 3,4              &lt;span class=&quot;se&quot;&gt;\ &lt;/span&gt;
              &lt;span class=&quot;nt&quot;&gt;--key&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx &lt;span class=&quot;se&quot;&gt;\ &lt;/span&gt;
              &lt;span class=&quot;nt&quot;&gt;--value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy    &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
              &lt;span class=&quot;nt&quot;&gt;--poll-mode&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;c&quot;&gt;# SERVER: inside smf/build_release&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;#&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;#&lt;/span&gt;
../src/smfb/smfb &lt;span class=&quot;nt&quot;&gt;--poll-aio&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;1             &lt;span class=&quot;se&quot;&gt;\ &lt;/span&gt;
                 &lt;span class=&quot;nt&quot;&gt;--write-ahead-log-dir&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt;  &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
                 &lt;span class=&quot;nt&quot;&gt;--cpuset&lt;/span&gt; 1,2             &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
                 &lt;span class=&quot;nt&quot;&gt;--poll-mode&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
         SMF is 48x lower tail latency than Apache Kafka. In fact SMF
         wrote was at a slight disadvantage in that it wrote more than 
         100MM records (100'007'592 total records)
    &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/perf.final.comparison.png&quot; /&gt;
  &lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;
&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
         End-To-End consistent latency
    &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/perf.final.comparison.e2e.png&quot; /&gt;
  &lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
         Server observed latency
    &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/perf.final.server.png&quot; /&gt;
  &lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;h1 id=&quot;performance-riddle-me-this&quot;&gt;Performance, riddle me this!&lt;/h1&gt;

&lt;p&gt;At this point I was excited that we were back on track for having 
possibly the fastest open source write ahead log … so I
decided to do one more pass through at &lt;code class=&quot;highlighter-rouge&quot;&gt;perf stat -F 99 -d -p &amp;lt;pid&amp;gt;&lt;/code&gt;
… bad idea.&lt;/p&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
         Original perf stat 
    &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/perf.stat.bad.png&quot; /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
         After the bugfixes perf stat 
    &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/perf.final.stat.png&quot; /&gt;
  &lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;…………WHATTTTTT!!!!!! HULK SMASH KEYBOARD!!!!!&lt;/p&gt;

&lt;p&gt;After &lt;em&gt;all&lt;/em&gt; this work I have &lt;em&gt;more&lt;/em&gt; Last-Level-Cache misses?!!?
I was honestly going in circles at this point. 
I asked my friend &lt;a href=&quot;https://twitter.com/duarte_nunes&quot;&gt;@duarte_nunes&lt;/a&gt;
to see what he thought and he recommented for me to take a look at 
toplev - a program from Andi Kleen to measure and make sense
of CPU hardware counters.&lt;/p&gt;

&lt;h1 id=&quot;toplev-next&quot;&gt;Toplev next…&lt;/h1&gt;

&lt;p&gt;This is an area of ongoing effort, and I hope to write more about 
toplev and how I got another 10X performance improvement
(fingers crossed). What prevented me from digging a bit futher
is that my toplev profiles look difficult to understand.&lt;/p&gt;

&lt;p&gt;Depending on the queue workload, the profiles change drastically from
0% Front End to 21% Front End bound workload.&lt;/p&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
         On a benchmark to write 100MM with 8196 messages in
         each batch, it guarantees that my lz4 compression 
         filter will be hit and therefore I'm mostly BackEnd bound.
    &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/perf.toplev.be.png&quot; /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
         On a benchmark to write 100MM messages with 100 messages 
         in each batch, I'm 21% Front End bound.
    &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/perf.toplev.fe.png&quot; /&gt;
  &lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;Before i dug deeper into the issues, I created a CPU FlameGraph 
to see if maybe there was something that I missed recording 
traces &amp;amp; probes… the results were suprisingly positive:&lt;/p&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
         FlameGraph of the current runtime witout --poll-mode 
         and using the normal linux epoll &amp;amp; the new libaio 
         that Avi re-wrote for the seastar framework.
    &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/wal.perf.final.svg&quot; /&gt;
  &lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;When I accidentally created this graph, I realized all along that the
reason for LLC-cache-misses came from the &lt;code class=&quot;highlighter-rouge&quot;&gt;--poll-mode&lt;/code&gt; flag. 
Without it, my &lt;code class=&quot;highlighter-rouge&quot;&gt;perf stat -F 99 -d -p &amp;lt;pid&amp;gt;&lt;/code&gt; looked pretty good!&lt;/p&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
    &lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
         perf stat after running smfb without the --poll-mode flag on
    &lt;/caption&gt;
&lt;tr&gt;
  &lt;td&gt;&lt;img src=&quot;/images/perf.stat.final.stat.nopoll.png&quot; /&gt;
  &lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;&lt;br /&gt;
&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;Low latency queueing might sounds like a silly goal for some, given that 
most BigData systems are used for offline flows. Dequeue, process, re-enqueue is the
usual paradigm for most log brokers, where processing is in the 100s of milliseconds.&lt;/p&gt;

&lt;p&gt;I do not think there exists a system today that can handle the load for the next
generation of information flows in a cost efficient manner - Terabits/second. 
Low latency and high 
throughput will be a requirement to process drone logs, handle new security
attacks, etc. I hope by then, either SMF or a system like it, designed to take
advantage of every core, every storage device (of varying speeds) with a multitude of 
SLA’s can step up to the challenge.&lt;/p&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;Let me know if you found this useful on twitter &lt;a href=&quot;https://twitter.com/emaxerrno&quot;&gt;@emaxerrno&lt;/a&gt;
or on the comments.&lt;/p&gt;

&lt;p&gt;&lt;br /&gt;
&lt;br /&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;Special thanks to my partner Sarah Rohrbach for reading earlier drafts of this post.&lt;/p&gt;

&lt;h1 id=&quot;references&quot;&gt;References&lt;/h1&gt;
</description>
        <pubDate>Thu, 08 Feb 2018 19:00:00 -0500</pubDate>
        <link>http://alexgallego.org/perf/toplev/2018/02/08/performance-debugging.html</link>
        <guid isPermaLink="true">http://alexgallego.org/perf/toplev/2018/02/08/performance-debugging.html</guid>
        
        
        <category>perf</category>
        
        <category>toplev</category>
        
      </item>
    
      <item>
        <title>Concurrent writes to the same file with O_DIRECT</title>
        <description>&lt;p&gt;Imagine you were trying to write a very fast Write Ahead Log&lt;/p&gt;

&lt;p&gt;…like - dunno - this: &lt;a href=&quot;https://github.com/senior7515/smf&quot;&gt;smf&lt;/a&gt;…&lt;/p&gt;

&lt;p&gt;…well you pack up, use it and go home.&lt;/p&gt;

&lt;p&gt;Unless, of course, you are still waiting for your build system to 
finish and then you start digging deeper.&lt;/p&gt;

&lt;p&gt;I wrote a &lt;code class=&quot;highlighter-rouge&quot;&gt;wal_segment&lt;/code&gt; which basically allows you to write to a file in append only mode.
Prior to it, I couldn’t control the flush rate to the disk. That is you can only - 
by definition of O_DIRECT - flush page-aligned memory buffers - and the &lt;code class=&quot;highlighter-rouge&quot;&gt;seastar::file_output_stream()&lt;/code&gt;
does not allow you to flush unfinished pages.&lt;/p&gt;

&lt;p&gt;You deal with that by having to zero-out the tail of the unfinished page followed by a write, and then
followed by a &lt;code class=&quot;highlighter-rouge&quot;&gt;truncate&lt;/code&gt; call which will set the file to the correct size - minus all the zeros you wrote.
I know zeroes are not strictly necessary, but wanted to make the reads safer - more on that later.&lt;/p&gt;

&lt;p&gt;However, you are in a world where concurrency is the norm (as well as parallelism, but let’s focus on 
the structure - concurrency - &lt;em&gt;not&lt;/em&gt; the simulataneous execution - parallelism) and right before you
close the file, you want to flush all the remaining pages as fast as possible.&lt;/p&gt;

&lt;p&gt;I couldn’t find any information on the web that specifically answered the question of&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Can you write to the same file handle, multiple pages at the same time (&lt;em&gt;dispatched&lt;/em&gt; at the same time),
and each write need not be sequential.&lt;/p&gt;

  &lt;p&gt;That is, on pages 1,2,3,4 - write a sequence of 4,3,2,1 (worst case scenario)?&lt;/p&gt;

  &lt;p&gt;The answer on for my SSD (INTEL SSDSC2BP48), with XFS is yes.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;What follows are the tests I wrote to prove it with my favorite systems framework 
&lt;a href=&quot;https://github.com/scylladb/seastar&quot;&gt;seastar&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The strategy below is as follows:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Write 10 pages&lt;/li&gt;
  &lt;li&gt;Allow for at most 4 concurrent execution of page writes&lt;/li&gt;
  &lt;li&gt;Always skip the first page - proving the point - explicitly - though implicitly proved by concurrent execution.&lt;/li&gt;
  &lt;li&gt;Make sure that you can read the file on the file system with &lt;code class=&quot;highlighter-rouge&quot;&gt;less&lt;/code&gt; afterwards&lt;/li&gt;
&lt;/ul&gt;

&lt;script src=&quot;https://gist.github.com/c80a4f3563bc87b6de4099e2c4527e77.js&quot;&gt; &lt;/script&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
&lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
This is what it looks like when you `less` the file on your terminal at
page boundaries
&lt;/caption&gt;
&lt;tr&gt;&lt;td&gt;&lt;img src=&quot;/images/skipped_page_boundary_on_disk.png&quot; alt=&quot;page boundary of skiped page&quot; /&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;A follow up of this will include how to &lt;em&gt;safely&lt;/em&gt; dispatch half written pages and how to distribute the 
lock/semaphore contention - Spoiler alert: &lt;code class=&quot;highlighter-rouge&quot;&gt;jump_consistent_hash()&lt;/code&gt;&lt;/p&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
&lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; padding-bottom:20px;&quot;&gt;
Comparing Lemire's fastrange.h vs Google's jump-consistent-hashing
&lt;/caption&gt;
&lt;tr&gt;&lt;td&gt;&lt;img src=&quot;/images/semaphore_load.png&quot; class=&quot;site-image&quot; alt=&quot;page boundary of skiped page&quot; /&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;This idea from a set of unit tests I was writing to prove that my wal_segment behaved - at a high level -
like the &lt;code class=&quot;highlighter-rouge&quot;&gt;seastar::make_file_output_stream()&lt;/code&gt; handle seastar provides.&lt;/p&gt;

&lt;p&gt;The exciting news is that you should expect an update with this concurrency primitive turned on
for &lt;code class=&quot;highlighter-rouge&quot;&gt;smf&lt;/code&gt; in the next month or so.&lt;/p&gt;

&lt;p&gt;Let me know if you found this useful!&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://groups.google.com/forum/#!forum/smf-dev&quot;&gt;Join the smf mailing list&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;appendix&quot;&gt;Appendix&lt;/h2&gt;

&lt;p&gt;This full integration test.&lt;/p&gt;

&lt;script src=&quot;https://gist.github.com/960ab33c64bad14c369bd69d6230a1b8.js&quot;&gt; &lt;/script&gt;

</description>
        <pubDate>Fri, 02 Feb 2018 19:00:00 -0500</pubDate>
        <link>http://alexgallego.org/concurrency/o_direct/2018/02/02/O_DIRECT.html</link>
        <guid isPermaLink="true">http://alexgallego.org/concurrency/o_direct/2018/02/02/O_DIRECT.html</guid>
        
        
        <category>concurrency</category>
        
        <category>O_DIRECT</category>
        
      </item>
    
      <item>
        <title>seastar: the future&lt;&gt; is here</title>
        <description>&lt;p&gt;On June 8, 2016, Avi Kivity came to NYC to present ScyllaDB. 
During his search for a quick open desk to do some work, 
I volunteered open spaces we had at Concord&lt;sup id=&quot;fnref:6&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;.
We talked lock-free algorithms, 
memory reclamation techniques, threading models, Concord and distributed
streaming engines, even C vs C++. Five hours later I was convinced that
seastar was the best systems framework I’d ever come across.&lt;/p&gt;

&lt;p&gt;I’ve now been using seastar for almost two years and I haven’t changed my mind.&lt;/p&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;h1 id=&quot;the-future-is-all-about-concurrency&quot;&gt;The future&amp;lt;&amp;gt; is all about concurrency.&lt;/h1&gt;

&lt;p&gt;For the truly impatient, the &lt;code class=&quot;highlighter-rouge&quot;&gt;future&amp;lt;&amp;gt;&lt;/code&gt; is here&lt;sup id=&quot;fnref:10&quot;&gt;&lt;a href=&quot;#fn:10&quot; class=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;In 1978 news&lt;sup id=&quot;fnref:8&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;, T. Hoare prophetically said the future was about computers getting more cores and not 
increasing in clock speed. In 2004 Herb Sutter coined the same trend as The Free Lunch is Over&lt;sup id=&quot;fnref:9&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;.
The &lt;code class=&quot;highlighter-rouge&quot;&gt;seastar::future&amp;lt;&amp;gt;&lt;/code&gt; is a tool to take advantage of multi-core, multi-socket
machines - a way to structure your software to grow gracefully with your hardware.
There are many other tools that fit this new modality, from lock-free algorithms and 
to co-routines, to channels&lt;sup id=&quot;fnref:8:1&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;, not to mention actor-style message passing, 
among many other paradigms 
like full-on distributed programming languages&lt;sup id=&quot;fnref:12&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; &lt;sup id=&quot;fnref:13&quot;&gt;&lt;a href=&quot;#fn:13&quot; class=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; &lt;sup id=&quot;fnref:14&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt; &lt;sup id=&quot;fnref:15&quot;&gt;&lt;a href=&quot;#fn:15&quot; class=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;blockquote&gt;

  &lt;p&gt;instead of driving clock speeds and straight-line instruction throughput ever
higher, they are instead turning en masse to hyperthreading and multicore 
architectures 
– &lt;cite&gt;Herb Sutter&lt;/cite&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;seastar::future&amp;lt;&amp;gt;’s are for &lt;strong&gt;concurrent software construction&lt;/strong&gt;. In addition, 
their design makes them &lt;em&gt;composable&lt;/em&gt;.
You can take any 2 futures and chain them together via &lt;code class=&quot;highlighter-rouge&quot;&gt;.then()&lt;/code&gt; 
operator and yield a new future&lt;sup id=&quot;fnref:5&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot;&gt;9&lt;/a&gt;&lt;/sup&gt;. Although you can combine, mix, map-reduce, 
filter, chain, fail, complete, generate, fulfill, sleep, expire futures, etc, they are 
fundamentally about program structure. Such program structure can execute in 
parallel, but doesn’t have to. When you have concurrent structure, 
parallelism is a free variable&lt;sup id=&quot;fnref:11&quot;&gt;&lt;a href=&quot;#fn:11&quot; class=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt;.
That is to say, you can turn up or down the number of &lt;code class=&quot;highlighter-rouge&quot;&gt;simultaneous&lt;/code&gt; execution
units/cores/threads without changing your program. In this paradigm, you worry
about correct program structure and someone else worries about the execution.&lt;/p&gt;

&lt;table style=&quot;padding-top:20px;&quot;&gt;
&lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; font-size: 11px; padding-bottom:20px;&quot;&gt;
The promise is the producer end of the channel, the 
future is the consumer end. A note to users. When you choose a future library
you are implicitly choosing a threading model - protect your cacheline.
&lt;/caption&gt;
&lt;tr&gt;&lt;td&gt;&lt;img src=&quot;/images/future_channel.png&quot; alt=&quot;Future producer consumer&quot; /&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;&lt;br /&gt;
Seastar is
an &lt;em&gt;intrusive&lt;/em&gt; building block. Once you start composing seastar-driven
asynchronous building blocks, 
you have to go out of your way - &lt;em&gt;really&lt;/em&gt; - to build anything synchronous, 
and that’s powerful. 
Structurally, seastar has the same effect as actor frameworks like
Akka&lt;sup id=&quot;fnref:12:1&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;, Orleans&lt;sup id=&quot;fnref:13:1&quot;&gt;&lt;a href=&quot;#fn:13&quot; class=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt;, or even languages like 
Pony&lt;sup id=&quot;fnref:14:1&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt; or Erlang&lt;sup id=&quot;fnref:15:1&quot;&gt;&lt;a href=&quot;#fn:15&quot; class=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt; have. 
Once you have an actor, they spread &lt;strong&gt;virally&lt;/strong&gt; through your system
making everything an actor.&lt;/p&gt;

&lt;p&gt;Philosophically, actor frameworks and distributed languages differ from seastar. 
While the former try to give the programmer higher abstractions and a runtime
to hide machine details like IO or CPU scheduling, seastar takes the opposite 
approach. It gives you - the wise programmer - abilities to tune and control
almost every part of the future&amp;lt;&amp;gt; runtime. This includes IO shares scheduling, 
CPU shares scheduling, in addition to batteries included approach when it comes
to taking advantage of hardware for dealing with filesystems, networking, DMA, etc.&lt;/p&gt;

&lt;p&gt;Both approaches, however, are intrinsically safe. The programmer worries about 
correctness and 
construction while the frameworks worry about efficient execution. Counter to 
general wisdom, it is actually faster and more scalable than the synchronous 
approach. While the machine does more work, it is executing your code simultaneously.
This simultaneity is the key to finishing work sooner.&lt;/p&gt;

&lt;p&gt;At its core, from the project site, seastar promises:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Shared-nothing design&lt;sup id=&quot;fnref:16&quot;&gt;&lt;a href=&quot;#fn:16&quot; class=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;High-performance networking&lt;sup id=&quot;fnref:17&quot;&gt;&lt;a href=&quot;#fn:17&quot; class=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Futures and promises&lt;sup id=&quot;fnref:18&quot;&gt;&lt;a href=&quot;#fn:18&quot; class=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Message passing&lt;sup id=&quot;fnref:19&quot;&gt;&lt;a href=&quot;#fn:19&quot; class=&quot;footnote&quot;&gt;14&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;… but it is much more, so let’s get technical and find out how Seastar
executes these concurrency building blocks.&lt;/p&gt;

&lt;h1 id=&quot;enter-seastar-at-your-own-risk-you-might-not-come-back&quot;&gt;Enter Seastar… at your own risk, you might not come back&lt;/h1&gt;

&lt;p&gt;In a past life, I helped build Concord.io with facebook’s folly::futures&lt;sup id=&quot;fnref:20&quot;&gt;&lt;a href=&quot;#fn:20&quot; class=&quot;footnote&quot;&gt;15&lt;/a&gt;&lt;/sup&gt;, and 
wangle&lt;sup id=&quot;fnref:21&quot;&gt;&lt;a href=&quot;#fn:21&quot; class=&quot;footnote&quot;&gt;16&lt;/a&gt;&lt;/sup&gt; for networking and async execution. While these libraries enabled us
to deliver high performance code using similar primitives, their use of asynchronous
operations is not as pervasive as that of seastar. 
They are libraries and not frameworks, which is the first distinction. That is, you 
can use the parts of the libraries that you need without needing to include or use 
the rest. You can tick your own clocks, your own IOEventLoops, your own CPU Scheduling, 
your own &lt;code class=&quot;highlighter-rouge&quot;&gt;syscall()&lt;/code&gt; thread pool, etc. Seastar, on the contrary, tells you that 
you have to operate within their framework. It is not possible to take parts of seastar
and use them on your code base without the IO subsystem or the CPU subsystem.&lt;/p&gt;

&lt;p&gt;While this decision seems like a disadvantage, it is actually an enforcer of 
asynchronicity - very much like actors. It is front and center to everything 
you do. This is a good thing.&lt;/p&gt;

&lt;h1 id=&quot;no-locks-atomics-cache-polluting-primitives&quot;&gt;No locks, atomics, cache polluting primitives&lt;/h1&gt;

&lt;p&gt;Seastar takes one extreme approach to data locality. It uses almost no locks, atomics, 
or in any way implicit memory sharing with other cores. Your view into any application
starts with a &lt;code class=&quot;highlighter-rouge&quot;&gt;seastar::distributed&amp;lt;T&amp;gt;&lt;/code&gt; type. This means a copy of the &lt;code class=&quot;highlighter-rouge&quot;&gt;T&lt;/code&gt; is thread local.&lt;/p&gt;

&lt;p&gt;They of course cover all the basics for high performance applications:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Small type optimizations (although &lt;code class=&quot;highlighter-rouge&quot;&gt;seastar::small_set&amp;lt;T&amp;gt; and seastar::small_map&amp;lt;K,V&amp;gt;&lt;/code&gt; are missing).&lt;/li&gt;
  &lt;li&gt;Non thread safe non-polymorphic shared pointer (local to core) via &lt;code class=&quot;highlighter-rouge&quot;&gt;seastar::lw_shared_ptr&amp;lt;T&amp;gt;&lt;/code&gt;&lt;sup id=&quot;fnref:22&quot;&gt;&lt;a href=&quot;#fn:22&quot; class=&quot;footnote&quot;&gt;17&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Non thread safe polymorphic shared pointer (local to core) via &lt;code class=&quot;highlighter-rouge&quot;&gt;seastar::shared_ptr&amp;lt;T&amp;gt;&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;String with small type optimizations&lt;sup id=&quot;fnref:23&quot;&gt;&lt;a href=&quot;#fn:23&quot; class=&quot;footnote&quot;&gt;18&lt;/a&gt;&lt;/sup&gt; nor atomics like the libc++&lt;sup id=&quot;fnref:24&quot;&gt;&lt;a href=&quot;#fn:24&quot; class=&quot;footnote&quot;&gt;19&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Move only bag-o’-bytes&lt;sup id=&quot;fnref:25&quot;&gt;&lt;a href=&quot;#fn:25&quot; class=&quot;footnote&quot;&gt;20&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Cicular buffers&lt;/li&gt;
  &lt;li&gt;Linux DAIO&lt;/li&gt;
  &lt;li&gt;and many many more! 🎉🎉🎉🎉&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;img src=&quot;/images/moar_details.jpg&quot; alt=&quot;moar details&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;h2 id=&quot;a-mental-model&quot;&gt;A mental model&lt;/h2&gt;

&lt;table class=&quot;seastar_model_img&quot; style=&quot;padding-top:20px;&quot;&gt;
&lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; font-size: 11px; padding-bottom:20px;&quot;&gt;
Figure 1: Seastar Mental Model.
Everything in seastar happens in a `thread_local' (per hyper-thread) with the 
exception of explicit cross core communication. 
As with all mental models, this is a simplification and omits details. 
&lt;/caption&gt;
&lt;tr&gt;&lt;td&gt;&lt;img src=&quot;/images/seastar_model.png&quot; alt=&quot;seastar mental model&quot; /&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;h1 id=&quot;irl-impact&quot;&gt;IRL Impact&lt;/h1&gt;

&lt;p&gt;I’ve been using seastar for a year and a half on a project called
&lt;strong&gt;smf&lt;/strong&gt;&lt;sup id=&quot;fnref:26&quot;&gt;&lt;a href=&quot;#fn:26&quot; class=&quot;footnote&quot;&gt;21&lt;/a&gt;&lt;/sup&gt; and it has been eye-opening.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;➜  smf git:&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;master&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; cloc &lt;span class=&quot;nt&quot;&gt;--vcs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;git &lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt;                                                       
&lt;span class=&quot;nt&quot;&gt;--------------------------------------------------------------------------------&lt;/span&gt;
Language                      files          blank        comment           code
&lt;span class=&quot;nt&quot;&gt;--------------------------------------------------------------------------------&lt;/span&gt;
C++                              61            712            512           3607
C/C++ Header                     77            650            824           2761&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;smf is a set of libraries and utilities 
(like boost:: for C++ or guava for java) designed to be the 
building blocks of your next distributed systems.&lt;/p&gt;

&lt;p&gt;Current benchmarks in microseconds make smf’s RPC (seastar-backed through DPDK) the 
lowest tail latency system I’ve tested - including gRPC, Thrift, Cap n’Proto, 
etc. What matters however is not that I’ve managed to build a fast RPC, but 
the fact that doing it with seastar was no more work than doing the same thing
with facebook::folly and facebook::wangle, boost::asio, or libevent.&lt;/p&gt;

&lt;table class=&quot;smf_rpc_img&quot; style=&quot;padding-top:20px;&quot;&gt;
&lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; font-size: 11px; padding-bottom:20px;&quot;&gt;
Figure 3: smf end-to-end latency.
&lt;/caption&gt;
&lt;tr&gt;&lt;td&gt;&lt;img src=&quot;/images/annotated_rpc.png&quot; class=&quot;site-image&quot; alt=&quot;smf rpc end to end latency&quot; /&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;In addition to the RPC, smf has its own Write Ahead Log (WAL).&lt;/p&gt;

&lt;p&gt;It is a write ahead log modeled after an Apache Kafka-like interface or 
Apache Pulsar. It has topics, partitions, etc. It is designed to have a
single reader/writer per topic/partition.&lt;/p&gt;

&lt;p&gt;Current benchmarks in milliseconds ==&amp;gt; 41X faster than Apache Kafka&lt;/p&gt;

&lt;table class=&quot;smf_wal_img&quot; style=&quot;padding-top:20px;&quot;&gt;
&lt;caption align=&quot;bottom&quot; style=&quot;font-weight: bold; font-size: 11px; padding-bottom:20px;&quot;&gt;
Figure 4: smf vs kafka - 3 producers.
&lt;/caption&gt;
&lt;tr&gt;&lt;td&gt;&lt;img src=&quot;/images/wal_kafka_latency_vs_smf_3producers.png&quot; class=&quot;site-image&quot; alt=&quot;smf WAL 3 producers comparison vs kafka 3 producers&quot; /&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;&lt;br /&gt;
These massive gains should be expected of many server side applications.&lt;/p&gt;

&lt;p&gt;If you are looking for an even more batteries included system, please stop by 
and ask us questions on our mailing list.
&lt;br /&gt;&lt;/p&gt;

&lt;h1 id=&quot;-join-the-mailing-list-&quot;&gt;&lt;a href=&quot;https://groups.google.com/forum/#!forum/smf-dev&quot;&gt; Join The Mailing List &lt;/a&gt;&lt;/h1&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;Special thanks to &lt;a href=&quot;https://twitter.com/duarte_nunes&quot;&gt;duarte nunes&lt;/a&gt; and sarah rohrbach for reading
drafts of this post.
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;&lt;/p&gt;
&lt;h1 id=&quot;references&quot;&gt;References&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:6&quot;&gt;
      &lt;p&gt;concord - my previous startup - http://concord.io &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:10&quot;&gt;
      &lt;p&gt;future header file - https://github.com/scylladb/seastar/blob/master/core/future.hh &lt;a href=&quot;#fnref:10&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:8&quot;&gt;
      &lt;p&gt;csp - t hoare - http://weblab.cs.uml.edu/~bill/cs515/CSP_Hoare_78.pdf &lt;a href=&quot;#fnref:8&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:8:1&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:9&quot;&gt;
      &lt;p&gt;herb sutter  - free lunch is over - https://www.cs.utexas.edu/~lin/cs380p/Free_Lunch.pdf &lt;a href=&quot;#fnref:9&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:12&quot;&gt;
      &lt;p&gt;akka - actor framework for the jvm - http://akka.io/ &lt;a href=&quot;#fnref:12&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:12:1&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:13&quot;&gt;
      &lt;p&gt;orleans - actor framework by msft - https://github.com/dotnet/orleans &lt;a href=&quot;#fnref:13&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:13:1&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:14&quot;&gt;
      &lt;p&gt;pony - actor language - https://www.ponylang.org/ &lt;a href=&quot;#fnref:14&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:14:1&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:15&quot;&gt;
      &lt;p&gt;erlang - distributed programming lang - https://www.erlang.org/ &lt;a href=&quot;#fnref:15&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:15:1&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot;&gt;
      &lt;p&gt;continuations docs - https://github.com/scylladb/seastar/blob/master/doc/tutorial.md#continuations &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:11&quot;&gt;
      &lt;p&gt;parallelism is a free variable - rob pike - https://www.youtube.com/watch?v=cN_DpYBzKso &lt;a href=&quot;#fnref:11&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:16&quot;&gt;
      &lt;p&gt;seastar shared nothing - http://www.seastar-project.org/shared-nothing/ &lt;a href=&quot;#fnref:16&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:17&quot;&gt;
      &lt;p&gt;seastar networking - http://www.seastar-project.org/networking/ &lt;a href=&quot;#fnref:17&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:18&quot;&gt;
      &lt;p&gt;seastar promises - http://www.seastar-project.org/futures-promises/ &lt;a href=&quot;#fnref:18&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:19&quot;&gt;
      &lt;p&gt;seastar message passing - http://www.seastar-project.org/message-passing/ &lt;a href=&quot;#fnref:19&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:20&quot;&gt;
      &lt;p&gt;facebook’s folly::futures - https://github.com/facebook/folly &lt;a href=&quot;#fnref:20&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:21&quot;&gt;
      &lt;p&gt;facebook wangle - https://github.com/facebook/wangle &lt;a href=&quot;#fnref:21&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:22&quot;&gt;
      &lt;p&gt;non-thread-safe shared ptr - https://github.com/scylladb/seastar/blob/master/core/shared_ptr.hh &lt;a href=&quot;#fnref:22&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:23&quot;&gt;
      &lt;p&gt;seastar::sstring - string with small type optimization - https://github.com/scylladb/seastar/blob/40a68fa50ebeeb17cd3797af7cddbbcdf07ce61a/core/sstring.hh &lt;a href=&quot;#fnref:23&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:24&quot;&gt;
      &lt;p&gt;std::basic_string - https://gcc.gnu.org/onlinedocs/gcc-6.2.0/libstdc++/api/a01076_source.html &lt;a href=&quot;#fnref:24&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:25&quot;&gt;
      &lt;p&gt;seastar::temporary_buffer - https://github.com/scylladb/seastar/blob/743723fc79d8f40a926908181026a709a8cbe719/core/temporary_buffer.hh &lt;a href=&quot;#fnref:25&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:26&quot;&gt;
      &lt;p&gt;smf - the fastest RPC in the west - https://github.com/senior7515/smf &lt;a href=&quot;#fnref:26&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
        <pubDate>Sat, 16 Dec 2017 19:00:00 -0500</pubDate>
        <link>http://alexgallego.org/concurrency/smf/2017/12/16/future.html</link>
        <guid isPermaLink="true">http://alexgallego.org/concurrency/smf/2017/12/16/future.html</guid>
        
        
        <category>concurrency</category>
        
        <category>smf</category>
        
      </item>
    
      <item>
        <title>Emacs: No modeline</title>
        <description>&lt;h1 id=&quot;tldr-distraction-free-writing&quot;&gt;tl;dr: distraction-free writing&lt;/h1&gt;

&lt;p&gt;For the impatient, here is how my emacs looks now:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/emacs_no_modeline.png&quot; alt=&quot;emacs&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Having less fluff, helps me concentrate. I don’t really need
scroll bars, copy and paste buttons, file open/close, etc.&lt;/p&gt;

&lt;p&gt;Over 80% of my time I spend reading code if I’m using emacs.
I’m optimizing for that use case.&lt;/p&gt;

&lt;p&gt;My emacs opens up fully maximized and I immediately F11 because I’m not
even interested in having edges. Really, what’s the point of looking
at the scroll bar?&lt;/p&gt;

&lt;h1 id=&quot;dont-worry-it-is-still-fully-functional&quot;&gt;Don’t worry it is still fully functional…&lt;/h1&gt;

&lt;p&gt;&lt;img src=&quot;/images/emacs_no_modeline.gif&quot; alt=&quot;emacs&quot; /&gt;&lt;/p&gt;

&lt;h1 id=&quot;emacs-you-bae&quot;&gt;Emacs! you bae&lt;/h1&gt;

&lt;p&gt;I’ve been tweaking my emacs setup for almost 7 years now. My operating systems
instructor used it once and he was a hero to me - I thought,
as did every other kid in the class, if I use this editor I will be as good.&lt;/p&gt;

&lt;p&gt;During my internships people were always surprised I was using emacs.
I was pretty much stuck. I Couldn’t go back to eclipse, netbeans, jetbrains,
visual studio, etc. 7 years pass by and I am
&lt;strong&gt;so very slow&lt;/strong&gt; with the default setup. I have to go through the manual
to figure out how to pipe things from a shell into my buffer.&lt;/p&gt;

&lt;p&gt;I suppose people realize they are old(er?) when they start to reminisce about
the good ‘ol days - i.e.: remember floppy disks, when you had 2 disk entries,
one for your operating system and one for storage… or something like that.
I realize I’m older because I have about 400 key mappings
that are mode dependent. I promise I also do regular/productive work
outside of tweaking my setup.&lt;/p&gt;

&lt;p&gt;I empathize with the ‘pursuit of perfection for your editing life’. 3 years ago
for example, wrote an RPC client/server between my emacs and a Go (back at
rc-59 when it wasn’t cool and hip) server to get pprof data into my
buffer directly. I dumped it because I decided to curl
&lt;code class=&quot;highlighter-rouge&quot;&gt;http://localhost:6060/debug/pprof/&lt;/code&gt; into a buffer - yay! for
(shell-command-on-region) &lt;code class=&quot;highlighter-rouge&quot;&gt;M-| &lt;/code&gt;&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;cloc ~/.emacs.d/agallego/

http://cloc.sourceforge.net v 1.60  &lt;span class=&quot;nv&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;0.19 s &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;160.6 files/s, 35393.7 lines/s&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;--------------------------------------------------------------------------------&lt;/span&gt;
Language                      files          blank        comment           code
&lt;span class=&quot;nt&quot;&gt;--------------------------------------------------------------------------------&lt;/span&gt;
.....
Lisp                             27            351            442           1056
....
&lt;span class=&quot;nt&quot;&gt;--------------------------------------------------------------------------------&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;However, I have about 1 thousand lines which are mostly &lt;strong&gt;configuration&lt;/strong&gt;
around these modes:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;defvar&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;my-packages&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;thrift&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;scala-mode2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; scala&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sbt-mode&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;         &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; ensime depends on this&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ensime&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;           &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; for scala&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;scss-mode&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; scss lang&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;haskell-mode&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;     &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; haskell lang&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;protobuf-mode&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; protobuf lang&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;simplezen&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; zencoding for html but much simpler&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;yaml-mode&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; .yaml files&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;js2-mode&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;         &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; js files -- auto lints&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;yasnippet&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; insert snippets on my code with tab&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;elisp-slime-nav&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; elisp&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;use-package&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cpputils-cmake&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; cmake utils for navigating code&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cmake-mode&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;markdown-mode+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;   &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; for markdown&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;multi-web-mode&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;   &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; js, css &amp;amp; tml in the same buffer&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;jsx-mode&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;jsfmt&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;string-inflection&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;   &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; snake_case -&amp;gt; UPPER -&amp;gt; cammelCase&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;zygospore&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;           &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; revert delete other windows w/ c-x 1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; god-mode            ;; vi-like navigation&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;demangle-mode&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;       &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; demangle c++ symbols using flint++&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;vlf&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;       &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; view very large files&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ws-butler&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; clean up only modified lines for whitespace&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;clean-aindent-mode&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; better auto indent&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buffer-move&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; swap buffer position&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shift-text&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;          &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; for left,right,up,down blocks shift&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;clang-format&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;google-c-style&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;company&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;goto-last-change&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;web-beautify&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;       &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; for unminifying javascript,etc&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;helm&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;               &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; incremental search&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;helm-ag&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;            &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; helm for aga silver searcher&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;helm-c-yasnippet&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;   &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;helm-helm-commands&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; help with common helm commands&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;helm-ls-git&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; inc search on git repos&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;helm-ls-hg&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;       &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; for mercurial repos&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;helm-git-grep&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; grep in git repos through helm -&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;helm-swoop&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;       &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; groups similar word in buffer&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;helm-flx&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;         &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; flx matching on helm results.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;helm-hayoo&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;       &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; haskell package search&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;anzu&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;             &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; counts the  occurances for isearch&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nyan-mode&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; progress bar for buffer scroll&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;magit&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;            &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; git interface for emacs&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;git-timemachine&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;git-messenger&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mark-multiple&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; allow for multiple marks on a buffer&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;undo-tree&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; allow to undo changes on a file&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;elscreen&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;         &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; allow to split screens in tabs&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;yagist&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;           &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; post to github gists&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;expand-region&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; avbrev on steroids&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;avy&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;              &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; jump to anything you see with one keystroke&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ace-window&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;       &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; jump like ace but for windows&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rainbow-delimiters&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; colors parens&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;browse-kill-ring&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; amazing visual kill ring search&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;exec-path-from-shell&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; fix mac not reading the shell issue&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;diminish&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;         &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;;  cleans up your powerline&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;smartparens&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;      &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; themes&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;color-theme-sanityinc-solarized&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;spacemacs-theme&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cyberpunk-theme&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; high contrast&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gandalf-theme&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; blue'ish background - day theme&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;vimgolf&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;;; fun! game&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;A list of packages to ensure are installed at launch.&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;h1 id=&quot;it-all-started-with-an-openvms-terminal&quot;&gt;It all started with an OpenVMS terminal&lt;/h1&gt;

&lt;p&gt;During one of my internships I had to ssh into these &lt;strong&gt;OpenVMS&lt;/strong&gt; machines.
Apart of the terrifying operating system and weird &lt;code class=&quot;highlighter-rouge&quot;&gt;command /line /switches=1&lt;/code&gt;
I really enjoyed having one full terminal for my editor. You F11 your way into a
pretty enjoyable and quiet programming environment. At the time we were using
emacs 18 or 19 - so broken.&lt;/p&gt;

&lt;p&gt;It took me about a month to get syntax highlighting. I remember having
conversations with people that used nano (I’m almost certain.
Definitely not emacs or vi) because…
well… getting emacs to work in this environment was just close impossible.&lt;/p&gt;

&lt;p&gt;Honestly, I still had doubts about using emacs then. I first started
enjoying emacs is when I realized that the results of searches would appear
on your buffer and you could edit them as text, it had me at &lt;code class=&quot;highlighter-rouge&quot;&gt;find-grep-dired&lt;/code&gt;!&lt;/p&gt;

&lt;p&gt;About 4 years ago, I decided to do all my work over git indexes :boom:
You have no idea how productive you will be once you force your directories to
have a convention. So much so that projects that don’t have git indexes, I
manually add them and all the files in them so I can use &lt;code class=&quot;highlighter-rouge&quot;&gt;git-grep&lt;/code&gt; and
&lt;code class=&quot;highlighter-rouge&quot;&gt;git ls-files&lt;/code&gt;. See my other post about &lt;a href=&quot;/emacs/helm/2015/02/02/helm-notational-velocity.html&quot;&gt;notational-velocity-clone&lt;/a&gt;.&lt;/p&gt;

&lt;h1 id=&quot;give-me-the-codez---add-hook-prog-mode-hook-no-mini-buffer&quot;&gt;Give me the codez - (add-hook ‘prog-mode-hook ‘no-mini-buffer)&lt;/h1&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;add-hook&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;'prog-mode-hook&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
          &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;lambda&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
              &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;setq&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mode-line-format&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
              &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;setq-default&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mode-line-format&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
              &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
          &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This is just a minor improvement on the UI. It makes a significant difference
on my laptop. At work, my monitor is 4K resolution and the amount of real state
that the mode-line covers is pretty insignificant.&lt;/p&gt;
</description>
        <pubDate>Sat, 16 Jan 2016 19:00:00 -0500</pubDate>
        <link>http://alexgallego.org/emacs/productivity/2016/01/16/emacs-no-modeline.html</link>
        <guid isPermaLink="true">http://alexgallego.org/emacs/productivity/2016/01/16/emacs-no-modeline.html</guid>
        
        
        <category>emacs</category>
        
        <category>productivity</category>
        
      </item>
    
      <item>
        <title>The size of the code deployed matters</title>
        <description>&lt;h1 id=&quot;infrastructure-as-code&quot;&gt;Infrastructure as code?&lt;/h1&gt;

&lt;p&gt;I had an interesting conversation with a friend of mine
&lt;a href=&quot;https://twitter.com/dtcb&quot;&gt;@dtcb&lt;/a&gt; about the size of the code deployed.&lt;/p&gt;

&lt;p&gt;In all of the infrastructure efforts I’ve been part of for the last 6
years the size of the code deployed grew, sometimes to the point that the
operational complexity was so high that the code had to broken apart.
This is in line with the thought process of micro-services as an
architectural solution to break the complexity of components from a
monolithic architecture to a piece-meal composition that in turn is
easier to deploy and debug.&lt;/p&gt;

&lt;p&gt;What struck me at that time is that micro services might in fact might be
too big for deployment? What if instead you could deploy classes. Literally,
a simple interface that did one (or very few) things well. What if your
operating system was in a way an API to deploy services (it is!) but the
size of the code deployed was so small that it would in turn be hard to make
mistakes. What if each of these classes had built in monitoring,
tracing and debugging hooks. What if each running operator (see where I’m going?)
had an ability to run multiple versions of it. What if you allowed a
higher level runtime (higher than programming language runtime) worry
about the performance and gluing of components.&lt;/p&gt;

&lt;p&gt;In a way this is exactly what a CQRS/Streaming architecture affords the
developer. Notice that I mentioned ‘architecture’ and not
‘samza, storm, etc’ because as of today all these services require you to deploy
a fat jar,tar.gz file that contains the glue itself. It is not left
up to a higher level construct to build up the ‘Computational Topology’
of your code. As a developer you specify: (in pseudo code)&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-scala&quot; data-lang=&quot;scala&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// assuming an aggregation here.
//
&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;topology&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group_by&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;some&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;field&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;typically&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;user&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;operator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;usually&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;commutative&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;associative&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;like&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Sum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;save&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;some&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;database&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;emit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;next&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tuple&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;chain&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;more&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;operators&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;.....&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;h2 id=&quot;so-where-does-the-infrastructure-as-code-comes-into-play&quot;&gt;So where does the ‘infrastructure as code’ comes into play?&lt;/h2&gt;

&lt;p&gt;Imagine some magical command line tool that allowed you
to deployed a single class/interface and run it, regardless of the programming
language. What is more important is not just ‘running’ of your code
but it managed the connections in between
components.&lt;/p&gt;

&lt;p&gt;What if you could write a source of data in Python and then have a
heavy machine learning algorithm in Java. What if you empowered each team
in an organization to deploy code and the code have enough information
(via some implicit information) about the size of job and how
it would in turn execute in physical space. If you achieve this goal
you have in turn achieved ‘infrastructure as code’. You would have the
ability to modify your running infrastructure by running more code.&lt;/p&gt;

&lt;p&gt;For example, supposed you started code as follows:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-scala&quot; data-lang=&quot;scala&quot;&gt;  &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;topology&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;addStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hdfs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;://&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;println&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;and then you added a simple filter:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-scala&quot; data-lang=&quot;scala&quot;&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;topology&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;addStream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hdfs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;://&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;blue you know you're my boy!&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;println&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;What if instead of just tearing everything down there was a system that
allowed you to modify the runtime of these application to run
the filter function &lt;strong&gt;either in the same process or in the same machine
or in a series of machines without have to take down your running code.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I do know that the erlang vm allows you to do byte code deployment on a
running vm. And I know that spark gives you this native (scala) interface
for dealing with streams in multiple machines.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I want both&lt;/strong&gt; - really, I want the ability to redeploy the map function
at runtime, pausing all its upstream operators, deploy a new one
and keep producing. I &lt;strong&gt;also&lt;/strong&gt; want the ability to add some arbitrary computation
graph above the mapping function &lt;strong&gt;without&lt;/strong&gt; affecting anything else. In fact
I don’t even want to write them in the same file. I just want to refer to
these functions by name.&lt;/p&gt;

&lt;p&gt;I suppose I’m trying to solve &lt;strong&gt;some&lt;/strong&gt; of these challenges. I think that
as of today we don’t have the tools (at least i don’t know of them)
that would allow us to have this simple programming paradigm, but we have
&lt;strong&gt;some&lt;/strong&gt; of the tools.&lt;/p&gt;

&lt;p&gt;Obviously not complete, here is a sample of the system I’m working on:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;MockTweetComputation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;initialize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_timer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;milliseconds_now&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;process_timer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;timer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tweet&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gen_tweets&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;produce_record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'tweets'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'_'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;thrift_to_bytes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tweet&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_timer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;milliseconds_now&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rbonut&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Metadata&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'mock-tweets-generator'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;istreams&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[],&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;ostreams&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'tweets'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
                &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;h1 id=&quot;youve-figured-it-out&quot;&gt;You’ve figured it out!&lt;/h1&gt;

&lt;p&gt;Essentially a globally unique pub-sub system gives you the flexibility I’m
talking about here. Essentially a pub sub system would allow you to refer
to functions by name (same as a program) and your arguments to your function
would be explicity delcared as would your output of your program such that
some scheduler/oracle could co-locate/inline/scale work for you.&lt;/p&gt;

&lt;p&gt;In brief, the above operator is something like this in C++:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-c&quot; data-lang=&quot;c&quot;&gt;&lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tweet&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;140&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tweet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mockTweetGenerator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tweet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;retval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;// ... do lots of work&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;retval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Notice that the istreams of mock-tweets-generator is an empty list.
However, the ostreams (return arguments) is a list of one ‘type’.&lt;/p&gt;

&lt;h1 id=&quot;ok-so-whats-that-about-code-size&quot;&gt;Ok… so what’s that about code size?&lt;/h1&gt;

&lt;p&gt;Well.. the argument is that by making the code small, you’ll make less
mistakes. You’ll be able to iterate faster on your problems. Same deal
that people have been arguing for for decades.&lt;/p&gt;

&lt;p&gt;What is exciting for me, is that we - at work - built a system
that actually empowers you to do just that. For &lt;strong&gt;all&lt;/strong&gt; programming
languages and through a simple pub-sub model you can effectively
change how you chain complex computational models - with some small
added overhead.&lt;/p&gt;

&lt;p&gt;We built the glue, and hopefully our application developers can
just build the application and deploy code without fear. Deploy code fast.&lt;/p&gt;

&lt;p&gt;Note: It is clear to me that this paradigm doesn’t work for all the problems
in the world. You would need a unified runtime and a unified programming
which is &lt;strong&gt;&lt;em&gt;almost&lt;/em&gt;&lt;/strong&gt; what erlang gives you. However, this would defeat
one of our goals which is to allow &lt;em&gt;anyone&lt;/em&gt; with any programming
language to take advantage of this model.&lt;/p&gt;

&lt;p&gt;Soon, I’ll talk about our isolation, message guarantees, etc.&lt;/p&gt;

&lt;h1 id=&quot;thoughts---please-let-me-know&quot;&gt;Thoughts? - please let me know!&lt;/h1&gt;
</description>
        <pubDate>Sat, 28 Mar 2015 20:00:00 -0400</pubDate>
        <link>http://alexgallego.org/random/thoughts/2015/03/28/infrastructure-as-code.html</link>
        <guid isPermaLink="true">http://alexgallego.org/random/thoughts/2015/03/28/infrastructure-as-code.html</guid>
        
        
        <category>random</category>
        
        <category>thoughts</category>
        
      </item>
    
      <item>
        <title>Productionizing storm is difficult</title>
        <description>&lt;p&gt;Today I got a call from a friend using storm in production:&lt;/p&gt;

&lt;p&gt;‘Yo I restarted the supervisor process, can’t get the topology
to pick up the new Kafka node while reading it. The producers
are working fine, but there are no consuming nodes yet.’&lt;/p&gt;

&lt;p&gt;First, there is a huge mismatch of expectations there. My friend
didn’t know that you never really need to restart the supervisor
nodes (most of the time…  ) Second, when you deploy a
storm topology nimbus creates a &lt;code class=&quot;highlighter-rouge&quot;&gt;.ser&lt;/code&gt; file which contains
all of the environment (.property files for example, log4j.XML etc).
Then this file gets deployed to all supervising nodes that are
going to execute your code.&lt;/p&gt;

&lt;p&gt;The issue with his intuition of just killing the
process consuming the properties file and reading from Kafka matches
that of local development. It’s easy to simply kill -9 anything
on your computer and… bam! restart it… (minus the implication
of possibly data loss – &lt;code class=&quot;highlighter-rouge&quot;&gt;read_from_start=true&lt;/code&gt; was given)&lt;/p&gt;

&lt;h2 id=&quot;obviously-running-a-distributed-computation-is-not-trivial&quot;&gt;Obviously running a distributed computation is not trivial.&lt;/h2&gt;

&lt;p&gt;Here are a few reasons:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Restarting a process that interacted with external state
(very typical) may not produce the same results when you need to
replay records.&lt;/p&gt;

    &lt;p&gt;In fact, you can’t rely on the results being the same given the presence of
arbitrary external state. period.&lt;/p&gt;

    &lt;p&gt;There is a huge value in the speedy processing of records and them being
processed correctly with enough information for debugging. That’s the
main reason for using a stream processor. Timely and correct processing.&lt;/p&gt;

    &lt;p&gt;Interacting with external state is hard. supa’ hard.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Usually all stream processors provide very different tools for reasoning
about distributing work and processing work locally:&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;
        &lt;p&gt;Exceptions are hard to reason about in a distributed system:
Do you restart the entire topology, do you restart the node,
do you just crash and let the containerizer clean up your resources
and reschedule your worker somewhere else in the cluster loosing data
locality ? , etc. List is long&lt;/p&gt;
      &lt;/li&gt;
      &lt;li&gt;
        &lt;p&gt;Restarting remote processes have very different semantics than local procs.
If the system has no way of detecting failure and replaying records
then you just lost data. If the error propagates upstream and the
source of data cannot replay data, you are f**ed. You just lost data
which usually translates to real money.&lt;/p&gt;
      &lt;/li&gt;
      &lt;li&gt;
        &lt;p&gt;Deploying code in common streaming systems is really awkward. First
you have to plan for downtime. The reason is that you usually can’t
deploy code piecemeal. You need to deploy code in big chunks (DAGs)
which contain the whole computational topology.&lt;/p&gt;

        &lt;p&gt;The issue with this monolithic deployment is fairly obvious.
Topologies are mostly long lived things, their uptime is months on a
fast moving startup. Literally half year on bigger companies. This means
that deploying a single component might introduce a bug in other components
so deployment is a nightmare.&lt;/p&gt;

        &lt;p&gt;Also, think about this bastard proposition. First you have to stop
making money - pause your topology, leave those cpu’s idle. Then you
upload new code and cross your fingers another team didn’t introduce a bug
and hope that your unit,regression,integration tests got all the bugs.
You then wait for about 5 - 10 minutes for the topology to be settled
(i.e.: caches are hot, processing is humming along, no stack traces are registered,
nagios hasn’t called you, your dashboard numbers are updating, etc..).
If this process failed you have to roll back and cross your fingers again.
Not to mention that your team must keep all deployed binaries properly versioned
and in a repository that is easy to get to - not the case for most startups.&lt;/p&gt;
      &lt;/li&gt;
      &lt;li&gt;
        &lt;p&gt;I’ll stop at this note, but say something went wrong, where do you start?
Do you go onto zookeeper and make sure that your topics are being
updated? (in the case of the kafka driver for storm). Do you restart the
process and increase the log level (assuming no jmx hooks). On a real note,
debugging distributed systems is usually all about printf and tcpdump.
Attaching a debugger to a running process is scary. Press the up arrow
one more time than you intended and you just brought down part of the cluster
accidentally. Don’t attach a debugger and you have almost no visibility into
the process. jmx metrics help, the linux &lt;code class=&quot;highlighter-rouge&quot;&gt;/proc&lt;/code&gt; helps, sending signals to
processes help but some things are just logic errors on very specific
code paths and this is impossible to debug without the old printf, pen &amp;amp; paper.&lt;/p&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Luckily the solution was trivial. Restart the topology from nimbus (storm master node)
and et voila’. New configs serialized and deployed.&lt;/p&gt;

&lt;p&gt;Stay tunned for updates as we make progress on our attempt at solving &lt;code class=&quot;highlighter-rouge&quot;&gt;some&lt;/code&gt; issues
mentioned here.&lt;/p&gt;
</description>
        <pubDate>Sun, 22 Feb 2015 18:10:48 -0500</pubDate>
        <link>http://alexgallego.org/storm/streaming/production/operations/2015/02/22/storm-missmatched-expectations.html</link>
        <guid isPermaLink="true">http://alexgallego.org/storm/streaming/production/operations/2015/02/22/storm-missmatched-expectations.html</guid>
        
        
        <category>storm</category>
        
        <category>streaming</category>
        
        <category>production</category>
        
        <category>operations</category>
        
      </item>
    
      <item>
        <title>Emacs - Helm + Git Grep == Notational Velocity Clone</title>
        <description>&lt;p&gt;If you haven’t used &lt;a href=&quot;http://notational.net/&quot;&gt;notational velocity&lt;/a&gt;
on a Mac OS computer you are missing out :)&lt;/p&gt;

&lt;p&gt;It is an incredibly powerful tool.&lt;/p&gt;

&lt;p&gt;I wanted the same freedom and flexibility while taking notes in meetings
but I obviously wanted to use emacs to do it.&lt;/p&gt;

&lt;h1 id=&quot;emacs-deft&quot;&gt;&lt;a href=&quot;http://jblevins.org/projects/deft/&quot;&gt;emacs deft&lt;/a&gt;&lt;/h1&gt;

&lt;p&gt;First emacs deft is a great plugin, did everything I wanted.
Filter, search, edit, and create buffers with ease.&lt;/p&gt;

&lt;p&gt;The more I used, it the more I hacked it to do what I wanted. One day
I realized that I was already using a phenomenal incremental search
and narrowing framework -
&lt;a href=&quot;http://emacs-helm.github.io/helm/&quot;&gt;emacs helm!!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I’ve been using &lt;code class=&quot;highlighter-rouge&quot;&gt;helm-git-grep&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;helm-ls-git&lt;/code&gt; for a while. Put those
2 functions together and you have exactly what I wanted. A git backed,
incremental search and narrowing framework that searches both &lt;strong&gt;content&lt;/strong&gt; and
&lt;strong&gt;filenames&lt;/strong&gt;. This is key! - In fact this was so useful that I started using
it for all my git based repos.&lt;/p&gt;

&lt;p&gt;This is my default way of navifgating all repositories. I just think of ideas,
don’t really care wether it is the name of a file or a comment in a file.
I want it all, in an easy to use interface.&lt;/p&gt;

&lt;h2 id=&quot;this-is-the-result&quot;&gt;This is the result:&lt;/h2&gt;

&lt;script src=&quot;https://gist.github.com/senior7515/205606fa84f5ffd05136.js&quot;&gt; &lt;/script&gt;

</description>
        <pubDate>Mon, 02 Feb 2015 18:10:48 -0500</pubDate>
        <link>http://alexgallego.org/emacs/helm/2015/02/02/helm-notational-velocity.html</link>
        <guid isPermaLink="true">http://alexgallego.org/emacs/helm/2015/02/02/helm-notational-velocity.html</guid>
        
        
        <category>emacs</category>
        
        <category>helm</category>
        
      </item>
    
  </channel>
</rss>
