Jekyll2018-10-28T17:37:01+00:00https://nickmooney.com/Nick MooneySecurity and summer camp (but not at the same time).RNA secondary sequence prediction with the Nussinov algorithm2018-02-07T00:00:00+00:002018-02-07T00:00:00+00:00https://nickmooney.com/rna-secondary-structure-nussinov<p>Last year I had the good fortune of taking an undergraduate algorithms course with <a href="https://homes.cs.washington.edu/~ruzzo/">Larry Ruzzo</a>. One of my favorite parts of the course was applying dynamic program to approximate a real-world problem in the field of computational biology: RNA secondary structure prediction. I ended up implementing the algorithm in a couple languages (I was learning Nim at the time) and swore I would write it up, so this post will give an introduction to the Nussinov algorithm, along with some of an implementation in Nim.</p>
<script type="text/javascript" async="" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-MML-AM_CHTML">
</script>
<ul id="markdown-toc">
<li><a href="#the-problem-context" id="markdown-toc-the-problem-context">The problem context</a> <ul>
<li><a href="#what-is-a-pairing" id="markdown-toc-what-is-a-pairing">What is a pairing?</a></li>
<li><a href="#the-naive-solution" id="markdown-toc-the-naive-solution">The naive solution</a></li>
</ul>
</li>
<li><a href="#subproblems-and-optimal-substructure" id="markdown-toc-subproblems-and-optimal-substructure">Subproblems and optimal substructure</a> <ul>
<li><a href="#intuition-on-subproblem-decomposition" id="markdown-toc-intuition-on-subproblem-decomposition">Intuition on subproblem decomposition</a></li>
<li><a href="#first-approach-to-dynamic-programming" id="markdown-toc-first-approach-to-dynamic-programming">First approach to dynamic programming</a></li>
<li><a href="#introducing-another-variable" id="markdown-toc-introducing-another-variable">Introducing another variable</a></li>
</ul>
</li>
<li><a href="#implementation" id="markdown-toc-implementation">Implementation</a></li>
<li><a href="#backtracking" id="markdown-toc-backtracking">Backtracking</a></li>
<li><a href="#footnotes" id="markdown-toc-footnotes">Footnotes</a></li>
</ul>
<h2 id="the-problem-context">The problem context</h2>
<p><em>Disclaimer: I am a computer scientist<sup id="fnref:0"><a href="#fn:0" class="footnote">1</a></sup>, not a biologist. I am sure that the basic Nussinov algorithm is a drastic simplification of real-world computational biology, so if you would like to learn more about the biology side of this, ask your favorite (computational) biologist.</em></p>
<p>Most of us are familiar with the structure of DNA: a <a href="https://www.youtube.com/watch?v=tWzJhkrZm5Y">double helix</a> that consists of the bases A, C, T, and G. The double helix is formed because the bases in each strand are paired with their corresponding bases (A/T and C/G) in the other strand – in other words, each strand in DNA is sort of an “inverse” of the other. RNA, on the other hand, is single stranded, so does not have a “natural” structure like DNA. However, the complementary bases (A/U and C/G) still have affinities for each other, so strands of RNA end up folding in on themselves to form a unique secondary structure. Knowledge of this secondary structure gives us insight into the behavior of the molecule, so it is very handy to be able to predict it.</p>
<p>Predicting the secondary structure is also a little tricky: any complementary bases can pair up with each other (with some restrictions we will detail below), but physics tells us that the eventual scondary structure will approximate the one with the optimum total free energy. Total free energy can be approximated by the <em>number</em> of pairings that form the secondary structure, so the “optimal” structure is approximately the structure with the largest number of pairings. Remember that a pairing is defined by two complementary bases “sticking” to each other.</p>
<p>How do we find the secondary structure with the largest number of pairings? This is where the Nussinov algorithm comes in. First, let’s formalize the idea of a pairing.</p>
<h3 id="what-is-a-pairing">What is a pairing?</h3>
<p>Let’s define a pairing in somewhat formal terms<sup id="fnref:1"><a href="#fn:1" class="footnote">2</a></sup>. We can view a single-stranded RNA molecule as a sequence of <script type="math/tex">n</script> symbols drawn from the alphabet <script type="math/tex">\{A, C, G, U\}</script>. Let <script type="math/tex">B = b_1b_2 \ldots b_n</script> represent our RNA strand, where each <script type="math/tex">b_i \in \{A, C, G, U\}</script>. Pairings must form between complementary bases, and each base may pair with at most one other base (i.e. the base pairs form a matching). Using the notation from <em>Algorithm Design</em>, a base pairing is some set <script type="math/tex">S = \{(i, j)\}</script> in which <script type="math/tex">(i, j) \in \{1, 2, \ldots, n\}</script>. Base pairings must also adhere to the following conditions:</p>
<ol>
<li>No sharp turns. There must be at least 4 bases between any base pair. Formally, if <script type="math/tex">(i, j) \in S</script>, then <script type="math/tex">% <![CDATA[
i < j - 4 %]]></script>.</li>
<li>No crossing. If <script type="math/tex">(i, j)</script> and <script type="math/tex">(k, l)</script> are in <script type="math/tex">S</script>, then we must not have the case that <script type="math/tex">% <![CDATA[
i < k < j < l %]]></script>. Intuitively, this means a base within any given pairing cannot pair with a base outside that pairing.</li>
</ol>
<h3 id="the-naive-solution">The naive solution</h3>
<p>It is worth taking a moment to think about what a naive algorithm algorithm for this would look like: evaluate each possible set of pairings, and from the ones that follow the rules we’ve laid out (complementary bases, matching, no sharp turns, non-crossing), pick the largest set. There are on the order of <script type="math/tex">n \choose 2</script> pairings<sup id="fnref:2"><a href="#fn:2" class="footnote">3</a></sup>, and each set can contain any number of those pairings. This leaves us with somewhere in the neighborhood of <script type="math/tex">2^{n \choose 2} = 2^{\frac{n^2 - n}{2}}</script> pairings, which is… big. <script type="math/tex">\mathcal{O}(2^{n^2})</script> big. As the number of bases grows, this solution quickly becomes infeasible. Clearly, we are going to have to do better.</p>
<h2 id="subproblems-and-optimal-substructure">Subproblems and optimal substructure</h2>
<h3 id="intuition-on-subproblem-decomposition">Intuition on subproblem decomposition</h3>
<p>Let’s say we know the optimal solution (i.e. the largest set of pairings that follow the rules) for an RNA strand with <script type="math/tex">n</script> bases. Does this knowledge make it any easier to find the solution for an RNA strand with length <script type="math/tex">n + 1</script>? Assume that the first <script type="math/tex">n</script> bases are the same: we have just added the base <script type="math/tex">b_{n+1}</script>.</p>
<p>The simplest case is that the <script type="math/tex">b_{n+1}</script> does not pair with any previous bases in the optimal solution, so the optimal solution is unchanged from the <script type="math/tex">n</script>-base solution. Otherwise, <script type="math/tex">b_{n+1}</script> will pair with some previous base <script type="math/tex">% <![CDATA[
t < j - 4 %]]></script>. Due to the noncrossing condition, the optimal solution for this new strand will consist of the pairing <script type="math/tex">(t, n + 1)</script> and the optimal solutions for <script type="math/tex">b_1b_2 \ldots b_{t-1}</script> and <script type="math/tex">b_{t+1} b_{t+2} \ldots b_n</script>. We’ve broken down the problem into subproblems.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> -------------------------------
| |
* * * * * * * * * * * * *
1 2 t-1 t t+1 n n+1
</code></pre></div></div>
<h3 id="first-approach-to-dynamic-programming">First approach to dynamic programming</h3>
<p>Dynamic programming is a great fit for this problem, since we have shown that the problem has <a href="https://en.wikipedia.org/wiki/Optimal_substructure">optimal substructure</a>: the optimal solution for the <script type="math/tex">n + 1</script>-length problem can be constructed from the optimal solutions of its subproblems. Let’s try to formalize this a little bit. Our first approach may be to define some function <script type="math/tex">OPT(j)</script>, which gives us the number of base pairings in the optimal solution for the strand <script type="math/tex">b_1b_2 \ldots b_j</script>. In this case, <script type="math/tex">OPT(n)</script> would be the number of pairings in the optimal solution for the whole strand.<sup id="fnref:3"><a href="#fn:3" class="footnote">4</a></sup></p>
<p>When we make a pairing for <script type="math/tex">i</script> and <script type="math/tex">j</script>, as discussed earlier, our solution consists of that pairing combined with the optimal solutions for <script type="math/tex">b_1 \ldots b_{t-1}</script> and <script type="math/tex">b_{t+1} \ldots b_j</script>. The <script type="math/tex">OPT</script> function we defined will only give us the optimal solution for <script type="math/tex">b_1 \ldots b_{t-1}</script>, but we have no information about the <script type="math/tex">b_{t+1} \ldots b_j</script> subproblem. We will have to introduce another variable.</p>
<h3 id="introducing-another-variable">Introducing another variable</h3>
<p>Let’s define a new function <script type="math/tex">OPT(i, j)</script> which represents the number of pairings in the optimal solution from <script type="math/tex">b_i</script> to <script type="math/tex">b_j</script>. Both of our subproblems can be represented by this function, and we can define it as a recurrence:</p>
<script type="math/tex; mode=display">OPT(i, j) = \max \begin{cases}
OPT(i, j - 1) \\
\max\limits_{t} 1 + OPT(i, t - 1) + OPT(t + 1, j - 1)
\end{cases}</script>
<p>We maximize over <script type="math/tex">t</script>, constrained to the values of <script type="math/tex">t</script> that follow the rules for allowable pairings we defined above (no sharp turns, complementary bases, max one pairing per base, noncrossing). In English, we can understand this as saying that the optimal solution for the strand <script type="math/tex">b_i \ldots b_j</script> is either:</p>
<ol>
<li>The optimal solution for the strand <script type="math/tex">b_i \ldots b_{j-1}</script> (i.e. <script type="math/tex">b_j</script> doesn’t get paired), or</li>
<li>The pairing <script type="math/tex">(t, j)</script> for the <script type="math/tex">t</script> that follows the rules and maximizes the number of pairings in <script type="math/tex">(i, t - 1)</script> and <script type="math/tex">(t + 1, j - 1)</script>, plus those optimal solutions to <script type="math/tex">(i, t - 1)</script> and <script type="math/tex">(t + 1, j - 1)</script>.</li>
</ol>
<p>We can now find the optimal solution in <script type="math/tex">\mathcal{O}(n^3)</script> time: thats <script type="math/tex">n^2</script> values of <script type="math/tex">OPT</script> to calculate, and (approximately) <script type="math/tex">n</script> possible values of <script type="math/tex">t</script> to look over for each value of <script type="math/tex">OPT</script>. This is much better than our previous exponential-time solution.</p>
<h2 id="implementation">Implementation</h2>
<p>As is usually the case in dynamic program, we need to build up solutions to subproblems in an order such that each later subproblem consists of previous, already-calculated subproblems. Looking at our definition of <script type="math/tex">OPT</script> above, we can see that each calculation of <script type="math/tex">OPT(i, j)</script> relies on subproblems strictly smaller than <script type="math/tex">j - i</script>. If we calculate values for <script type="math/tex">OPT</script> in order of ascending strand size, we should be good.</p>
<p>Our base cases will be those strands that are too short to contain <em>any</em> pairings. By the “no sharp turns” rule, we can declare that <script type="math/tex">OPT(i, j) = 0</script> for all <script type="math/tex">i \geq j - 4</script>. We will then build up solutions starting at <script type="math/tex">k = 5</script> and going up to <script type="math/tex">k = n</script> (for a strand of length <script type="math/tex">n</script>). In Nim, this looks something like the folllowing:</p>
<div class="language-nim highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># calculate the OPT array for a given set of bases</span>
<span class="k">proc </span><span class="nf">nussinov</span><span class="p">(</span><span class="n">bases</span><span class="p">:</span> <span class="kt">seq</span><span class="o">[</span><span class="n">RNABase</span><span class="o">]</span><span class="p">):</span> <span class="kt">seq</span><span class="o">[</span><span class="kt">seq</span><span class="o">[</span><span class="kt">int</span><span class="o">]]</span> <span class="o">=</span>
<span class="k">let</span> <span class="n">n</span> <span class="o">=</span> <span class="n">bases</span><span class="p">.</span><span class="n">len</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">newSeqWith</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">newSeq</span><span class="o">[</span><span class="kt">int</span><span class="o">]</span><span class="p">(</span><span class="n">n</span><span class="p">))</span>
<span class="c"># Initialize OPT[i,j] = 0 where i >= j - 4</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="mf">0</span><span class="p">..</span><span class="o"><</span><span class="n">n</span><span class="p">:</span>
<span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="mf">0</span><span class="p">..</span><span class="o"><</span><span class="n">n</span><span class="p">:</span>
<span class="k">if</span> <span class="n">i</span> <span class="o">>=</span> <span class="n">j</span> <span class="o">-</span> <span class="mi">4</span><span class="p">:</span>
<span class="n">result</span><span class="o">[</span><span class="n">i</span><span class="o">][</span><span class="n">j</span><span class="o">]</span> <span class="o">=</span> <span class="mi">0</span>
<span class="c"># Calculate progressively larger subproblems according</span>
<span class="c"># to the recurrence given in the textbook</span>
<span class="k">for</span> <span class="n">k</span> <span class="ow">in</span> <span class="mf">5</span><span class="p">..</span><span class="o"><</span><span class="n">n</span><span class="p">:</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="mf">0</span><span class="p">..</span><span class="o"><</span><span class="p">(</span><span class="n">n</span><span class="o">-</span><span class="n">k</span><span class="p">):</span>
<span class="k">let</span> <span class="n">j</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="n">k</span>
<span class="k">var</span> <span class="n">max_val</span> <span class="o">=</span> <span class="n">result</span><span class="o">[</span><span class="n">i</span><span class="o">][</span><span class="n">j</span><span class="o">-</span><span class="mi">1</span><span class="o">]</span>
<span class="k">for</span> <span class="n">t</span> <span class="ow">in</span> <span class="n">i</span><span class="p">..</span><span class="o"><</span><span class="p">(</span><span class="n">j</span><span class="o">-</span><span class="mi">4</span><span class="p">):</span>
<span class="k">if</span> <span class="n">complementary</span><span class="p">(</span><span class="n">bases</span><span class="o">[</span><span class="n">t</span><span class="o">]</span><span class="p">,</span> <span class="n">bases</span><span class="o">[</span><span class="n">j</span><span class="o">]</span><span class="p">):</span>
<span class="k">let</span> <span class="n">before_val</span> <span class="o">=</span> <span class="k">if</span> <span class="n">t</span> <span class="o">></span> <span class="mi">0</span><span class="p">:</span> <span class="n">result</span><span class="o">[</span><span class="n">i</span><span class="o">][</span><span class="n">t</span><span class="o">-</span><span class="mi">1</span><span class="o">]</span>
<span class="k">else</span><span class="p">:</span> <span class="mi">0</span>
<span class="k">let</span> <span class="n">after_val</span> <span class="o">=</span> <span class="k">if</span> <span class="n">j</span> <span class="o">></span> <span class="mi">0</span><span class="p">:</span> <span class="n">result</span><span class="o">[</span><span class="n">t</span><span class="o">+</span><span class="mi">1</span><span class="o">][</span><span class="n">j</span><span class="o">-</span><span class="mi">1</span><span class="o">]</span>
<span class="k">else</span><span class="p">:</span> <span class="mi">0</span>
<span class="k">let</span> <span class="n">cur_val</span> <span class="o">=</span> <span class="mi">1</span> <span class="o">+</span> <span class="n">before_val</span> <span class="o">+</span> <span class="n">after_val</span>
<span class="n">max_val</span> <span class="o">=</span> <span class="n">max</span><span class="p">(</span><span class="n">max_val</span><span class="p">,</span> <span class="n">cur_val</span><span class="p">)</span>
<span class="n">result</span><span class="o">[</span><span class="n">i</span><span class="o">][</span><span class="n">j</span><span class="o">]</span> <span class="o">=</span> <span class="n">max_val</span>
</code></pre></div></div>
<p>A matrix is a great way to represent what’s going on as this code runs, so let’s draw it out. Our matrix is “truncated” (<script type="math/tex">j</script> starts at 6) because we only need to consider subproblems of <script type="math/tex">k = 5</script> and up (by the no sharp turns rule again). Let’s see what happens as we fill it in.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Sequence: ACCGGUAGU Fill in values for k = 5
4 |0 0 0 4 |0 0 0 0
3 |0 0 3 |0 0 1
2 |0 2 |0 0
i = 1 | i = 1 |1
-------- --------
j = 6 7 8 9 j = 6 7 8 9
k = 6 k = 7
4 |0 0 0 0 4 |0 0 0 0
3 |0 0 1 1 3 |0 0 1 1
2 |0 0 1 2 |0 0 1 1
i = 1 |1 1 i = 1 |1 1 1
-------- --------
j = 6 7 8 9 j = 6 7 8 9
k = 8
4 |0 0 0 0
3 |0 0 1 1
2 |0 0 1 1
i = 1 |1 1 1 2
--------
j = 6 7 8 9
</code></pre></div></div>
<p>The value in the lower right tells us that the optimal solution has two pairings.</p>
<h2 id="backtracking">Backtracking</h2>
<p>We know how to calculate the <script type="math/tex">OPT</script> matrix, but weren’t we looking for a solution that actually gave us the pairings? To get the pairings, we have to backtrack through our <script type="math/tex">OPT</script> matrix – another common idea in dynamic programming. The pseudocode looks like this:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>function traceback(i, j):
if i >= j:
return
if OPT[i][j] == OPT[i][j-1]:
# optimal solution does not contain (i, j)
traceback(i, j - 1)
else:
find the t that maximizes 1 + OPT(i, t - 1) + OPT(t + 1, j)
mark (t, j) as a pairing
traceback(i, t - 1)
traceback(t + 1, j)
</code></pre></div></div>
<p>This traceback has an <script type="math/tex">\mathcal{O}(n^2)</script> runtime. If we wanted to improve the runtime of the traceback, we could make space tradeoffs and store the actual optimal pairings (rather than just the number of them) in the <script type="math/tex">OPT</script> matrix. However, this doesn’t improve the overall asymptotic runtime of the full solution, since the <script type="math/tex">\mathcal{O}(n^3)</script> step still dominates the <script type="math/tex">\mathcal{O}(n^2)</script> step.</p>
<h2 id="footnotes">Footnotes</h2>
<div class="footnotes">
<ol>
<li id="fn:0">
<p>Technically, I have no credentials – my graduation date is in about 5 weeks. <a href="#fnref:0" class="reversefootnote">↩</a></p>
</li>
<li id="fn:1">
<p>Much of this is paraphrased from the fantastic textbook <a href="https://www.amazon.com/Algorithm-Design-Jon-Kleinberg/dp/0321295358"><em>Algorithm Design</em></a> by Jon Kleinberg and Éva Tardos. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>Many of these pairings will not be valid by our rules, but I don’t believe this changes the asymptotic bound. <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>Note that <script type="math/tex">OPT</script> gives us the optimal <em>number</em> of pairings for the strand from <script type="math/tex">b_1</script> to <script type="math/tex">b_n</script>, not the pairings themselves. This is okay – later, we will trace back through our calculated values of <script type="math/tex">OPT</script> to recover the optimal solution. <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Last year I had the good fortune of taking an undergraduate algorithms course with Larry Ruzzo. One of my favorite parts of the course was applying dynamic program to approximate a real-world problem in the field of computational biology: RNA secondary structure prediction. I ended up implementing the algorithm in a couple languages (I was learning Nim at the time) and swore I would write it up, so this post will give an introduction to the Nussinov algorithm, along with some of an implementation in Nim.Jailbreak Detector Detector2018-02-04T00:00:00+00:002018-02-04T00:00:00+00:00https://nickmooney.com/jailbreak-detector-bsides<p>I gave my first talk! Big thanks to the folks organizing BSides Seattle for putting on a great show. It was a real pleasure to speak to an engaged and interested audience, and I really enjoyed the other talks and activites.</p>
<p><a href="https://www.dropbox.com/s/ll54hqup927r7y1/Jailbreak%20Detector%20Detector.pdf?dl=0">Slides</a>!</p>
<p>The title of the talk is “Jailbreak Detector Detector: Countermeasures to jailbreak detection on iOS.” It is mostly an overview of jailbreak detection techniques and the ways in which they are bypassed by end users. I discovered much of this while doing research on jailbreak detection at Duo. Shoutout to <a href="https://www.reddit.com/user/ryley_angus">/u/ryley_angus</a> – I reversed their Liberty app to provide supporting material for the talk.</p>
<p>Slides can be found <a href="https://www.dropbox.com/s/ll54hqup927r7y1/Jailbreak%20Detector%20Detector.pdf?dl=0">here</a>, and I’ll be sure to link a video if one exists (not sure if the talks got recorded).</p>I gave my first talk! Big thanks to the folks organizing BSides Seattle for putting on a great show. It was a real pleasure to speak to an engaged and interested audience, and I really enjoyed the other talks and activites.Keybase + PGP + YubiKey2017-10-18T00:00:00+00:002017-10-18T00:00:00+00:00https://nickmooney.com/keybase-pgp-yubikey<p>I wanted to document the way I most recently generated PGP keys to live on my YubiKey. My previous PGP keys expired and my current YubiKey is affected by the <a href="https://crocs.fi.muni.cz/public/papers/rsa_ccs17">Infineon RSA bug</a>, so my goal was to have keys generated on my laptop that live entirely on my YubiKey and that play nicely with Keybase.</p>
<p>This is not the most secure way to do this, but it works adequately for my purposes (in which I am more likely to be using PGP to enable YubiKey-based SSH login, not communicating national secrets). Ideally you would generate a master key on an airgapped computer and generate your subkeys on the YubiKey itself. The fact that your secret key material will be on your hard drive temporarily may not be acceptable depending on your threat model.</p>
<h2 id="generate-your-master-key">Generate your master key</h2>
<p>Keybase has sane defaults (RSA 4096) and makes the PGP key generation process easy, so I just ran a <code class="highlighter-rouge">keybase pgp gen</code> and entered my information. I chose not to upload my private key to Keybase.</p>
<p>This generates a master key with Signing and Authentication capabilities, as well as a subkey with Encryption capabilities. To offload your keys to your YubiKey, you’ll likely want subkeys for encryption, signing, and authentication, so you’re already 1/3 of the way there with the Keybase defaults.</p>
<h2 id="generate-subkeys">Generate subkeys</h2>
<p>Run a <code class="highlighter-rouge">gpg --list-keys</code> to get your key fingerprint, then run <code class="highlighter-rouge">gpg --expert --edit-key <yourfingerprint></code>.</p>
<p>You’ll need to issue two <code class="highlighter-rouge">addkey</code> commands. Generate one subkey as “RSA (sign only)”, and another with only authentication privileges using the “RSA (set your own capabilities)” option. My session looked like the following. Note that I used <code class="highlighter-rouge">16y</code> as the key expiry parameter – you can use whatever expiry timing you want. I just stuck to Keybase’ default of 16 years because I’m likely going to be “revoking” my PGP keys by removing them from my Keybase account rather than by issuing proper GPG revocations. In other words, in the worst case, a key is compromised and doesn’t expire for a long time, but my Keybase account shows that I no have asserted that I no longer control that key.</p>
<script src="https://gist.github.com/c207b7845517b95cddd1025c023c445a.js"> </script>
<h2 id="move-your-subkeys-to-your-yubikey">Move your subkeys to your YubiKey</h2>
<p>In the same key editing session, we’re going to select each key and move it to the YubiKey.</p>
<ol>
<li>Select the first subkey with <code class="highlighter-rouge">key 1</code></li>
<li><code class="highlighter-rouge">keytocard</code> to move the key to the YubiKey</li>
<li>Deselect the subkey with <code class="highlighter-rouge">key 1</code> again (as you can only select one subkey at a time when issuing a <code class="highlighter-rouge">keytocard</code>)</li>
<li>Repeat the process with <code class="highlighter-rouge">key 2</code> and <code class="highlighter-rouge">key 3</code></li>
</ol>
<p>Hooray! You’ve moved your subkeys to the YubiKey.</p>
<h2 id="back-up-your-master-secret-key">Back up your master secret key</h2>
<p>Issue a <code class="highlighter-rouge">gpg -a --export-secret-key <yourkeyid> > backup_secret.asc</code>. Ideally <code class="highlighter-rouge">backup_secret.asc</code> lives on a thumb drive you keep in a hole in your back yard or something.</p>
<p>Now delete your secret key from your computer, as you don’t need it any more – the keys you’ll be using live on the YubiKey. You can do this with <code class="highlighter-rouge">gpg --delete-secret-key <yourkeyid></code>.</p>
<h2 id="update-keybase-with-your-new-subkeys">Update Keybase with your new subkeys</h2>
<p>Run a <code class="highlighter-rouge">keybase pgp update</code> to push your updated public key to Keybase. Your master key hasn’t changed, but your public key now reflects that you’ve authorized two additional subkeys.</p>
<h2 id="youre-done">You’re done!</h2>
<p>That’s it! Note that you will still be able to run <code class="highlighter-rouge">gpg --export-secret-key</code> and get some output. This confused me for a little while until I learned that only secret key <em>stubs</em> are being exported. You can inspect the output of the export command by running <code class="highlighter-rouge">gpg -a --export-secret-key <yourkeyid> | gpg --list-packets --verbose</code>. You should see “pkey” entries, but no “skey” entries.</p>
<p>From now on, when you attempt to perform an operation that requires the use of your PGP keys, you should be prompted to insert the YubiKey and then enter the PIN.</p>I wanted to document the way I most recently generated PGP keys to live on my YubiKey. My previous PGP keys expired and my current YubiKey is affected by the Infineon RSA bug, so my goal was to have keys generated on my laptop that live entirely on my YubiKey and that play nicely with Keybase.Using your own hardware with CenturyLink Fiber2016-12-23T00:00:00+00:002016-12-23T00:00:00+00:00https://nickmooney.com/centurylink-fiber-bypass-modem<p>CenturyLink’s FTTH (fiber to the home) service recently became available in our neighborhood in Seattle. After having issues with Comcast for quite a while, it was a welcome change.</p>
<p>When the CenturyLink technician came to install service at our house, he installed two pieces of hardware. The first piece of hardware is the ONT – “optical network terminal” – and the second is a device that CenturyLink refers to as a modem. In our case, the “modem” was a ZyXEL C1100Z. What CenturyLink refers to as a modem is really a router, although it seems that they have intentionally tried to obscure this to make it difficult for consumers to use their own hardware. This is likely valuable to CenturyLink since they charge $9.99/mo for the rental of a “modem.”</p>
<p>Some research had shown that it is possible to connect your own hardware directly to the ONT – <a href="http://kmwoley.com/blog/bypassing-needless-centurylink-wireless-router-on-gigabit-fiber/">others have been successful</a> doing this, so I decided to take a shot at it. My housemates and I are all college students and we don’t like getting ripped off, so I wanted to see if we could use nothing but our existing hardware.</p>
<h2 id="necessary-features">Necessary features</h2>
<p>To connect your own hardware to the CenturyLink ONT, your router needs to support two things:</p>
<ol>
<li>Logging into the ONT via PPPoE</li>
<li>VLAN tagging over the WAN port, since the ONT expects all packets between it and the router to be tagged with VLAN ID 201</li>
</ol>
<h2 id="pppoe">PPPoE</h2>
<p>Getting our PPPoE credentials was the easy part. I found <a href="https://n8henrie.com/2015/01/how-to-find-your-centurylink-ppp-password-on-a-zyxel-c1000z-modem/">this post</a> detailing a couple ways to get that information. In short:</p>
<ol>
<li>Enable telnet on the ZyXEL device</li>
<li>Open a standard shell with <code class="highlighter-rouge">sh</code>, then <code class="highlighter-rouge">/usr/bin/pidstat -l -C pppd</code> to see the username and base64-encoded password provided to the pppd process.</li>
<li>Decode the password: <code class="highlighter-rouge">echo "encoded_password" | base64 --decode</code></li>
</ol>
<p>I saved these credentials for later.</p>
<h2 id="our-hardware">Our hardware</h2>
<p>We have a TP-Link TL-WR841N router (note that this is the same as the WR-841ND – “D” means “detachable antennas”). This router has a built-in switch that supports VLAN functionality, and I figured getting it to play nice with the ONT should be fairly easy. I flashed the router with <a href="https://wiki.openwrt.org/toh/tp-link/tl-wr841nd">OpenWRT</a> and started setting things up. First I edited the WAN interface’s settings to use the PPPoE credentials I harvested from the ZyXEL device, and then I went to configure the VLAN tagging. I won’t go into too much detail here since there are many other blog posts that do a fantastic job of explaining this process.</p>
<p>I was a little confused when I got to configuring the switch, since my default configuration didn’t seem to match what most other people were seeing in OpenWRT’s web interface. What I later found out is that the TL-WR841N <em>does</em> support VLAN tagging, but only on the 4 LAN ports; the WAN port is connected directly to <code class="highlighter-rouge">eth1</code> – no switch. I figured I was out of luck and almost went to purchase a switch to place between the router and the ONT, but I was curious if there was any reason the TL-WR841N’s dedicated WAN port <em>had</em> to be the one connected to the WAN. Some digging revealed that although it’s not particularly well-documented, you are free to use any port as the WAN port with OpenWRT. It just requires that you set up different VLANs. If I could use one of the LAN ports as my WAN port, I could use the switching functionality to tag all the packets between the router and the ONT with VLAN ID 201.</p>
<h2 id="the-setup">The setup</h2>
<p><strong>Note:</strong> In my router, port 0 is the CPU port, and ports 1-4 correspond to the labeled ports 4-1 respectively (i.e. all the LAN ports are in reverse order). Make sure you read the OpenWRT documentation for your specific hardware to figure out which ports are which.</p>
<p>It’s likely possible to do this in the GUI, but I’m more comfortable editing config files, so I edited <code class="highlighter-rouge">/etc/config/network</code> to look like this:</p>
<script src="https://gist.github.com/89b3130efed48286587dde4054f07da6.js"> </script>
<p>There are a couple important bits here. <code class="highlighter-rouge">eth0</code> is all the LAN ports (which are attached to the switch that enables the VLAN functionality. <code class="highlighter-rouge">eth1</code> is the WAN port, connected directly to the CPU, which we do not use in this configuration. You can ignore the lines containing <code class="highlighter-rouge">orig</code> and any other cruft. The stuff you need to change is this:</p>
<ul>
<li>Line 11: <code class="highlighter-rouge">eth0.1</code> means that the LAN should be VLAN 1 on eth0</li>
<li>Line 23: <code class="highlighter-rouge">eth0.201</code> means that the WAN interface should be VLAN 201 on eth0</li>
<li>Lines 31-34: ensure switching and VLANs are enabled on your device</li>
<li>Lines 36-39: create a VLAN with ID 1, containing the CPU port (tagged) and LAN ports 1, 2, and 3 (untagged)</li>
<li>Lines 41-44: create a VLAN with ID 201, containing the CPU port (tagged) and the LAN port 4 (tagged) – as mentioned above, on my router, LAN port 4 in the software corresponds to LAN port 1 on the hardware</li>
</ul>
<p>We want to always tag port 0, since the CPU should see traffic from separate VLANs as separate pseudo-interfaces (<code class="highlighter-rouge">eth0.1</code> and <code class="highlighter-rouge">eth0.201</code>). We also tag port 4 because we want traffic sent to/from the ONT to be tagged. We leave ports 1-3 untagged, meaning “treat traffic on these ports as implicitly part of VLAN 1, but do not tag the ethernet frames.” I don’t have much networking background so it took me a while to get my head around this. I found <a href="https://wiki.openwrt.org/doc/uci/network/switch">OpenWRT’s switch documentation</a> helpful.</p>
<h2 id="still-not-working">Still not working?</h2>
<p>At this point, I reloaded my network configuration on the TL-WR841N with a <code class="highlighter-rouge">/etc/init.d/network reload</code>, but I still wasn’t able to access the internet. I believe this is due to a bug in OpenWRT’s <code class="highlighter-rouge">swconfig</code>, as my second VLAN was showing up with an incorrect VLAN ID and without tagging on port 4. I fixed this by adding a few lines to <code class="highlighter-rouge">/etc/rc.local</code>, but this should not be necessary. I’m currently trying to figure out if <code class="highlighter-rouge">swconfig</code> is actually broken (so I can submit a pull request), or if it’s just user error on my part. Either way, adding the following lines to <code class="highlighter-rouge">/etc/rc.local</code> finally got the switch to match its intended configuration:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>swconfig dev eth0 vlan 0 set vid 201
swconfig dev eth0 vlan 0 set ports '0t 4t'
ifconfig eth0 down
sleep 1
ifconfig eth0 up
</code></pre></div></div>
<p>Once I figure out why OpenWRT isn’t correctly respecting <code class="highlighter-rouge">/etc/config/network</code>, I’ll update this post.</p>
<h2 id="success">Success!</h2>
<p>With all of the above done, and the ethernet cable from the ONT plugged into LAN port 1 on my router (referred to as port 4 in the software), the router was successfully able to talk to the ONT and log in via PPPoE. Now we have the TP-Link TL-WR841N connected directly to the ONT, and we no longer have to rent the ZyXEL C1100Z router from CenturyLink. We are only on a 40 megabit connection, so our existing hardware was more than sufficient, but if you have a gigabit connection it’s possible that your hardware will have trouble keeping up, so YMMV.</p>
<p>All in all I’m glad we switched away from Comcast, and having fiber directly connected to our house is pretty cool. That said, it feels that CenturyLink has been pretty deceptive. Calling the device they provide a “modem” is misleading, since it doesn’t actually directly interface with the fiber – it really does feel like a cash grab targeted at people who don’t know better. Additionally, we are on a 40/5 connection, but were advertised upload speeds up to 20 megabits/s for a little extra per month. After getting rid of the rented modem we decided to upgrade our upload speed. On further investigation, 5 megabits/s is apparently the maximum possible upload in our area, which is really disappointing for fiber. I’m not sure if CenturyLink plans to upgrade the infrastructure in our neighborhood any time soon, but the reps I talked to weren’t able to give me any information, despite being shown the option to get 20 megabits/s upload during the signup process. Hopefully these are just some growing pains for CenturyLink’s GPON fiber offerings, but I know others in Seattle have experienced similar issues with under-delivery by CenturyLink.</p>
<p>Lastly, I have read reports of people getting CenturyLink to just disable VLAN tagging on the ONT, which would make it a lot easier to connect your own hardware. From what I can tell, this is something they used to do, but no longer offer. I called in four or five times to try to figure out if this was possible, but couldn’t get them to make the change. I wouldn’t be surprised if someone still managed to convince them to do it though – the information I received seemed to vary a lot between representatives.</p>CenturyLink’s FTTH (fiber to the home) service recently became available in our neighborhood in Seattle. After having issues with Comcast for quite a while, it was a welcome change.Lecture clicker synthesizer control2016-11-15T00:00:00+00:002016-11-15T00:00:00+00:00https://nickmooney.com/lecture-clicker-synthesizer<iframe width="560" height="315" src="https://www.youtube.com/embed/u6zVi9VvFng" frameborder="0" allowfullscreen=""></iframe>
<h2 id="whats-a-clicker">What’s a clicker?</h2>
<p>A while back, I took a Biology class that required the use of “clickers” – little RF remotes that allow lecturers to get real-time feedback from their students.</p>
<p><img src="/images/responsecard_rf.png" alt="The Turning Technologies ResponseCard RF LCD" /></p>
<p>Most students in classes that require clickers will buy a clicker for $50 or so, use it for a quarter, and then forget about it deep in their backpacks. The technology is proprietary, so while the <a href="https://fccid.io/R4WRCRF03">documents filed with the FCC</a> are informative on a surface level, the application was also granted long-term confidentiality and the inner workings of the device are not public.</p>
<h2 id="former-work">Former work</h2>
<p>Searching for info online led me to the previous work of <a href="https://travisgoodspeed.blogspot.com/2010/07/reversing-rf-clicker.html">Travis Goodspeed</a> and <a href="http://www.taylorkillian.com/2012/11/turning-point-clicker-emulation-with.html">Taylor Killian</a>, who have both written quite a bit about the clickers. In short, the clickers use a widely-available chip called the <a href="https://www.nordicsemi.com/eng/Products/2.4GHz-RF/nRF24L01">nRF24L01</a> – the chip is cheap enough that I was unable to easily get my hands on fewer than 10 of them.</p>
<p>There is also tons of information out there about the nRF24L01. I relied heavily on the <a href="https://arduino-info.wikispaces.com/Nrf24L01-2.4GHz-HowTo">Arduino-Info nRF24L01 page</a> for information about hardware, pinouts, and libraries. I ended up also buying the “base modules” (basically 3.3V power conditioners) and wiring a base module + nRF24L01 breakout to an Arduino Nano clone that I had from some previous projects.</p>
<h2 id="my-setup">My setup</h2>
<p><img src="/images/nrf24l01_cabled.jpg" alt="nRF24L01 + base module" /></p>
<p>I connected up the base module to my Arduino Nano exactly as described on the <a href="https://arduino-info.wikispaces.com/Nrf24L01-2.4GHz-HowTo">Arduino-Info nRF24L01 page</a> and got to work with the fantastic <a href="https://tmrh20.github.io/RF24/">TMRh20 RF24 library</a>. The only “gotcha” I ran into was setting the address of the reading pipe. Packets sent by the clickers look like this:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> TT TT TT SS SS SS DD CC CC
target MAC source MAC data CRC
</code></pre></div></div>
<p>The “right thing” to do when intercepting clicker responses is to open a reading pipe associated with the target (base station) MAC. This way, the data you will receive will be the source (clicker) MAC, data, and a CRC. I had accidentally opened up a reading pipe associated with the clicker’s address – which is actually fine as the chip will still listen for that address and send everything after that address to the Arduino over SPI – but if your reading pipe is associated with the clicker’s MAC address, you will <em>only</em> get the data and CRC, i.e. you will only be able to listen to a single clicker at a time. Associating the reading pipe with the base station’s MAC (0x123456) will ensure that you get responses from all clickers within range, and that you get the associated clicker MAC addresses too.</p>
<h2 id="code">Code</h2>
<p>I wrote an Arduino sketch to interact with the Arduino over serial, set everything up, and then listen for clicker input. The Arduino sends input from the clicker to the computer (or other device) over serial. The format is very verbose – I just threw it together as something human-readable for debugging and didn’t go back to make it quieter. It would be easy to modify the sketch to just send out binary data.</p>
<p>I also wrote a barebones Python script to turn clicker input into MIDI output. Hooking this up to a KORG Volca FM, you get what you see in the video at the top of the post! Now you know what to do with the clicker you thought you’d never use again.</p>
<p>You can access the code at my <a href="https://github.com/nickmooney/turning-clicker">GitHub repository</a>. Feel free to do whatever you want with it – I likely won’t be investing time into improving it much since I’m mainly focused on school and work at the moment. Pull requests are welcome! Some ideas I had for improvement are:</p>
<ul>
<li>Port the whole thing to Arduino (i.e. hook the Arduino up to a MIDI shield, get rid of the laptop as middleman)</li>
<li>Add actually-interesting MIDI control capabilities (changing octaves, velocity)</li>
<li>Make the serial comms format nicer</li>
<li>Add bidirectional communication</li>
<li>Allow channel changing etc. on the nRF24L01 without reflashing the Arduino</li>
</ul>
<p>Note that you’ll have to install the <code class="highlighter-rouge">RF24</code> and <code class="highlighter-rouge">FastCRC</code> Arduino libraries to get the sketch to compile, as well as the <code class="highlighter-rouge">pySerial</code> and <code class="highlighter-rouge">rtmidi</code> libraries for running the Python script (you can do the latter with a <code class="highlighter-rouge">pip install -r requirements.txt</code>).</p>