<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>BooTell</title>
  
  <subtitle>Compute and Communicate</subtitle>
  <link href="https://bootell.net/atom.xml" rel="self"/>
  
  <link href="https://bootell.net/"/>
  <updated>2022-09-11T14:40:51.208Z</updated>
  <id>https://bootell.net/</id>
  
  <author>
    <name>Brooks Bao</name>
    
  </author>
  
  <generator uri="https://hexo.io/">Hexo</generator>
  
  <entry>
    <title>macOS Dictionary 词典编译</title>
    <link href="https://bootell.net/2022/08/23/macOs-Dictionary-Compile/"/>
    <id>https://bootell.net/2022/08/23/macOs-Dictionary-Compile/</id>
    <published>2022-08-23T16:15:00.000Z</published>
    <updated>2022-09-11T14:40:51.208Z</updated>
    
    <content type="html"><![CDATA[<p>macOS 自带的 Dictionary.app 相比于其他词典应用，支持触摸板重按取词，十分方便。但自带词典随够用但较少，可以从其他词典转换成支持的词典，以丰富词库。<br>环境搭建及编译难度较小，重点在于各个词典的优化，以达到更加的显示效果。下面是整个词典制作的过程。</p><span id="more"></span><h3 id="环境部署"><a href="#环境部署" class="headerlink" title="环境部署"></a>环境部署</h3><p>官方文档：<a href="https://github.com/ilius/pyglossary/blob/master/doc/apple.md">pyglossary</a></p><h5 id="1-Python环境搭建"><a href="#1-Python环境搭建" class="headerlink" title="1. Python环境搭建"></a>1. Python环境搭建</h5><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">下载项目</span></span><br><span class="line">git clone https://github.com/ilius/pyglossary.git</span><br><span class="line">cd pyglossary</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">python环境</span></span><br><span class="line">python3 -m venv venv</span><br><span class="line">source venv/bin/activate</span><br><span class="line">pip3 install lxml beautifulsoup4 html5lib</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">报错 <span class="string">&quot;LZO compression support is not available&quot;</span> 时安装</span></span><br><span class="line">brew install lzo</span><br><span class="line">pip3 install python-lzo</span><br></pre></td></tr></table></figure><h5 id="2-Apple-词典工具安装"><a href="#2-Apple-词典工具安装" class="headerlink" title="2. Apple 词典工具安装"></a>2. Apple 词典工具安装</h5><p>下载地址：<a href="http://developer.apple.com/downloads">Additional Tools for Xcode</a><br>下载与自身 xcode 匹配的 “Additional Tools for Xcode”<br>将其中 <code>Utilities/Dictionary Development Kit</code> 解压放到 <code>/Applications/Utilities</code> 下</p><h5 id="3-音频转换需要安装speex"><a href="#3-音频转换需要安装speex" class="headerlink" title="3. 音频转换需要安装speex"></a>3. 音频转换需要安装speex</h5><p>brew直接安装的话，发现报错 “speexdec: command not found”,不带 bin 可执行文件，需要从源码编译 <a href="https://github.com/ilius/pyglossary/issues/184">issues&#x2F;184</a><br>下载源码：<a href="https://www.speex.org/downloads/">speex.org</a></p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">报错 <span class="string">&quot;autoreconf: error: aclocal failed with exit status: 2&quot;</span>，需安装依赖</span></span><br><span class="line">brew install automake</span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">报错 <span class="string">&quot;make[4]: Nothing to be done for `all&#x27;.&quot;</span>，安装依赖</span></span><br><span class="line">brew install pkg-config</span><br><span class="line"></span><br><span class="line">./configure</span><br><span class="line">make</span><br><span class="line">make install</span><br></pre></td></tr></table></figure><p>经测试，spx文件不需要转换，直接变更后缀为<code>.mp3</code>也可正常使用</p><h3 id="词典选择"><a href="#词典选择" class="headerlink" title="词典选择"></a>词典选择</h3><p>资源比较丰富且整理较好的有以下几个：</p><ul><li>牛津高阶英汉双解词典（第四版）&#x2F;（第八版）&#x2F;（第九版）&#x2F;（第十版）</li><li>朗文当代高级词典（第五版）</li><li>柯林斯COBUILD双解词典</li><li>韦氏高阶英汉双解词典</li></ul><p>资源获取的网站常用的有以下几个：</p><ul><li><a href="https://freemdict.com/">https://freemdict.com/</a></li><li><a href="https://forum.freemdict.com/c/12-category/12">https://forum.freemdict.com/c/12-category/12</a></li><li><a href="https://www.pdawiki.com/forum/forum-4-1.html">https://www.pdawiki.com/forum/forum-4-1.html</a></li></ul><h3 id="转换过程"><a href="#转换过程" class="headerlink" title="转换过程"></a>转换过程</h3><h5 id="1-资源转换"><a href="#1-资源转换" class="headerlink" title="1. 资源转换"></a>1. 资源转换</h5><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">移动资源文件</span></span><br><span class="line">mkdir dist</span><br><span class="line">mv &quot;~/Downloads/LDOCE 5++ V2.15&quot; ./dist/ldoce5</span><br><span class="line">cd dist/ldoce5</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">转换为XML文件</span></span><br><span class="line">python3 ../../main.py --write-format=AppleDict &quot;LDOCE5++ V 2-15.mdx&quot; LDOCE5++</span><br><span class="line"></span><br><span class="line">cd LDOCE5++</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">转换音频</span></span><br><span class="line">find OtherResources -name &quot;*.spx&quot; -execdir sh -c &#x27;spx=&#123;&#125;;speexdec $spx  $&#123;spx%.*&#125;.wav&#x27; \;</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">资源地址替换</span></span><br><span class="line">sed -i &quot;&quot; &#x27;s:src=&quot;/:src=&quot;:g&#x27; LDOCE5++.xml</span><br><span class="line">sed -i &quot;&quot; &#x27;s|sound://\([/_a-zA-Z0-9]*\).spx|\1.wav|g&#x27; LDOCE5++.xml</span><br></pre></td></tr></table></figure><p>将<code>LM5style.css</code>文件内容，替换到 <code>objects/LDOCE5++.dictionary/Contents/DefaultStyle.css</code></p><h5 id="2-优化"><a href="#2-优化" class="headerlink" title="2. 优化"></a>2. 优化</h5><p>为了让词典在 Dictionary.app 中显示效果更好，需要进行优化。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sed -i &quot;&quot; &quot;s|&gt; &lt;\/a&gt;|&gt;🔈&lt;/a&gt;|g&quot; LDOCE5.xml</span><br></pre></td></tr></table></figure><h5 id="3-编译与安装"><a href="#3-编译与安装" class="headerlink" title="3. 编译与安装"></a>3. 编译与安装</h5><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">make</span><br><span class="line">make install</span><br></pre></td></tr></table></figure><h3 id="示例"><a href="#示例" class="headerlink" title="示例"></a>示例</h3><p>由于 Dictionary.app 自带牛津词典，这里转换朗文和柯林斯作为补充。</p><h5 id="1-LDOCE-5-V2-15"><a href="#1-LDOCE-5-V2-15" class="headerlink" title="1. LDOCE 5++ V2.15"></a>1. LDOCE 5++ V2.15</h5><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">转换为XML文件</span></span><br><span class="line">python3 ../../main.py --write-format=AppleDict &quot;LDOCE5++ V 2-15.mdx&quot; LDOCE5</span><br><span class="line"></span><br><span class="line">cd LDOCE5</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">转换音频，直接将后缀改为mp3</span></span><br><span class="line">find OtherResources -name &quot;*.spx&quot; -execdir sh -c &#x27;spx=&#123;&#125;;mv $spx $&#123;spx%.*&#125;.mp3&#x27; \;</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">资源地址替换</span></span><br><span class="line">sed -i &quot;&quot; &#x27;s:src=&quot;/:src=&quot;:g&#x27; LDOCE5.xml</span><br><span class="line">sed -i &quot;&quot; &#x27;s|sound://\([/_a-zA-Z0-9]*\).spx|\1.mp3|g&#x27; LDOCE5.xml</span><br></pre></td></tr></table></figure><p>修改 <code>LDOCE5.plist</code><br>CFBundleDisplayName：朗文当代高级词典（第五版）<br>CFBundleName：朗文当代</p><p>将 <code>LM5style.css</code> 内容复制到 <code>LDOCE5.css</code>，并修改内容</p><figure class="highlight css"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br></pre></td><td class="code"><pre><span class="line"># 显示隐藏</span><br><span class="line"><span class="selector-class">.Sense</span> <span class="selector-class">.corpus</span> <span class="selector-class">.title</span> &#123;</span><br><span class="line">    <span class="comment">/* display: none; */</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="selector-class">.LDOCEVERSION_new</span> <span class="selector-class">.BoxPanel</span> &#123;</span><br><span class="line">    <span class="comment">/* display: none; */</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="selector-class">.dictionary_intro</span> &#123;</span><br><span class="line">    <span class="attribute">display</span>: none;</span><br><span class="line">    <span class="comment">/*background-color: #314089;*/</span></span><br><span class="line">    <span class="comment">/*color: #fff;*/</span></span><br><span class="line">    <span class="attribute">padding-left</span>: <span class="number">10px</span>;</span><br><span class="line">    <span class="attribute">margin</span>: <span class="number">5px</span> <span class="number">0</span> <span class="number">10px</span> -<span class="number">7px</span>;</span><br><span class="line">&#125;</span><br><span class="line"><span class="selector-class">.wordfams</span> <span class="selector-class">.LDOCE5pp_sensefold</span> &#123;</span><br><span class="line">    <span class="attribute">display</span>: none;</span><br><span class="line">&#125;</span><br><span class="line"><span class="selector-class">.topics_container</span> &#123;</span><br><span class="line">    <span class="attribute">display</span>: none;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"># 发音图标</span><br><span class="line"><span class="keyword">@font-face</span> &#123;</span><br><span class="line">    <span class="attribute">font-family</span>: <span class="string">&#x27;lm5pp_icomoon&#x27;</span>;</span><br><span class="line">    <span class="attribute">src</span>: <span class="built_in">url</span>(<span class="string">&quot;lm5pp_icomoon.ttf&quot;</span>);</span><br><span class="line">    <span class="attribute">font-weight</span>: normal;</span><br><span class="line">    <span class="attribute">font-style</span>: normal;</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">@font-face</span> &#123;</span><br><span class="line">    <span class="attribute">font-family</span>: <span class="string">&#x27;cjkextent&#x27;</span>;</span><br><span class="line">    <span class="attribute">src</span>: <span class="built_in">url</span>(<span class="string">&quot;cjkextent.ttf&quot;</span>);</span><br><span class="line">    <span class="attribute">font-weight</span>: normal;</span><br><span class="line">    <span class="attribute">font-style</span>: normal;</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><h5 id="2-CollinsCOBUILDOverhaul-v2-30"><a href="#2-CollinsCOBUILDOverhaul-v2-30" class="headerlink" title="2. CollinsCOBUILDOverhaul v2.30"></a>2. CollinsCOBUILDOverhaul v2.30</h5><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">转换为XML文件</span></span><br><span class="line">python3 ../../main.py --write-format=AppleDict &quot;CollinsCOBUILDOverhaul V 2-30.mdx&quot; COLLINS</span><br><span class="line"></span><br><span class="line">cd COLLINS</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">资源地址替换</span></span><br><span class="line">sed -i &quot;&quot; &#x27;s:src=&quot;/:src=&quot;:g&#x27; COLLINS.xml</span><br></pre></td></tr></table></figure><p>修改 <code>COLLINS.plist</code><br>CFBundleDisplayName：柯林斯英汉双解词典<br>CFBundleName：柯林斯双解</p><p>自带CSS效果显示不好，这里查找到柯林斯完美复原版的样式文件，进行替换。将内容复制到 <code>COLLINS.css</code>，并修改内容</p><figure class="highlight css"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"># 展开隐藏项</span><br><span class="line"><span class="selector-class">.hidden</span> &#123;</span><br><span class="line">    <span class="comment">/* display: none */</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"># 隐藏单词用量</span><br><span class="line"><span class="selector-class">.trend</span><span class="selector-class">.folded</span> &#123;</span><br><span class="line">    <span class="attribute">display</span>: none;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"># 发音图标</span><br><span class="line"><span class="keyword">@font-face</span> &#123;</span><br><span class="line">    <span class="attribute">font-family</span>: <span class="string">&#x27;icomoon&#x27;</span>;</span><br><span class="line">    <span class="attribute">src</span>: <span class="built_in">url</span>(<span class="string">&quot;icomoon.ttf&quot;</span>);</span><br><span class="line">    <span class="attribute">font-weight</span>: normal;</span><br><span class="line">    <span class="attribute">font-style</span>: normal;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="参考资料"><a href="#参考资料" class="headerlink" title="参考资料"></a>参考资料</h3><blockquote><p><a href="https://github.com/ilius/pyglossary/blob/master/doc/apple.md">https://github.com/ilius/pyglossary/blob/master/doc/apple.md</a><br><a href="https://kaihao.io/2018/mdict-to-macos-dictionary/">https://kaihao.io/2018/mdict-to-macos-dictionary/</a><br><a href="https://10382.github.io/post/Mdict%E8%BD%ACMac%E8%AF%8D%E5%85%B8%E5%B0%8F%E8%AE%B0/">https://10382.github.io/post/Mdict转Mac词典小记/</a><br><a href="https://www.zhihu.com/question/20428599">https://www.zhihu.com/question/20428599</a><br><a href="https://www.pdawiki.com/forum/thread-42822-1-1.html">https://www.pdawiki.com/forum/thread-42822-1-1.html</a><br><a href="https://www.pdawiki.com/forum/thread-13014-1-1.html">https://www.pdawiki.com/forum/thread-13014-1-1.html</a><br><a href="https://placeless.net/blog/macos-dictionaries">https://placeless.net/blog/macos-dictionaries</a><br><a href="http://qunwang6.github.io/blog/OSXDictionary/">http://qunwang6.github.io/blog/OSXDictionary/</a><br><a href="https://www.readern.com/convert-mdict-to-macos-dictionary.html">https://www.readern.com/convert-mdict-to-macos-dictionary.html</a></p></blockquote>]]></content>
    
    
    <summary type="html">&lt;p&gt;macOS 自带的 Dictionary.app 相比于其他词典应用，支持触摸板重按取词，十分方便。但自带词典随够用但较少，可以从其他词典转换成支持的词典，以丰富词库。&lt;br&gt;环境搭建及编译难度较小，重点在于各个词典的优化，以达到更加的显示效果。下面是整个词典制作的过程。&lt;/p&gt;</summary>
    
    
    
    
    <category term="macOS" scheme="https://bootell.net/tags/macOS/"/>
    
  </entry>
  
  <entry>
    <title>PHP的gRPC客户端与服务端实践</title>
    <link href="https://bootell.net/2020/08/13/PHP-Integrate-gRPC/"/>
    <id>https://bootell.net/2020/08/13/PHP-Integrate-gRPC/</id>
    <published>2020-08-13T03:14:00.000Z</published>
    <updated>2020-08-13T03:14:00.000Z</updated>
    
    <content type="html"><![CDATA[<p>PHP 服务之间互相调用，有很多选择，可以通过 HTTP 接口，或者 RPC 像 <a href="https://github.com/laruence/yar">yar</a>。考虑到今后多语言的扩展性与通用性，选取了 gRPC 作为方式。</p><span id="more"></span><h3 id="gRPC通信"><a href="#gRPC通信" class="headerlink" title="gRPC通信"></a>gRPC通信</h3><p>gRPC 是一种可在任何环境中运行的现代开源高性能 RPC 框架，可以像调用本地对象一样直接调用另一台不同的机器上服务端应用的方法。</p><p>对比之前通过 HTTP 使用 Json 格式发送数据，说有以下优点：</p><ul><li>使用 <code>*.proto</code> 文件生成相关代码，格式更规范，过程调用简化操作，不会受到 HTTP 资源方法语义的限制；</li><li>使用 Protobuf 编译成二进制，在编组速度和代码大小方面提供了更高效的数据交换；</li><li>通过 HTTP&#x2F;2 进行高效网络传输，引入双向流式传输、流控制、报头压缩等，减少资源使用量，从而缩短应用与服务之间的响应时间。</li></ul><p>gRPC 允许四种方法：</p><ul><li>单向，即客户端发送一个请求给服务端，从服务端获取一个应答</li><li>服务端流式，即客户端发送一个请求给服务端，可获取一个数据流用来读取一系列消息</li><li>客户端流式，即客户端用提供的一个数据流写入并发送一系列消息给服务端</li><li>双向流式，即两边都可以分别通过一个读写数据流来发送一系列消息</li></ul><p>若使用fpm模式运行PHP，暂时只能方便实现单向 RPC。</p><h3 id="gRPC环境配置"><a href="#gRPC环境配置" class="headerlink" title="gRPC环境配置"></a>gRPC环境配置</h3><p>PHP 安装 gRPC 支持，参考文档：<a href="https://grpc.io/docs/languages/php/quickstart/">Quick Start - grpc.io</a></p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">安装扩展，有 pecl 和 composer，两种方式，pecl 性能更好</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">gRPC扩展</span></span><br><span class="line">pecl install grpc</span><br><span class="line">composer require grpc/grpc</span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">Protobuf扩展</span></span><br><span class="line">pecl install protobuf</span><br><span class="line">composer require google/protobuf</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">protoc编译器下载</span></span><br><span class="line">curl --fail -L -O https://github.com/protocolbuffers/protobuf/releases/download/v3.12.4/protoc-3.12.4-linux-x86_64.zip</span><br><span class="line">unzip protoc-3.12.4-linux-x86_64.zip &#x27;bin/protoc&#x27; -d /usr/local</span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">编译器生成php组件</span></span><br><span class="line">apt-get install automake libtool</span><br><span class="line">git clone -b v1.31.0 https://github.com/grpc/grpc</span><br><span class="line">cd grpc &amp;&amp; git submodule update --init</span><br><span class="line">make grpc_php_plugin</span><br><span class="line">mv bins/opt/grpc_php_plugin /usr/local/bin</span><br></pre></td></tr></table></figure><h3 id="gRPC-使用"><a href="#gRPC-使用" class="headerlink" title="gRPC 使用"></a>gRPC 使用</h3><p>编写 <code>Hello.proto</code> 文件</p><figure class="highlight protobuf"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line">syntax = <span class="string">&quot;proto3&quot;</span>;</span><br><span class="line">  </span><br><span class="line"><span class="keyword">package</span> test;</span><br><span class="line">  </span><br><span class="line"><span class="comment">// The greeting service definition.</span></span><br><span class="line"><span class="keyword">service </span><span class="title class_">Hello</span> &#123;</span><br><span class="line">  <span class="comment">// Sends a greeting</span></span><br><span class="line">  <span class="function"><span class="keyword">rpc</span> SayHello (HelloRequest) <span class="keyword">returns</span> (HelloResponse) </span>&#123;&#125;</span><br><span class="line">&#125;</span><br><span class="line">  </span><br><span class="line"><span class="comment">// The request message containing the user&#x27;s name.</span></span><br><span class="line"><span class="keyword">message </span><span class="title class_">HelloRequest</span> &#123;</span><br><span class="line">  <span class="type">string</span> name = <span class="number">1</span>;</span><br><span class="line">&#125;</span><br><span class="line">  </span><br><span class="line"><span class="comment">// The response message containing the greetings</span></span><br><span class="line"><span class="keyword">message </span><span class="title class_">HelloResponse</span> &#123;</span><br><span class="line">  <span class="type">string</span> message = <span class="number">1</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>生成 php 文件</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">protoc -I ./protos_dir --php_out=./php_dir --grpc_out=./GPBMetadata_dir --plugin=protoc-gen-grpc=$(which grpc_php_plugin) $file</span><br></pre></td></tr></table></figure><p>由于生成的文件命名空间都是根目录，所以需要在 <code>composer.json</code> 中添加自动加载，然后执行 <code>composer dump-autoload</code></p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">&quot;autoload&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">  <span class="attr">&quot;psr-4&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;GPBMetadata\\&quot;</span><span class="punctuation">:</span> <span class="string">&quot;GPBMetadata_dir/GPBMetadata&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;Test\\&quot;</span><span class="punctuation">:</span> <span class="string">&quot;php_dir/Test&quot;</span></span><br><span class="line">  <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>Nginx 对 gRPC的支持，参考文档：<a href="https://www.nginx.com/blog/nginx-1-13-10-grpc/">gRPC Support with NGINX</a></p><figure class="highlight nginx"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="section">server</span> &#123;</span><br><span class="line">    <span class="attribute">listen</span> <span class="number">8000</span> http2;</span><br><span class="line">    <span class="attribute">server_name</span> grpc.localhost;</span><br><span class="line">    <span class="attribute">index</span> index.php;、</span><br><span class="line">    <span class="attribute">root</span> /var/www/grpc;</span><br><span class="line"></span><br><span class="line">    <span class="attribute">add_trailer</span> grpc-status <span class="variable">$sent_http_grpc_status</span>;</span><br><span class="line">    <span class="attribute">add_trailer</span> grpc-message <span class="variable">$sent_http_grpc_message</span>;</span><br><span class="line"></span><br><span class="line">    <span class="section">location</span> / &#123;</span><br><span class="line">        <span class="attribute">try_files</span> <span class="variable">$uri</span> <span class="variable">$uri</span>/ /index.php<span class="variable">$is_args</span><span class="variable">$args</span>;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="section">location</span> <span class="regexp">~ \.php$</span> &#123;</span><br><span class="line">        <span class="attribute">include</span> fastcgi_params;</span><br><span class="line">        <span class="attribute">fastcgi_param</span> SCRIPT_FILENAME <span class="variable">$document_root</span><span class="variable">$fastcgi_script_name</span>;</span><br><span class="line">        <span class="attribute">fastcgi_pass</span> php-fpm:<span class="number">9000</span>;</span><br><span class="line">        <span class="attribute">try_files</span> <span class="variable">$uri</span> =<span class="number">404</span>;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>使用 yii2 框架实现的完整例子：<a href="https://github.com/bootell/yii2-grpc-demo">bootell&#x2F;yii2-grpc-demo</a></p>]]></content>
    
    
    <summary type="html">&lt;p&gt;PHP 服务之间互相调用，有很多选择，可以通过 HTTP 接口，或者 RPC 像 &lt;a href=&quot;https://github.com/laruence/yar&quot;&gt;yar&lt;/a&gt;。考虑到今后多语言的扩展性与通用性，选取了 gRPC 作为方式。&lt;/p&gt;</summary>
    
    
    
    
    <category term="PHP" scheme="https://bootell.net/tags/PHP/"/>
    
  </entry>
  
  <entry>
    <title>SSH配置与使用</title>
    <link href="https://bootell.net/2020/07/18/SSH-Config-and-Usage/"/>
    <id>https://bootell.net/2020/07/18/SSH-Config-and-Usage/</id>
    <published>2020-07-18T02:28:00.000Z</published>
    <updated>2020-07-18T02:28:00.000Z</updated>
    
    <content type="html"><![CDATA[<h3 id="免密登录"><a href="#免密登录" class="headerlink" title="免密登录"></a>免密登录</h3><p>创建本地公私钥文件：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">ssh-keygen -t rsa -b 4096 -C &quot;youremail@example.com”</span><br></pre></td></tr></table></figure><p>生成后的文件在<code>~/.ssh</code>目录内，<code>id_rsa</code>为私钥，<code>id_rsa.pub</code>为公钥。将公钥复制到远程机器对应用户的<code>~/.ssh/authorized_keys</code>内。</p><span id="more"></span><h3 id="远程服务器配置"><a href="#远程服务器配置" class="headerlink" title="远程服务器配置"></a>远程服务器配置</h3><p>为保证远程服务器安全，创建非root用户：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">adduser custom_user</span><br></pre></td></tr></table></figure><p>编辑服务器ssh服务配置文件<code>/etc/ssh/sshd_config</code>：</p><figure class="highlight ini"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 修改登陆端口号</span></span><br><span class="line">Port 2210</span><br><span class="line"><span class="comment"># 禁止 root 登陆</span></span><br><span class="line">PermitRootLogin no</span><br><span class="line"><span class="comment"># 关闭远程密码登陆</span></span><br><span class="line">PasswordAuthentication no</span><br><span class="line">UsePAM no</span><br></pre></td></tr></table></figure><p>重启ssh服务：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo service ssh restart</span><br></pre></td></tr></table></figure><h3 id="本地SSH配置"><a href="#本地SSH配置" class="headerlink" title="本地SSH配置"></a>本地SSH配置</h3><p>ssh的配置文件有两个，分别对应全局<code>/etc/ssh/ssh_config</code>，当前用户<code>~/.ssh/config</code>，以下配置在这两个中都有效。</p><h5 id="保持连接"><a href="#保持连接" class="headerlink" title="保持连接"></a>保持连接</h5><p>SSH连接长时间不使用，会断连。可以通过配置间隔向server发送keep-alive包，保持连接</p><figure class="highlight ini"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 每30秒发送一个keep-alive包</span></span><br><span class="line">ServerAliveInterval 30</span><br><span class="line"><span class="comment"># 发送10次都无响应，断开连接</span></span><br><span class="line">ServerAliveCountMax 10</span><br></pre></td></tr></table></figure><h5 id="使用多个key"><a href="#使用多个key" class="headerlink" title="使用多个key"></a>使用多个key</h5><p>可以针对不同server设置不同的key，也可以配置多个key让ssh分别尝试</p><figure class="highlight ini"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 指定单个server的key文件</span></span><br><span class="line">Host example.com</span><br><span class="line">    IdentityFile ~/.ssh/id_rsa_other</span><br><span class="line">    </span><br><span class="line"><span class="comment"># 通用配置多个key</span></span><br><span class="line">Host *</span><br><span class="line">    IdentityFile ~/.ssh/id_rsa_1</span><br><span class="line">    IdentityFile ~/.ssh/id_rsa_2</span><br></pre></td></tr></table></figure><h5 id="共享连接"><a href="#共享连接" class="headerlink" title="共享连接"></a>共享连接</h5><p>同时打开多个同一个server的连接，连接间可以复用，不用再进行连接建立，加快连接速度。</p><figure class="highlight ini"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">Host *</span><br><span class="line">    ControlMaster auto</span><br><span class="line">    ControlPath /tmp/ssh-connection-%h-%p-%r</span><br></pre></td></tr></table></figure><p>在会话结束后，可以让master连接继续在后台保持一段时间，加快下次连接速度。这个在git拉取推送代码时非常有用，可以加快每次的速度。</p><figure class="highlight ini"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">Host *</span><br><span class="line">    ControlPersist 4h</span><br></pre></td></tr></table></figure><h3 id="端口转发"><a href="#端口转发" class="headerlink" title="端口转发"></a>端口转发</h3><p><img src="/images/ssh_forward.png" alt="ssh-forward.drawio"></p><h5 id="本地端口转发"><a href="#本地端口转发" class="headerlink" title="本地端口转发"></a>本地端口转发</h5><p>将发送到本地端口的请求，通过中间服务器，转发到目标主机端口。</p><p>例如情况①，server3可以连接到server1，不能访问server2，两台机器server1、server2间可以通信；这时在server3上设置转发，通过中间服务器server1，将本地的端口转发到server2。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">ssh -L [绑定地址:]本地端口:目标地址:目标端口 中间服务器信息</span><br><span class="line">ssh -g -L [localhost:]8080:server2:80 root@server1</span><br></pre></td></tr></table></figure><p><code>-g</code>参数允许远程连接使用此端口转发，如果不设置，只能server3通过localhost:port来进行访问。</p><h5 id="远程端口转发"><a href="#远程端口转发" class="headerlink" title="远程端口转发"></a>远程端口转发</h5><p>将发送到远程端口的请求，转发到目标端口。</p><p>例如情况②，server1可以连接server2与server3，server3不能访问server2；这时在server1上设置转发，使server3能够通过server1的端口，访问server2的端口。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">ssh -R [绑定地址:]绑定端口:目标地址:目标端口 用户@主机地址</span><br><span class="line">ssh -R [localhost:]8080:server2:80 root@server3</span><br></pre></td></tr></table></figure><h5 id="动态端口转发"><a href="#动态端口转发" class="headerlink" title="动态端口转发"></a>动态端口转发</h5><p>将发送到本地端口的请求，转发指定地址，目标地址和端口由发起的请求决定。</p><p>在开启转发后，ssh将在本地建立socket代理，将客户端的代理设置成本地代理即可使用。</p><p>例如情况①，若server3需要访问server2上的多个端口服务；这时在server3上设置动态转发，然后将需要转发客户端的代理设置成<code>127.0.0.1:8080</code>即可。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">ssh -D [绑定地址:]绑定端口 用户@主机地址</span><br><span class="line">ssh -D [localhost:]8080 root@server1</span><br></pre></td></tr></table></figure>]]></content>
    
    
    <summary type="html">&lt;h3 id=&quot;免密登录&quot;&gt;&lt;a href=&quot;#免密登录&quot; class=&quot;headerlink&quot; title=&quot;免密登录&quot;&gt;&lt;/a&gt;免密登录&lt;/h3&gt;&lt;p&gt;创建本地公私钥文件：&lt;/p&gt;
&lt;figure class=&quot;highlight shell&quot;&gt;&lt;table&gt;&lt;tr&gt;&lt;td class=&quot;gutter&quot;&gt;&lt;pre&gt;&lt;span class=&quot;line&quot;&gt;1&lt;/span&gt;&lt;br&gt;&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;code&quot;&gt;&lt;pre&gt;&lt;span class=&quot;line&quot;&gt;ssh-keygen -t rsa -b 4096 -C &amp;quot;youremail@example.com”&lt;/span&gt;&lt;br&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/figure&gt;

&lt;p&gt;生成后的文件在&lt;code&gt;~/.ssh&lt;/code&gt;目录内，&lt;code&gt;id_rsa&lt;/code&gt;为私钥，&lt;code&gt;id_rsa.pub&lt;/code&gt;为公钥。将公钥复制到远程机器对应用户的&lt;code&gt;~/.ssh/authorized_keys&lt;/code&gt;内。&lt;/p&gt;</summary>
    
    
    
    
    <category term="Linux" scheme="https://bootell.net/tags/Linux/"/>
    
  </entry>
  
  <entry>
    <title>自动化编译 Hexo 博客并部署在 Github Page</title>
    <link href="https://bootell.net/2019/10/03/Build-Hexo-Using-Github-Actions/"/>
    <id>https://bootell.net/2019/10/03/Build-Hexo-Using-Github-Actions/</id>
    <published>2019-10-03T12:00:00.000Z</published>
    <updated>2022-09-11T14:40:51.208Z</updated>
    
    <content type="html"><![CDATA[<p>一直使用 <a href="https://travis-ci.org/">Travis CI</a> 来自动生成和部署 Hexo。最近 Github 推出了自己的持续集成服务 <a href="https://github.com/features/actions">Github Actions</a>，于是改用它以方便管理。在此记录一下两种方法。</p><span id="more"></span><h3 id="Hexo-配置"><a href="#Hexo-配置" class="headerlink" title="Hexo 配置"></a>Hexo 配置</h3><p>将主题文件使用 git submodule 跟踪：<code>git submodule add https://github.com/hexojs/hexo-theme-landscape themes/landscape</code>；<br>先随意推送一个提交到远端的 master 分支进行初始化，源码文件不要使用 master 分支提交到远端；</p><h3 id="使用-Github-Actions"><a href="#使用-Github-Actions" class="headerlink" title="使用 Github Actions"></a>使用 Github Actions</h3><p>目前使用 <a href="https://github.com/features/actions">Github Actions</a> 需要申请，申请后会在 Repo 里出现 Actions 的标签。<br>使用时，直接在根目录下创建 <code>.github/workflow/deploy.yaml</code> 即可，注意修改 git 的信息。</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">name:</span> <span class="string">Build</span> <span class="string">and</span> <span class="string">Deploy</span></span><br><span class="line"></span><br><span class="line"><span class="attr">on:</span> [<span class="string">push</span>]</span><br><span class="line"></span><br><span class="line"><span class="attr">jobs:</span></span><br><span class="line">  <span class="attr">build:</span></span><br><span class="line">    <span class="attr">runs-on:</span> <span class="string">ubuntu-latest</span></span><br><span class="line">    <span class="attr">steps:</span></span><br><span class="line">    <span class="bullet">-</span> <span class="attr">uses:</span> <span class="string">actions/checkout@master</span></span><br><span class="line">      <span class="attr">with:</span></span><br><span class="line">        <span class="attr">ref:</span> <span class="string">hexo</span></span><br><span class="line">        <span class="attr">submodules:</span> <span class="literal">true</span></span><br><span class="line">    <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">Use</span> <span class="string">Node.js</span></span><br><span class="line">      <span class="attr">uses:</span> <span class="string">actions/setup-node@v1</span></span><br><span class="line">      <span class="attr">with:</span></span><br><span class="line">        <span class="attr">node-version:</span> <span class="string">&#x27;10.x&#x27;</span></span><br><span class="line">    <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">Install</span> <span class="string">dependencies</span></span><br><span class="line">      <span class="attr">run:</span> <span class="string">npm</span> <span class="string">install</span></span><br><span class="line">    <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">Get</span> <span class="string">old</span> <span class="string">version</span></span><br><span class="line">      <span class="attr">run:</span> <span class="string">|</span></span><br><span class="line"><span class="string">        npx hexo clean</span></span><br><span class="line"><span class="string">        git clone &quot;https://github.com/$&#123;GITHUB_REPOSITORY&#125;&quot; -b master public</span></span><br><span class="line"><span class="string"></span>    <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">Generate</span> <span class="string">pages</span></span><br><span class="line">      <span class="attr">run:</span> <span class="string">npx</span> <span class="string">hexo</span> <span class="string">generate</span></span><br><span class="line">    <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">Deploy</span> <span class="string">to</span> <span class="string">GitHub</span> <span class="string">Page</span></span><br><span class="line">      <span class="attr">env:</span></span><br><span class="line">        <span class="attr">GITHUB_TOKEN:</span> <span class="string">$&#123;&#123;</span> <span class="string">secrets.GITHUB_TOKEN</span> <span class="string">&#125;&#125;</span></span><br><span class="line">      <span class="attr">run:</span> <span class="string">|</span></span><br><span class="line"><span class="string">        cd $GITHUB_WORKSPACE/public</span></span><br><span class="line"><span class="string">        git config user.name $GITHUB_ACTOR</span></span><br><span class="line"><span class="string">        git add .</span></span><br><span class="line"><span class="string">        git commit -m &quot;github actions auto build&quot;</span></span><br><span class="line"><span class="string">        git push https://$&#123;GITHUB_ACTOR&#125;:$&#123;GITHUB_TOKEN&#125;@github.com/$&#123;GITHUB_REPOSITORY&#125;.git master:master</span></span><br></pre></td></tr></table></figure><h3 id="使用-Travis-CI"><a href="#使用-Travis-CI" class="headerlink" title="使用 Travis CI"></a>使用 Travis CI</h3><p>首先需要创建 <a href="https://github.com/settings/tokens">GitHub Token</a>，用于最后推到 <code>Github Pages</code>；然后登录 <a href="https://travis-ci.org/">Travis CI</a> 并授权后，在设置里添加变量 <code>GITHUB_TOKEN</code>，值为上面创建的 token。<br>在根目录下添加 <code>.travis.yml</code> 文件：</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">language:</span> <span class="string">node_js</span></span><br><span class="line"></span><br><span class="line"><span class="attr">node_js:</span> <span class="string">stable</span></span><br><span class="line"></span><br><span class="line"><span class="attr">cache:</span></span><br><span class="line">  <span class="attr">directories:</span></span><br><span class="line">    <span class="bullet">-</span> <span class="string">node_modules</span></span><br><span class="line"></span><br><span class="line"><span class="attr">install:</span></span><br><span class="line">  <span class="bullet">-</span> <span class="string">npm</span> <span class="string">install</span></span><br><span class="line"></span><br><span class="line"><span class="attr">script:</span></span><br><span class="line">  <span class="bullet">-</span> <span class="string">hexo</span> <span class="string">clean</span></span><br><span class="line">  <span class="bullet">-</span> <span class="string">git</span> <span class="string">clone</span> <span class="string">https://github.com/$&#123;TRAVIS_REPO_SLUG&#125;.git</span> <span class="string">-b</span> <span class="string">master</span> <span class="string">public/</span></span><br><span class="line">  <span class="bullet">-</span> <span class="string">hexo</span> <span class="string">generate</span></span><br><span class="line"></span><br><span class="line"><span class="attr">after_script:</span></span><br><span class="line">  <span class="bullet">-</span> <span class="string">cd</span> <span class="string">./public</span></span><br><span class="line">  <span class="bullet">-</span> <span class="string">git</span> <span class="string">checkout</span> <span class="string">master</span></span><br><span class="line">  <span class="bullet">-</span> <span class="string">git</span> <span class="string">config</span> <span class="string">user.name</span> <span class="string">&quot;Travis CI&quot;</span></span><br><span class="line">  <span class="bullet">-</span> <span class="string">git</span> <span class="string">add</span> <span class="string">.</span></span><br><span class="line">  <span class="bullet">-</span> <span class="string">git</span> <span class="string">commit</span> <span class="string">-m</span> <span class="string">&quot;travis-ci auto build&quot;</span></span><br><span class="line">  <span class="bullet">-</span> <span class="string">git</span> <span class="string">push</span> <span class="string">&quot;https://$&#123;GITHUB_TOKEN&#125;@github.com/$&#123;TRAVIS_REPO_SLUG&#125;.git&quot;</span> <span class="string">master:master</span></span><br></pre></td></tr></table></figure>]]></content>
    
    
    <summary type="html">&lt;p&gt;一直使用 &lt;a href=&quot;https://travis-ci.org/&quot;&gt;Travis CI&lt;/a&gt; 来自动生成和部署 Hexo。最近 Github 推出了自己的持续集成服务 &lt;a href=&quot;https://github.com/features/actions&quot;&gt;Github Actions&lt;/a&gt;，于是改用它以方便管理。在此记录一下两种方法。&lt;/p&gt;</summary>
    
    
    
    
    <category term="Blog" scheme="https://bootell.net/tags/Blog/"/>
    
  </entry>
  
  <entry>
    <title>Scrapy 爬虫开发</title>
    <link href="https://bootell.net/2019/08/06/Scrapy-Development/"/>
    <id>https://bootell.net/2019/08/06/Scrapy-Development/</id>
    <published>2019-08-06T02:00:00.000Z</published>
    <updated>2022-09-11T14:40:51.208Z</updated>
    
    <content type="html"><![CDATA[<h3 id="Installation"><a href="#Installation" class="headerlink" title="Installation"></a>Installation</h3><p>Scrapy 支持 Python2.7 及 3.4+，安装步骤按照<a href="https://docs.scrapy.org/en/latest/intro/install.html">官方文档</a>进行</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 安装</span></span><br><span class="line">pip3 install Scrapy</span><br><span class="line"><span class="comment"># 创建项目</span></span><br><span class="line">scrapy startproject crawler</span><br></pre></td></tr></table></figure><p>安装完毕后，目录结构如下所示</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">crawler/</span><br><span class="line">    scrapy.cfg            # scrapyd-client 项目部署时使用</span><br><span class="line">    crawler/</span><br><span class="line">        __init__.py</span><br><span class="line">        items.py          # 爬取结果结构化</span><br><span class="line">        middlewares.py    # 项目中间件</span><br><span class="line">        pipelines.py      # 项目管道</span><br><span class="line">        settings.py       # 项目设置</span><br><span class="line">        spiders/          # 爬虫所在目录</span><br><span class="line">            __init__.py</span><br></pre></td></tr></table></figure><p>Scrapy 执行时的流程大致是</p><p><img src="/images/scrapy_flow.png" alt="Scrapy 流程"></p><ol><li>Engine 从 Spider 获取 <code>start_urls</code></li><li>Engine 将 <code>start_urls</code> 发送到 Scheduler，并请求下一个爬取的 Request</li><li>Scheduler 将下一个要爬取 Request 返回给 Engine</li><li>Engine 将收到的 Request 执行所有 Middleware 的 <code>process_request()</code> 后，发送到 Downloader</li><li>Downloader 下载内容后，执行所有 Middleware 的 <code>process_response()</code> 后，将结果返回给 Engine</li><li>Engine 将内容发送给 Spider 做数据处理，之前执行 Middleware 的 <code>process_spider_input()</code></li><li>Spider 处理后，将结果通过 Middleware 的 <code>process_spider_output()</code> 后，返回给 Engine</li><li>Engine 将处理后的数据发送给 Pipline 进行操作，并将处理过的 Request 发送给 Scheduler，请求下一个 Request</li></ol><p><code>settings.py</code> 配置文件中，一般需要修改的配置如下</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 是否遵循 robots 协议</span></span><br><span class="line">ROBOTSTXT_OBEY = <span class="literal">False</span></span><br><span class="line"><span class="comment"># 并发请求数</span></span><br><span class="line">CONCURRENT_REQUESTS = <span class="number">16</span></span><br><span class="line"><span class="comment"># 下载间隔，实际范围在 0.5*DOWNLOAD_DELAY 到 1.5*DOWNLOAD_DELAY 之间</span></span><br><span class="line">DOWNLOAD_DELAY = <span class="number">1</span></span><br><span class="line"><span class="comment"># 域名/IP 并发请求限制</span></span><br><span class="line">CONCURRENT_REQUESTS_PER_DOMAIN = <span class="number">16</span></span><br><span class="line">CONCURRENT_REQUESTS_PER_IP = <span class="number">16</span></span><br><span class="line"><span class="comment"># 启用 Cookies</span></span><br><span class="line">COOKIES_ENABLED = <span class="literal">False</span></span><br><span class="line"><span class="comment"># 下载中间件，后面数字表示优先级</span></span><br><span class="line">DOWNLOADER_MIDDLEWARES = &#123;&#125;</span><br><span class="line"><span class="comment"># 管道</span></span><br><span class="line">ITEM_PIPELINES = &#123;&#125;</span><br></pre></td></tr></table></figure><span id="more"></span><h3 id="Spider"><a href="#Spider" class="headerlink" title="Spider"></a>Spider</h3><p>框架事先定义了几个通用的爬虫</p><ul><li><code>scrapy.spiders.Spider</code> 最基本的爬虫</li><li><code>scrapy.spiders.CrawlSpider</code> 最常用爬虫，能根据规则对全站网站进行爬取</li><li><code>scrapy.spiders.XMLFeedSpider</code></li><li><code>scrapy.spiders.CSVFeedSpider</code></li><li><code>scrapy.spiders.SitemapSpider</code></li></ul><p><code>CrawlSpider</code> 继承了 <code>Spider</code> 外，提供了额外的属性和方法：</p><ul><li><code>rules</code>：是 <code>Rule</code> 对象的列表，定义了爬取网站的规则。它对不同的连接所需要执行的动作进行了定义。</li><li><code>parse_start_url()</code>：<code>start_url</code> 的请求返回时，该方法会被调用</li></ul><p><code>Rule</code> 对象主要作用为过滤有效链接，指定链接处理方法，并确定是否继续跟进</p><ul><li><code>link_extractor</code> 从爬取的页面中提取指定格式的链接，生成新的 Request 请求，规则有 <code>allow</code>，<code>deny</code> 等，详细见<a href="https://docs.scrapy.org/en/latest/topics/link-extractors.html#topics-link-extractors">官方文档</a></li><li><code>callback</code> 对格式匹配的页面执行对应的处理方法</li><li><code>cb_kwargs</code> 回掉函数的参数</li><li><code>follow</code> 是否对页面中的连接继续跟进，当 <code>callback</code> 为 <code>None</code> 时默认为 <code>True</code>，其他默认为 <code>False</code></li><li><code>process_links</code> 过滤提取的链接</li><li><code>process_request</code> 对指定链接 Request 请求进行处理</li></ul><p>完整示例 <code>crawlwer/spiders/ithome.py</code> 如下所示</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> scrapy</span><br><span class="line"><span class="keyword">from</span> scrapy.spiders <span class="keyword">import</span> CrawlSpider, Rule</span><br><span class="line"><span class="keyword">from</span> scrapy.linkextractors <span class="keyword">import</span> LinkExtractor</span><br><span class="line"><span class="keyword">from</span> ..items <span class="keyword">import</span> CrawlerItem</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">Ithome</span>(<span class="title class_ inherited__">CrawlSpider</span>):</span><br><span class="line">    <span class="comment"># 爬虫名臣</span></span><br><span class="line">    name = <span class="string">&quot;ithome&quot;</span></span><br><span class="line">    <span class="comment"># 允许爬取的域名</span></span><br><span class="line">    allowed_domains = [</span><br><span class="line">        <span class="string">&#x27;ithome.com&#x27;</span>,</span><br><span class="line">    ]</span><br><span class="line">    <span class="comment"># 初始链接</span></span><br><span class="line">    start_urls = [<span class="string">&#x27;https://www.ithome.com&#x27;</span>]</span><br><span class="line"></span><br><span class="line">    <span class="comment"># 爬取的规则</span></span><br><span class="line">    rules = (</span><br><span class="line">        Rule(LinkExtractor(allow=(<span class="string">&#x27;[0-9]+\.htm&#x27;</span>)), callback=<span class="string">&#x27;parse_article&#x27;</span>, follow=<span class="literal">True</span>),</span><br><span class="line">        Rule(LinkExtractor(allow=(<span class="string">&#x27;.*\.htm&#x27;</span>))),</span><br><span class="line">    )</span><br><span class="line"></span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">parse_article</span>(<span class="params">self, response</span>):</span><br><span class="line">        self.logger.info(<span class="string">&#x27;Parsing url: %s&#x27;</span>, response.url)</span><br><span class="line">        item = CrawlerItem()</span><br><span class="line">        item[<span class="string">&#x27;url&#x27;</span>] = response.url</span><br><span class="line">        item[<span class="string">&#x27;title&#x27;</span>] = response.css(<span class="string">&#x27;.post_title h1::text&#x27;</span>).get()</span><br><span class="line">        content = response.css(<span class="string">&#x27;.post_content p ::text&#x27;</span>).getall()</span><br><span class="line">        item[<span class="string">&#x27;content&#x27;</span>] = <span class="string">&#x27;&#x27;</span>.join(content)</span><br><span class="line">        <span class="keyword">return</span> item</span><br></pre></td></tr></table></figure><h3 id="Items"><a href="#Items" class="headerlink" title="Items"></a>Items</h3><p>用来定义结构化的结果<br>完整示例 <code>crawler/items.py</code> 如下所示</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> scrapy</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">CrawlerItem</span>(scrapy.Item):</span><br><span class="line">    url = scrapy.Field()</span><br><span class="line">    title = scrapy.Field()</span><br><span class="line">    content = scrapy.Field()</span><br><span class="line">    comments = scrapy.Field()</span><br></pre></td></tr></table></figure><h3 id="Middlewares"><a href="#Middlewares" class="headerlink" title="Middlewares"></a>Middlewares</h3><p>middleware 主要用来在下载，处理爬虫等前后，进行相关操作。<br>本次主要用来处理反爬虫。最基本反爬虫一般是通过浏览器的 UA，客户端的 IP，以及动态加载的 JS 来实现。于是针对以上措施，分别进行处理。</p><ul><li><p>请求 UA 随机中间件<br>需要安装 <code>pip3 install fake-useragent</code>，它从 <code>useragentstring.com</code> 和 <code>w3schools.com</code> 获取真实的浏览器 useragent，并在本地进行缓存<br>最早时从网上找了一些 UA，在本地做了一个随机获取，结果网上的 UA 已被翻爬虫过滤了，不能绕过反爬机制</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> fake_useragent <span class="keyword">import</span> UserAgent</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">UserAgentMiddleware</span>(<span class="title class_ inherited__">object</span>):</span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">__init__</span>(<span class="params">self</span>):</span><br><span class="line">        self.ua = UserAgent()</span><br><span class="line"></span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">process_request</span>(<span class="params">self, request, spider</span>):</span><br><span class="line">        request.headers.setdefault(<span class="string">&#x27;User-Agent&#x27;</span>, self.ua.random)</span><br></pre></td></tr></table></figure></li><li><p>随机代理中间件</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> redis</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">ProxyMiddleware</span>(<span class="title class_ inherited__">object</span>):</span><br><span class="line"></span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">__init__</span>(<span class="params">self, from_url, proxy_pool_key</span>):</span><br><span class="line">        self.redis = redis.from_url(from_url)</span><br><span class="line">        self.proxy_pool_key = proxy_pool_key</span><br><span class="line"></span><br><span class="line"><span class="meta">    @classmethod</span></span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">from_crawler</span>(<span class="params">cls, crawler</span>):</span><br><span class="line">        <span class="keyword">return</span> cls(</span><br><span class="line">            from_url=crawler.settings.get(<span class="string">&#x27;REDIS_URL&#x27;</span>),</span><br><span class="line">            proxy_pool_key=crawler.settings.get(<span class="string">&#x27;PROXY_POOL_KEY&#x27;</span>),</span><br><span class="line">        )</span><br><span class="line"></span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">process_request</span>(<span class="params">self, request, spider</span>):</span><br><span class="line">        proxy = self.redis.srandmember(self.proxy_pool_key)</span><br><span class="line">        <span class="keyword">if</span> proxy:</span><br><span class="line">            proxy = proxy.decode()</span><br><span class="line">            spider.logger.info(<span class="string">&#x27;Using proxy: %s&#x27;</span>, proxy)</span><br><span class="line">            request.meta[<span class="string">&#x27;proxy&#x27;</span>] = proxy</span><br></pre></td></tr></table></figure></li></ul><blockquote><h4 id="代理配置"><a href="#代理配置" class="headerlink" title="代理配置"></a>代理配置</h4><p>服务器有多个IP，可以使用 <a href="https://wiki.squid-cache.org/">squid</a> 创建 http 代理服务器，通过设置代理不同端口使用不同的 IP 地址。</p><p>安装直接通过 <code>apt-get install squid</code> ，安装完成后修改配置文件 <code>/etc/squid/squid.conf</code></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"># 定义权限规则列表</span><br><span class="line">acl port3128 localport 3128</span><br><span class="line">acl port3129 localport 3129</span><br><span class="line">acl port3130 localport 3130</span><br><span class="line">acl port3131 localport 3131</span><br><span class="line">acl port3132 localport 3132</span><br><span class="line">acl port3133 localport 3133</span><br><span class="line"></span><br><span class="line"># 定义访问控制，允许接入的地址</span><br><span class="line">http_access allow all</span><br><span class="line"></span><br><span class="line"># 定义监听的端口</span><br><span class="line">http_port 3128</span><br><span class="line">http_port 3129</span><br><span class="line">http_port 3130</span><br><span class="line">http_port 3131</span><br><span class="line">http_port 3132</span><br><span class="line">http_port 3133</span><br><span class="line"></span><br><span class="line"># 定义转发地址</span><br><span class="line">tcp_outgoing_address 172.16.0.106 port3128</span><br><span class="line">tcp_outgoing_address 172.16.0.107 port3129</span><br><span class="line">tcp_outgoing_address 172.16.0.108 port3130</span><br><span class="line">tcp_outgoing_address 172.16.0.109 port3131</span><br><span class="line">tcp_outgoing_address 172.16.0.110 port3132</span><br><span class="line">tcp_outgoing_address 172.16.0.248 port3133</span><br></pre></td></tr></table></figure></blockquote><ul><li>CloudFlare 反爬虫，起主要反爬方法是通过 JS 生成本地 Cookie。<br>可以通过 <a href="https://github.com/clemfromspace/scrapy-cloudflare-middleware">scrapy_cloudflare_middleware</a> 进行处理，直接安装 <code>pip3 install scrapy_cloudflare_middleware</code></li></ul><p>启动的 Middlewares 需要写入 <code>settings.py</code> 配置文件</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># CloudFlare 反爬需要开启 cookie</span></span><br><span class="line">COOKIES_ENABLED = <span class="literal">True</span></span><br><span class="line"></span><br><span class="line">DOWNLOADER_MIDDLEWARES = &#123;</span><br><span class="line">    <span class="comment"># 禁用 Scrapy 自带的 UA Middleware</span></span><br><span class="line">    <span class="string">&#x27;scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware&#x27;</span>: <span class="literal">None</span>, </span><br><span class="line">    <span class="string">&#x27;crawler.middlewares.UserAgentMiddleware&#x27;</span>: <span class="number">400</span>,</span><br><span class="line">    <span class="string">&#x27;crawler.middlewares.ProxyMiddleware&#x27;</span>: <span class="number">543</span>,</span><br><span class="line">    <span class="string">&#x27;scrapy_cloudflare_middleware.middlewares.CloudFlareMiddleware&#x27;</span>: <span class="number">560</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="Piplines"><a href="#Piplines" class="headerlink" title="Piplines"></a>Piplines</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> pymongo</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">MongoPipeline</span>(<span class="title class_ inherited__">object</span>):</span><br><span class="line">    collection_name = <span class="string">&#x27;scrapy_items&#x27;</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">__init__</span>(<span class="params">self, mongo_uri, mongo_db</span>):</span><br><span class="line">        self.mongo_uri = mongo_uri</span><br><span class="line">        self.mongo_db = mongo_db</span><br><span class="line"></span><br><span class="line"><span class="meta">    @classmethod</span></span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">from_crawler</span>(<span class="params">cls, crawler</span>):</span><br><span class="line">        <span class="keyword">return</span> cls(</span><br><span class="line">            mongo_uri=crawler.settings.get(<span class="string">&#x27;MONGO_URI&#x27;</span>),</span><br><span class="line">            mongo_db=crawler.settings.get(<span class="string">&#x27;MONGO_DATABASE&#x27;</span>, <span class="string">&#x27;items&#x27;</span>)</span><br><span class="line">        )</span><br><span class="line"></span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">open_spider</span>(<span class="params">self, spider</span>):</span><br><span class="line">        self.client = pymongo.MongoClient(self.mongo_uri)</span><br><span class="line">        self.db = self.client[self.mongo_db]</span><br><span class="line"></span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">close_spider</span>(<span class="params">self, spider</span>):</span><br><span class="line">        self.client.close()</span><br><span class="line"></span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">process_item</span>(<span class="params">self, item, spider</span>):</span><br><span class="line">        self.db[self.collection_name].insert_one(<span class="built_in">dict</span>(item))</span><br><span class="line">        <span class="keyword">return</span> item</span><br></pre></td></tr></table></figure><p>在 <code>settings.py</code> 配置中增加</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">ITEM_PIPELINES = &#123;</span><br><span class="line">    <span class="string">&#x27;crawler.pipelines.MongoPipeline&#x27;</span>: <span class="number">300</span>,</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">MONGO_URI=<span class="string">&#x27;mongodb://127.0.0.1:27017&#x27;</span></span><br><span class="line">MONGO_DATABASE = <span class="string">&#x27;items&#x27;</span></span><br></pre></td></tr></table></figure><h3 id="Distributed-crawling"><a href="#Distributed-crawling" class="headerlink" title="Distributed crawling"></a>Distributed crawling</h3><p>Scrapy Scheduler 和 Duplication Filter 本身使用了本地文件来来存储，不能进行水平的扩展。可以使用 <a href="https://github.com/rmax/scrapy-redis">Scrapy-Redis</a> 来存放这些数据，使爬虫能够方便扩展，可以分布式部署。</p><p>安装通过 <code>pip3 install Scrapy-Redis</code>，安装后修改 settings 配置</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">SCHEDULER = <span class="string">&quot;scrapy_redis.scheduler.Scheduler&quot;</span></span><br><span class="line">DUPEFILTER_CLASS = <span class="string">&quot;scrapy_redis.dupefilter.RFPDupeFilter&quot;</span></span><br><span class="line">SCHEDULER_PERSIST = <span class="literal">True</span></span><br><span class="line"></span><br><span class="line">REDIS_URL=<span class="string">&#x27;redis://:password@127.0.0.1:6379/0&#x27;</span></span><br></pre></td></tr></table></figure><p>修改 Spider 继承 <code>scrapy_redis.spiders.RedisCrawlSpider</code></p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> scrapy_redis.spiders <span class="keyword">import</span> RedisCrawlSpider</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">Ithome</span>(<span class="title class_ inherited__">RedisCrawlSpider</span>):</span><br><span class="line">    <span class="comment"># start_urls = [&#x27;https://www.ithome.com&#x27;]</span></span><br></pre></td></tr></table></figure><p>由于任务是从 redis 中读取，所以 <code>start_urls</code> 需要直接存入 redis <code>redis-cli lpush ithome:start_urls https://www.ithome.com</code></p><h3 id="Deploy"><a href="#Deploy" class="headerlink" title="Deploy"></a>Deploy</h3><p>只运行单个爬虫时，直接使用 <code>scrapy crawl spider-name</code> 命令来运行，按 <code>Ctrl+C</code> 来停止。</p><p>部署到服务器执行时，需要执行多个爬虫，可以通过 <code>scrapyd</code> 服务运行、监控。<br>首先安装 <code>pip3 install scrapyd </code>，然后在项目目录创建配置文件 <code>scrapyd.conf</code> 如下</p><figure class="highlight ini"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="section">[scrapyd]</span></span><br><span class="line"><span class="attr">eggs_dir</span>    = eggs</span><br><span class="line"><span class="attr">logs_dir</span>    = logs</span><br><span class="line">items_dir   =</span><br><span class="line"><span class="comment"># 日志最大保存数量</span></span><br><span class="line"><span class="attr">jobs_to_keep</span> = <span class="number">5</span></span><br><span class="line"><span class="attr">dbs_dir</span>     = dbs</span><br><span class="line"><span class="comment"># 同时执行的最大爬虫数量，设为0时，最大执行数量为 max_proc_per_cpu * cpu 核心数</span></span><br><span class="line"><span class="attr">max_proc</span>    = <span class="number">0</span></span><br><span class="line"><span class="comment"># 每个 CPU 最大同时执行爬虫数量</span></span><br><span class="line"><span class="attr">max_proc_per_cpu</span> = <span class="number">4</span></span><br><span class="line"><span class="comment"># 爬虫历史最大保存数量</span></span><br><span class="line"><span class="attr">finished_to_keep</span> = <span class="number">100</span></span><br><span class="line"><span class="attr">poll_interval</span> = <span class="number">5.0</span></span><br><span class="line"><span class="attr">bind_address</span> = <span class="number">127.0</span>.<span class="number">0.1</span></span><br><span class="line"><span class="attr">http_port</span>   = <span class="number">6800</span></span><br><span class="line"><span class="attr">debug</span>       = <span class="literal">off</span></span><br><span class="line"><span class="attr">runner</span>      = scrapyd.runner</span><br><span class="line"><span class="attr">application</span> = scrapyd.app.application</span><br><span class="line"><span class="attr">launcher</span>    = scrapyd.launcher.Launcher</span><br><span class="line"><span class="attr">webroot</span>     = scrapyd.website.Root</span><br><span class="line"><span class="section">[services]</span></span><br><span class="line"><span class="attr">schedule.json</span>     = scrapyd.webservice.Schedule</span><br><span class="line"><span class="attr">cancel.json</span>       = scrapyd.webservice.Cancel</span><br><span class="line"><span class="attr">addversion.json</span>   = scrapyd.webservice.AddVersion</span><br><span class="line"><span class="attr">listprojects.json</span> = scrapyd.webservice.ListProjects</span><br><span class="line"><span class="attr">listversions.json</span> = scrapyd.webservice.ListVersions</span><br><span class="line"><span class="attr">listspiders.json</span>  = scrapyd.webservice.ListSpiders</span><br><span class="line"><span class="attr">delproject.json</span>   = scrapyd.webservice.DeleteProject</span><br><span class="line"><span class="attr">delversion.json</span>   = scrapyd.webservice.DeleteVersion</span><br><span class="line"><span class="attr">listjobs.json</span>     = scrapyd.webservice.ListJobs</span><br><span class="line"><span class="attr">daemonstatus.json</span> = scrapyd.webservice.DaemonStatus</span><br></pre></td></tr></table></figure><p>scrapyd 提供了 API 接口用来控制和监控爬虫</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 列出可运行的所有爬虫</span></span><br><span class="line">curl http://localhost:6800/listspiders.json\?project\=default</span><br><span class="line"><span class="comment"># 执行爬虫</span></span><br><span class="line">curl http://localhost:6800/schedule.json -d project=default -d spider=spider-name </span><br><span class="line"><span class="comment"># 列出所有任务</span></span><br><span class="line">curl http://localhost:6800/listjobs.json?project=default</span><br><span class="line"><span class="comment"># 取消爬虫</span></span><br><span class="line">curl http://localhost:6800/cancel.json -d project=default -d job=&#123;job-id&#125;</span><br></pre></td></tr></table></figure><p>另外 scrapyd 还提供了一个 web 界面方便查看，由于 scrapyd 本身没有提供授权机制，可以使用 nginx 反向代理并设置 Basic Auth。创建 nginx 配置文件 <code>/etc/nginx/sites-enabled/scrapyd</code> </p><figure class="highlight nginx"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="section">server</span> &#123;</span><br><span class="line">    <span class="attribute">listen</span> <span class="number">6801</span>;</span><br><span class="line">    <span class="section">location</span> / &#123;</span><br><span class="line">        <span class="attribute">proxy_pass</span> http://127.0.0.1:6800/;</span><br><span class="line">        <span class="attribute">auth_basic</span> <span class="string">&quot;Restricted&quot;</span>;</span><br><span class="line">        <span class="attribute">auth_basic_user_file</span> /etc/nginx/conf.d/.htpasswd;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>生成密码文件</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">htpasswd -c /etc/nginx/conf.d/.htpasswd username password</span><br></pre></td></tr></table></figure><h3 id="Speed-optimization"><a href="#Speed-optimization" class="headerlink" title="Speed optimization"></a>Speed optimization</h3><p>按照默认配置部署到服务器之后，发现服务器负载非常低，爬取速度也很慢。可以简单的修改配置加快爬虫速度</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">CONCURRENT_REQUESTS = <span class="number">100</span></span><br><span class="line"></span><br><span class="line">DOWNLOAD_DELAY = <span class="number">0</span></span><br><span class="line"></span><br><span class="line">CONCURRENT_REQUESTS_PER_DOMAIN = <span class="number">100</span></span><br><span class="line">CONCURRENT_REQUESTS_PER_IP = <span class="number">100</span></span><br><span class="line"></span><br><span class="line">REACTOR_THREADPOOL_MAXSIZE = <span class="number">20</span></span><br></pre></td></tr></table></figure>]]></content>
    
    
    <summary type="html">&lt;h3 id=&quot;Installation&quot;&gt;&lt;a href=&quot;#Installation&quot; class=&quot;headerlink&quot; title=&quot;Installation&quot;&gt;&lt;/a&gt;Installation&lt;/h3&gt;&lt;p&gt;Scrapy 支持 Python2.7 及 3.4+，安装步骤按照&lt;a href=&quot;https://docs.scrapy.org/en/latest/intro/install.html&quot;&gt;官方文档&lt;/a&gt;进行&lt;/p&gt;
&lt;figure class=&quot;highlight bash&quot;&gt;&lt;table&gt;&lt;tr&gt;&lt;td class=&quot;gutter&quot;&gt;&lt;pre&gt;&lt;span class=&quot;line&quot;&gt;1&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;2&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;3&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;4&lt;/span&gt;&lt;br&gt;&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;code&quot;&gt;&lt;pre&gt;&lt;span class=&quot;line&quot;&gt;&lt;span class=&quot;comment&quot;&gt;# 安装&lt;/span&gt;&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;pip3 install Scrapy&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;&lt;span class=&quot;comment&quot;&gt;# 创建项目&lt;/span&gt;&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;scrapy startproject crawler&lt;/span&gt;&lt;br&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/figure&gt;


&lt;p&gt;安装完毕后，目录结构如下所示&lt;/p&gt;
&lt;figure class=&quot;highlight plaintext&quot;&gt;&lt;table&gt;&lt;tr&gt;&lt;td class=&quot;gutter&quot;&gt;&lt;pre&gt;&lt;span class=&quot;line&quot;&gt;1&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;2&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;3&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;4&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;5&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;6&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;7&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;8&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;9&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;10&lt;/span&gt;&lt;br&gt;&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;code&quot;&gt;&lt;pre&gt;&lt;span class=&quot;line&quot;&gt;crawler/&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;    scrapy.cfg            # scrapyd-client 项目部署时使用&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;    crawler/&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;        __init__.py&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;        items.py          # 爬取结果结构化&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;        middlewares.py    # 项目中间件&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;        pipelines.py      # 项目管道&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;        settings.py       # 项目设置&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;        spiders/          # 爬虫所在目录&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;            __init__.py&lt;/span&gt;&lt;br&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/figure&gt;


&lt;p&gt;Scrapy 执行时的流程大致是&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/images/scrapy_flow.png&quot; alt=&quot;Scrapy 流程&quot;&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Engine 从 Spider 获取 &lt;code&gt;start_urls&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Engine 将 &lt;code&gt;start_urls&lt;/code&gt; 发送到 Scheduler，并请求下一个爬取的 Request&lt;/li&gt;
&lt;li&gt;Scheduler 将下一个要爬取 Request 返回给 Engine&lt;/li&gt;
&lt;li&gt;Engine 将收到的 Request 执行所有 Middleware 的 &lt;code&gt;process_request()&lt;/code&gt; 后，发送到 Downloader&lt;/li&gt;
&lt;li&gt;Downloader 下载内容后，执行所有 Middleware 的 &lt;code&gt;process_response()&lt;/code&gt; 后，将结果返回给 Engine&lt;/li&gt;
&lt;li&gt;Engine 将内容发送给 Spider 做数据处理，之前执行 Middleware 的 &lt;code&gt;process_spider_input()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Spider 处理后，将结果通过 Middleware 的 &lt;code&gt;process_spider_output()&lt;/code&gt; 后，返回给 Engine&lt;/li&gt;
&lt;li&gt;Engine 将处理后的数据发送给 Pipline 进行操作，并将处理过的 Request 发送给 Scheduler，请求下一个 Request&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;code&gt;settings.py&lt;/code&gt; 配置文件中，一般需要修改的配置如下&lt;/p&gt;
&lt;figure class=&quot;highlight python&quot;&gt;&lt;table&gt;&lt;tr&gt;&lt;td class=&quot;gutter&quot;&gt;&lt;pre&gt;&lt;span class=&quot;line&quot;&gt;1&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;2&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;3&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;4&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;5&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;6&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;7&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;8&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;9&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;10&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;11&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;12&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;13&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;14&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;15&lt;/span&gt;&lt;br&gt;&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;code&quot;&gt;&lt;pre&gt;&lt;span class=&quot;line&quot;&gt;&lt;span class=&quot;comment&quot;&gt;# 是否遵循 robots 协议&lt;/span&gt;&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;ROBOTSTXT_OBEY = &lt;span class=&quot;literal&quot;&gt;False&lt;/span&gt;&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;&lt;span class=&quot;comment&quot;&gt;# 并发请求数&lt;/span&gt;&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;CONCURRENT_REQUESTS = &lt;span class=&quot;number&quot;&gt;16&lt;/span&gt;&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;&lt;span class=&quot;comment&quot;&gt;# 下载间隔，实际范围在 0.5*DOWNLOAD_DELAY 到 1.5*DOWNLOAD_DELAY 之间&lt;/span&gt;&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;DOWNLOAD_DELAY = &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;&lt;span class=&quot;comment&quot;&gt;# 域名/IP 并发请求限制&lt;/span&gt;&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;CONCURRENT_REQUESTS_PER_DOMAIN = &lt;span class=&quot;number&quot;&gt;16&lt;/span&gt;&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;CONCURRENT_REQUESTS_PER_IP = &lt;span class=&quot;number&quot;&gt;16&lt;/span&gt;&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;&lt;span class=&quot;comment&quot;&gt;# 启用 Cookies&lt;/span&gt;&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;COOKIES_ENABLED = &lt;span class=&quot;literal&quot;&gt;False&lt;/span&gt;&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;&lt;span class=&quot;comment&quot;&gt;# 下载中间件，后面数字表示优先级&lt;/span&gt;&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;DOWNLOADER_MIDDLEWARES = &amp;#123;&amp;#125;&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;&lt;span class=&quot;comment&quot;&gt;# 管道&lt;/span&gt;&lt;/span&gt;&lt;br&gt;&lt;span class=&quot;line&quot;&gt;ITEM_PIPELINES = &amp;#123;&amp;#125;&lt;/span&gt;&lt;br&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/figure&gt;</summary>
    
    
    
    
    <category term="爬虫" scheme="https://bootell.net/tags/%E7%88%AC%E8%99%AB/"/>
    
  </entry>
  
  <entry>
    <title>自动化编译 LEDE K3</title>
    <link href="https://bootell.net/2019/02/28/LEDE-K3-Build-with-GitLab-CI/"/>
    <id>https://bootell.net/2019/02/28/LEDE-K3-Build-with-GitLab-CI/</id>
    <published>2019-02-28T02:18:00.000Z</published>
    <updated>2019-10-10T02:08:00.000Z</updated>
    
    <content type="html"><![CDATA[<p>本地编译一次 LEDE&#x2F;OpenWrt 固件花了近3个小时，下载依赖文件因为网络问题也比较慢，考虑可以利用各种免费的 CI 自动集成工具来编译需要的固件，目前可选的有 Github Actions 和 GitLab CI。</p><span id="more"></span><h3 id="编译配置"><a href="#编译配置" class="headerlink" title="编译配置"></a>编译配置</h3><h5 id="生成编译配置文件"><a href="#生成编译配置文件" class="headerlink" title="生成编译配置文件"></a>生成编译配置文件</h5><p>编译使用 <a href="https://github.com/coolsnowwolf/lede">coolsnowwolf&#x2F;lede</a> 项目，首先对要编译的固件进行配置：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 克隆项目</span></span><br><span class="line">git <span class="built_in">clone</span> https://github.com/coolsnowwolf/lede.git</span><br><span class="line"></span><br><span class="line"><span class="built_in">cd</span> lede</span><br><span class="line"></span><br><span class="line"><span class="comment"># 下载软件源</span></span><br><span class="line">./scripts/feeds update -a</span><br><span class="line">./scripts/feeds install -a</span><br><span class="line"></span><br><span class="line"><span class="comment"># 选择配置，生成配置文件</span></span><br><span class="line">make menuconfig</span><br></pre></td></tr></table></figure><h5 id="固化原固件配置"><a href="#固化原固件配置" class="headerlink" title="固化原固件配置"></a>固化原固件配置</h5><p>将需要原固件需要保存的文件，添加到编译目录的 <code>/files</code> 内，即可保存之前固件的配置。<br>常见的配置地址如下所示</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">/</span><br><span class="line">└── etc</span><br><span class="line">    ├── config</span><br><span class="line">    │   ├── system    主机名，时区等</span><br><span class="line">    │   ├── network   网络</span><br><span class="line">    │   ├── dhcp</span><br><span class="line">    │   └── ddns</span><br><span class="line">    └── lib\lua\luci\view\admin_status\index.htm  主页样式</span><br></pre></td></tr></table></figure><h3 id="自动化编译"><a href="#自动化编译" class="headerlink" title="自动化编译"></a>自动化编译</h3><p>自动化编译需要先配置好配置文件，把 <code>make menuconfig</code> 生成的配置文件保存为 <code>defconfig</code>，一并提交到远程仓库。</p><h5 id="GitHub-Actions"><a href="#GitHub-Actions" class="headerlink" title="GitHub Actions"></a>GitHub Actions</h5><p><a href="https://github.com/features/actions">Github Actions</a> 免费账户每月有2000分钟时长，单次最高运行6小时，非常充足。目前测试阶段需要申请开通。<br>编译需要添加配置文件 <code>.github/workflow/build.yaml</code>:</p><figure class="highlight yml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">name:</span> <span class="string">Build</span></span><br><span class="line"></span><br><span class="line"><span class="attr">on:</span> [<span class="string">push</span>]</span><br><span class="line"></span><br><span class="line"><span class="attr">jobs:</span></span><br><span class="line">  <span class="attr">k3:</span></span><br><span class="line">    <span class="attr">runs-on:</span> <span class="string">ubuntu-16.04</span></span><br><span class="line">    <span class="attr">steps:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="attr">uses:</span> <span class="string">actions/checkout@master</span></span><br><span class="line">      <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">Set</span> <span class="string">Environment</span></span><br><span class="line">        <span class="attr">run:</span> <span class="string">|</span></span><br><span class="line"><span class="string">          sudo apt-get -yqq update</span></span><br><span class="line"><span class="string">          sudo apt-get -yqq install build-essential asciidoc binutils bzip2 gawk gettext git libncurses5-dev libz-dev patch unzip zlib1g-dev lib32gcc1 libc6-dev-i386 subversion flex uglifyjs git-core gcc-multilib p7zip p7zip-full msmtp libssl-dev texinfo libglib2.0-dev xmlto qemu-utils upx libelf-dev autoconf automake libtool autopoint</span></span><br><span class="line"><span class="string"></span>      <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">Prepare</span> <span class="string">Build</span></span><br><span class="line">        <span class="attr">run:</span> <span class="string">|</span></span><br><span class="line"><span class="string">          ./scripts/feeds update -a</span></span><br><span class="line"><span class="string">          ./scripts/feeds install -a</span></span><br><span class="line"><span class="string">          cp k3config .config</span></span><br><span class="line"><span class="string">          make defconfig</span></span><br><span class="line"><span class="string">          sed -i &#x27;s|^TARGET_|# TARGET_|g; s|# TARGET_DEVICES += phicomm-k3|TARGET_DEVICES += phicomm-k3|&#x27; target/linux/bcm53xx/image/Makefile</span></span><br><span class="line"><span class="string"></span>      <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">Build</span> <span class="string">Image</span></span><br><span class="line">        <span class="attr">run:</span> <span class="string">make</span> <span class="string">-j$(nproc)</span> <span class="string">V=sc</span></span><br><span class="line">      <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">Upload</span> <span class="string">Image</span></span><br><span class="line">        <span class="attr">uses:</span> <span class="string">actions/upload-artifact@v1</span></span><br><span class="line">        <span class="attr">with:</span></span><br><span class="line">          <span class="attr">name:</span> <span class="string">lede-k3-$&#123;&#123;</span> <span class="string">github.sha</span> <span class="string">&#125;&#125;</span></span><br><span class="line">          <span class="attr">path:</span> <span class="string">bin</span></span><br></pre></td></tr></table></figure><h5 id="GitLab-CI"><a href="#GitLab-CI" class="headerlink" title="GitLab CI"></a>GitLab CI</h5><p>GitLab CI 的 Shared Runners 每个 Job 的时间限制为3小时，每月有2000分钟的免费编译时长，足够日常编译自己的固件了。<br>编译需要在 GitLab 后台 Setting&#x2F;CI&#x2F;General 中设置 <code>Timeout</code> 时间为 <code>3h</code>，并添加配置文件 <code>.gitlab-ci.yml</code> 如下：</p><figure class="highlight yml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">image:</span> <span class="string">tossp/lede:latest</span></span><br><span class="line"></span><br><span class="line"><span class="attr">stages:</span></span><br><span class="line">  <span class="bullet">-</span> <span class="string">build</span></span><br><span class="line">  <span class="bullet">-</span> <span class="string">deploy</span></span><br><span class="line"></span><br><span class="line"><span class="attr">make:</span></span><br><span class="line">  <span class="attr">stage:</span> <span class="string">build</span></span><br><span class="line">  <span class="attr">variables:</span></span><br><span class="line">    <span class="attr">GIT_SUBMODULE_STRATEGY:</span> <span class="string">recursive</span></span><br><span class="line">    <span class="attr">FORCE_UNSAFE_CONFIGURE:</span> <span class="string">&#x27;1&#x27;</span></span><br><span class="line">  <span class="attr">cache:</span></span><br><span class="line">    <span class="attr">key:</span> <span class="string">&quot;$CI_JOB_NAME-$CI_COMMIT_REF_SLUG&quot;</span></span><br><span class="line">    <span class="attr">paths:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">dl/</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">feeds/</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">staging_dir/</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">build_dir/</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">key-*</span></span><br><span class="line">  <span class="attr">artifacts:</span></span><br><span class="line">      <span class="attr">name:</span> <span class="string">&quot;$&#123;CI_JOB_STAGE&#125;_$&#123;CI_JOB_NAME&#125;_$&#123;CI_COMMIT_REF_NAME&#125;&quot;</span></span><br><span class="line">      <span class="attr">when:</span> <span class="string">always</span></span><br><span class="line">      <span class="attr">paths:</span></span><br><span class="line">        <span class="bullet">-</span> <span class="string">bin/</span></span><br><span class="line">        <span class="bullet">-</span> <span class="string">build.log</span></span><br><span class="line">  <span class="attr">script:</span></span><br><span class="line">    <span class="bullet">-</span> <span class="string">./scripts/feeds</span> <span class="string">update</span> <span class="string">-a</span></span><br><span class="line">    <span class="bullet">-</span> <span class="string">./scripts/feeds</span> <span class="string">install</span> <span class="string">-a</span></span><br><span class="line">    <span class="bullet">-</span> <span class="string">cp</span> <span class="string">defconfig</span> <span class="string">.config</span></span><br><span class="line">    <span class="bullet">-</span> <span class="string">make</span> <span class="string">defconfig</span></span><br><span class="line">    <span class="bullet">-</span> <span class="string">sed</span> <span class="string">-i</span> <span class="string">&#x27;s|^TARGET_|# TARGET_|g; s|# TARGET_DEVICES += phicomm-k3|TARGET_DEVICES += phicomm-k3|&#x27;</span> <span class="string">target/linux/bcm53xx/image/Makefile</span></span><br><span class="line">    <span class="bullet">-</span> <span class="string">make</span> <span class="string">-j$(nproc)</span> <span class="string">V=sc</span> <span class="string">&gt;</span> <span class="string">./build.log</span> <span class="number">2</span><span class="string">&gt;&amp;1</span></span><br><span class="line">  <span class="attr">only:</span></span><br><span class="line">    <span class="bullet">-</span> <span class="string">ci</span></span><br></pre></td></tr></table></figure><h3 id="其他问题"><a href="#其他问题" class="headerlink" title="其他问题"></a>其他问题</h3><ul><li><p>原固件为LEDE时升级<br>将固件传输到路由器 <code>/tmp/</code></p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">mtd -r write /tmp/openwrt-bcm53xx-phicomm-k3-squashfs.trx firmware</span><br></pre></td></tr></table></figure></li><li><p>开启隐藏功能（ssr） </p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">echo</span> 0xDEADBEEF &gt; /etc/config/google_fu_mode</span><br></pre></td></tr></table></figure></li><li><p>无线驱动<br>可以更换 K3 的无线驱动，直接替换文件即可<br>驱动下载地址：<a href="https://github.com/Hill-98/phicommk3-firmware">https://github.com/Hill-98/phicommk3-firmware</a><br>驱动源码包位置：<code>/package/lean/k3-brcmfmac4366c-firmware/files/lib/firmware/brcm/brcmfmac4366c-pcie.bin</code><br>驱动固件位置：<code>/lib/firmware/brcm/brcmfmac4366c-pcie.bin</code></p></li><li><p>mwan3 ipset<br>使用 mwan3 的 ipset 功能，可以对域名进行匹配</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">echo</span> <span class="string">&#x27;ipset create baidu hash:ip family inet&#x27;</span> &gt;&gt; /etc/rc.local</span><br><span class="line"><span class="built_in">echo</span> <span class="string">&#x27;conf-dir=/etc/dnsmasq.d&#x27;</span> &gt;&gt; /etc/dnsmasq.conf</span><br><span class="line"><span class="built_in">echo</span> <span class="string">&#x27;ipset=/baidu.com/baidu&#x27;</span> &gt;&gt; /etc/dnsmasq.d/baidu.conf</span><br></pre></td></tr></table></figure></li></ul><h3 id="参考资料"><a href="#参考资料" class="headerlink" title="参考资料"></a>参考资料</h3><p><a href="https://www.right.com.cn/forum/thread-257677-1-1.html">https://www.right.com.cn/forum/thread-257677-1-1.html</a><br><a href="https://www.right.com.cn/forum/thread-419328-1-1.html">https://www.right.com.cn/forum/thread-419328-1-1.html</a></p>]]></content>
    
    
    <summary type="html">&lt;p&gt;本地编译一次 LEDE&amp;#x2F;OpenWrt 固件花了近3个小时，下载依赖文件因为网络问题也比较慢，考虑可以利用各种免费的 CI 自动集成工具来编译需要的固件，目前可选的有 Github Actions 和 GitLab CI。&lt;/p&gt;</summary>
    
    
    
    
    <category term="路由器" scheme="https://bootell.net/tags/%E8%B7%AF%E7%94%B1%E5%99%A8/"/>
    
  </entry>
  
  <entry>
    <title>Paypal 实现自动订阅</title>
    <link href="https://bootell.net/2018/08/30/Paypay-Subscriptions-Integrate/"/>
    <id>https://bootell.net/2018/08/30/Paypay-Subscriptions-Integrate/</id>
    <published>2018-08-30T08:30:00.000Z</published>
    <updated>2022-09-11T14:40:51.208Z</updated>
    
    <content type="html"><![CDATA[<p>官方给出的自动续费分五步 <a href="https://developer.paypal.com/docs/subscriptions/integrate/integrate-steps/">Intergrate Subscriptions</a>。实际开发中，还需要实现支付结果处理和订阅管理等：</p><ol><li>事先创建计划，并激活；</li><li>用户创建订阅，跳转到paypal网站等待用户同意；</li><li>用户同意后，跳转回网站，执行订阅；</li><li>获取用户账单，包括每次扣款结果通知的接收或支付结果的主动查询；</li><li>处理用户取消订阅等通知。</li></ol><span id="more"></span><h3 id="使用-Palpal-SDK"><a href="#使用-Palpal-SDK" class="headerlink" title="使用 Palpal SDK"></a>使用 Palpal SDK</h3><figure class="highlight php"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">composer <span class="keyword">require</span> paypal/rest-api-sdk-php</span><br></pre></td></tr></table></figure><p>官方有完整的 <a href="https://paypal.github.io/PayPal-PHP-SDK/sample/#billing">Samples</a>；</p><p>可以通过 <a href="https://www.sandbox.paypal.com/">Paypal Sandbox</a> 方便的进行调试。</p><h3 id="创建订阅计划并激活"><a href="#创建订阅计划并激活" class="headerlink" title="创建订阅计划并激活"></a>创建订阅计划并激活</h3><ul><li>订阅计划（Billing Plan）等同于的产品，需要为每个商品不同价格创建不同的计划。不过可以针对不同用户在创建协议时更改；</li><li>Payment 中创建 <code>TRIAL</code> 类型支付时，也必须存在 <code>REGULAR</code> 的支付。<code>TRAIL</code> 并不能自动判断是否为新用户等条件，所以新用户首次的优惠需要业务代码自己实现。</li><li>由于创建用户订阅协议时，<strong>协议生效时间必须在当前时间24小时以后</strong>，所以循环扣款的设置无法立刻扣款，最早也需要24小时。一般业务需要立刻进行首次扣款，可以用 <code>MerchantPreferences</code> 的 <code>setSetupFee</code> 来设置首次扣款的费用；</li><li>Paypal SDK 会报错 <code>&quot;NotifyUrl&quot; value is NULL</code>，该错误为 Paypal 服务端错误，但官方未修复，解决办法见 <a href="https://github.com/paypal/PayPal-PHP-SDK/pull/1152/files">issue</a>。</li></ul><h3 id="创建订阅"><a href="#创建订阅" class="headerlink" title="创建订阅"></a>创建订阅</h3><ul><li><p>用户可以创建针对同一订阅计划的多个订阅协议（Billing Agreement），创建后跳转至 Paypal 网站等待用户同意协议；</p></li><li><p>因协议开始时间 <code>start_date</code> 最早为当前时间24小时之后，所以该值实际上设置的是第二次扣款时间。所以，若设置按月付款，<code>start_date</code> 需要设置成一个月以后，然后通过设置 <code>setSetupFee</code> 价格来设置首次扣款费用；</p></li><li><p>创建订阅后，还没有生成 <code>Agreement.id</code>，这时候需要从跳转链接中提取出 <code>token</code> 来使创建的订阅与用户同意后跳转的回来的协议信息相对应。</p><figure class="highlight php"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="variable">$link</span> = <span class="variable">$agreement</span>-&gt;<span class="title function_ invoke__">getApprovalLink</span>();</span><br><span class="line"><span class="title function_ invoke__">parse_str</span>(<span class="title function_ invoke__">parse_url</span>(<span class="variable">$link</span>, PHP_URL_QUERY), <span class="variable">$params</span>);</span><br><span class="line"><span class="variable">$token</span> = <span class="variable">$params</span>[<span class="string">&#x27;token&#x27;</span>];</span><br></pre></td></tr></table></figure></li></ul><h3 id="执行订阅"><a href="#执行订阅" class="headerlink" title="执行订阅"></a>执行订阅</h3><ul><li>同一个订阅计划可以被同一个用户多次订阅。所以根据需要，需要在执行新协议时，手动取消该用户之前的协议；</li><li>实际扣款时间有延迟，每次循环扣款执行的时间，都会比<code>AgreementDetail.next</code>显示的时间晚几个小时。所以为保证连续性，可以设置提前一天扣款。</li></ul><h3 id="支付结果接收与查询"><a href="#支付结果接收与查询" class="headerlink" title="支付结果接收与查询"></a>支付结果接收与查询</h3><ul><li>可以在 <code>My Apps -&gt; REST API apps -&gt; WEBHOOKS</code> 设置 <code>webhook</code> 通知。当每次循环扣款成功时，Paypal 都会发送 <code>PAYMENT.SALE.COMPLETED</code> 的事件通知，可以通过其中的 <code>billing_agreement_id</code> 字段与已创建的订阅相匹配，找出对应付款的协议。</li><li>每次 <code>AgreementDetail</code> 都会返回下次收款时间 <code>next</code> 参数。可以在超过这个时间后，通过 <code>Agreement::searchTransactions</code> 方法查询该协议的所有交易。需要注意的是，Paypal 实际的扣款时间一般都会延迟，所以需要多次重试。</li></ul><h3 id="用户订阅取消与删除等"><a href="#用户订阅取消与删除等" class="headerlink" title="用户订阅取消与删除等"></a>用户订阅取消与删除等</h3><ul><li>取消订阅会通过 <code>webhook</code> 发送 <code>BILLING.SUBSCRIPTION.CANCELLED</code> 通知，订阅暂停会发送 <code>BILLING.SUBSCRIPTION.SUSPENDED</code> 通知</li><li>直接删除计划并不会自动删除基于该计划的协议，所以再删除计划前，需要手动取消所有订阅该计划的协议。</li></ul><h3 id="参考资料"><a href="#参考资料" class="headerlink" title="参考资料"></a>参考资料</h3><p><a href="https://developer.paypal.com/docs/subscriptions/">https://developer.paypal.com/docs/subscriptions/</a></p><p><a href="https://paypal.github.io/PayPal-PHP-SDK/sample/">https://paypal.github.io/PayPal-PHP-SDK/sample/</a></p><p><a href="https://www.cnblogs.com/pheye/p/6603126.html">https://www.cnblogs.com/pheye/p/6603126.html</a></p>]]></content>
    
    
    <summary type="html">&lt;p&gt;官方给出的自动续费分五步 &lt;a href=&quot;https://developer.paypal.com/docs/subscriptions/integrate/integrate-steps/&quot;&gt;Intergrate Subscriptions&lt;/a&gt;。实际开发中，还需要实现支付结果处理和订阅管理等：&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;事先创建计划，并激活；&lt;/li&gt;
&lt;li&gt;用户创建订阅，跳转到paypal网站等待用户同意；&lt;/li&gt;
&lt;li&gt;用户同意后，跳转回网站，执行订阅；&lt;/li&gt;
&lt;li&gt;获取用户账单，包括每次扣款结果通知的接收或支付结果的主动查询；&lt;/li&gt;
&lt;li&gt;处理用户取消订阅等通知。&lt;/li&gt;
&lt;/ol&gt;</summary>
    
    
    
    
    <category term="PHP" scheme="https://bootell.net/tags/PHP/"/>
    
  </entry>
  
  <entry>
    <title>Hello World</title>
    <link href="https://bootell.net/2016/02/01/Hello-World/"/>
    <id>https://bootell.net/2016/02/01/Hello-World/</id>
    <published>2016-02-01T04:00:00.000Z</published>
    <updated>2022-09-11T14:40:51.208Z</updated>
    
    <content type="html"><![CDATA[<p>2016, 从校园到社会, 从接收学习到发出声音.</p><p>记录积累, 在这里, boot to tell.</p><span id="more"></span>]]></content>
    
    
    <summary type="html">&lt;p&gt;2016, 从校园到社会, 从接收学习到发出声音.&lt;/p&gt;
&lt;p&gt;记录积累, 在这里, boot to tell.&lt;/p&gt;</summary>
    
    
    
    
  </entry>
  
</feed>
