Skip to content

Commit

Permalink
control for oov based errors with w2v vectorization
Browse files Browse the repository at this point in the history
  • Loading branch information
Nathaniel Imel authored and Nathaniel Imel committed Nov 12, 2023
1 parent a6961ad commit 7442fb8
Show file tree
Hide file tree
Showing 30 changed files with 7,423 additions and 2,970 deletions.
35 changes: 35 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
name: website

# build the documentation whenever there are new commits on main
on:
push:
branches:
- main
# Alternative: only build for tags.
# tags:
# - '*'

# security: restrict permissions for CI jobs.
permissions:
contents: read

jobs:
# Build the documentation and upload the static HTML files as an artifact.
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.11'

# ADJUST THIS: install all dependencies (including pdoc)
- run: pip install -e .
- run: pip install pdoc
# ADJUST THIS: build your documentation into docs/.
# We use a custom build script for pdoc itself, ideally you just run `pdoc -o docs/ ...` here.
- run: pdoc src/sciterra -d google --math -o ./docs

- uses: actions/upload-pages-artifact@v1
with:
path: docs/
20 changes: 13 additions & 7 deletions docs/sciterra.html
Original file line number Diff line number Diff line change
Expand Up @@ -79,20 +79,26 @@ <h1 class="modulename">

<p><a href="https://github.com/nathimel/sciterra/actions/workflows/test.yml"><img src="https://github.com/nathimel/sciterra/actions/workflows/test.yml/badge.svg" alt="build" /></a></p>

<p>Software library to support data-driven analyses of scientific literature</p>
<p>Software library to support data-driven analyses of scientific literature.</p>

<p>Inspired heavily by Zach Hafen's <a href="https://github.com/zhafen/cc">cc</a> library.</p>
<p>This library is a reimplementation of Zach Hafen's <a href="https://github.com/zhafen/cc">cc</a> library.</p>
</div>

<input id="mod-sciterra-view-source" class="view-source-toggle-state" type="checkbox" aria-hidden="true" tabindex="-1">

<label class="view-source-button" for="mod-sciterra-view-source"><span>View Source</span></label>

<div class="pdoc-code codehilite"><pre><span></span><span id="L-1"><a href="#L-1"><span class="linenos">1</span></a><span class="sd">&quot;&quot;&quot;</span>
</span><span id="L-2"><a href="#L-2"><span class="linenos">2</span></a><span class="sd">.. include:: ../../README.md</span>
</span><span id="L-3"><a href="#L-3"><span class="linenos">3</span></a><span class="sd">&quot;&quot;&quot;</span>
</span><span id="L-4"><a href="#L-4"><span class="linenos">4</span></a>
</span><span id="L-5"><a href="#L-5"><span class="linenos">5</span></a><span class="n">__docformat__</span> <span class="o">=</span> <span class="s2">&quot;google&quot;</span>
<div class="pdoc-code codehilite"><pre><span></span><span id="L-1"><a href="#L-1"><span class="linenos"> 1</span></a><span class="sd">&quot;&quot;&quot;</span>
</span><span id="L-2"><a href="#L-2"><span class="linenos"> 2</span></a><span class="sd">.. include:: ../../README.md</span>
</span><span id="L-3"><a href="#L-3"><span class="linenos"> 3</span></a><span class="sd">&quot;&quot;&quot;</span>
</span><span id="L-4"><a href="#L-4"><span class="linenos"> 4</span></a>
</span><span id="L-5"><a href="#L-5"><span class="linenos"> 5</span></a><span class="n">__docformat__</span> <span class="o">=</span> <span class="s2">&quot;google&quot;</span>
</span><span id="L-6"><a href="#L-6"><span class="linenos"> 6</span></a>
</span><span id="L-7"><a href="#L-7"><span class="linenos"> 7</span></a><span class="kn">from</span> <span class="nn">.mapping.atlas</span> <span class="kn">import</span> <span class="n">Atlas</span>
</span><span id="L-8"><a href="#L-8"><span class="linenos"> 8</span></a><span class="kn">from</span> <span class="nn">.mapping.cartography</span> <span class="kn">import</span> <span class="n">Cartographer</span>
</span><span id="L-9"><a href="#L-9"><span class="linenos"> 9</span></a><span class="kn">from</span> <span class="nn">.mapping.publication</span> <span class="kn">import</span> <span class="p">(</span>
</span><span id="L-10"><a href="#L-10"><span class="linenos">10</span></a> <span class="n">Publication</span><span class="p">,</span>
</span><span id="L-11"><a href="#L-11"><span class="linenos">11</span></a><span class="p">)</span> <span class="c1"># publication should probably be moved out of mapping.</span>
</span></pre></div>


Expand Down
16 changes: 13 additions & 3 deletions docs/sciterra/librarians.html
Original file line number Diff line number Diff line change
Expand Up @@ -71,9 +71,19 @@ <h1 class="modulename">
<a href="./../sciterra.html">sciterra</a><wbr>.librarians </h1>





<input id="mod-librarians-view-source" class="view-source-toggle-state" type="checkbox" aria-hidden="true" tabindex="-1">

<label class="view-source-button" for="mod-librarians-view-source"><span>View Source</span></label>

<div class="pdoc-code codehilite"><pre><span></span><span id="L-1"><a href="#L-1"><span class="linenos">1</span></a><span class="kn">from</span> <span class="nn">.librarian</span> <span class="kn">import</span> <span class="n">Librarian</span>
</span><span id="L-2"><a href="#L-2"><span class="linenos">2</span></a><span class="kn">from</span> <span class="nn">.adslibrarian</span> <span class="kn">import</span> <span class="n">ADSLibrarian</span>
</span><span id="L-3"><a href="#L-3"><span class="linenos">3</span></a><span class="kn">from</span> <span class="nn">.s2librarian</span> <span class="kn">import</span> <span class="n">SemanticScholarLibrarian</span>
</span><span id="L-4"><a href="#L-4"><span class="linenos">4</span></a>
</span><span id="L-5"><a href="#L-5"><span class="linenos">5</span></a><span class="sd">&quot;&quot;&quot;Why is there not an ArxivLibrarian? For now, we are restricting to APIs that allow us to traverse literature graphs, and arxiv does not have one. While there is a useful pip-installable package for querying the arxiv api for papers, https://pypi.org/project/arxiv/, the returned object does not have information on references and citations. However, it may still be possible to obtain a large sample of publications with abstracts and submission dates (though no citation counts), because the arxiv API&#39;s limit for a single query is 300,000 results.</span>
</span><span id="L-6"><a href="#L-6"><span class="linenos">6</span></a><span class="sd">&quot;&quot;&quot;</span>
</span></pre></div>


</section>
</main>
<script>
Expand Down
Loading

0 comments on commit 7442fb8

Please sign in to comment.