Skip to content

Commit

Permalink
Deploying to gh-pages from @ dad2176 🚀
Browse files Browse the repository at this point in the history
  • Loading branch information
jgbradley1 committed Nov 27, 2024
1 parent b32f885 commit dfb59de
Show file tree
Hide file tree
Showing 9 changed files with 1,920 additions and 1,917 deletions.
42 changes: 19 additions & 23 deletions cli/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -1406,19 +1406,17 @@ <h2 id="index">index</h2>
<a id="__codelineno-3-7" name="__codelineno-3-7" href="#__codelineno-3-7"></a> --resume TEXT Resume a given indexing run
<a id="__codelineno-3-8" name="__codelineno-3-8" href="#__codelineno-3-8"></a> --reporter [rich|print|none] The progress reporter to use. [default:
<a id="__codelineno-3-9" name="__codelineno-3-9" href="#__codelineno-3-9"></a> rich]
<a id="__codelineno-3-10" name="__codelineno-3-10" href="#__codelineno-3-10"></a> --emit TEXT The data formats to emit, comma-separated.
<a id="__codelineno-3-11" name="__codelineno-3-11" href="#__codelineno-3-11"></a> [default: parquet]
<a id="__codelineno-3-12" name="__codelineno-3-12" href="#__codelineno-3-12"></a> --dry-run / --no-dry-run Run the indexing pipeline without executing
<a id="__codelineno-3-13" name="__codelineno-3-13" href="#__codelineno-3-13"></a> any steps to inspect and validate the
<a id="__codelineno-3-14" name="__codelineno-3-14" href="#__codelineno-3-14"></a> configuration. [default: no-dry-run]
<a id="__codelineno-3-15" name="__codelineno-3-15" href="#__codelineno-3-15"></a> --cache / --no-cache Use LLM cache. [default: cache]
<a id="__codelineno-3-16" name="__codelineno-3-16" href="#__codelineno-3-16"></a> --skip-validation / --no-skip-validation
<a id="__codelineno-3-17" name="__codelineno-3-17" href="#__codelineno-3-17"></a> Skip any preflight validation. Useful when
<a id="__codelineno-3-18" name="__codelineno-3-18" href="#__codelineno-3-18"></a> running no LLM steps. [default: no-skip-
<a id="__codelineno-3-19" name="__codelineno-3-19" href="#__codelineno-3-19"></a> validation]
<a id="__codelineno-3-20" name="__codelineno-3-20" href="#__codelineno-3-20"></a> --output PATH Indexing pipeline output directory.
<a id="__codelineno-3-21" name="__codelineno-3-21" href="#__codelineno-3-21"></a> Overrides storage.base_dir in the
<a id="__codelineno-3-22" name="__codelineno-3-22" href="#__codelineno-3-22"></a> configuration file.
<a id="__codelineno-3-10" name="__codelineno-3-10" href="#__codelineno-3-10"></a> --dry-run / --no-dry-run Run the indexing pipeline without executing
<a id="__codelineno-3-11" name="__codelineno-3-11" href="#__codelineno-3-11"></a> any steps to inspect and validate the
<a id="__codelineno-3-12" name="__codelineno-3-12" href="#__codelineno-3-12"></a> configuration. [default: no-dry-run]
<a id="__codelineno-3-13" name="__codelineno-3-13" href="#__codelineno-3-13"></a> --cache / --no-cache Use LLM cache. [default: cache]
<a id="__codelineno-3-14" name="__codelineno-3-14" href="#__codelineno-3-14"></a> --skip-validation / --no-skip-validation
<a id="__codelineno-3-15" name="__codelineno-3-15" href="#__codelineno-3-15"></a> Skip any preflight validation. Useful when
<a id="__codelineno-3-16" name="__codelineno-3-16" href="#__codelineno-3-16"></a> running no LLM steps. [default: no-skip-
<a id="__codelineno-3-17" name="__codelineno-3-17" href="#__codelineno-3-17"></a> validation]
<a id="__codelineno-3-18" name="__codelineno-3-18" href="#__codelineno-3-18"></a> --output PATH Indexing pipeline output directory.
<a id="__codelineno-3-19" name="__codelineno-3-19" href="#__codelineno-3-19"></a> Overrides storage.base_dir in the
<a id="__codelineno-3-20" name="__codelineno-3-20" href="#__codelineno-3-20"></a> configuration file.
</code></pre></div>
<h2 id="init">init</h2>
<p>Generate a default configuration file.</p>
Expand Down Expand Up @@ -1512,16 +1510,14 @@ <h2 id="update">update</h2>
<a id="__codelineno-11-6" name="__codelineno-11-6" href="#__codelineno-11-6"></a> profiling [default: no-memprofile]
<a id="__codelineno-11-7" name="__codelineno-11-7" href="#__codelineno-11-7"></a> --reporter [rich|print|none] The progress reporter to use. [default:
<a id="__codelineno-11-8" name="__codelineno-11-8" href="#__codelineno-11-8"></a> rich]
<a id="__codelineno-11-9" name="__codelineno-11-9" href="#__codelineno-11-9"></a> --emit TEXT The data formats to emit, comma-separated.
<a id="__codelineno-11-10" name="__codelineno-11-10" href="#__codelineno-11-10"></a> [default: parquet]
<a id="__codelineno-11-11" name="__codelineno-11-11" href="#__codelineno-11-11"></a> --cache / --no-cache Use LLM cache. [default: cache]
<a id="__codelineno-11-12" name="__codelineno-11-12" href="#__codelineno-11-12"></a> --skip-validation / --no-skip-validation
<a id="__codelineno-11-13" name="__codelineno-11-13" href="#__codelineno-11-13"></a> Skip any preflight validation. Useful when
<a id="__codelineno-11-14" name="__codelineno-11-14" href="#__codelineno-11-14"></a> running no LLM steps. [default: no-skip-
<a id="__codelineno-11-15" name="__codelineno-11-15" href="#__codelineno-11-15"></a> validation]
<a id="__codelineno-11-16" name="__codelineno-11-16" href="#__codelineno-11-16"></a> --output PATH Indexing pipeline output directory.
<a id="__codelineno-11-17" name="__codelineno-11-17" href="#__codelineno-11-17"></a> Overrides storage.base_dir in the
<a id="__codelineno-11-18" name="__codelineno-11-18" href="#__codelineno-11-18"></a> configuration file.
<a id="__codelineno-11-9" name="__codelineno-11-9" href="#__codelineno-11-9"></a> --cache / --no-cache Use LLM cache. [default: cache]
<a id="__codelineno-11-10" name="__codelineno-11-10" href="#__codelineno-11-10"></a> --skip-validation / --no-skip-validation
<a id="__codelineno-11-11" name="__codelineno-11-11" href="#__codelineno-11-11"></a> Skip any preflight validation. Useful when
<a id="__codelineno-11-12" name="__codelineno-11-12" href="#__codelineno-11-12"></a> running no LLM steps. [default: no-skip-
<a id="__codelineno-11-13" name="__codelineno-11-13" href="#__codelineno-11-13"></a> validation]
<a id="__codelineno-11-14" name="__codelineno-11-14" href="#__codelineno-11-14"></a> --output PATH Indexing pipeline output directory.
<a id="__codelineno-11-15" name="__codelineno-11-15" href="#__codelineno-11-15"></a> Overrides storage.base_dir in the
<a id="__codelineno-11-16" name="__codelineno-11-16" href="#__codelineno-11-16"></a> configuration file.
</code></pre></div>


Expand Down
4 changes: 2 additions & 2 deletions config/env_vars/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -1754,7 +1754,7 @@

<h1 id="default-configuration-mode-using-env-vars">Default Configuration Mode (using Env Vars)</h1>
<h2 id="text-embeddings-customization">Text-Embeddings Customization</h2>
<p>By default, the GraphRAG indexer will only emit embeddings required for our query methods. However, the model has embeddings defined for all plaintext fields, and these can be generated by setting the <code>GRAPHRAG_EMBEDDING_TARGET</code> environment variable to <code>all</code>.</p>
<p>By default, the GraphRAG indexer will only export embeddings required for our query methods. However, the model has embeddings defined for all plaintext fields, and these can be generated by setting the <code>GRAPHRAG_EMBEDDING_TARGET</code> environment variable to <code>all</code>.</p>
<p>If the embedding target is <code>all</code>, and you want to only embed a subset of these fields, you may specify which embeddings to skip using the <code>GRAPHRAG_EMBEDDING_SKIP</code> argument described below.</p>
<h3 id="embedded-fields">Embedded Fields</h3>
<ul>
Expand Down Expand Up @@ -2440,7 +2440,7 @@ <h2 id="prompting-overrides">Prompting Overrides</h2>
</tbody>
</table>
<h2 id="storage">Storage</h2>
<p>This section controls the storage mechanism used by the pipeline used for emitting output tables.</p>
<p>This section controls the storage mechanism used by the pipeline used for exporting output tables.</p>
<table>
<thead>
<tr>
Expand Down
14 changes: 7 additions & 7 deletions config/yaml/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -1439,7 +1439,7 @@ <h4 id="fields_2">Fields</h4>
<li><code>async_mode</code> (see Async Mode top-level config)</li>
<li><code>batch_size</code> <strong>int</strong> - The maximum batch size to use.</li>
<li><code>batch_max_tokens</code> <strong>int</strong> - The maximum batch # of tokens.</li>
<li><code>target</code> <strong>required|all|none</strong> - Determines which set of embeddings to emit.</li>
<li><code>target</code> <strong>required|all|none</strong> - Determines which set of embeddings to export.</li>
<li><code>skip</code> <strong>list[str]</strong> - Which embeddings to skip. Only useful if target=all to customize the list.</li>
<li><code>vector_store</code> <strong>dict</strong> - The vector store to use. Configured for lancedb by default.<ul>
<li><code>type</code> <strong>str</strong> - <code>lancedb</code> or <code>azure_ai_search</code>. Default=<code>lancedb</code></li>
Expand Down Expand Up @@ -1566,7 +1566,7 @@ <h4 id="fields_12">Fields</h4>
<h3 id="cluster_graph">cluster_graph</h3>
<h4 id="fields_13">Fields</h4>
<ul>
<li><code>max_cluster_size</code> <strong>int</strong> - The maximum cluster size to emit.</li>
<li><code>max_cluster_size</code> <strong>int</strong> - The maximum cluster size to export.</li>
<li><code>strategy</code> <strong>dict</strong> - Fully override the cluster_graph strategy.</li>
</ul>
<h3 id="embed_graph">embed_graph</h3>
Expand All @@ -1588,11 +1588,11 @@ <h4 id="fields_15">Fields</h4>
<h3 id="snapshots">snapshots</h3>
<h4 id="fields_16">Fields</h4>
<ul>
<li><code>embeddings</code> <strong>bool</strong> - Emit embeddings snapshots to parquet.</li>
<li><code>graphml</code> <strong>bool</strong> - Emit graph snapshots to GraphML.</li>
<li><code>raw_entities</code> <strong>bool</strong> - Emit raw entity snapshots to JSON.</li>
<li><code>top_level_nodes</code> <strong>bool</strong> - Emit top-level-node snapshots to JSON.</li>
<li><code>transient</code> <strong>bool</strong> - Emit transient workflow tables snapshots to parquet.</li>
<li><code>embeddings</code> <strong>bool</strong> - Export embeddings snapshots to parquet.</li>
<li><code>graphml</code> <strong>bool</strong> - Export graph snapshots to GraphML.</li>
<li><code>raw_entities</code> <strong>bool</strong> - Export raw entity snapshots to JSON.</li>
<li><code>top_level_nodes</code> <strong>bool</strong> - Export top-level-node snapshots to JSON.</li>
<li><code>transient</code> <strong>bool</strong> - Export transient workflow tables snapshots to parquet.</li>
</ul>
<h3 id="encoding_model">encoding_model</h3>
<p><strong>str</strong> - The text encoding model to use. Default=<code>cl100k_base</code>.</p>
Expand Down
Loading

0 comments on commit dfb59de

Please sign in to comment.