docs: add README to each example

ikawaha · Apr 13, 2024 · d188364 · d188364
1 parent 9b43806
commit d188364
Show file tree

Hide file tree

Showing 4 changed files with 60 additions and 9 deletions.
diff --git a/_examples/db_search/README.md b/_examples/db_search/README.md
@@ -55,7 +55,7 @@ It demonstrates how to tokenize Japanese text using Kagome, which is a common re
 By using SQLite with FTS4, it efficiently manages and searches through a large amount of text data, making it suitable for applications like:
 
 1. **Search Engines:** You can use this code as a basis for building a search engine that indexes and searches Japanese text content.
-2. **Document Management Systems:**	This code can be integrated into a document management system to enable full-text search capabilities for Japanese documents.
+2. **Document Management Systems:** This code can be integrated into a document management system to enable full-text search capabilities for Japanese documents.
 3. **Content Recommendation Systems:** When you have a large collection of Japanese content, you can use this code to implement content recommendation systems based on user queries.
 4. **Chatbots and NLP:**  If you're building chatbots or natural language processing (NLP) systems for Japanese language, this code can assist in text analysis and search within the chatbot's knowledge base.
 

diff --git a/_examples/tokenize/README.md b/_examples/tokenize/README.md
@@ -0,0 +1,26 @@
+# Example of tokenizing/analyzing Japanese text
+
+This example demonstrates how to analyzes a sentence (tokenize) and get the part-of-speech (POS) of each word using Kagome.
+
+- Target text data is as follows:
+
+```text
+すもももももももものうち
+```
+
+- Example output:
+
+```shellsession
+$ cd /path/to/kagome/_examples/tokenize
+$ go run .
+---tokenize---
+すもも  名詞,一般,*,*,*,*,すもも,スモモ,スモモ
+も      助詞,係助詞,*,*,*,*,も,モ,モ
+もも    名詞,一般,*,*,*,*,もも,モモ,モモ
+も      助詞,係助詞,*,*,*,*,も,モ,モ
+もも    名詞,一般,*,*,*,*,もも,モモ,モモ
+の      助詞,連体化,*,*,*,*,の,ノ,ノ
+うち    名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
+```
+
+> __Note__ that tokenization varies depending on the dictionary used. In this example we use the IPA dictionary.
diff --git a/_examples/tokenize/main.go b/_examples/tokenize/main.go
@@ -22,12 +22,12 @@ func main() {
 	}
 
 	// Output:
-	//---tokenize---
-	//すもも	名詞,一般,*,*,*,*,すもも,スモモ,スモモ
-	//も	助詞,係助詞,*,*,*,*,も,モ,モ
-	//もも	名詞,一般,*,*,*,*,もも,モモ,モモ
-	//も	助詞,係助詞,*,*,*,*,も,モ,モ
-	//もも	名詞,一般,*,*,*,*,もも,モモ,モモ
-	//の	助詞,連体化,*,*,*,*,の,ノ,ノ
-	//うち	名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
+	// ---tokenize---
+	// すもも	名詞,一般,*,*,*,*,すもも,スモモ,スモモ
+	// も	助詞,係助詞,*,*,*,*,も,モ,モ
+	// もも	名詞,一般,*,*,*,*,もも,モモ,モモ
+	// も	助詞,係助詞,*,*,*,*,も,モ,モ
+	// もも	名詞,一般,*,*,*,*,もも,モモ,モモ
+	// の	助詞,連体化,*,*,*,*,の,ノ,ノ
+	// うち	名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
 }
diff --git a/_examples/wakati/README.md b/_examples/wakati/README.md
@@ -0,0 +1,25 @@
+# Wakati Example with Kagome
+
+## Segmenting Japanese text into words with Kagome
+
+In this example, we demonstrate how to segment Japanese text into words using Kagome.
+
+- Target text data is as follows:
+
+```text
+すもももももももものうち
+```
+
+- Example output:
+
+```shellsession
+$ cd /path/to/kagome/_examples/wakati
+$ go run .
+----wakati---
+すもも/も/もも/も/もも/の/うち
+```
+
+> __Note__ that segmentation varies depending on the dictionary used.
+> In this example we use the IPA dictionary. But for searching purposes, the Uni dictionary is recommended.
+>
+> - [What is a Kagome dictionary?](https://github.com/ikawaha/kagome/wiki/About-the-dictionary#what-is-a-kagome-dictionary) | Wiki | kagome @ GitHub