Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jieba分词有没有结合lucene的例子 #4

Open
13653415686 opened this issue Jan 8, 2020 · 2 comments
Open

jieba分词有没有结合lucene的例子 #4

13653415686 opened this issue Jan 8, 2020 · 2 comments

Comments

@13653415686
Copy link

segmenter.Cut
segmenter.CutForSearch
segmenter.Cut2
这些分词效果都是在什么时候应用,并且这个分词的效果没问题。
之前pangu分词是 foreach (WordInfo word in words)
{
if (word == null)
{
continue;
}
result.AppendFormat("{0}^{1}.0 ", word.Word, (int)Math.Pow(3, word.Rank));
}
现在是JieBa分词,该怎么得到

@SilentCC
Copy link
Owner

您好,

var segments = segmenter.Cut("我来到北京清华大学", cutAll: true);
Console.WriteLine("【全模式】:{0}", string.Join("/ ", segments));

segments = segmenter.Cut("我来到北京清华大学");  // 默认为精确模式
Console.WriteLine("【精确模式】:{0}", string.Join("/ ", segments));

segments = segmenter.Cut("他来到了网易杭研大厦");  // 默认为精确模式,同时也使用HMM模型
Console.WriteLine("【新词识别】:{0}", string.Join("/ ", segments));

segments = segmenter.CutForSearch("小明硕士毕业于中国科学院计算所,后在日本京都大学深造"); // 搜索引擎模式
Console.WriteLine("【搜索引擎模式】:{0}", string.Join("/ ", segments));

Cut2 是我新加的属性

segmenter.Cut2 是在Cut基础上返回每个词的起始位置。
返回的是IEnumerable<WordInfo>

WordInfo

 public class WordInfo
    {
        public WordInfo(string value,int position)
        {
            this.value = value;
            this.position = position;
        }
        //分词的内容
        public string value { get; set; }
        //分词的初始位置
        public int position { get; set; }
    }

@SilentCC
Copy link
Owner

https://www.cnblogs.com/dacc123/p/8431369.html
这篇博客里有详细说明

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants