Extract meaningful information from unstructured text with Ruby.
Using a variety of APIs (Yahoo Term Extractor and Alchemy are currently supported), semantic_extraction can automatically return a collection of keywords for an arbitrary block of text. If you use Alchemy, it can also return named entities.
A brief walkthrough:
$ require 'rubygems' $ require 'semantic_extraction' $ SemanticExtraction.alchemy_api_key = 'YOUR_API_KEY_HERE' $ SemanticExtraction.find_keywords("http://chrisvannoy.com/2010/03/10/introducing_semantic_extraction/") $ ["Knight News Challenge", "Yahoo Term Extractor", "obscure gem", "soon-to-be twitter employee", "handle serving gems", "API providers", "Alchemy API", "unstructured text", "earliest steps", "Rails 3-compatible version", "structured data", "early stage", "death threats", "github", "final aside", "awesome piece", "default choice", "HTTP communication", "Indianapolis Star", "Feel free"]
-
Add support for OpenCalais
-
Flesh out the rest of the Alchemy API
-
Tests, tests and more tests.
-
Fork the project.
-
Make your feature addition or bug fix.
-
Add tests for it. This is important so I don’t break it in a future version unintentionally.
-
Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
-
Send me a pull request. Bonus points for topic branches.
Copyright © 2010 Chris Vannoy. See LICENSE for details.