With Anthelion Yahoo has released its Web crawler for structured data under open source license. The software works as a plug-in for Apache Nutch.
Open source web crawler: Yahoo Anthelion searches the semantic
web
Semantic annotations such as
using RDFa making web content understandable to machines. Now, Yahoo released
with Anthelion a Web crawler, which is to accurately read the data. The software works as a plug-in for
Apache Nutch and was released under the free Apache 2.0 license.
Anthelion
is a crawler for the semantic web. (Graphic:
Yahoo)
Anyone
interested in the functioning of the crawler, the Paper "should Focused Crawling for Structured Data" of employees of Yahoo Labs and Robert
Meusel reading of the University of Mannheim. The
source code of the software can be found on the project of Anthelion GitHub page.
Anthelion: Applications for
the Web crawler
Anthelion
was designed to most effectively search for matching data on the Web. When feasible application called Yahoo
for example, the search for web pages, give them the information about movies. The highlight: Thanks to online
learning algorithms should be Anthelion extremely effective in finding other
related websites.
No comments:
Post a Comment