Thanks to visit codestin.com
Credit goes to github.com

Skip to content

hjalle/Recluse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Recluse

Build status MyGet CI NuGet

Recluse is a simple dotnet core web crawler.

Usage from Console app:

IServiceCollection services = new ServiceCollection();

services.AddSingleton<ICrawlHandler, LogCrawlHandler>();
services.AddRecluseCrawler();
            
var serviceProvider = services.BuildServiceProvider();
var crawler = serviceProvider.GetService<RecluseCrawler>();

var task = crawler.CrawlAsync(new CrawlTask(new Uri("http://news.ycombinator.com")));

crawler.Start();
task.Wait();
var obj = task.Result;
Console.WriteLine($"{obj.Uri} -  {obj.StatusCode} - {obj.Headers}");
foreach (var item in obj.Links)
{
    Console.WriteLine($"{item.Uri} -  {item.LinkText} - {item.LinkType}");
}

About

Web crawler for dotnet core

Resources

Stars

Watchers

Forks

Packages

No packages published