Skip to content

Examples for ResultScan

Mirko Pagliai edited this page May 8, 2020 · 4 revisions

We assume that we have finished a scan:

$LinkScanner = new LinkScanner();
$LinkScanner->setConfig([
	'externalLinks' => true,
	'fullBaseUrl' => 'http://google.com',
	'maxDepth' => 2,
]);
$LinkScanner->scan();

I want to get the first result of the scan:

debug($LinkScanner->ResultScan->first())

Output:

########## DEBUG ##########
object(LinkScanner\ScanEntity) {
    'code' => (int) 200,
    'external' => false,
    'location' => '',
    'type' => 'text/html; charset=UTF-8',
    'url' => 'http://google.com',
    'referer' => null
}
###########################

Now I want to get an array of all the scanned external urls:

$externalUrls = $LinkScanner->ResultScan
    ->match(['external' => true])
    ->extract('url')
    ->toList();
debug($externalUrls);

Output:

########## DEBUG ##########  
[  
(int) 0 => 'http://maps.google.it/maps?hl=it&tab=wl',  
(int) 1 => 'https://play.google.com/?hl=it&tab=w8',  
(int) 2 => 'http://youtube.com/?gl=IT&tab=w1',  
(int) 3 => 'http://news.google.it/nwshp?hl=it&tab=wn',  
(int) 4 => 'https://mail.google.com/mail/?tab=wm',  
(int) 5 => 'https://drive.google.com/?tab=wo',  
(int) 6 => 'https://accounts.google.com/ServiceLogin?hl=it&passive=true&continue=http://www.google.it/',  
(int) 7 => 'https://plus.google.com/101652436578946786044'  
]  
###########################

As already explained, a ScanEntity instance can call all the methods of the Cake\Http\Client\Responseclass.
So, now I want to filter all results that are neither "ok" nor a redirect, so those that are errors:

$errorResults = $LinkScanner->ResultScan->filter(function($scanEntity) {
    return !$scanEntity->isOk() && !$scanEntity->isRedirect();
});

print PHP_EOL;
print 'There are ' . $errorResults->count() . ' results that are error:';
print PHP_EOL;
foreach ($errorResults->toList() as $result) {
    print $result->url . PHP_EOL;
}

Output:

There are 3 results that are error:  
http://google.it/language_tools?hl=it&authuser=0  
http://google.it/setprefdomain?prefdom=US&sig=K_04_j5dfz8-tySoeO0InK1SnshTA%3D  
http://google.it/images/branding/googlelogo/1x/googlelogo_white_background_color_272x92dp.png

Finally, a complete example:

$LinkScanner = new LinkScanner();
$LinkScanner->setConfig([
	'externalLinks' => true,
	'fullBaseUrl' => 'http://google.it',
	'maxDepth' => 2,
]);
$LinkScanner->Client->setConfig('timeout', 3);
$LinkScanner->scan();

print PHP_EOL;
print 'Fullbase: ' . $LinkScanner->getConfig('fullBaseUrl') . PHP_EOL;
print 'Hostname: ' . $LinkScanner->hostname . PHP_EOL;
print 'Startime: ' . (new Time($LinkScanner->startTime))->nice() . PHP_EOL;
print 'Endtime: ' . (new Time($LinkScanner->endTime))->nice() . PHP_EOL;

print PHP_EOL;
print 'Total scanned URLs: ' . $LinkScanner->ResultScan->count();

//Matches external urls
$externalUrls = $LinkScanner->ResultScan->match(['external' => true]);
print PHP_EOL;
print 'There are ' . $externalUrls->count() . ' external urls.' . PHP_EOL;

//Filters all results that are neither "ok" nor a redirect,
//so those that are errors
$errorResults = $LinkScanner->ResultScan->filter(function($scanEntity) {
    return !$scanEntity->isOk() && !$scanEntity->isRedirect();
});
print 'There are ' . $errorResults->count() . ' results that are error:' . PHP_EOL;
foreach ($errorResults->toList() as $result) {
    print $result->url . PHP_EOL;
}

Output:

Fullbase: http://google.it  
Hostname: google.it  
Startime: Feb 28, 2019, 11:00 AM  
Endtime: Feb 28, 2019, 11:00 AM  

Total scanned URLs: 21  
There are 7 external urls.  
There are 3 results that are error:  
http://google.it/language_tools?hl=it&authuser=0  
http://google.it/setprefdomain?prefdom=US&sig=K_F2LhwTc3cjmZg_KC6kLN8WlQEmo%3D  
http://google.it/images/branding/googlelogo/1x/googlelogo_white_background_color_272x92dp.png
Clone this wiki locally