Class Anemone::Page
In: lib/anemone/page.rb
Parent: Object



body  [R]  The raw HTTP response body of the page
code  [RW]  Integer response code of the page
data  [RW]  OpenStruct for user-stored data
depth  [RW]  Depth of this page from the root of the crawl. This is not necessarily the shortest path; use PageStore#shortest_paths! to find that value.
error  [R]  Exception object, if one was raised during HTTP#fetch_page
headers  [R]  Headers of the HTTP response
redirect_to  [R]  URL of the page this one redirected to, if any
referer  [RW]  URL of the page that brought us to this page
response_time  [RW]  Response time of the request for this page in milliseconds
url  [R]  The URL of the page
visited  [RW]  Boolean indicating whether or not this page has been visited in PageStore#shortest_paths!

Public Class methods

Public Instance methods

The content-type returned by the HTTP request for this page

Array of cookies received with this page as WEBrick::Cookie objects.

Delete the Nokogiri document and response body to conserve memory

Nokogiri document for the HTML body

Was the page successfully fetched? true if the page was fetched with no error, false otherwise.

Returns true if the page is a HTML document, returns false otherwise.

Returns true if uri is in the same domain as the page, returns false otherwise

Array of distinct A tag HREFs from the page

Returns true if the page was not found (returned 404 code), returns false otherwise.

Returns true if the page is a HTTP redirect, returns false otherwise.

Converts relative URL link into an absolute URL based on the location of the page