Grunt plugin for fetching URLs and saving the result as local files.
Getting Started
This plugin requires Grunt ~0.4.2
If you haven't used Grunt before, be sure to check out the Getting Started guide, as it explains how to create a Gruntfile as well as install and use Grunt plugins. Once you're familiar with that process, you may install this plugin with this command:
npm install grunt-fetch-pages --save-dev
Once the plugin has been installed, it may be enabled inside your Gruntfile with this line of JavaScript:
grunt;
The "fetchpages" task
Overview
In your project's Gruntfile, add a section named fetchpages
to the data object passed into grunt.initConfig()
.
grunt;
Options
baseURL
Type: String
Base url for fetching remote pages via GruntJS "files" feature. Can be omitted when using only the urls
feature (see urls
option).
destinationFolder
Type: String
Required: yes
Local destination folder for fetched remote urls. This option is mandatory.
urls
Type: Array
Default: []
An optional list of remote urls to fetch. Required properties per element:
url
: full remote URL to fetchlocalFile
: local file name for fetched page (destination folder is defined bydestinationFolder
option)
followLinks
Type: Boolean
Default: true
Also fetch sub pages referenced via links (<a href="">
).
No fetching of links within sub pages at this time.
ignoreSelector
Type: String
Default: [rel="nofollow"]
Selector for ignoring certain links when following (see followLinks
option). The default value matches links with the "rel" attribute set to "nofollow": <a href="" rel="nofollow">
.
The selector is applied as $('a:not(ignoreSelector)')
, e.g. $('a:not([rel="nofollow"])')
cleanHTML
Type: Boolean
Default: false
Clean fetched pages via htmlclean node module, removing unneeded whitespaces, line-breaks, comments, etc.
fetchBaseURL
Type: Boolean
Default: true
Do not fetch the baseURL
when this option is set to false
.
Usage Examples
Simple example, fetch base URL and follow links:
grunt;
Full example with all feasible options set:
grunt;
Contributing
Take care to maintain the existing coding style. Add unit tests for any new or changed functionality. Do not submit code that did not pass the default grunt task for linting and testing.
Credits
Thanks to SinnerSchrader for support.
License
The MIT License (MIT)
Copyright (c) 2013-2016 Oliver Hellebusch
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.