 
 
  
  
  
  
  
  
 
A Node.js scraper for humans.
 Installation
 Installation $ npm i --save scrape-it
 Example
 Example const scrapeIt = require("scrape-it"); // Promise interface scrapeIt("http://ionicabizau.net", { title: ".header h1" , desc: ".header h2" , avatar: { selector: ".header img" , attr: "src" } }).then(page => { console.log(page); }); // Callback interface scrapeIt("http://ionicabizau.net", { // Fetch the articles articles: { listItem: ".article" , data: { // Get the article date and convert it into a Date object createdAt: { selector: ".date" , convert: x => new Date(x) } // Get the title , title: "a.article-title" // Nested list , tags: { listItem: ".tags > span" } // Get the content , content: { selector: ".article-content" , how: "html" } } } // Fetch the blog pages , pages: { listItem: "li.page" , name: "pages" , data: { title: "a" , url: { selector: "a" , attr: "href" } } } // Fetch some other data from the page , title: ".header h1" , desc: ".header h2" , avatar: { selector: ".header img" , attr: "src" } }, (err, page) => { console.log(err || page); }); // { articles: // [ { createdAt: Mon Mar 14 2016 00:00:00 GMT+0200 (EET), // title: 'Pi Day, Raspberry Pi and Command Line', // tags: [Object], // content: '<p>Everyone knows (or should know)...a" alt=""></p>/n' }, // { createdAt: Thu Feb 18 2016 00:00:00 GMT+0200 (EET), // title: 'How I ported Memory Blocks to modern web', // tags: [Object], // content: '<p>Playing computer games is a lot of fun. ...' }, // { createdAt: Mon Nov 02 2015 00:00:00 GMT+0200 (EET), // title: 'How to convert JSON to Markdown using json2md', // tags: [Object], // content: '<p>I love and ...' } ], // pages: // [ { title: 'Blog', url: '/' }, // { title: 'About', url: '/about' }, // { title: 'FAQ', url: '/faq' }, // { title: 'Training', url: '/training' }, // { title: 'Contact', url: '/contact' } ], // title: 'Ionică Bizău', // desc: 'Web Developer, Linux geek and Musician', // avatar: '/images/logo.png' }
 Documentation
 Documentation scrapeIt(url, opts, cb) A scraping module for humans.
url : The page url or request options. opts : The options passed to scrapeHTML method. cb : The callback function. scrapeIt.scrapeHTML($, opts) Scrapes the data in the provided element.
$ : The input element.  Object opts : An object containing the scraping information. If you want to scrape a list, you have to use the listItem selector: 
listItem (String): The list item selector. data (Object): The fields to include in the list objects:       <fieldName> (Object|String): The selector or an object containing:         selector (String): The selector. convert (Function): An optional function to change the value. how (Function|String): A function or function name to access the value. attr (String): If provided, the value will be taken based on the attribute name. trim (Boolean): If false , the value will not be trimmed (default: true ). eq (Number): If provided, it will select the nth element. listItem (Object): An object, keeping the recursive schema of the listItem object. This can be used to create nested lists. {    articles: {        listItem: ".article"      , data: {            createdAt: {                selector: ".date"              , convert: x => new Date(x)            }          , title: "a.article-title"          , tags: {                listItem: ".tags > span"            }          , content: {                selector: ".article-content"              , how: "html"            }        }    } }  If you want to collect specific data from the page, just use the same schema used for the data field. 
{      title: ".header h1"    , desc: ".header h2"    , avatar: {          selector: ".header img"        , attr: "src"      } }  How to contribute
 How to contribute Have an idea? Found a bug? Seehow to contribute.
 Where is this library used?
 Where is this library used?  If you are using this library in one of your projects, add it in this list.  
 
ui-studentsearch  (by Rakha Kanz Kautsar)—API for majapahit.cs.ui.ac.id/studentsearch  License
 License MIT © Ionică Bizău