Scraper Management

I. Introduction

Scrapers can run without your oversight, but initially or after a period of time if the original calendar changes, you may need to make changes to the scraper. You can do so by "Requesting a Fix."  

You'll want to check how the scraper is reading information before you request the fix. 

Some scenarios when the scraper may need a fix:
  • you find yourself spending too much time editing event information in the pending queue
  • no new events have been pulled into the pending queue for a particular venue after an extended period of time
  • there is often incorrect event info on your site getting through the queue
  • the scraper reads "Yes" under the "Broken?" column in the scraper overview page

II. Finding the scraper

Navigate to the page of the scraper in question, which can be achieved in two different ways:

1) Searching for it by venue name in MAIN > SCRAPERS tab: http://admin.dostuffmedia.com/scrapers



2) Clicking on the scraper name under "Scraped By" in the right sidebar of an event page



III. Viewing past scrapes

In the right sidebar of the scraper page, click "View past scrapes"



This will take you to a list of the most recent times this scraper has scraped information in from the source's website.
A quick way to check for accuracy is looking that whether the "Succeeded" column reads "Yes" or "No"



To get the specific details of the scrape, click on the "json" link in the row of the date you'd like to check.
* A useful plugin to download is JSON View for Chrome or Firefox *

Check out the .json and compare it to the actual calendar page being scraped. This is the most accurate way to determine if the event data we are pulling in differs or is inaccurate compared to the event data listed on the source's website.

IV. Requesting a Scraper Fix

If it's concluded that the scraper is broken or needed edited to be more accurate, you can request a fix for it.

On the scraper page, click "Edit this scraper" in the right sidebar.



Then towards the bottom of the page, check "Needs Fix"



Within the "Problem" field:
  1. Describe the issue with an EXAMPLE in the form of a link and a highlighted screenshot
  2. Explain the needed fix and provide an EXAMPLE (best simply a correction of the issue example)



Then click "Update". 

If scraper was fixed, the updated results should be showing in your pending queue within 1 week. If the scripter has any hangups or questions, he will respond in the "Problem" field.
Comments