A comparative analysis between data extraction and data discovery

If we look forward with the screen scraping in quite a simplified level, there are two sources involved in it. These are like that of data discovery and data extraction software. Those data discovery is the source that deals with the navigation of web site to arrive at the pages which are containing a good amount of data needed. These data discovery software step in screen scraping can be simple as requesting for a single URL. People making extensive use of computers would know that this software is one of the best and is going to use an important electronic file that would associate with making the software program designed possibly in the best way.

The data discovery steps into the screen scraping which might be as simple as requesting a single URL. in the data extraction phase, you will need the source and here you have already arrived at the page containing data that you are interested in and now you will need to pull it out of the HTML. Traditionally and typically it involves creating a series of regular expression that matches the pieces of the page you want. The regular expression can be a bit complex but would better deal with so most screen scraping applications that will hide these details from the user. Even though they may use the regular expressions behind the scenes.

There probably is a third phase which is often ignored and that is where you are going to extract the data once you are associated with it. In case of live web site you might even scrape the information and display it in the user’s web browser in real time. When you are shopping around for a screen scrapping tool you need to make sure that it gives you the flexibility that you need to work with the data once it have been extracted. But that does not happen with the data discovery step in screen scraping as it is simple and don’t need a lot of information to put on. These are commercial screen scraping tool that can be an incredible time saver.