Jump to content

Nov 23

As a frequent traveler I often stay at Marriott hotels and noticed they recently upgraded and redesigned their website.

Why is this interesting in a blog about Web Data Services?

Marriott has just done what so many other companies are doing these days – they are modernizing their website with a user-friendly, dynamic AJAX-based interface to enhance user experience.

While AJAX is helpful for creating interactive web applications like a hotel reservation system, it’s very bad news for business users who depend on collecting web data with home grown scripts and primitive web scrapers to empower business applications like Market Intelligence, Financial Research, and Buzz Analytics.

But don’t despair. Kapow Technologies just released version 7.1 of Kapow Web Data Server which includes support for even the most sophisticated AJAX toolkits, including Google Web ToolKit.

That said I couldn’t write this blog post without testing it out myself.

So I powered up RoboMaker 7.1, typed in www.mariott.com, and did a simple search for hotels near San Francisco airport.

Then a dynamic map (powered by Microsoft bing) appeared showing the locations of the 10 nearest Marriott hotels.  I wanted to create a loop over the ten hotels on the map, so I simply clicked on hotel number 1, clicked the insert loop command, and in a few minutes I had created a Kapow robot that could extract hotels directly from a highly dynamic, AJAX-based map.

Tell me about any other product on the planet that can do this in 2 minutes!

Check out the picture below, and be sure to think about Kapow Web Data Server when your current Web Data Extraction tools break in the world of modern AJAX powered web sites.

Marriott Map

By:  Stefan Andreasen Stefan Andreasen, Founder and CTO

Tagged with:          
Jul 13

Scraping comes from “Screen Scraping” which is a term used for a set of products that turn old “Green Screen” mainframe applications into web services by “wrapping” the screen protocol.  Screen Scrapers connect up to the fields of a 32×80 character terminal and read pixels, text and numbers to fill in forms and in turn wrap the application into a programmatic interface or web service.  Examples of such products are IBM Rational HATS, Attachmate EXTRA.

Web Scraping is conceptually identical to Screen Scraping as it “wraps” a human interface into a programmatic interface, but instead of “wrapping” a character based mainframe protocol, it “wraps” a Web site or Web application and turns it into an API.

It sounds similar but technically, and in use cases, it’s quite different.

Web Scraping does not represent all approaches of wrapping Web applications into API’s – it’s limited to traditional methods that use script languages like PERL or Python to extract data from static HTML with regular expressions. This method of extracting data from web sites has been used for years, but it has been running into two growing challenges:  it’s fragile toward changes in the underlying web application, and more importantly, it simply does not work with today’s dynamic AJAX powered web sites.

If you are a PERL programmer I encourage you to build a simple “web scraper”. Go to Gmail.com and create a PERL script that can log in and read the content of your inbox. You will quickly find out that it is nearly impossible.

Let me introduce the Kapow Web Data Server – it takes over where fragile “Web Scraping” scripts fail, delivering a point-and-click interface to turn a website like gmail.com into a sharable REST or SOAP service in the cloud or on-premise, virtually in minutes. Web data access has never been easier and more resilient.

Web Scraping represents a business concept with growing value in today’s networked world, however, Web Data Serving has taken over to deliver a far more productive and robust alternative to traditional Web Scraping technologies.

I will be continuing with more blogs on this topic, and as always, I’d love to hear your comments.

By:  Stefan Andreasen Stefan_Andreasen_CTO

Tagged with:       

The Kapow Katalyst Blog is…

... a collection of insights, perspectives, and thought leadership around Application Integration.

Comments, Feedback, Contact Us:

blog at kapowsoftware.com

Get Our RSS Feed