Jump to content

What does Web Data mean to you? Who owns the data?
Jul 30

Often without realizing it, more and more companies rely on Web Data (any data you can see in a web browser) as a critical foundation for making business decisions.

Ron’s post on Web Data reminded me of this interesting blog post, “More data usually beats better algorithms”, written by Anand Rajaraman, co-founder of Kosmix and also Consulting Assistant Professor of Data Mining at Stanford University.

MoneyFallingThe blog post describes how Anand’s students competed for the $1 Million Netflix Prize, a competition open to the public.

Netflix provides a huge data set of customer movie ratings from the past, and the challenge is to use this data to create a better algorithm than Netflix already has to predict which movies people want to view in the future.

Anand’s students attacked this challenge and in his post he highlights two very different approaches.  Team A focused on developing a sophisticated algorithm.   Team B used a simple algorithm and focused more on the data, pulling in additional movie data from IMBD (International Movie Database).

Which team performed better?

Team B, who focused more on the data, got to the top of the Netflix Prize leaderboard.

Anand’s point?  “…adding more, independent data usually beats out designing ever-better algorithms to analyze an existing data set. I’m often suprised that many people in business, and even in academia, don’t realize this.”  Just adding one extra set of data can improve the quality of your decision making several times over.

The key is not about selecting between a better algorithm or better data, but about improving the outcome of your decision-making by adding more data, namely Web Data. Think about the impact to your business if you could add high-value Web Data to your Market Intelligence, Pricing Intelligence, Financial Intelligence or any other Business Intelligence product.

Many companies already have knowledge workers who cut-and-paste Web Data into their BI tools or use simple Web Scraping tools like Velocityscape, Connotate, QL2 or Mozenda (which are limited by their inability to handle dynamic web content like AJAX or JavaScript).  To get the most out of your Business Intelligence projects, you’ll want a full Web Data Services product like the Kapow Web Data Server.

Unleash the real power of Web Data to make better business decisions.

Check it out and let me hear your comments.

By:  Stefan Andreasen Stefan_Kapow_CTO

Comments are closed.

The Kapow Katalyst Blog is…

... a collection of insights, perspectives, and thought leadership around Application Integration.

Comments, Feedback, Contact Us:

blog at kapowsoftware.com

Get Our RSS Feed

RSSKapowSoftware