Jump to content

Nov 05

To say pulling data from various internal and external sources is time-consuming is a masterpiece of understatement. Cutting and pasting, using homegrown scripts or applications that record a user’s actions can’t compete with the pace of business. And over time, there will be an increased demand of not only quantity but quality of information.Lots of information is accessible via public websites with more data that’s often hidden beyond firewalls and web portals that require login credentials and ability to navigate the site in order to extract the data. Valuable information is also embedded in PDFs, images, and graphics.Kapow-Blogpost-graphic


From start-ups to enterprise organizations and spanning across a variety of industries from financial, transportation, retail, and healthcare, acquiring external data is critical. Whether you want to stay in compliance, move ahead of the competition or reach new markets- it all requires constant monitoring of web data. Data is extracted, transformed, and migrated into various reports and becomes the foundation business decisions are based upon.

So a web-scraping tool or homegrown web scraping approach seems like a good option, since it looks like it’s a quick and inexpensive way to harvest the data you require. Or can it?

Now comes the uneasy feeling in the back of your mind. Can my homegrown web scraping approach or a web-scraping tool acquire the correct information I need? How do I know the data I received is accurate and formatted correctly? And what if management wants different reporting data, how is that handled?

The short answer: You don’t know.

The right answer begins with an evaluation of your specific data requirements and business needs.

  1. How does web scraping acquire the data?

While product demonstrations can present an initial set of data with colorful dashboards, full of charts and reports, you are better off to ask for a technology demonstration that relates to your specific data collection needs. Write up a list of actual websites you gather data from. Your list should include various types of sites from HTML 5, Flash, JavaScript, and AJAX. Be sure to include websites with firewalls and PDFs. The more scalable, reliable, and faster the web data extraction process performs across various external websites, the better.

  1. What does the data look like?

You have received some data using a web scraper tool, but now you spend all your time trying to transform the data. You notice formatting and quality issues with the data. If the extracted data is not accurately transformed and put into a usable format, such as Microsoft Excel, .csv files, or XML, the data becomes unusable by applications that have specific integration requirements. Now you have lost half the value of your purchased investment. Extracting and auto correcting of specialized data often includes dates, currencies, calculations, conditional expressions, plus the removal of duplicate data are all important considerations.

  1. How difficult is it to make changes?

What happens if a website changes or if you need to monitor and extract data from new websites? Many web-scraping tools have a high propensity to fail when websites change, which then requires resources and in some cases a developer to fix the problem. Unless you have a developer in house to make these fixes, this will add additional time and expense, and the problem only grows bigger as you monitor and extract data from hundreds or even hundreds of thousands of websites. If scalability is important to you, be sure to ask how the technology solution monitors and handles changes to a website, especially if you want to expand beyond your immediate data collection needs.

Extracting and transforming web data is more than just purchasing any web-scraping tool. Think about the data you are collecting and how it’s tied to your business. In all likelihood, there’s a strong set of business drivers for collecting the data, and taking shortcuts will only compromise the success of what your business goals are. And it should never make you feel uneasy about the information you are collecting.

Look beyond the data that’s being extracted, and think about what you are doing with it in the context your customers, creating a competitive advantage, or streamlining processes that rely on data from websites, portals, and online verification services.


Tagged with:          
Sep 17

Customer insights to best practices 

Last week I spoke with John, who leads a web automation team at a Fortune 500 professional staffing company that has been a customer of Kapow for more than 5 years, primarily using Kapow for Customer Relationship Management (CRM) and Human Resource (HR) activities that involve transforming, synchronizing, and delivering information between their Vignette, SharePoint® and Salesforce® applications.

Like almost every enterprise organization their business team’s use Microsoft Excel extensively for data sharing and reporting, and collaboration via Microsoft SharePoint.

As John explains, “Microsoft Excel® is used throughout the organization to capture data within business teams, reporting, or simply for exchanging data, all within SharePoint.”

John elaborates, “Microsoft has really enhanced Excel, which is seen with the improvements in data visualization between Excel 2010 and 2013 versions. Microsoft is also integrating Excel with SharePoint 2013 so you can surface live Excel data directly in web parts in SharePoint. This is the path we are taking and with the new Excel edit feature in Kapow 9.4 we expect to quadruple the use of Kapow over the next year to support it.”


I must admit this is very exciting and great to see the same excitement from customers who see the value in automating activities that involve great amounts of data and the use of Excel.

When you use Kofax Kapow to dynamically update live internal and external data in an Excel spreadsheet, this information can in turn be surfaced in SharePoint, making your entire SharePoint platform a collaborative real-time decision-making platform. Data is unlocked from any data source you can think of, including cloud apps, enterprise apps, web portals, emails, active directory and of course SharePoint itself.

Today the company updates all their Excel data repositories and Excel reports manually, which is not only tedious and not very exciting work, but unavoidably will also include human errors, which could become critical to their business.

Some of the data they capture is coming from other departments. Just managing who has done what is a nightmare. That’s why capabilities like Kapow’s advanced logging are so important when it comes to having a full audit trail.

In one example, John explains how they currently receive a separate email when a person joins a training course, which is sent through a SharePoint workflow but then requires a business user to manually key this information into Excel. All this is automated with Kapow 9.4.

John expects that automating manual Excel driven work will be expanding their need for Kapow into more departments, including HR, payroll and employee development. Finally, by combining Excel and Kapow into SharePoint this will drive adoption within data delivery, data visualization, and data collaboration.

Do you have any insights regarding Excel? We would like to hear from you.

Stay tuned for my next customer interview about using Kapow for Excel automation.

Stefan Andreasen

Corporate Evangelist, Kapow & Information Integration

Tagged with:    
Aug 26

Millions of organizations put up with the inefficiencies and risks associated with running critical parts of their business on spreadsheets, with the vast majority using Microsoft Excel ® as their preferred tool. Spreadsheet software isn’t designed to be used in the manner with which most companies use them today. Spreadsheets are handy for ad-hoc analysis, reporting, data exchange, prototyping and other common tasks. In a corporate setting, the repetitive manual tasks needed to acquire and integrate information from internal and external data sources into spreadsheets can lead to costly errors. In addition, spreadsheets are difficult to audit and clumsy to work with in collaborative repetitive business processes such as budgeting, sales, and operational planning, partner data-exchange and cash management.

Ventana Research’s comprehensive report on “Spreadsheets in Today’s Enterprise – Making Intelligent Use of a Core Technology” provides detailed insight into the use of Excel in the typical corporation. Excel is the de-facto format for reports, data-exchange or financial models. According to a study performed by Ventana Research in 2012, 72% of the participants said that their most important spreadsheet are ones that are shared with others.1

A typical Excel based process involves opening a pre-formatted Excel template, complete with multiple work-sheets, pre-built macros, tables, graphics, and then edit/assemble data from a multitude of sources into this template to create the delivery document. Input data can come from systems such as email servers; business applications such as CRM, HR or ERP; bank portals, business partners portals such as financial partners, supply-chain partners, logistics parts; government public web-sites: and finally internal monitoring applications from departments such as IT, Marketing, Procurement, etc.

These Excel reports are then delivered to stakeholders in departments such as, Finance, Sales, IT or externally to business partners through email, FTP upload or portal upload.

Rather than get rid of spreadsheets, which for most companies would be nearly impossible, there is a modern way to cost-effectively automate the acquisition of the data entered while still preserving the familiarity and ease of use of Excel with greater accuracy, ease of collaboration and elimination of tedious manual processes.

FIGURE 1. Manual Excel based process flow.













Innovative products such as Kofax Kapow allow the business user to define the flow over their complete Excel process with integration directly to all the information sources and destinations. It does not take much longer to create a solution than to perform the work once manually and it can then be repeated over and over again, without human errors. Kofax Kapow also delivers a full audit log of everything that happened and alerts selected persons if anything went wrong.

The value is not only in the automation of the repetitive manual process, but also in increased business revenue from:

  1.  Elimination of human errors.
  2. Near real-time result/delivery for quicker decisions or improved service levels.
  3. Running the process at speeds that would be impossible for a human.












FIGURE 2: Efficient workflow of automated Excel process with Kofax Kapow.


Next steps

When I discuss this topic with industry leaders, I typically recommend a number of steps to discover the use of Excel within an enterprise to understand the potential for Excel Automation. These steps include:

  1. Interview business managers in departments who use Excel.
  2. Estimate amount human time used on manual repetitive work.
  3. Estimate the business value from elimination of human errors.
  4. Estimate the business value of freeing employees to make better business decisions.
  5. Think about improving your business by including more data sources or increasing the frequency you acquire data.

From these simple steps you can determine the ROI.

For most companies Excel Automation is a no-brainer.

A future blog post will go through real-life customer examples, so stay tuned.

Comments are welcome at stefan.andreasen@kofax.com

  1. Ventana Research, Spreadsheets in Today’s Enterprise, January 2013.


Apr 21

Many factors have contributed to SharePoint’s longevity and success, and David Roe pointed out a few of them in his recent CMS Wire article entitled, “SharePoint: A Formidable Enterprise Collaboration Platform” The article, which summarizes a Radicati Group report on SharePoint, mentions that SharePoint’s ecosystem has been a key contributor to its continued success, and I agree completely. SharePoint functionality is also important, of course, and Microsoft has invested heavily to add social and mobile capabilities throughout SharePoint 2013. But business value doesn’t come from a box: it comes from applying technology like SharePoint—and Kapow Enterprise—to the pressing needs that challenge your business. As part of the SharePoint ecosystem, Kapow improves many of our SharePoint customers’ content processes, from capture to creation to enterprise search—just as we do for all the CMS products we support.

If you have any questions about content migration, give us a call and we can help decide whether we’re right for you. You can get the full Kapow Content Migration story from our white paper on the topic. Attending SPTechCon April 22-25 in San Francisco? Bring your requirements by booth 220 at the exhibit hall. 






Authored by: Carol Kimura, Director, Field Marketing at Kapow Software – a Kofax Company

Apr 08

On March 26 I presented at bpmNEXT 2014, an annual event for leaders in the business process management (BPM) industry, analysts, industry influencers and various vendors. There were nearly one hundred attendees from more than 10 countries. This was one of those events where you come back with great new contacts and a ton of inspiration.

Following welcoming remarks by Nathaniel Palmer and Bruce Silver, who are some of the biggest thought leaders in the industry and the team behind the creation and expansion of the BPM.com community, we jumped right in to the 25 presentations all of which delivered cutting-edge new and innovative BPM demos.

The event was very well orchestrated and organized by Nathaniel and Bruce. At the end of the three-day conference, I can say it was definitely one of the best events I’ve ever attended. I was both honored and proud when my presentation “Automation of Manual Process with Synthetic APIs”, was voted Best in Show by the attendees.  Later this month, you’ll be able to watch all of the bpmNext presentations at www.bpmnext.com

So how does Synthetic APIs  help most business processes?

BPM is all about using a workflow engine from one of many vendors to describe, manage, monitor and improve efficiency of business processes. This can be any process, but most companies normally invest in BPM around critical fundamental processes that drive major parts of their business.

Unfortunately BPM does not help much in automating the individual sub-tasks of the process they manage. This is especially true for the ever increasing amount of web-centric processes and processes involving web portals, because those portals more likely than not do not provide a full set of API that reflects the functionality of the portal itself. This is where Kapow Software enabled Synthetic API technology comes in.

Synthetic APIs, which include business rules, data transformations and interactions with multiple applications and data sources can be deployed as REST, SOAP or mini-apps (Kapow Kapplets™) by the click of a button, are easily built with the intuitive and live-data-driven work-flow design environment of Kapow Enterprise. This makes it a breeze to automate all those tedious repeatable sub-processes involving web portals, documents (like Excel), business applications (like ERP) and file systems (local or FTP). In fact it’s so easy with Kapow Enterprise, that Kapow customers implement hundreds of automations per year, that release important knowledge workers from performing repeatable manual data-driven work  to focus on more relevant and gratifying work that substantially adds to the top-line results.

Many of the more than 250 Kapow customers experience such a huge business benefit and a competitive advantage with the Kapow Enterprise platform, that they ask for us to not mention their name in any circumstance. For more details on Synthetic APIs, check out the Synthetic API on-demand webinar. Comments are also welcome at sandreasen@kapowsoftware.com.











Authored By: Stefan Andreasen, Corporate Evangelist, Data Intergration, Kapow Software – a Kofax Company

Apr 07

Just as Oracle users began to gather for COLLABORATE 14, the IOUG-run conference, I was speaking with one of our long-time Oracle customers and the topic turned to her journey with Kapow.

She began, as many of our customers have, in the middle of a major content migration project—several hundred thousand pages—that had begun to slip almost on the very first day. After meeting with Kapow at COLLABORATE she saw the potential and about a week after bringing Kapow Enterprise in-house she was convinced.

What struck me was just how difficult it was for her—a seasoned content management expert—to believe that it was worth automating a content migration. “In the past I’d used specialists to develop migration scripts but we didn’t find very much reuse,” she told me, and went on to say “Despite what experts advocate, I’ve always had to transform content as part of my projects and scripting isn’t well-suited to that. So I took automation out of my migration toolbox until I found you at COLLABORATE a few years back.”

Since then they have placed Kapow Enterprise at the center of their Information Management function, using it to create content from databases, to integrate support documents from outside their firewall, and even to load richer metadata into the index of their search solution. So whether you need to publish data from an Oracle database or migrate content into WebCenter, give us a call and we can help assess whether we’re right for you. You can get the full Kapow Content Migration story from our white paper on the topic.

And if you’re attending COLLABORATE 14 in Las Vegas, come hear Stephen Moore speak on “Automating Web and Document Migrations” Wednesday, April 9 from 2-3pm. He’s on Level 3, San Polo 3501A (Session 908). And don’t forget to stop by booth 1643 at the Exhibitor’s Showcase for a 1-on-1 discussion of your requirements. 





Authored by: Carol Kimura, Director, Field Marketing at Kapow Software – a Kofax Company

Mar 24

Kapow is once again pleased to be part of the Adobe Summit in Salt Lake City on March 24-28; it’s not our first time here and I expect it won’t be our last. Adobe Experience Manager, built on Adobe CQ, is a powerful tool for web marketers and our success in the Adobe community is a reflection of that.

However, I find too often that high-visibility content projects will struggle, or even become high-visibility failures, no matter what the target CMS is.

At this very moment, two of my colleagues at large organizations are in the middle of large content migrations, and both are trying to work through the content freeze that’s considered a “best practice” in content migration. One hasn’t been able to update customer-facing documents for months, leading to mis-quoted sales and even a few customer defections when inconsistent price lists were circulated by email during the freeze period.

But all of that was unnecessary. Thanks to today’s technology lengthy content freezes aren’t needed. We have a white paper that explains why, or even better–If you’re attending the Adobe Digital Marketing Summit stop by our booth #811 and say hello.





Authored by: Carol Kimura,  Director, Field Marketing at Kapow Software – a Kofax Company


Nov 11

I’m proud to be announcing that Kapow Enterprise 9.3 comes with an integrated WebKit-based browser. This means that when designing or running data integration flows (aka robots) in Kapow KatalystTM against web-based systems or applications, you can take advantage of the impressive HTML5 compatibility and JavaScript performance of WebKit.

For those of you who are not familiar with WebKit, it is the common core between Safari and Chrome (up until the most recent versions of Chrome that are now running on a WebKit fork known as “Blink”). According to StatCounter, the web traffic analysis tool, WebKit is the most widespread web browser engine in use on the internet – ahead of both the IE and Firefox engines.

Fie's blog image

This means that integration flows based on WebKit have a very high likelihood of being compatible with the websites in use around the world. For old legacy systems that you wouldn’t expect to work in Chrome or Safari, we recommend that you continue to use our classic browser engine that is IE-compatible. Having both engines in our product gives you the maximum flexibility in integrating with both cutting-edge web applications as well as those legacy applications that still hold important information and functionality, but are no longer updated to support modern browsers.

Making browsers that are created for human interaction controllable by an integration flow isn’t the easiest thing in the world. It often requires a lot of scripting, trial-and-error and can be hard for others to read and maintain.

But we’ve taken our knowledge of how browsers work, including algorithms to determine when the time is right to take the “next step” (clicking a link or entering data into a form) and wrapped the WebKit engine in this logic, making it easy for you as a user to build integration flows using point-and-click development. These flows are quick to create and easy to maintain over time, providing stable Synthetic APIs  so data can be rapidly integrated from applications and data sources that do not have APIs.

Authored by: Anne-Sofie Nielsen, Vice President of R&D, Kapow Software – A Kofax Company

Nov 11

Today is an important milestone in our mission to make big data more accessible, actionable and affordable to support data-driven decision making. The variety of big data continues to grow exponentially as new data sources and types are becoming available. Blending internal data sets with external data and making it available for business consumers to explore is opening a whole new world of possibilities for innovation and business growth. With Kapow Enterprise 9.3, launched today at Strata Big Data conference in London, we are making more data sources available and supporting interactive data exploration, allowing users to understand the data and act on it very quickly.

Our objective with this new release was to allow organizations to experiment with data. Explore widely used sources and combine new sources, both internal and external, to discover new insights that were never available before. At the core is the ability to quickly test current hypothesis, create news ones and rapidly course-correct in between – all based on relevant data that pricing analysts, marketing professionals, sales executives and other knowledge workers can quickly access and explore.

We’ve enhanced our lightweight data applications, Kapow KappletsTM, to present data from any number of sources in richly visual pages that can include interactive tables and graphs. Users can explore data by dynamically sorting and filtering data so they can focus on subsets of the data that are more relevant or interesting. But in order to make this really valuable, in Kapow Enterprise 9.3, users can take action on the data right away by selecting one or multiple rows of data as input for an action executed by another data integration flow.

Kapplets Image_1blog

For example, the Kapplet shown to the right presents a table with sales leads collected from various external sources. The Kapplet user can select one, multiple or all of the rows and submit them to a CRM system (e.g. Saleforce.com) to create a new lead record. Once submitted the Kapplet will present back the URL of the new Salesforce.com lead record per each row.

To make it easier to quickly build and distribute Kapplets across the organization, we’ve built user management capabilities into the platform, so organizations can start empowering their business users across departments right away without relying on LDAP or Active Directory implementations.

We continue to enhance the data acquisition capabilities of our platform with the ability to rapidly create Synthetic APIs to integrate sources without APIs as well as enhance the utilization of existing traditional APIs with new capabilities to natively interact with web services.

There are numerous ways that global business leaders can harness the power of the new Kapow Enterprise 9.3 big data integration platform. For example monitoring competitor pricing, increasing marketing ROI or generating more sales leads. Contact us today to get started right away!

Authored by: Hila Segal, Director, Product Marketing at Kapow Software – a Kofax Company

Nov 06

We’re surrounded by data. It is data produced in both private and public sectors. It is data generated by individuals or machines. The variety of Big Data is large and growing rapidly across both internal and external sources and offer immense opportunities for those who will take advantage of it.

Variety-of-Big-Data-Sources-Kapow-Software-COND_SMBehind the firewall, Big Data is stored in databases, file systems, Hadoop frameworks, documents, archives and legacy on-premise applications for ERP, CRM, content management and more.  Externally, outside the firewall, it is available in cloud-based applications (like Salesforce, Marketo and Ariba), partner and supplier portals, public web like government websites or competitor websites, social networks and many more. External data on economic indicators, public finance, healthcare, regulatory compliance and more also play a central role in many Big Data use cases. Everyday consumers are creating Big Data across Facebook, Twitter, LinkedIn, blogs, reviews and forums, offering insights into their preferences and behaviors.

To add to this variety, each data source has different characteristics. Data can be structured, like machine data or sensor data, well organized in fields within a record or file or it can be unstructured, like social media data with no pre-defined data model. It can also be somewhere in the middle between structured and unstructured.  Some data sources will have APIs and others will not, requiring alternative methods in order to access and extract data.

The bottom line is that there is really no one “killer” data source. It is the unique combination of sources that you tailor to address your business needs that will be the “killer” data for your organization. The ever-expanding pool of data sources and types is becoming available and opening a whole new world of possibilities. Start taking advantage of it now, experiment with new sources, augment traditional sources, iterate and find the data gems that matter to your business!


Author: Hila Segal, Director of Product Marketing

The Kapow Katalyst Blog is…

... a collection of insights, perspectives, and thought leadership around Application Integration.

Comments, Feedback, Contact Us:

blog at kapowsoftware.com

Get Our RSS Feed