Pingar: turning unstructured data into knowledge

August 23, 2011 | Robert Scoble

Large enterprises, and even smaller organizations in a document intensive industry, often have millions of documents stored on their servers. Finding any sort of meaningful relationship among the documents or gleaning any value from them can seem impossible. Pingar has developed technology to make sense of all this information and allow its owners to put unstructured data to good use.

“What we’re trying to do is provide technologies that enable those enterprises to begin to understand what content sits within their data sets,” explains Peter Wren-Hilton, CEO and Founder of Pingar. “Typically the entity extraction and content analysis components that we have developed really are designed to enable enterprises to be able to identify relationships between documents [and] begin to, for instance, generate automatic meta data. If redaction is a key point, then our entity extraction components allow companies to redact documents through algorithms rather than through the black marker pen. So we’ve got a range of components that collectively enable enterprises to make more sense from the millions of documents that they have stored away.”

Generating the meta data is a key difference between Pingar and other enterprise solutions on the market. Most enterprise search engines rely on meta data to help them identify documents; however, users often find a way around entering the meta data when the document is created. Pingar removes the need to humanly tag documents and replaces that task with algorithms. This is just one of many tasks that are automated with Pingar.

“It’s not just the extraction,” says Wren-Hilton, “it’s what you can then do with that extracted entity. We then move from entity extraction to content analysis, and with content analysis, we’ve got redaction, sanitization and summarization. So we’re able to take a 40-page .pdf and create a six or seven paragraph executive summary on the fly simply through content analysis.”

The technology is platform agnostic, so it works with any document management system, and it was released as an API to allow developers to access it.

“We were going to go to market at the end of last year, and we actually did a fairly significant pivot when we realized that the amount of technology that we had would make it far better to release it as an API. In March of this year, we released an API with 18 specific components that developers are able to access. There’s the standard, free developer sandbox account, so they can start building applications.”

Currently, Pingar is available in English and Chinese language versions; however, the company has plans to release French, German, Spanish and Arabic versions of the API over the next 6 to 9 months. Other innovations are sure to follow as well.

“Although we’re commercializing the product,” explains Wren-Hilton, “there’s still a strong focus on research, and I think one of the areas that will create the most excitement for those people interested in enterprise is the ability to start developing custom taxonomies. A company will be able to build a custom taxonomy using some of our technology. So rather than having to use a digital librarian to physically build a taxonomy, our entity extraction tools will identify the most commonly used terms and phrases and build a taxonomy on the fly.”

More info:

Pingar web site:
Pingar blog:
Pingar profile on CrunchBase:
Pinger on Twitter:

This post was tagged:

{ 1 comment }

David Glen August 24, 2011 at 4:42 pm

Watched this from beginning to end – very cool – good summary of Pingar. i suspect this is the start of a major wave for Pingar -wish I had access to millions of documents so that I could justify the investment :-)

Comments on this entry are closed.