The examples on our front page are but a small small of the opportunities presented by the application of Natural Language Technology. NLP makes use of several techniques, which are incorporated in our modular core processing engine technology. This flexibility enables us to continue to develop new solutions based on innovative combination of techniques.
In a ‘tolerant’ and associative approach documents are compared to determine the amount of overlap. Even tough the content is not the same, these documents can be matched, by using sentence deconstruction, synonym identification, related terms and expressions comparison, alternative word-extensions, etc.
Search terms (used in a search engine) can be analyzed and if desired extended with synonyms, alternative word-extensions en related terms. Type and spelling mistakes can be automatically corrected.
Manually defined patterns in text content will be recognized by our automated processing. We will then identify and exact parameters. The patterns can be defined in broad enough terms that multiple phrasing approaches will be recognized: “I would like to order NN tickets for the XX performance by YY on ZZ date"
Complete sentence structure identification, as well as the role of each item in the sentence can be automatically determined.
Part of parsing, a sentence will be split in logically coherent components: [The aggressive dog] of [our neighbor] attacked [our cat].
Keyword extraction, key phrase recognition
Continuing from chunking, this process identifies the relative significance of words and phrases. The main and supporting topics of the content can be determined in this way.
Correction of spelling mistakes
Generate suggestions to correction spelling mistakes.
Grouping identifies central text items or concepts in the body of the content and highlights or gathers them closer together in order to speed up understanding of complex content.
De-doubler can be very effectively used to review content with different phrasing and from different sources to determine if the central content is actually the same. Suitable for identifying press articles that are based on originals, identify paraphrasing (to avoid copyright issues) or fraud.
Automated solution offering in response to customer questions
Automatically determine which FAQ answer matches a customer’s question. Additionally can use pattern matching and sentence de-construction to extract relevant parameters in the question posed. These parameters can be highlighted in the selected FAQ response, or used to build a dynamically generated response.
independently manages a chat-conversation with a website visitor in a self-service context, for instance through the use of available FAQ responses. This approach is also suitable for use in transaction-oriented settings, such as ordering tickets, financial calculations, etc. Pattern matching and sentence de-construction are used to extract the most highly relevant tops in the communication by the website visitor.
On the basis of content read by a user, we build a profile representing areas of interest of the individual. This profile is depicted in a word-cloud of relevant terms and can be used to match disparate content (documents, advertising, etc.) to a user’s interest profile, through the use of fuzzy matching.
Alternatively, this type of user profiling can also be used to (pre) sort various documents (for instance search results) based on relevance to a user’s profile.
Like a monkey sitting on the editor’s shoulder, we can assist by highlighting relevant passages. Various applications could be imagined: summarizing large documents by retaining only the highly relevant passages, applying anonimization to sensitive passages, redacting of content, etc.
Search engines like Google and Bing rate search engine results lower, if the exact same content appears elsewhere. this is a problem for e-commerce shops, which typically are used to simply copying the product description of the manufacturer. CARP can be engaged to interactively suggest paraphrasing options that can be copied into the new product description with a single button click.
Editors of news summary publications typically need to summarize many articles manually, before they can integrate these into a newspaper size page layout. CARP can produce content summaries highlighting the most relevant or poignant sentences in a document with a user-controlled degree of summarization. These summaries significantly speed the work of the editor.
Classifier training interface tools
CARP incorporates an interactive interface for training of the classifier module. multiple training documents can be uploaded simultaneously. Our software suggests a category label, in case these are not yet identified for the topics and keywords identified. Words that fit the label are marked inside the content to provide visual feedback as to the accuracy of the operation. The user has full control to fine tune the category labels inside this training interface, to ensure that the process is done correctly with 100% confidence in the operational setting.
Similar to determining the difficulty level of a piece of content, CARP can be used to monitor the use of phrasing, sentence constructs that are more complicated than desired, etc. in order to maintain a consistent ‘corporate style’ of communication with the stakeholders.
CARP can be engaged to monitor public and private communication to identify words, phrases or topics that need to be flagged. For instance, personal or societal threats on social media and forums. Also effectively used to allocate invoices to the correct account, on the basis of description of order or item. Signaling can also be used to identify frequency of topics in collections of documents, for instance the mention of (specific) complications in medical reports for a specific disease or clinical surgery category.
Clustering with CARP can be used to organize collections of documents based on their identified content. Once clusters are identified, CARP can generate appropriate keywords that are representative of the category of documents. Relevancy based file-systems can be generated in this manner, by processing remaining documents and identifying to which category(yes) each document belongs based on matching of identified keywords. New documents are automatically added to the category once processed.
Social Media comment analysis
Through the use of ‘chunking’ and 'keyphrase recognition’, CARP identifies aspects such as “mattress was too soft”. Subsequent processing of all content enables aggregating similar phrases to determine how often the issue is mentioned in a giving period, for a giving location, etc. These results can be used to calibrate the use of ‘stars’ to indicate ‘quality’ ratings, for instance. This approach to processing unstructured content can result in much more accurate insights into the various aspects involved, especially if presented visually. It becomes immediately apparent which aspects (translated into ‘problems’) are most significant. Similar techniques can be used to process questions with open answer format in surveys.
The Search Wizard assists the user to phrase a question in such a way that it can be interpreted by the IT-system. During the typing of a question, CARP automatically generates search questions on the basis of a descriptive text. The computer then can respond by formulating synonyms and related search terms and extend the query with these to generate a more complete and relevant search.
Using pattern recognition, CARP identifies and extracts specific information from the text for reporting purposes.