{"id":232,"date":"2012-04-02T19:10:36","date_gmt":"2012-04-02T19:10:36","guid":{"rendered":"http:\/\/www.redpointms.com\/blog\/?p=232"},"modified":"2012-04-06T17:45:30","modified_gmt":"2012-04-06T17:45:30","slug":"text-analytics-and-business-intelligence","status":"publish","type":"post","link":"http:\/\/www.rplead.com\/blog\/ecm\/text-analytics-and-business-intelligence\/","title":{"rendered":"Text analytics and business intelligence"},"content":{"rendered":"<p><a href=\"http:\/\/www.redpointms.com\/blog\/wp-content\/uploads\/2012\/04\/Research.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"alignleft  wp-image-237\" title=\"Research\" src=\"http:\/\/www.redpointms.com\/blog\/wp-content\/uploads\/2012\/04\/Research-300x214.jpg\" alt=\"Research\" width=\"180\" height=\"128\" srcset=\"http:\/\/www.rplead.com\/blog\/wp-content\/uploads\/2012\/04\/Research-300x214.jpg 300w, http:\/\/www.rplead.com\/blog\/wp-content\/uploads\/2012\/04\/Research.jpg 600w\" sizes=\"(max-width: 180px) 100vw, 180px\" \/><\/a>Text analytics is getting more popular recently. Over the years, it was perceived as a step child of business intelligence. Recently I have seen results of a research indicating that most of organizations that implemented business intelligence were still waiting to realize their ROI. I think that the problem is that BI in its current narrow definition of dealing primarily with structured data gives only partial answers to business questions. After all, only 15 to 20 % of information that the organizations deal with is structured. Interestingly \u2013 the concept of business intelligence was first introduced in IBM Journal in 1950s by Hans Peter Luhn in his article \u201cA Business Intelligence System\u201d. He defined it as \u201cautomatic method to provide current awareness services to scientists and engineers\u201d and \u201cinterrelationships of presented facts in such way as to guide action towards desired goal\u201d. Luhn did not refer selectively to structured data, as a matter of fact part of his life was devoted to solving problems of information retrieval and storage faced by libraries, documents and records centers. Even for IBM, in 1950s \u2013 computerized methods were still at very early stages. Over the years however, as the computers became part of the business life, the analysis of data went the path of lowest resistance \u2013 exploration of data that is structured, and by its nature fairly straightforward to compare, categorize, and identify trends; data that one could apply mathematical models to process. Thus over time the structured data analysis became almost synonymous with business intelligence. The text analytics was still preserved in business domains such as market research or pharma. Recently however, the text analytics is experiencing its renaissance, and there are several reasons for this. One is the national security \u2013 governments are spending billions of dollars on development of analytical tools allowing them to search the \u2018big data\u2019 in shortest possible time to identify threats. Another one is that lot of organizations also realized that they need to listen more to their customers\u2013 hence in market research \u2013 disciplines like customer experience management, enterprise feedback management or voice of customer in CRM &#8211; are booming. Another aspect that brings acceleration to text analytics rise is the change to the way how we communicate, brought by latest social technologies and the \u2018big data\u2019. The concept of \u2018big data\u2019 is sort of misleading \u2013 after all storage costs and size is not a problem &#8211; lot of companies that are selling cloud services \u2013 offer few gigabytes here and there for free. The issue is not so much with the size of the data but the size and degree of its \u2018unstructureness\u2019. To make sense of the information stored, and make use of it, the organizations need methods, tools and processes to digest and analyze the data. In the last sentence I made purposeful distinction between data and information \u2013the former is set of raw facts while information is the data put in the context, creating specific meaning to the user. This speed of changes in way how we communicate, makes the term \u2018text analytics\u2019 old-fashioned already. We are now talking about analysis of all types of unstructured data, not only the text, but also voice messages, videos, drawings, pictures and other rich media.<\/p>\n<p>So what is text analytics about? It is simply set of techniques and models to turn text into data that could be further analyzed, as in traditional business intelligence, allowing organizations to respond to business problems. By generating semantics, text analytics provides link between search and traditional business intelligence, turning data retrieval into information delivery mechanism. The process discovers and presents the uncovered facts, business rules and relationships. There are several analytical methods employed in this process, using statistical, linguistic and structural techniques. Here are few examples:<\/p>\n<ul>\n<li>Named entity recognition \u2013 to identify from the textual sources names of people, organizations, locations, symbols and so on<\/li>\n<li>Similarity detection and disambiguation \u2013 based on contextual clues to distinguish that for example the word \u201cbass\u201d refers to fish and not to the instrument<\/li>\n<li>Pattern based recognition \u2013 based on employing regular expressions, for example to identify and standardize phone numbers, emails, postal codes and so on<\/li>\n<li>Concept recognition \u2013 clustering data entities around defined ideas<\/li>\n<li>Relationship recognition \u2013 finding associations between data entities<\/li>\n<li>Co-reference recognition \u2013 multiple terms referring to the same object, which could be quite complex \u2013 in the example below the pronoun refers to two different people:<\/li>\n<ul>\n<li>Paul gave money to Stephen. He had nothing left.<\/li>\n<li>Paul gave money to Stephen. He was rich.<\/li>\n<\/ul>\n<li>Sentiment techniques \u2013 subjective analysis to discover attitude based on source data \u2013 opinion, mood, emotion, sentiment<\/li>\n<li>Quantitative analysis \u2013 extracting semantic or grammatical relationships between words to find meaning<\/li>\n<\/ul>\n<p>As we can see the difficulty with extracting information from unstructured data could be quite immense, although it is not impossible task. It requires however quite a lot of commitment from the organization to implement it. If done properly, it can help with addressing lot of problems that enterprises face today. These problems are related to perceived information overload, poor information governance, and low quality of metadata that leads to poor findability and knowledge. This in turn impacts organizational productivity. As for information external to organizations \u2013 the \u2018big data\u2019 question \u2013 how to monetize on the social media, will drive new technological and business solutions. It seems that this is one of the areas that will experience substantial growth. Using only \u2018transactional\u2019 business intelligence based on structured information, is insufficient for organizations to get the full picture. The solution is rather the \u2018integrated business intelligence\u2019 combining structured and unstructured data in providing answers to business questions.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Text analytics is getting more popular recently. Over the years, it was perceived as a step child of business intelligence. Recently I have seen results of a research indicating that most of organizations that implemented business intelligence were still waiting &hellip;<\/p>\n<p class=\"read-more\"><a href=\"http:\/\/www.rplead.com\/blog\/ecm\/text-analytics-and-business-intelligence\/\">Read more &raquo;<\/a><\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[4,5,43,14],"tags":[28,40,9,41,8,12,23,29,30],"jetpack_featured_media_url":"","_links":{"self":[{"href":"http:\/\/www.rplead.com\/blog\/wp-json\/wp\/v2\/posts\/232"}],"collection":[{"href":"http:\/\/www.rplead.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.rplead.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.rplead.com\/blog\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"http:\/\/www.rplead.com\/blog\/wp-json\/wp\/v2\/comments?post=232"}],"version-history":[{"count":9,"href":"http:\/\/www.rplead.com\/blog\/wp-json\/wp\/v2\/posts\/232\/revisions"}],"predecessor-version":[{"id":244,"href":"http:\/\/www.rplead.com\/blog\/wp-json\/wp\/v2\/posts\/232\/revisions\/244"}],"wp:attachment":[{"href":"http:\/\/www.rplead.com\/blog\/wp-json\/wp\/v2\/media?parent=232"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.rplead.com\/blog\/wp-json\/wp\/v2\/categories?post=232"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.rplead.com\/blog\/wp-json\/wp\/v2\/tags?post=232"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}