Category Archives: Information Management

Strategic Analysis – Blackberry

Recently I created a strategic analysis of Blackberry – struggling Canadian mobile developer and service provider. Here is some information that I collected that might help in bringing the organization back to profitability.

Strategic Analysis – BlackBerry

by Andre Kaminski

 

Executive Summary

Overview

The target audience of this document is BlackBerry’s senior management. The objective is to assess current competitive position of the organization and to provide recommendations for the future.

BlackBerry, founded in 1984, was once leader in wireless innovation that revolutionized the term “Smartphone”. After period of exceptional growth in revenues from $85M (2000) to $20B (2008), for last three years it is in constant decline with revenues of $11B in 2013. The Net Income during the same period dropped from $3B profit (2011) to $0.6B loss (2013) (see figure 1).

During this analysis, there were several tools used. Five Forces and Environmental Analysis were used to examine the current industry structure. Both of the tools give different perspectives on the competitors and market segments, allowing to predict future movements. SWOT and VRIN tools were used to analyze current capabilities and competitive positions of the organization. Other tools like Competitive Lifecycle and Capability Analysis diagrams were not found suitable due to nature of the business and the purpose of this document.

Industry Analysis

The current situation of smartphone industry could be described as ‘Red Ocean’ (see figure 2), with high level of fierce competition. Although there are high entry barriers, there is already large number of competitors that saturated the market (according to IDC report – over 150 vendors). The industry is in early maturity phase, with well-established competitors with strong brands and excess capacity. In November 2013, IDC found that smartphone shipments grew 40%, while average pricing declined by 12%. Over last few years, more and more organizations allow employees to bring their own devices to work (BYOD) which changed the market landscape. There is no more clear differentiation between enterprise and consumer market segments, but rather an overlap. Another strategic group with fierce competition is middleware area – software that connects disparate mobile applications, programs and systems (Fig. 5).

In order to keep the costs down, many competitors buy standard components provided often by the same suppliers, which is increasing companies’ vulnerability to any changes in the relationships or new alliances. Most of the vendors are able to relatively quickly imitate new features, and currently there is little differentiation among offered devices. Apple’s introduction of iPhone in 2007 was a good example, currently almost all vendors offer the touch screens, including BlackBerry. There are high risks of substitution as voice and data services could be used on several types of devices like tablets, smartphones or laptops. Due to continuing difficult economic situation, end users are sensitive to pricing, leading to high cross price elasticity. Since handsets are sold through resellers who add data services on top of voice, relatively small number of resellers in given geographic market segments provides them with high bargaining power when negotiating pricing. Another important aspect in the industry is the need of complementing the hardware devices with software offering. Without large number of applications, it is almost impossible to sell products to consumer segment. This led to consolidation of operating systems – Google’s Android and Windows Phone OS are used on several vendors’ devices across the market space, with only two used on proprietary handsets – Apple’s iOS and BlackBerry’s QNX. Ability to attract good software development companies is directly linked to number of sold devices, which currently unfavorably positions BlackBerry (see figure 3 and figure 4).

The industry is very dynamic, with constant changes – clearly an example of Schumpeterian rents, where timing and adoption are critical to success. Blackberry lost its dominant position when it failed to recognize threat from Apple’s iPhone in 2007. Inability to quickly respond, with several misses of deadlines for delivery of new generation of devices by as much as 18 months (model Z10), only deepened the crisis for Blackberry. Key factors that will drive the industry over next few years will be related to security and privacy protection (driven by NSA scandal), data consuming and social networking shifting to mobile devices due to changing demographics, and feature rich operational systems, supporting large number of applications (figure 6).

Read more »

Team to Extreme

Week ago, I had privilege to be one of the speakers at Project World/Business Analyst World 2013 conference in Vancouver. It was advanced topic targeting mainly project managers, but others, who lead any type of teams hopefully benefited as well. During my presentation I introduced formula for team performance optimization, as well as I talked about factors that influence the performance and what we can be done to address them. Here are slides from my presentation.

 

Decision time – SharePoint – On-Premises, Hosted or Office 365?

In the middle of 2011, Microsoft introduced Office 365 – the new way how organizations will use MS Office suite  in the future, at least how Microsoft hopes. The competition is getting quite stiff in the office productivity tools based on the cloud solutions, and the new Microsoft’s flag product is key to its survival. Office 365, depending on the type of selected license, includes MS Office Plus suite, SharePoint, Outlook and Lync (instant communication solution). With significantly reduced need for maintenance, infrastructure and security, one would expect that this is an ideal solution in particular for small and medium organizations.

Not so, according to Richard Harbridge. Richard did an excellent job in painstakingly collecting data and developing model comparing the costs specific for implementation of SharePoint portion of the suite, evaluating on-premises, hosted and Office 365 installations. The results of the model suggest that the user volume/licensing heavily influences Office 365 competitiveness, and in some cases Office 365 is outright more expensive than on-premises installations, even though implementation and maintenance costs are lower. Office 365 business model clearly targets small business with smaller number of users.

The research looked at several costs, spread over 1 to 5 years:

  • Professional services
  • Infrastructure or Hosting
  • Administration team
  • Licensing

For comparison there were several typical configurations taken into account:

  • Single server (web server and MS SQL server on virtual machine)
  • Small farm standard (two servers: separate web and MS SQL server installations)
  • Small farm with high availability (2 web servers and 2 MS SQL servers)
  • Medium farm (1 web server, 1 application/indexing server, 1 MS SQL server)
  • Medium farm with high availability (2 web servers, 2 application/indexing servers, 2 SQL servers)

As you can see below, number of users (that translates into licensing costs) plays significant role in costs structure, favoring organizations with 100 or less employees. This makes me thinking that Microsoft purposely is trying to keep medium and larger organizations away, due to limitations of their current infrastructure and support. If so Microsoft will soon will have to address it, as Google with their Google Docs is well on the way to take over the market. Overall – the cloud version of the productivity software for organizations seems to be the right direction, and we should see more and more vendors getting into this space.

To see Richard’s research, visit his blog: http://www.rharbridge.com/?p=818

 

Big Data, Data Warehousing and Data Mining

Michael Koploy from Software Advice posed recently a question about plain definitions of some basic Business Intelligence concepts – Big Data, Data Warehousing and Data Mining. Although question seems to be quite simple, it is mind provoking due to changes that BI is experiencing to during last year or two. New developments in this area force us to look again at these concepts. Here is my view on these 3 topics:

Big Data

Simple definition:

The concept of the big data is not new, although it gained popularity during recent years. It describes all the data available to organizations and that includes structured and unstructured data. It is characterized by its large volume, variety and velocity, which makes it challenging to analyze. Until recently organizations tended to limit amount of information by putting breaks and structure through governance and architecture. Too much information was considered bad thing, due to limited capacity of systems and capabilities to process this information.

How it is changing:

The old saying – ‘garbage in – garbage out’ is not true anymore. Organizations realized that among the garbage there might be lot of valuable information that could be monetized. This could be done directly or indirectly and used not only to generate revenues but also to gain competitive advantage. The value of information might not be correctly estimated at the time of its creation or during its initial intended use. Value is often defined by its context – to paraphrase – “the value is in the eye of the beholder”, and it is also time variant. Traditional BI was dealing primarily with the structured data, as it was easier to work with and get results quickly. The rest was mostly ignored or treated as necessary evil. The problem however is that unstructured data constitutes around 80 to 85% of data within the organization, or floating over there in the web, and it could be in one or another way related to the business. Social networks like Facebook, Twitter, blogs, discussions, memos, emails and so on are equal sources of potentially useful information. The winners from losers are separated by ability to see the value where others do not, and ability to use it.

 

Data Warehousing

Simple definition

Traditionally data warehousing is a process of consolidating and aggregating information from various sources within the organization, and used for historical analysis and reporting. The outputs from the analysis are used for operational, tactical or strategic planning. Before the data could be used for these purposes however it has to go through process of cleanup, standardization, normalization, integration and so on. Once stored in Data Warehouse it could be aggregated, and correlated to find answers to typical business questions.

How it is changing

Once data is in Data Warehouse it becomes relatively non-volatile, time variant, representing subject oriented historical value of data. Here is the problem in the new world – the process of standardization and structuring of the data often strips the most valuable part – intrinsic relationships between data, that might not be visible at the time when the structuring rules are established. Usually Data Warehouses are created with specific goals, and these goals might be changing relatively quickly. Adjusting Data Warehouse to fit these new goals might be as painful as turning a large ship in narrow fiord. In the light of Big Data, the whole concept will have to be reevaluated.

 

Data Mining

Simple Definition

In short it is discovery of true meaning of data from large datasets that integrates structured and unstructured data. These datasets might come from data warehouses or from any other data sources. Data mining helps to answer specific business questions that might be unique and might not have predefined processing paths.

How it is changing

Data mining is building on available data and thus closely related to the above discussed two terms. Since these terms are changing, so it is the data mining concept. The organizations need to employ innovative techniques like statistical tools, semantic analysis, neural networks, artificial intelligence and so on, to extract information from combination of both structured and unstructured data in order to gain knowledge. This single step is what separates ‘wheat from the chaff’, winners from losers – it is the ‘holy grail’ of Business Intelligence.

Majority prefers ‘big data’ on premises rather than in the cloud

According to recent AIIM’s survey, the ‘big data’ adoption is going to double to 17% during next 12 months. This penetration is going to increase further to about 60% within next 3 years. The survey confirms the old truth – the need for holistic view of the data – over 61% of respondents would like to see integrated information, coming from both – structured and unstructured sources. Classification of unstructured data seems to be ongoing problem, with over 70% of organizations finding that it is easier to find information on the web, rather than on their own internal networks. Although search techniques and tools improved over the years, it seems that the adoption of new technologies is pretty slow. Another big factor playing large role in this is the poor data governance.  With regards to analysis of the data, the requirements don’t seem to be very sophisticated, indicating that organizations still struggle with strategy how to effectively use the ‘big data’. Most respondents would be satisfied simply with basic pattern analysis, keyword correlation, incident prediction and fraud prevention. This fact seems to be confirmed by lack of answer to an important question. When asked about a ‘killer application’ for their business area, over 88% of respondents said that it would make a big difference in their business, but when asked what it would be, majority declined to answer.

Another interesting fact from the report is that most of respondents seem to confuse search with data analytics. Although there are some overlaps between the two, the former is about returning results matching selection criteria, while the latter about processing of the data to return answers about specific business question.

Lastly, not so good news for cloud vendors, over 88% of respondents would prefer on-premise big data storage and analysis, rather than SaaS solutions. This seems to be related to perception of poor data protection on externally hosted applications (although only 64% of respondents explicitly stated this). Majority considers the business insights as organization’s intellectual property. Cloud providers will have to work harder to convince the market, as data security question will continue to be the primary barrier to cloud adoption.

SharePoint and Information Security

Interesting survey was recently published by Cryptozone on SharePoint security. The results are evidence of need and importance of information management governance and proper, upfront design of the information systems. It appears that in most of organizations, the responsibility for assigning of the access rights to SharePoint documents still belongs to IT Administrators, as it was indicated by 69% of respondents. At least this segment of users knew who was in charge; in contrast to 22% who did not even know who managed it. The problem with ceding of the responsibility for content protection entirely to IT is that IT primary focus is on maintenance and configuration of the technical infrastructure, but with limited knowledge and understanding of the content and its specific protection needs. IT cannot and should not make decisions on how particular type of information should be protected, and who should have access to it.

So who should be responsible for making such decisions? The answer seems to be intuitive – the business – but 43% of respondents said that they do not trust document authors to control who should read their documents. This would indicate that most of the users have low levels of awareness and understanding of the security needs. This seems to be confirmed by another set of responses that indicated that over 45% of users did copy sensitive and confidential information to unprotected USB memory sticks and emails. 55% of these respondents claimed that reason for this was the need for sending necessary information to users without access to SharePoint, with further 43% needing it for working at home. Over 30% of users were more concerned about getting the work done rather than security, and another 47% did not even think about security or did not care.

One of the contributing factors leading to taking documents out of SharePoint’s control, was the need to share it with third parties – over 56% of respondents said that their organizations did not have external portals to help with collaboration outside of the organization.

The bottom line is that this exposes the organizations to risks including legal risks and intellectual property theft. Therefore proper solution would be to give some thought before SharePoint is rolled out, answering questions on how the information is going to flow across the organization, how it is going to be accessed, how users will be segmented by their needs and how it is going to be protected. This should lead to development of information management governance, that would clearly describe roles and responsibilities across the organization, and ways how the information should be distributed and protected. Lastly, the most important step is to make the users aware of the security needs, training them on the policies and periodically reinforcing this knowledge.

Business Process Management key to successful implementation of information management

Business processes are integral part of information management. In organizational context they could be compared to cardio-vascular systems in living organisms, with blood being represented by information, and the processes by structure of veins and valves. Like with the organism, inefficient circulation will lead the organization to poor performance, inability to compete, which as end result could be fatal. Business processes could be defined as a set of related, structured activities and discrete tasks, moving and enhancing business information to achieve specific goals and objectives. They could be divided into three groups:

  • Management processes – governing operations of the organization often called ‘corporate governance’
  • Operational processes – set of core business activities to generate value and revenues, like manufacturing, purchasing, sales, or marketing
  • Supporting processes – set of auxiliary activities supporting the core, operational processes, for example HR, accounting, information technology, and support

The processes exhibit certain common characteristics:

  • Definition – they have clearly defined scope,  inputs and outputs
  • Sequencing – the activities could be sequenced and prioritized for execution
  • Benefactor – there must be specified recipient of the process outcome
  • Value – adding value during the process of transforming or carrying the data
  • Inclusion – they exist in the context of the organization
  • Cross-functionality – the process often spans multiple functions within organization

There are two concepts related to process management: Business Process Management (BPM) and Business Process Reengineering (BPR). Although both deal with the process control and flow of information, and sharing many common characteristics, there is however a significant difference between the two. Business Process Management is an ongoing initiative with set of operational activities to capture, define, monitor, and to gradual improve the organizational benefits. BPM is often implemented using bottom-up approach, and it introduces more gentle change to the organization. Business Process Reengineering on the other hand, is more project oriented, with clearly defined end-state and timeline, redesigning the processes and transforming the organization. It is often implemented as top-down approach and requires much stronger organizational change management activities on many fronts within the organization. BPR initiatives could create lot of apprehension among the workers, due to introduced change in work habits, and often their key success measurement relates to reduction of the workforce.

Formalization, standardization and automation of business processes can introduce several benefits to the organizations:

  • Better utilization of organization’s workforce
  • Improved process speed
  • Reduction in number of errors
  • Costs reduction
  • Risks reduction
  • Improved customer service
  • Duplicate work reduction
  • Improved visibility of the processes and their efficiencies

Formal business processes implementation might need to resolve several issues:

–   Staff resistance to change – new processes will impact the ways how the work is done right now, and could introduce fears related to exposing potential inefficiencies, resulting in workforce reduction, or transferring to other departments

  • Implementation time is often lengthy due to need of discovery and documentation of hidden, informal processes, and their adjustments
  •  ‘Butterfly effect’ – any small inaccuracies in identification of the sub-processes, could translate into larger problems down the value chain
  • Difficulty in finding skilled resources to deliver
  • Insufficient funding – most of organizations face budgetary constraints today, while the business process changes often require substantial time and money commitments
  • Lack of management support – formalization and automation of business processes might not be on the top of management’s priority list.

The process automation could be categorized by complexity of the implementation, and organizations could select one or more depending on their needs:

1. Routing

Routing is the simplest implementation of the business processes, addressing ad-hoc needs of the end users. Usually they linearly move information from person to person, without integrating with information generating or consuming applications. They are often employed for user notification about waiting task and monitoring of the completion status. The users need to open and process the tasks manually. That type of solution gives limited ability, if any, to implement rules associated with the process.

2. Workflow

Workflow is more sophisticated implementation of the business process automation. Among others, it allows running processes not only serially but also in parallel, saving time and improving productivity. The processes can also have defined complex set of rules, exceptions and conditions. Often there is a graphical user interface that allows for easy customization of the workflow. Useful feature of workflows is the ‘role’ concept allowing assigning tasks to roles rather than to specific people. In cases where user is unavailable, a rule could assign the task to another person, belonging to the same role. Completion of the task could trigger next step in the process chain.

3. Business Process Management

The Business Process Management is extending this concept further, to the whole enterprise, allowing crossing platforms, applications and repositories. It addresses complexities of the cross-departmental processes, and allows for their standardization. Implementation of automation requires identification of core practices and detailed analysis of business rules and triggers. Flowcharting and process modeling are two of the techniques used for this purpose. Flowcharts are graphical representation of sequence of steps and decision branches. They are excellent tools to provide blueprint for implementation, as well as could serve as communication and change management instruments. Process models on the other hand, are more elaborate tools adding intelligence, dependencies and levels to the process tasks. The simulation functionality allows identifying and resolving potential bottlenecks, inefficiencies and loops. Integration and operational monitoring of the processes could help with continuous improvement. Since the implementation of BPM is much more complex than with other two categories, it requires careful planning, change management and funding.

Master Data Management and Governance

DataMicrosoft SharePoint 2007 and then 2010 triggered rapid rates of adoption of collaboration and document management systems. Soon many organizations painfully realized the importance of Information Governance. Without it, the implementations quickly became digital landfill, just replacing but not improving shared drives problems. Often departments started building their own sites, with their own branding, cumbersome and unmanageable security structures, own metadata, poor or entirely missing taxonomies, leading to state of mess where users couldn’t find anything. Even worse, duplication of documents led to confusion, the business decisions based on outdated data, the storage size and backup costs exponential increase, and deterioration of systems performance. Worst of worst, since information was not purged or when it was, it happened randomly, this exposed the organizations to e-Discovery related legal risks and litigation costs.

To address these problems, organizations needed to develop set of aligned governance constructs within an overall Information Governance Framework. Among those constructs are Information Security Governance, Information Architecture, Data Quality, Records and Retention, Master and Reference Data just to mention few. I think that the latter plays very significant role and should be done early to get information under control.

So how Master Data Management could be defined? It is a set of processes, tools and organizational structures, where business and IT work together to address issues likes uniformity, accuracy, stewardship, and consistency and accountability of the organization’s data. This leads the data to become authoritative, secure, reliable and sustainable.  But not all data should get the same level of attention.  Master data is a ‘key’ data gathered and used by multiple departments during operations of the business like for example – customer data, information about products, employees, materials and so on. Master Data must contain most accurate and authoritative data available, and serve as single source of truth across the organization. Lot of organizations however find it difficult to secure the necessary funding and support from senior management, due to difficulty with measurement of return on investment.

Earlier this year, Gartner published some predictions related to Master Data Management governance and impact on organizations by end of 2016:

–          Only 33% of organizations that initiated MDM will be able to demonstrate its value. The difficulty here is that such initiative must present complete approach and be an ongoing process rather than once-off isolated project. This means that there needs to be consensus among senior executives and obtaining this is often quite challenging.

–          Spending on information governance must increase fivefold to be successful – and as per point above, needs to include other disciplines within the Information Management Governance Framework like quality management, lifecycle and retention, privacy and security. This will lead to building larger teams focusing on the governance and higher costs.

–          20% of CIOs in regulated industries will lose their jobs failing to implement information governance. IM governance is a construct that allows organization for compliance with regulations, and the primary responsibility for this lies with CIO and Legal Counsel.  Breaches in information security, leaks of confidential information, and breaches in privacy will lead to reputational and financial damage to those organizations.

The good news is that lot of organizations already recognize these risks, as according to Gartner, last year they have seen 21% increase in spending on MDM.

 

Office 365 offers entry point to the Cloud but with limitations

Office 365 is making steady progress in capturing small and medium business market segments with its software-as-a-service office suite and especially with cloud based version of SharePoint 2010. Adoption in larger enterprises is much slower however. For lot of organizations Office 365 is an excellent entry point into Cloud services that allows reduction of operational costs, physical storage requirements, and more optimal use of support resources. This all translates into reduction of total cost of ownership, in addition to elimination of more intangible headaches and risks like software updates or upgrades. However, quite a few organizations still have concerns related to security, reliability, ownership of data, privacy, or lack of knowledge what to do with existing on-site installations and investments. Honestly speaking, with regards to security or reliability – for most of organizations, cloud services are usually better in those areas than in-house operations. Cloud companies like Amazon, Microsoft or Rackspace have whole teams dedicated to these subjects, monitoring servers 24/7. Regarding ownership of the data, this shouldn’t be an issue either, since the data is not shared, even in multitenant environment (Microsoft offers two models – multitenant and dedicated, the latter might be an option for those who are obsessed with information protection). Deciding what to do with existing SharePoint installations, and the privacy – are valid concerns. In some countries (Canada is one of them, and so is European Union), passing information that includes personal data of users or clients across country borders, is illegal. Recently Microsoft announced cloud solution that would secure and limit the boundaries of the information transfer specifically to address government requirements, but so far this is limited only to the US. Also, the Microsoft SharePoint offering that is part of Office 365 suite, does not provide all the features that on-site installations have. Some of them:

  • Lack of FAST search solution
  • Lack of integration with Microsoft Information Rights Management
  • Lack of ability to index external databases from SharePoint search
  • Lack of Performance Point Services
  • Lack of support for external lists

So, for organizations that need more sophisticated configurations, this might not be the best option – at least for now.

But there is however another possibility – companies that really want to move into cloud, could try hybrid solutions. Assuming that such organizations have good information architecture and defined business processes, they could partition data and processes in such way that critical information is handled by in-house installations, and the rest is stored and processed using cloud solution. The integration of the data might require building a mash-up portals for the end users, so it would require some good thinking before implementation, and solid governance in place. It is important however to understand limitations of such solution – for example – federated search based on cloud and on-premises data will not work. Key success factor for such implementation would be a solid understanding of the business requirements, and alignment with overall long term goals of the organization. There are however quite a few benefits that cloud solutions bring and Microsoft is working on closing some of the gaps.

Text analytics and business intelligence

ResearchText analytics is getting more popular recently. Over the years, it was perceived as a step child of business intelligence. Recently I have seen results of a research indicating that most of organizations that implemented business intelligence were still waiting to realize their ROI. I think that the problem is that BI in its current narrow definition of dealing primarily with structured data gives only partial answers to business questions. After all, only 15 to 20 % of information that the organizations deal with is structured. Interestingly – the concept of business intelligence was first introduced in IBM Journal in 1950s by Hans Peter Luhn in his article “A Business Intelligence System”. He defined it as “automatic method to provide current awareness services to scientists and engineers” and “interrelationships of presented facts in such way as to guide action towards desired goal”. Luhn did not refer selectively to structured data, as a matter of fact part of his life was devoted to solving problems of information retrieval and storage faced by libraries, documents and records centers. Even for IBM, in 1950s – computerized methods were still at very early stages. Over the years however, as the computers became part of the business life, the analysis of data went the path of lowest resistance – exploration of data that is structured, and by its nature fairly straightforward to compare, categorize, and identify trends; data that one could apply mathematical models to process. Thus over time the structured data analysis became almost synonymous with business intelligence. The text analytics was still preserved in business domains such as market research or pharma. Recently however, the text analytics is experiencing its renaissance, and there are several reasons for this. One is the national security – governments are spending billions of dollars on development of analytical tools allowing them to search the ‘big data’ in shortest possible time to identify threats. Another one is that lot of organizations also realized that they need to listen more to their customers– hence in market research – disciplines like customer experience management, enterprise feedback management or voice of customer in CRM – are booming. Another aspect that brings acceleration to text analytics rise is the change to the way how we communicate, brought by latest social technologies and the ‘big data’. The concept of ‘big data’ is sort of misleading – after all storage costs and size is not a problem – lot of companies that are selling cloud services – offer few gigabytes here and there for free. The issue is not so much with the size of the data but the size and degree of its ‘unstructureness’. To make sense of the information stored, and make use of it, the organizations need methods, tools and processes to digest and analyze the data. In the last sentence I made purposeful distinction between data and information –the former is set of raw facts while information is the data put in the context, creating specific meaning to the user. This speed of changes in way how we communicate, makes the term ‘text analytics’ old-fashioned already. We are now talking about analysis of all types of unstructured data, not only the text, but also voice messages, videos, drawings, pictures and other rich media.

So what is text analytics about? It is simply set of techniques and models to turn text into data that could be further analyzed, as in traditional business intelligence, allowing organizations to respond to business problems. By generating semantics, text analytics provides link between search and traditional business intelligence, turning data retrieval into information delivery mechanism. The process discovers and presents the uncovered facts, business rules and relationships. There are several analytical methods employed in this process, using statistical, linguistic and structural techniques. Here are few examples:

  • Named entity recognition – to identify from the textual sources names of people, organizations, locations, symbols and so on
  • Similarity detection and disambiguation – based on contextual clues to distinguish that for example the word “bass” refers to fish and not to the instrument
  • Pattern based recognition – based on employing regular expressions, for example to identify and standardize phone numbers, emails, postal codes and so on
  • Concept recognition – clustering data entities around defined ideas
  • Relationship recognition – finding associations between data entities
  • Co-reference recognition – multiple terms referring to the same object, which could be quite complex – in the example below the pronoun refers to two different people:
    • Paul gave money to Stephen. He had nothing left.
    • Paul gave money to Stephen. He was rich.
  • Sentiment techniques – subjective analysis to discover attitude based on source data – opinion, mood, emotion, sentiment
  • Quantitative analysis – extracting semantic or grammatical relationships between words to find meaning

As we can see the difficulty with extracting information from unstructured data could be quite immense, although it is not impossible task. It requires however quite a lot of commitment from the organization to implement it. If done properly, it can help with addressing lot of problems that enterprises face today. These problems are related to perceived information overload, poor information governance, and low quality of metadata that leads to poor findability and knowledge. This in turn impacts organizational productivity. As for information external to organizations – the ‘big data’ question – how to monetize on the social media, will drive new technological and business solutions. It seems that this is one of the areas that will experience substantial growth. Using only ‘transactional’ business intelligence based on structured information, is insufficient for organizations to get the full picture. The solution is rather the ‘integrated business intelligence’ combining structured and unstructured data in providing answers to business questions.