Monthly Archives: January 2012

Three things that annoy me in SharePoint

No doubt about it, SharePoint is a good tool when it comes to document management and collaboration. However there is couple of problems that still do not make this product great. For example, when it comes to implementation of taxonomy and search, there are at least three things that require looking for some workarounds.

              1. Cannot delete custom content types.

Once you created a content type, that’s it, you are done – you won’t be able to delete it. Sure, there is a link in Site Settings to delete this content type; the only problem is that SharePoint will not allow you to do it. Instead, you are going to get messages that the content type is in use, even if you ensured that this content type was unlinked. There are some blog posts showing how to work around this problem, but all of them require running direct action queries on MS SQL content database. Obviously it is possible to be done, but not really feasible for production environment in most of organizations. To avoid this issue, implementation teams need to make sure that the taxonomy is tight on the paper, and then test with a pilot before production implementation.

2. Drop-off library works only with Document type items.

Drop-off library is a great concept, allowing for building set of rules that facilitate an automatic movement of documents to corresponding libraries, based on their content type. Unfortunately this works only on Document types, or your own custom types inherited from Document class. So if your customers would like to use it for images or audio files, they will have to move the files manually to their target locations. This could become confusing – for one type they can use drop off, for the others they cannot. So, when planning implementation, consider this during alignment of the end user processes, and if you still decide to benefit from this functionality, make sure that the change management team gives enough attention to it.

3. Lack of native support for indexing of PDF files.

PDF today became standard when a user wants to make document portable, light-weight and read-only. Unfortunately SharePoint 2010 indexing service currently does not support this type of files. There is couple of add-ons that could be installed, but they range in performance, quality and cost. I believe that this is such an important feature that it should be part of the out-of-the-box installation.

 Small things but make life more difficult – hopefully SharePoint 2012 will address them.

Classification or Search?

Couple of days ago, there was an interesting post by Michael Schrage where he questioned need for information classification in today’s (mostly electronic) world. I often hear same opinion from people who rely primarily on MS Outlook for storage and search of their documents. Apart from the fact that it rubs the IT administrators and record managers wrong way, there is some merit in his way of thinking. People usually get what they want – the information could be easily found and is easily accessible.

But why it is like this and is it applicable to all documents? First of all, we live in a world where information governance lies somewhere on a continuum between total ‘anarchy’ – where all documents live unorganized in one place, and a ‘tyranny’ – where every document, from the moment it is created, is classified and tracked. One side of the spectrum could be considered as for free spirited, right brain people, the other one for left brainer bureaucrats or ‘Type As’ as Schrage describes them. But reality lies somewhere in between, each of us personally leans to smaller or larger degree to one or the other end of the spectrum. My personal believe is that for us personally and as it is for organizations, to be really productive and creative, we need to balance on the edge of the chaos and tyranny.  To Schrage’s point – people quite often waste their time classifying the information that does not have to be classified. But then why do we classify in the first place? There is couple of objectives. The first one is most obvious – to easily find information, and this is what Schrage is referring to.

Not long time ago, when the documents existed only in physical form – people invented classification to locate and to find information. A good example is Dewey’s Decimal Classification system used in the libraries. First you locate books based on the class and subject, once you found it, you use index to find information within it. Electronic documents moved the limits of such system further, giving new capabilities and opportunities to search.

In case of my personal account with MS Outlook or with Twitter, Schrage is right. The value of classification of my emails for purpose of search is low. Outlook is pretty good and flexible allowing me to locate needed information fairly quickly. But why is it like this? This happens primarily because MS Outlook captures all the needed metadata describing context of the email automatically, with me spending no time on this. Sender address, date sent, received, subject, and content are searchable. Additionally the email treads functionality makes things easier to dig in deeper into messages when needed. This works so well since I am intimately familiar with my emails, and can easily recollect and associate the information with its context. But this is not going to be the same case if I inherit mailbox from someone else. Although the search might help with narrowing the results, I will need more to figure out what the message is about, and if it corresponds to what I am looking for. So, as per Schrage point – this does work for my personal productivity, but it will not help in case of an organization where I have to collaborate.

So, although I agree that classification is not needed here, and as a matter of fact it could be even restrictive, the key to success is the metadata describing the content. In case of Outlook, as I already mentioned, some of it is captured automatically. In other cases, however the metadata needs to be added, to keep the context with the content. It could be manual, but this is what most of people perceive as a ‘waste’ activity. It could be automatic, and to some degree it is possible as with MS Office documents. However, there still be some metadata that only the author could decide, as it corresponds to his or her intentions. Additionally the metadata itself could have its own classification or hierarchy to be meaningful.

So search and findability are one of the objectives of the classification. Another one, and especially important in case of organizations, is the records classification. Records should be kept for periods of time prescribed in retention schedules, usually based on document type classification. So here the classification is not going to disappear.

In summary, I agree that importance of classification will be diminishing as the technology evolvs. The automatic classification will definitely be of help but it is not there yet today. As artificial intelligence tools will become more truly ‘intelligent’ and capability of the systems will increase to analyze the content of the data, the need for manual classification will be limited. But the real purpose behind the scenes will remain – the accuracy and completeness of the metadata. Tools like Google Search or SharePoint 2010 with FAST search engine are on right track to narrow the search scope and to mine the results. Ability to use enterprise keywords, with good search analytics will help with the findability. However the need for classification will not disappear, but it will become of limited importance to most of the users.

Legal, statutory and regulatory foundation for Information Management programs

Any successful information management solution implementation requires establishing of a proper IM framework. Such framework will help with forming governance, setting up priorities, definition of constraints, and will give the overall direction to any future information programs.

The foundation of such framework is based on existing legal, statutory and regulatory requirements. Establishing of such basis, especially in larger organizations is not an easy task and requires involvement of several parties.  I made an attempt to capture some of these laws, standards and regulations used in the US and in Canada. This list is far from being exhaustive; every organization – depending on type of business – will have to establish their own baseline, which will include specific industry regulations.

United States:

Law, Statute, Regulation Short Description
Sarbanes-Oxley (SOX) 404 and 409 – Corporate and Auditing Accountability and Responsibility Act SOX deals with monitoring of creation and management of financial records, as well as disclosing of information about changes in the financial conditions or operations of the organization. It affects primarily publicly traded companies including accounting and security firms, auditors and brokers.
Health Insurance Portability and Accountability Act (HIPAA). HIPAA refers to protection of individually identifiable health information. It enforces that organizations handling such personal information notify the patients about their privacy policies.Organizations affected by this policy include health plans and health care providers.
Children’s Online Privacy Protection Act (COPPA) COPPA requires that online content providers, working with audiences that include children must use reasonable procedures to ensure that child’s parent is included in the process.
Department of Defense 5015.2 (DoD 5015.2) DOD 5015.2 identifies requirements based on operational, legal and legislative needs that records management solutions vendors must fulfill. It affects software vendors of electronic document and records management systems. Several government offices in the US require compliance with this standard, but also some other, larger organizations implementing information management systems, often use this standard during selection process. For this purpose, this standard is often used outside of the US.
Securities Exchange Act (Sec Rule 171-3 and 17a-4) SEC act outlines requirements for data retention, classification, and accessibility for organizations involved in financial securities trade.
Gramm-Leach Bliley Act The act is regulating handling and sharing of personal information, and disclosing of privacy policy to consumers. It primarily affects financial services organizations.
IRS Rev. Proc. 97-22 This guideline includes directives for taxpayers on maintenance of financial books and records using software applications.
Electronic Signatures in Global and National Commerce Act (ESIGN) This act regulates use of electronic records and signatures in commercial transactions.
Fair and Accurate Credit Transactions Act (FACTA) It allows consumers to request and obtain free credit report every 12 months. It also contains provisions to reduce identity theft and secure disposal of consumer information. The financial institutions are mainly affected by this act.
Fair Credit Reporting Act (FCRA) FCRA regulates the collection, distribution, and use of consumer information, including credit information. It affects consumer credit reporting organizations.
Freedom of Information Act (FOIA) It guarantees access to the full or partial previously unreleased information and documents controlled by the US government.
Government Paperwork Elimination Act (GPEA) This act requires federal agencies, where practicable, to use electronic forms, filing and signatures to conduct official business.
Occupational Safety and Health Act (OSHA) OSHA governs occupational health and safety in the private sector and federal government.
Uniform Electronic Transactions Act (UETA) The purpose of this act is to integrate the differing State laws in matter of retention of paper records, and the validity of electronic signatures. It supports the validity of electronic contracts.



Law, Statute, Regulation Short Description
Personal Information Protection and Electronic Documents Act (PIPEDA) It governs how the private companies collect, use and disclose personal information in the course of conducting business.
Secure Electronic Signature Regulations (SOR/2005-30) These regulations stipulate how digital signatures are created and verified. It is related to Canada’s Evidence Act dealing with integrity and validity of electronic documents.
Access to Information Act Regulates access to the full or partial previously unreleased information and documents controlled by the Canadian government.
Privacy Act This act stipulates rules how the federal government must deal with personal information.
Limitations Act Limitations Act defines period of time during which legal proceedings maybe initiated, and thus influencing definitions of retention periods.
Ontario Bill 198 It provides regulations of securities issued in the province of Ontario. It roughly corresponds to Sarbanes-Oxley in the US.
Microfilm and Electronic Images as Documentary Evidence Standard This standard deals with microfilming and electronic image capture. It also describes process of establishing a program helping with ensuring document integrity, reliability and authenticity.
Electronic Records as Documentary Evidence Standard This standard delivers provisions to ensure that electronic information is trustworthy, reliable and authentic.


It is important to remember that the process of establishing such baseline requires deep involvement of legal department, and several business subject matter experts. Since the laws and regulations change from time to time, the organization should appoint a steward responsible for maintenance of the framework, and establish a governance model describing what to do, when such laws or regulations change.