Manage Document Metadata
Managing Metadata
Some types of metadata have been used for years to identify, classify and manage documents in the legal environment. But even as electronic document exchange increases exponentially, and with it, awareness that most documents and files include hidden data, firm-wide understanding about metadata management as a real security concern still lags.
Risks of Metadata
At best, unintentional disclosure of confidential information can be awkward; at worst, it can raise the specter of malpractice. Potential metadata misuse scenarios include:
Using duplicate-and-revise (Save As) to create new documents. When Microsoft Office documents are re-purposed, the original author information, document properties, document variables and last print date usually stay with the document. Hidden text is often forgotten and carried over. Most authors are not aware that much of this metadata can be seen by looking at the document properties or by opening the document using a text editor or metadata viewer.
Applying track changes as a collaboration tool. When a document has been reviewed using track changes, the marked edits can still remain with the document — even if they are not visible to the eye — unless those changes have been accepted. The track changes feature can be turned off, but this does not eliminate the markings. Turning the display back on will reveal any revisions that have not been accepted and incorporated into the document.
Inserting comments to add a private note or annotation. As with track changes, comments created in Microsoft Office applications remain with a file unless deleted. Once comments are inserted, the comment display may be turned off. Any recipient of a document containing comments that are merely hidden can redisplay them easily. This may reveal confidential or potentially embarrassing information never intended to be viewed by anyone outside of the originating company.
Adding “identifier metadata” to your documents. Certain kinds of metadata can reveal the originator of the document based on the information’s uniqueness to both the user and firm. Identifier metadata includes uniquely named styles, bookmarks, hidden document variables and built-in custom document properties. Identifier metadata, though not necessarily high risk, should be managed if the originator needs to remain anonymous or if document creation strategy might be revealed by the metadata trail.
Key Strategies for Metadata Control
As the legal community becomes increasingly aware of the damage unintentional disclosure of document information can cause, the necessity for establishing metadata control strategies and parameters becomes blatantly evident. These include:
Educating your firm about metadata concerns.
Attorneys and support staff who prepare documents should be made aware of what software features may embed metadata (e.g., track changes, comments, document properties), as well as the ramifications of using them. Much of the metadata inherited from the practice of re-purposing documents can be eliminated simply by using templates containing minimal metadata.
Controlling and managing metadata with third-party metadata scrubbing and management software.
Microsoft provides a basic metadata removal tool for Microsoft Word. More powerful third-party applications not only scrub metadata but also allow firms to manage it at a very detailed level.
Establishing a firm-wide metadata scrubbing and management policy. Implementing metadata-related policies and procedures can eliminate the need for individual users to decide what metadata gets scrubbed, resulting in a more efficient and standardized scrubbing process. Key users, especially attorneys, should be involved in any decisions about what is automatically removed and what is optional.
Conclusion
Data about the data can be as important as the data itself — possibly even more so in some cases. So seriously consider availing yourself to these strategies.