Skip to main content

Leverage the Data in Content - Section Quick Links

Business leaders want to extract more value and data out of growing volumes of content. This habit looks at how to intelligently mine, enrich, and use content to benefit your organization.

Leverage the Data in Content

Unstructured Content as an Untapped Resource

Every organization has two types of content:



Structured content (loosely termed “data”) found in ERP, CRM, line-of-business applications, and other core systems. 



Unstructured content (loosely termed “documents”) that exists outside core systems and contains information critical to business operations and decision-making. Examples include Word documents (proposals, contracts, project plans, service reports, etc.), PowerPoint presentations, Excel spreadsheets, pdf files, CAD drawings, e-mails, images, videos, and more



Consider the following when building solutions to extract and amplify the value of unstructured content.


  • 80% to 90% of business information is in unstructured content. The volume and potential value of the information in unstructured content is huge—bigger, even, than structured content. Unlocking that value can deliver a substantial payback.


  • The challenge with unstructured content is exactly that: It’s unstructured. Unstructured content can be harder to retrieve, categorize, and analyze than structured content. But with the right tools, you can use these information sources efficiently and effectively.


  • Unstructured content often contains structured data (customer name, claim number, contract value, etc.). Using this data to give structure to unstructured content is the key to extracting value. And by acting on this data, you can proactively deliver content to improve productivity and processes.


  • Unstructured content is an important—and largely untapped—source of business intelligence. You can really enhance strategic decision-making by combining the information and insights mined from unstructured content with the data stored in CRM, ERP, and other business systems.


Why and How to Enrich Content with Metadata

Metadata is the key to unlocking and amplifying the value of unstructured content. Metadata allows you to effectively control, organize, track, and secure unstructured content as it moves through business processes, between systems, and across its lifecycle. 


Today, metadata extends well beyond the familiar file type, author, and date created to include custom metadata that gives unstructured content more meaning and context. For example, you might tag a sales agreement with customer name and close date, or attach policy number and status to the files associated with an insurance claim. 


Contextual metadata enables connections between pieces of content so related information can be retrieved and leveraged. It also allows you to link content with business processes and to ensure that all relevant information is available at the right time to support fast, effective decision-making.


Add Metadata Automatically
So why isn’t all this unstructured content already categorized with metadata? Common challenges include a rigid, hierarchical information architecture that’s complex to manage and hard to change, and manual processes that are time consuming and error prone. Asking busy knowledge workers to tag files with metadata almost always fails.


Automation is the only practical way to enrich content with consistent, accurate, high quality metadata at enterprise scale. Technologies for automatically categorizing content with metadata include:


  • Content services platforms offer extraction tools and repository rules to automate metadata generation; the most flexible systems allow you to create custom metadata extractors. 
  • Intelligent capture solutions can classify and extract actionable data from files uploaded from mobile devices, scanning solutions, e-mail, and other inbound channels. 
  • Auto-classification engines are good for categorizing and tagging large volumes of unmanaged content, especially as part of a content migration process. 
  • Artificial intelligence (AI) / machine learning services can mine and tag content with new types of richer, more meaningful metadata—the summary of a document, for example, or the age and emotion of someone in a photograph.
Question:How can you integrate content into an overall analytics / AI initiative?
Watch our ArchiTech talks to find out how to automate content and create custom content models for your unique business needs

5 Ways Enriched Content Can Lead to Business Transformation

Here are some of the most important ways you can benefit the business by extracting more data and value from your content.



  1. Improve the Findability of Information
    Employees spend nearly two hours a day searching for and gathering information, according to a McKinsey report. By enriching content with entities, summaries, and other metadata, you make it easier for users to find the information they need. You save people time and enable faster, more informed decision-making.

    Enhance the search experience with more relevant, comprehensive search results, and help users zero in on the right content with metadata-driven search filters.

    Help users discover content and facilitate knowledge-sharing by surfacing similar documents—those hidden gems that bring new information and insights to light. 
  2. Strengthen Information Governance
    All organizations must contend with an increasing number of regulatory requirements. Tagging content with a rich set of metadata gives you much greater control over its governance, use, and access—wherever that content lives. You can also reduce the unidentified risk in growing pools of dark data. 

    Satisfy regulatory obligations by identifying and tagging files with sensitive, compliance-related data (like personally identifiable information) so they can be properly secured and managed. 

    Enhance content security by using metadata-based access permissions to safeguard content as it moves through business processes and between enterprise systems and access points.
  3. Optimize Business Processes
    Embed content deeply into organizational operations. Metadata and configurable business rules allow you to control how content is routed through a business process. You can streamline processes across the enterprise by automating content delivery and initiating—or skipping— tasks based on the information contained in an invoice, contract, application, or any other piece of digital content.

    Expedite review-and-approval use cases (contract administration, loan origination, invoice processing, etc.) with fewer manual hand-offs and more automated decisions and straight-through processing.

    Facilitate case management use cases (insurance claims, fraud or legal investigations, FOIA/FOI requests, medical cases, etc. ) by making it easier to organize, route, retrieve, and retain all content related to a specific case.
  4. Enhance the Customer Experience 
    You can transform the customer experience by integrating content services, process services, and cloud AI services. Acting on the knowledge gleaned from text analytics creates opportunities to understand, engage, and serve today’s digital-first customers in new, more compelling ways

    Check it out: The demo below showcases how you can use sentiment analysis to identify dissatisfied customers, and then automatically route their feedback to a service agent for follow-up. The digital experience is driven by an intelligent, seamless information flow that brings together technologies from Alfresco, Amazon (AWS/Alexa), Twilio (speech-to-text conversion), and DeCooda (sentiment analysis). 

    Watch the demo below, and then take a look under the hood in this technical blog.

  5. Use Content as a New Data Source for Analytics
    Natural language applications can rapidly “read” thousands of complex documents (financial reports, contracts, proposals, strategic plans, etc.) to extract and structure textual information for better analysis and processing. Mining large stores of content for insights, patterns, and relationships benefits a wide variety of use cases, including fraud detection, legal review, contract management, preventive maintenance, and medical and scientific research.
Question:What types of content can be mined for valuable business insights?
Learn how automating content & process can help enhance your customers’ experience.
Use Case Snapshot:Detecting Fraud By Running AI Against 150 Million Document
The regulatory organization that oversees U.S. brokerage firms uses sophisticated AI and natural language processing techniques to fight financial fraud. A joint customer of Alfresco and AWS, the organization runs internally-developed text analytics against its cloud-based store of more than 150 million documents. Techniques like auto-classification, auto-summarization, and entity extraction help staff members detect and investigate potential cases of wrongdoing. According to the organization, “A document is just another way to store data, and more and more of our unstructured data are becoming structured through analytics.”
"By 2020, more than 50% of CIOs will have artificial intelligence as one of their top five investment priorities."


Enriching Content with Cloud Services

AI is a game-changer when it comes to unlocking the value of enterprise content. At the most basic level, AI and machine learning technologies can answer the question, “What’s in this content?” When applied at scale, they can uncover the insights, patterns, and relationships hidden in massive volumes of documents, images, videos, and other digital content.


Amazon Web Services (AWS), Google, and Microsoft all offer fully-featured AI services that are accessed via APIs. They support a wide range of AI and machine learning techniques, including entity extraction, key phrase extraction, sentiment analysis, and more.  




How to Integrate a Cloud AI Service with Alfresco
An open, extensible content services platform makes it easy to use cloud AI services to analyze and enrich your content. 


For content stored in a corporate data center, you simply write a metadata extractor that sends a file to the AI service, and then stores the resulting metadata—sentiment, summaries, entities, categories, relationships, etc.—in the repository alongside that file. You can even configure rules to trigger the analysis automatically, such as when a file is added to a folder. 


Mining unstructured content with AI technologies and tagging it with contextual AI insights opens up a world of possibilities as you pursue a digital transformation agenda. A content services platform with an open, extensible approach to leveraging third-party AI services gives you maximum flexibility as technologies advance and business needs change.

Question:How can you tag content with a richer set of metadata?
Watch this ArchiTech Talk on Content Modeling and Metadata.

Newsletter Signup:

Join thousands of other IT strategists getting the latest news and thought leadership from Alfresco in their inbox every week. Enter your email address below:
Thanks! We’ll be in touch soon.

Can Your ECM System Keep Up?

Look for these capabilities to mine, enrich, and act on content with greater intelligence and effectiveness.


Enrich Content 


  • Flexible, future-proof content modeling with custom metadata that can be adapted to changing business needs
  • Auto-classification tools to automate content enrichment
  • Intelligent multi-channel information capture to classify and extract metadata from scanned documents, emails, and other files
  • Extensible metadata extractor framework to analyze and enrich content using cloud AI / machine learning services 
  • Cloud-native architecture with the elasticity and capacity to run large-scale text or image analytics against a cloud-based repository
  • Integrations to automate metadata mapping in line-of-business and productivity applications like SAP, Salesforce, and Microsoft Outlook 


Act on Enriched Content 


  • Advanced search and discovery capabilities, such as faceted search and smart folders, to leverage the findability of enriched content 
  • Integrated process services, including configurable business rules and actions, to kick-off and orchestrate content flows using metadata 
  • Integrated governance services so that metadata can be leveraged for automated records management, eDiscovery, and defensible content disposal 
  • Dynamic access controls that are encapsulated in metadata so they can be enforced across systems, devices, and processes 

Explore More:

On Demand Webinar:
Modernizing Content Management: from Migration to Machine Learning
Tech Mahindra demonstrates how successful businesses get more value from their content using AI technologies.
Watch Now

Video: (ArchiTech Talk) Alfresco and AI Integration

Watch Now

Video: (ArchiTech Talk) Alfresco Content Modeling and Metadata

Watch Now

7 Habits Main Page


IDC: IDC Futurescape: WW Analytics, Cognitive/AI, and Big Data Predictions, doc #US41995816, Dec 2016

Economist: Special report The Data Deluge; cited in Alfresco whitepaper “Creating the Business Case and Realizing the Benefits of ECM in the Virtual Private Cloud

HBR: A Survey of 3,000 Executives Reveals How Businesses Succeed with AI

Economist: Special report The Data Deluge; cited in Alfresco whitepaper “Creating the Business Case and Realizing the Benefits of ECM in the Virtual Private Cloud:

Gartner: Report: How to Boost Artificial Intelligence with Content (and Vice Versa), published: 22 November 2017; Monica Basso, Karen A. Hobert, Michael Woodbridge

Talk to an Expert
Please submit your information below and one of our experts will reach out to you in the next 48 hours. We look forward to hearing from you!
Thanks! We’ll be in touch soon.