By Kevin Shepherdson, Founder & CEO, Straits Interactive
"Collection of Data," the first trend in the "6Cs" framework for 2025, emphasises the strategic importance of harnessing internal knowledge. As small and medium-sized enterprises (SMEs) evolve beyond relying solely on general-purpose large language models (LLMs), the integration of proprietary knowledge bases powered by techniques like Retrieval-Augmented Generation (RAG) represents a major leap forward. This article explores how SMEs can responsibly and effectively unlock their internal data to thrive in an increasingly AI-driven workplace.
Retrieval-Augmented Generation (RAG) is transforming how organisations leverage internal data. By combining proprietary datasets, such as internal documents and customer data, with the capabilities of LLMs, SMEs can generate outputs that are highly contextualized and relevant to their specific needs. RAG has been extensively researched and promoted by leading AI platforms like Hugging Face and Microsoft Research.
Example:
A professional services SME implemented a RAG-powered AI system to analyse client feedback and historical project reports. This allowed them to tailor proposals more effectively, resulting in a 25% increase in client acquisition.
With increased reliance on internal data, SMEs are becoming more aware of data privacy risks, including inadvertent inclusion of personally identifiable information (PII) in training datasets. Regulatory frameworks such as GDPR and PDPA emphasise the importance of privacy compliance.
Example:
A retail SME inadvertently included customer details, such as email addresses and phone numbers, in the training dataset for a customer service chatbot. This oversight led to privacy concerns, prompting the SME to implement stricter data governance measures.
Organisations will adopt stringent frameworks to ensure ethical data collection, processing, and storage. Robust governance, as highlighted by Deloitte, includes anonymising sensitive data and establishing clear policies for PII management.
Scenario:
A healthcare SME implements robust data governance measures to anonymise patient records while training its AI system for appointment scheduling, ensuring compliance with privacy regulations.
To overcome the limitations of internal data, SMEs will combine it with curated external datasets, enriching their AI systems for more balanced and accurate outputs. The World Economic Forum and McKinsey emphasise the value of combining internal knowledge with external sources.
Scenario:
A consulting SME supplements its internal policies and procedures with publicly available best practices, industry benchmarks, and tips. By incorporating these into its AI-powered training system, the SME ensures employees remain updated with the latest industry standards, improving client service delivery.
Leveraging internal knowledge bases allows SMEs to make faster and more informed decisions. For example:
1. Sales teams can use AI-powered insights to tailor pitches based on historical customer data.
2. Operations teams can analyse internal reports to optimize workflows.
Stringent data governance practices help SMEs:
1. Avoid legal penalties by complying with data privacy regulations like GDPR or PDPA.
2. Build trust with customers and stakeholders by demonstrating accountability.
Regular audits and diverse data sourcing ensure AI outputs are fair and representative, preventing discriminatory or inaccurate predictions.
SMEs often lack the advanced infrastructure needed to store, process, and secure internal data effectively.
Many SMEs struggle to implement and maintain robust data governance frameworks due to limited in-house expertise.
Combining internal and external datasets requires careful alignment to ensure consistency and compatibility.
Tip: Begin with one department’s data, such as sales or HR, to pilot AI applications.
Tip: Partner with data protection experts to audit and refine governance practices.
Tip: Start with publicly available datasets before investing in paid data sources.
Tip: Involve diverse teams in auditing to identify blind spots in data and AI outputs.
Tip: Use workshops or webinars to raise awareness about ethical data practices.
For SMEs, the ability to leverage internal knowledge represents a transformative opportunity to optimise operations, improve decision-making, and gain a competitive edge. However, success requires addressing key challenges such as data privacy risks and governance gaps.
By adopting robust data practices, integrating internal and external data, and empowering employees, SMEs can responsibly harness the power of their knowledge bases. As data becomes the foundation for AI-driven innovation, organisations that prioritize ethical and effective data collection will thrive in the competitive landscape of 2025 and beyond.
This foundational trend, "Collection of Data," sets the stage for the remaining "6Cs," highlighting the importance of internal knowledge as the cornerstone of generative AI’s transformative potential.
Get access to news, enforcement cases, events, and actionable tips and guides
Get regular email updates and offers
Job opportunities, mentorship and career guidance
Exclusive access to Data Protection community - ask questions, network and share knowledge with peers and experts via WhatsApp and Linkedin
DPEX Network is a Community Initiative of Straits Interactive.
Copyright © Straits Interactive Pte Ltd. All Rights Reserved.
All intellectual property rights to logos and brands featured on this website remain the property of their respective owners.