RBAC and RAG -  Best Friends

Retrieval Augmented Generation (RAG) enhances the knowledge of large language models (LLMs) by providing additional context or information, improving response quality. Despite their impressive capabilities, LLMs have limitations, such as the inability to retain new information post-training and the tendency to produce incorrect answers on unfamiliar topics. To counter these limitations, proprietary, relevant, and updated data can be combined with prompts, thus grounding the LLM and leading to more accurate and user-friendly responses.

For details on RAG, check out our Search-Labs blog about RAG.

Data Protection

Protecting private information should be a top priority for any company. The damage done by leaking sensitive information can come with financial penalties, damage to reputation, loss of competitive advantage, and even harm to individuals when personal information is leaked. Because of these and many other reasons, it is essential to consider what data you are sending to an LLM.

Today, users have options when integrating an LLM into an application. The easiest way is to use a public LLM provider. While this removes all management concerns by simply connecting to an API, users must be mindful of how the data they send to one of these providers may be used. An LLM won’t retain information immediately, but that doesn’t mean all prompts sent to the service can’t be recorded and used for future training interactions. With these public services, users should only send information they are not concerned about that could be used for future training iterations, making that knowledge available to any other user.

Some services offer commercial plans, which can come with legal contracts prohibiting the LLM provider from retaining and training off the data sent to its service. Hyperscalers provide options to deploy one of these generative LLMs to a customer’s tenant with the promise that customer data will be isolated. These options provide more comprehensive protection for data privacy than a public service. Still, users and companies must trust that the LLM provider will adhere to their promises.

Today's most secure way to integrate with an LLM is for a company to run one themselves. This should ensure no prompt information will be retained, and no data is sent externally without their knowledge. This added protection comes with added complexity and management responsibility. Companies must know how to deploy and scale an LLM. They must be able to monitor whether it continues to respond within the required response times. Running a model yourself doesn’t mean it will be more cost-effective, but it does move control back to the operator.

Regardless of the deployment type, grounding the model by augmenting the LLM’s knowledge remains equally important.

Types of data companies need to consider protecting

The following won’t be an extensive list, but it is worth pointing out a few types of data companies must consider when integrating with an LLM.

Internal vs. Public Data

As mentioned, every company has information that is not publicly accessible. Non-public data should stay non-public if the end application is to be used by internal employees. If the end users are external, care must be taken when deciding if the internal information is acceptable to be shared externally. Public data is already public, so sending it to an LLM should pose no additional risk.


Personal Identifiable Data (PII) is the type of data that makes the news when a company loses control of it. This is generally information that is unique to an individual. While all PII should be protected, some are more critical than others. Using a first name is often not an issue. But sending a first name, last name, and social security number to a public LLM isn’t a great idea.

Customer-Specific Data

Similar to PII in that it is unique to a customer, this data is often less sensitive. Examples include past order information, travel type preferences, and app settings.

Securing these and other data types is where role-based access control comes in.

RAG without RBAC

Let's start by looking at how a RAG app with only one user level of access works. If you are creating a chatbot that will run on a public website and answer documentation questions, it will likely be connected to only one dataset: your documentation. Having every user at the same access level in this setup is fine. Everyone can access the documentation, and anyone can see an answer.

But what if you are creating an internal chatbot with access to HR data where employees can ask questions instead? Some HR data should be available to everyone, but some will remain restricted to particular roles, such as Managers or HR staff.

Below, a user asks, "What is our work from home policy?”

wfh policy

This is a general question any employee should have access to, so as long as they have access to the app, they can ask and get an answer. Note that we could implement RBAC here by requiring users to log in, but for simplicity, we will assume they can only access this chatbot when they are inside the work network.

Now, let’s assume only Managers should have access to information about employee compensation. What happens when a non-manager or engineer asks a question about compensation when RBAC is not implemented?

compensation details

The chatbot returns what looks like a helpful answer; however, remember, only Managers should have access to this detail!

Elastic RBAC features

Before we examine the answer to the compensation question with RBAC enabled, let’s briefly discuss Elasticsearch’s RBAC capabilities at a high level.

  • Cluster Level
    • The most general access level is the cluster level. Can a user or account log in to the cluster?
  • Index Level
    • Once you can log in, can the account read the data in the access, and can it write, modify, and delete the index?
  • Document Level
  • Field Level
    • When a document is returned as part of a query, can the account view all the fields or only select fields?
  • Attribute
    • Use attributes to restrict access to documents in search queries and aggregations.

RBAC levels

A generalized representation of RBAC levels in Elasticsearch

For a simple example demonstrating how RBAC affects the indices and documents users from different groups are allowed to query, check out the sample Jypyter notebook in the Search Labs Repo here.

Elasticsearch can integrate with external authentication systems such as Active Directory, LDAP, SAML, and more. External groups are mapped to internal Elasticsearch roles when integrated with these providers. This offers several advantages over managing data access at the application level. First, the mapping between external and internal roles only needs to be configured once. It only requires updates when new index patterns are created or when modifications to group access types are needed. Second, by managing access at the group level within Elasticsearch, only their membership needs updating as users join or leave groups. Their access permissions automatically adjust to reflect their current group affiliations.

This second point is especially important. When access roles change, the permissions need to propagate to all systems in real-time. RBAC ensures this change takes affect rather than having to chang access in multiple programs.

For a detailed look at how RBAC fits into a Search Center of Excellent, check out the excellent blog here.

Our documentation goes into more detail about each access level.


Now that we understand the various access levels Elasticsearch can employ let's return to our RAG examples. In the last example, our Slackbot asked our engineer a helpful compensation question. However, because this information should have been restricted to Managers, it should not have provided that detail!

We could configure restrictions on this data in many ways, but we will keep our RBAC example simple, have two access levels for our HR dataset, and split the data over two indexes. One index will be hrdata-general, and one will be hrdata-restricted. Every employee will have access to hrdata-general, and only Managers will have access to hrdata-restricted. Users can query one or both indices based on the role mapping between the company’s LDAP settings and Elasticsearch users/roles.

When the Engineer asks about compensation again, this time with proper RBAC implemented, it does not provide the restricted information.

comp eng with rbac

When a Manager logs into this chatbot and asks the same question, that user’s RBAC settings allow them access to the restricted HR dataset.

comp details

This answer is correctly restricted to the Manager role, whereas without RBAC restriction, all employees could query and get responses from the restricted dataset.

This example shows how index-level access helps secure which indices groups can access. However, as mentioned above, Elasticsearch provides many additional and more granular ways to secure your data. Look out for a follow-up to this blog, where we will discuss and provide code examples for some advanced RBAC configurations.


Integrating Retrieval Augmented Generation (RAG) with Role-Based Access Control (RBAC) in Elasticsearch offers a robust and secure solution for internal and external applications. RAG enhances the capabilities of large language models, while RBAC ensures that access to sensitive data is meticulously controlled, maintaining the integrity and confidentiality of your company's information. This combination is particularly critical in a production environment where data protection is paramount. As we've demonstrated, implementing RBAC in Elasticsearch is practical and straightforward, making it an ideal choice for any company looking to leverage the power of AI while ensuring data privacy.

We encourage you to explore this capability further and consider how it can be applied to your unique business needs. Remember, Search AI is not just about generating intelligent responses but also about protecting your valuable data assets.

Ready to try this out on your own? Start a free trial.
Looking to build RAG into your apps? Want to try different LLMs with a vector database?
Check out our sample notebooks for LangChain, Cohere and more on Github, and join Elasticsearch Relevance Engine training now.
Recommended Articles
RAG in production: Operationalize your GenAI project
Generative AI

RAG in production: Operationalize your GenAI project

Retrieval Augmented Generation enables GenAI the ability to answer questions using information that was not part of the model's training dataset, unlocking significant increases in productivity and user experience. In this blog we discuss the considerations necessary to run RAG pipelines in production.

Tim Brophy

Intelligent RAG, Fetch Surrounding Chunks
Generative AIVector Search

Intelligent RAG, Fetch Surrounding Chunks

Explore Fetch Surrounding Chunking, an emerging pattern in RAG that uses intelligent chunking and Elasticsearch vector database to optimize LLM responses. This approach balances data input to enhance the accuracy and relevance of LLM-generated answers through semantic hybrid search.

Sunile Manjee

LangChain and Elastic collaborate to add vector database and semantic reranking for RAG
Generative AIIntegrations

LangChain and Elastic collaborate to add vector database and semantic reranking for RAG

Learn how LangChain and Elasticsearch can accelerate your speed of innovation in the LLM and GenAI space.

Max Jakob

How to Set Up LocalAI for GPU-Powered Text Embeddings in Air-Gapped Environments
Generative AIHow ToIntegrations

How to Set Up LocalAI for GPU-Powered Text Embeddings in Air-Gapped Environments

With LocalAI you can compute text embeddings in air-gapped environments. GPU support is available.

Valeriy Khakhutskyy

OpenAI function calling with Elasticsearch
Generative AI

OpenAI function calling with Elasticsearch

Explore OpenAI's function calling capabilities, allowing AI models to interact with external APIs and perform tasks beyond text generation. Learn to implement dynamic function calls, including fetching data from Elasticsearch, enhancing the model's real-time data access and complex operation handling. Discover practical use cases and step-by-step integration in this insightful blog.

Ashish Tiwari