RAG.DB

Configuration

All environment variables for the RAG DB platform

RAG DB is configured entirely through environment variables. This reference covers every variable grouped by subsystem.


Core API Config

VariableRequiredDefaultDescription
DEVNofalseEnable development mode (verbose logging, relaxed auth)
LOG_LEVELNoINFOLogging level: DEBUG, INFO, WARNING, ERROR, CRITICAL
AZURE_TENANT_IDYesAzure AD tenant ID for managed identity and service principal auth
AZURE_CLIENT_IDYesClient ID of the user-assigned managed identity or service principal
AZURE_CLIENT_SECRETNoClient secret (only needed for local development; managed identity is used in production)
KEYVAULT_NAMEYesName of the Azure Key Vault storing application secrets
SAS_TOKEN_TTL_DAYSNo7Validity period in days for generated user delegation SAS tokens

Cosmos DB

VariableRequiredDefaultDescription
COSMOSDB_ENDPOINTYesCosmos DB account endpoint URL
COSMOSDB_ACCOUNT_NAMEYesCosmos DB account name
COSMOSDB_INDEXES_DATABASEYesDatabase containing index data containers
COSMOSDB_INDEXES_METADATA_DATABASEYesDatabase containing metadata containers
COSMOSDB_INDEXES_METADATA_CONTAINERYesContainer for index metadata documents
COSMOSDB_INDEXES_FILES_CONTAINERYesContainer for file tracking documents
COSMOSDB_INDEXES_RUNS_CONTAINERYesContainer for indexing run records
COSMOSDB_INDEX_NOTIFICATIONS_CONTAINERYesContainer for index change notifications
COSMOSDB_PREFERRED_LOCATIONSNoComma-separated list of preferred Azure regions for geo-replication reads
COSMOSDB_CONSISTENCY_LEVELNoSessionConsistency level: Strong, BoundedStaleness, Session, ConsistentPrefix, Eventual
COSMOSDB_RETRY_TOTALNo9Maximum number of retries for transient Cosmos DB errors
COSMOSDB_RETRY_BACKOFF_MAXNo300Maximum backoff time in seconds between retries
COSMOSDB_CAPACITY_MODENoServerlessCapacity mode: Serverless or Provisioned
COSMOSDB_MAX_THROUGHPUTNo10000Maximum throughput in RU/s (only applies when COSMOSDB_CAPACITY_MODE is Provisioned)

Service Bus & Dapr

VariableRequiredDefaultDescription
SERVICEBUS_NAMESPACEYesAzure Service Bus namespace (fully qualified: <name>.servicebus.windows.net)
SERVICEBUS_SUBSCRIPTION_IDYesAzure subscription ID containing the Service Bus resource
SERVICEBUS_RESOURCE_GROUPYesResource group containing the Service Bus namespace
DAPR_EVENTS_DISPATCHER_COMPONENT_NAMEYesDapr component name for the events dispatcher pub/sub binding
DAPR_EVENTS_DISPATCHER_QUEUENosbq-events-dispatcherService Bus queue name for event dispatching
DAPR_INDEX_PROCESSOR_COMPONENT_NAMEYesDapr component name for the index processor pub/sub binding
DAPR_SESSIONS_PER_INDEXNoNumber of concurrent Dapr sessions per index (controls parallelism)
DAPR_SESSION_IDLE_TIMEOUT_SECNo360Idle timeout in seconds before a Dapr session is released

Azure OpenAI

VariableRequiredDefaultDescription
AZURE_OPENAI_ENDPOINTYesAzure OpenAI resource endpoint URL
AZURE_OPENAI_API_KEYNoAPI key for Azure OpenAI (falls back to managed identity if not set)
AZURE_OPENAI_API_VERSIONNo2024-07-01-previewAzure OpenAI API version
EMBEDDING_DEPLOYMENT_NAMEYesDeployment name for the embedding model
OPENAI_DEPLOYMENT_NAMEYesDeployment name for the chat/completion model (used for query rewriting)
VECTOR_SIZENo1536Dimension of embedding vectors
EMBEDDING_VECTOR_SIZENo1536Alias for VECTOR_SIZE (either may be used)
AZURE_OPENAI_WHISPER_MODEL_NAMENoDeployment name for the Whisper speech-to-text model

Azure Cognitive Services

VariableRequiredDefaultDescription
AZURE_DOCINTEL_ENDPOINTNoAzure Document Intelligence endpoint URL
AZURE_DOCINTEL_KEYNoAPI key for Azure Document Intelligence
AZURE_SPEECH_ENDPOINTNoAzure Speech Services endpoint URL
AZURE_SPEECH_KEYNoAPI key for Azure Speech Services
AZURE_SPEECH_REGIONNoAzure region for Speech Services (e.g., eastus)

Index Processor Scaling

These variables control the container scaling and processing behavior of the index processor workers.

Container Resources

VariableRequiredDefaultDescription
INDEX_PROCESSOR_CPUNo2CPU cores allocated per index processor replica
INDEX_PROCESSOR_MEMORYNo4GiMemory allocated per index processor replica

Autoscaling (KEDA)

VariableRequiredDefaultDescription
INDEX_PROCESSOR_MIN_REPLICASNo0Minimum replica count (0 enables scale-to-zero)
INDEX_PROCESSOR_MAX_REPLICASNo5Maximum replica count
INDEX_PROCESSOR_SCALE_OUT_MESSAGE_COUNTNo10Number of pending Service Bus messages that triggers a scale-out event
INDEX_PROCESSOR_POLLING_INTERVALNo5Seconds between KEDA polling for queue depth
INDEX_PROCESSOR_COOLDOWN_PERIODNo600Seconds to wait after last scale event before scaling down

Processing Behavior

VariableRequiredDefaultDescription
CHUNK_SIZENoTarget size in characters for each text chunk
CHUNK_OVERLAPNoNumber of overlapping characters between consecutive chunks
FILE_INDEXING_MAX_RETRIESNoMaximum retries for a single file before marking it as failed
EMBEDDING_BATCH_SIZENoNumber of chunks sent per embedding API call
INDEXER_STALE_FILE_TIMEOUT_MINUTESNo15Minutes after which an in-progress file is considered stale and eligible for retry

On this page