Event-based autoscaler for your Azure Cosmos DB change feed consumer applications running inside Kubernetes cluster.
Following diagram shows the different components that are involved for achieving the application scaling, and the relationships between these components.
-
Monitored Container - The Azure Cosmos DB container that the application needs to monitor for new changes. A Cosmos DB container might contain several logical partitions based on the presence of distinct values of partition keys. Different logical partitions will be grouped under the same Partition Range if they are stored on the same physical partition. For more information, please read the documentation on partitioning overview. In general, for containers that do not contain large amount of data, the count of physical partitions does not exceed 1.
-
Lease Container - Another Azure Cosmos DB container that keeps track of changes happening on the monitored container. It stores the list of changes in the Change Feed. The change feed design pattern supports multiple parallel listeners by keeping independent feeds for each partition range. The listener application instances acquire leases on these individual feeds before processing them. This ensures that a change is not processed by multiple applications. You may have both monitored and lease containers in the same Cosmos DB account, but they can also be situated in different accounts.
-
KEDA - KEDA runs as a separate service in Kubernetes cluster. It enables auto-scaling of applications based on internal and more primarily, external events. Check KEDA documentation to learn more.
-
External Scaler - While KEDA ships with a set of built-in scalers, it also allows users to extend KEDA through support for external scalers. In this scheme, KEDA will query user's GRPC service to fetch metrics of an event source and will scale the applications accordingly. This is where 'KEDA external scaler for Azure Cosmos DB' plugs itself in. For information on how an external scaler can be implemented, check KEDA external scaler concept.
-
Listener Application(s) - This represents the application
Deployment
orStatefulSet
that you would like to scale in and out using KEDA and the external scaler. For information on how to setup the change feed processor in your application that processes changes in Cosmos DB container, read documentation on change feed processing. -
ScaledObject
Spec - The specification contains information about the scale target (i.e. the applicationDeployment
that needs to be scaled) and the trigger metadata. The external scaler fetches information about the Cosmos DB lease container from the trigger metadata defined in theScaledObject
resource.
The external scaler calls Cosmos DB APIs to estimate the amount of changes pending to be processed. More specifically, the scaler counts the number of partition ranges that have changes remaining to be processed, and requests KEDA to scale the application to that amount.
Note: The architectural diagram above shows KEDA, external scaler and the target application in different Kubernetes namespaces. This is possible but not necessary. It is a requirement though that the
ScaledObject
and the applicationDeployment
reside in the same namespace.
⚠️ Caution: The Java SDK v2 client library uses a different naming convention for lease documents inside the lease container. This makes it incompatible with .NET SDK v3, the one that the external scaler depends on to estimate the pending changes on change feeds. Hence, if you have a Java-based target consumer application, your change feeds would be having lease documents with incompatible IDs, and the external scaler would be unable to detect any pending change remaining to be consumed. Consequently, it will scale down your application tominReplicaCount
if defined in theScaledObject
or to zero instances.
-
Add and update Helm chart repo.
helm repo add kedacore https://kedacore.github.io/charts helm repo update
-
Install KEDA Helm chart (or follow one of the other installation methods on KEDA documentation).
helm install keda kedacore/keda --namespace keda --create-namespace
-
Install Azure Cosmos DB external scaler Helm chart.
helm install external-scaler-azure-cosmos-db kedacore/external-scaler-azure-cosmos-db --namespace keda --create-namespace
Create ScaledObject
resource that contains the information about your application (the scale target), the external scaler service, Cosmos DB containers, and other scaling configuration values. Check ScaledObject
specification and External
trigger specification for information on different properties supported for ScaledObject
and their allowed values.
You can use file deploy/deploy-scaledobject.yaml
as a template for creating the ScaledObject
. The trigger metadata properties required to use the external scaler for Cosmos DB are described in Trigger Specification section below.
Note: If you are having trouble setting up the external scaler or the listener application, the step-by-step instructions for deploying the sample application might help.
The specification below describes the trigger
metadata in ScaledObject
resource for using 'KEDA external scaler for Cosmos DB' to scale your application.
triggers:
- type: external
metadata:
scalerAddress: external-scaler-azure-cosmos-db.keda:4050 # Mandatory. Address of the external scaler service.
connectionFromEnv: <env-variable-for-connection> # Mandatory. Environment variable for the connection string of Cosmos DB account with monitored container.
databaseId: <database-id> # Mandatory. ID of Cosmos DB database containing monitored container.
containerId: <container-id> # Mandatory. ID of monitored container.
leaseConnectionFromEnv: <env-variable-for-lease-connection> # Mandatory. Environment variable for the connection string of Cosmos DB account with lease container.
leaseDatabaseId: <lease-database-id> # Mandatory. ID of Cosmos DB database containing lease container.
leaseContainerId: <lease-container-id> # Mandatory. ID of lease container.
processorName: <processor-name> # Mandatory. Name of change-feed processor used by listener application.
-
scalerAddress
- Address of the external scaler service. This would be in format<scaler-name>.<scaler-namespace>:<port>
. If you installed Azure Cosmos DB external scaler Helm chart inkeda
namespace and did not specify custom values, the metadata value would beexternal-scaler-azure-cosmos-db.keda:4050
. -
connectionFromEnv
- Name of the environment variable on the scale target to read the connection string of the Cosmos DB account that contains the monitored container. -
databaseId
- ID of Cosmos DB database that contains the monitored container. -
containerId
- ID of the monitored container. -
leaseConnectionFromEnv
- Name of the environment variable on the scale target to read the connection string of the Cosmos DB account that contains the lease container. This can be same or different from the value ofconnection
metadata. -
leaseDatabaseId
- ID of Cosmos DB database that contains the lease container. This can be same or different from the value ofdatabaseId
metadata. -
leaseContainerId
- ID of the lease container containing the change feeds. -
processorName
- Name of change-feed processor used by listener application. For more information on this, you can refer to Implementing the change feed processor section.
Note Ideally, we would have created
TriggerAuthentication
resource that would have prevented us from adding the connection strings in plain text in theScaledObject
trigger metadata. However, this is not possible since at the moment, the triggers ofexternal
type do not support referencing aTriggerAuthentication
resource (link).