Genesys Cloud - Developer Community!

 View Only

Sign Up

  • 1.  Rate Limits for Data Actions within the Architect Flows

    Posted 04-12-2025 09:55
    Edited by Orhun Sahin 04-12-2025 17:00

    Hello Genesys Developers and Community Experts,

    I'm opening this discussion to seek confirmation on my understanding of how Genesys Cloud Architect flows interact with Platform API rate limits, particularly when using Data Actions.

    I believe understanding rate limits and how to manage them effectively is one of the most critical aspects for any developer building solutions on Genesys Cloud. While the documentation for direct Platform API usage via OAuth clients is quite clear regarding rate limits (e.g., the default 300 RPM per client credentials client), how Architect's internal mechanics, built-in actions, and Data Actions play into these limits feels less explicitly detailed – somewhat like a "black box" at times.

    Therefore, I wanted to outline my current understanding based on research and discussion, and kindly ask for confirmation from the experts here. I believe clarifying this could provide valuable insights for many other developers as well.

    Scenario Context:

    Imagine a typical Inbound Call Flow in Architect that involves:

    1. Several Genesys Cloud Data Actions calling the Platform API (using a Client Credentials Grant associated with a "Genesys Cloud Data Actions" integration).
    2. Several Web Services Data Actions calling external, third-party APIs.
    3. Various Built-in Architect Actions (e.g., Set Participant Data, Transfer to ACD, Play Audio).

    My Understanding Summary:

    Based on my current knowledge, here's how I believe these components interact with limits:

    • Genesys Cloud Data Actions & API Limits: API calls made by these Data Actions count directly against the specific rate limit (e.g., default 300 RPM (client.credentials.token.rate.per.minute)) of the individual OAuth Client configured in the associated "Genesys Cloud Data Actions" integration instance.
    • Web Services Data Actions & API Limits: These do not consume Genesys Cloud Platform API rate limits. However, they do count towards the separate concurrent operations limit (often around 50 concurrent actions per org) and obviously depend on the rate limits and capacity of the external API being called.
    • Built-in Architect Actions & API Limits: Actions native to Architect (like Set Participant Data, Transfer to ACD) execute as part of the internal platform engine. Even if they interact with platform services internally, they do not consume the RPM limit of the external OAuth Client used by the GC Data Actions. They are subject to internal platform resource management and flow complexity limits, but not the same external API RPM limits.
    • Concurrency Limits: Both GC Data Actions and Web Services Data Actions contribute to the concurrent operations limit (~50/org). The execution time of each action is critical for managing this.
    • Token Caching: Architect automatically caches the Client Credentials token obtained for a specific "Genesys Cloud Data Actions" integration instance, reusing it until expiry or invalidation.
    • Flow Complexity/Size Limits: Architect enforces separate limits on flow size and complexity (object/node count) to ensure performance and stability, distinct from API RPM limits.
    • Isolating Limits Between Flows: If multiple high-volume flows use GC Data Actions, the best practice to prevent them from exhausting a shared limit is to:
      1. Create separate OAuth Clients (Client Credentials).
      2. Create separate instances of the "Genesys Cloud Data Actions" integration.
      3. Configure each integration instance with its dedicated OAuth Client.
      4. Associate the Data Actions used in each flow with the corresponding integration instance. This ensures each flow (or group of flows) utilizes the independent rate limit of its dedicated OAuth Client. Using the same OAuth Client for Data Actions across multiple flows means they will share that single client's RPM limit. By utilizing multiple, separate OAuth Clients (each typically starting with its own default 300 RPM limit), it's my understanding that an organization can effectively achieve a higher aggregate throughput across different applications/flows, potentially scaling up towards a much higher total request count per minute across all clients (perhaps approaching figures sometimes mentioned like 3000 RPM in total platform capacity (org.app.user.rate.per.minute- I know this is typically for Authorization Code or Token Implicit grant, but the same 3000 limit for client credentials grant has been previously mentioned by John in the DEV Forum.), though this isn't a single shared pool but rather the sum of independent capacities, subject to overall platform health).

    Additional Best Practices Considered:

    Beyond direct API calls via Data Actions, these strategies seem beneficial for managing limits and efficiency:

    • Notifications API + Data Tables: To minimize direct Platform API polling from Architect (e.g., for agent states, queue stats), utilizing the Notifications API to subscribe to events and storing relevant, relatively static data in Genesys Cloud Data Tables seems highly recommended. Architect can then use the efficient Data Table Lookup action within the flow, which is presumed to avoid standard API rate limits or operate under much higher internal limits.
    • Notifications API + External DB + Web Services: For more dynamic or complex data scenarios, using the Notifications API to push real-time data to an external database (managed by us), and then having Architect query this database via Web Services Data Actions (calling our own custom API). This shifts the API load to our own infrastructure and leverages the different limit paradigms for Web Services Data Actions (like the 50 concurrent / ~2500 total executions per minute requests.volume.max limits), effectively bypassing the Platform API RPM limits for accessing that data from Architect.

    Request for Confirmation:

    Could you please confirm if my understanding outlined above is accurate? Specifically:

    1. Is the distinction correct – GC Data Actions consume the specific external OAuth Client's RPM limit, while built-in Architect actions do not?
    2. Is using separate OAuth Clients via separate "Genesys Cloud Data Actions" integration instances the standard and recommended method for isolating rate limits between different flows or logical applications using GC Data Actions within Architect?
    3. Regarding the isolation strategy: While each client has its own limit (e.g., 300 RPM), is it accurate to think that by using multiple clients, an organization can achieve significantly higher aggregate API throughput towards the Platform API, potentially scaling towards overall platform capacities sometimes cited (like 3000 RPM total), assuming the platform's overall health and the organization's specific entitlements permit?
    4. Are there any other significant nuances, undocumented behaviors, or best practices regarding how Architect actions (built-in or Data Actions) interact with rate limits, concurrency limits, or execution limits that developers should be particularly aware of?

    Additional General API Rate Limit Questions:

    Beyond the Architect context, we also had a few general questions about API rate limiting:

    1. CORS Preflight (OPTIONS) Requests: Do CORS preflight OPTIONS requests, sent by browsers before certain requests like PUT/PATCH, count against the standard API rate limits (e.g., 300 RPM)? Or are they typically excluded, allowing close to the full RPM limit for the actual PUT/PATCH requests themselves?
    2. Agent Desktop UI Actions: Do the Platform API requests generated internally by agent actions within the standard Genesys Cloud Desktop UI (e.g., changing status, answering calls) count against documented rate limits like org.app.user.rate.per.minute that apply to external user-context integrations? 
    3. Client Credentials Aggregate Throughput: Confirming the distinction: org.app.user.rate.per.minute applies per user for user-context grants, while Client Credentials limits are per client (~300 RPM default). Regarding the strategy of using multiple Client Credentials clients to increase potential aggregate throughput, can you validate the concept (potentially mentioned previously in DEV forum) that using, for example, 10 separate clients could allow an organization to approach an aggregate throughput of 3000 RPM, assuming proper load distribution and sufficient overall platform capacity? Is this 'sum of independent limits' concept generally accurate for scaling, acknowledging it's not a pooled limit?

    I genuinely believe that expert confirmation on these points would provide significant insight and clarification for many developers working with Architect and APIs. To respectfully draw attention to this request, I will tag a few community experts below, hoping they might lend their valuable perspective. @John Carnell @Jim Crespino @Jerome Saint-Marc @Jason Mathison @Tim Smith

    Finally, as a general summary and perhaps a crucial word of caution: while strategies like using multiple clients exist for managing load, it's vital to understand that there are likely overall organizational throughput limits (figures sometimes discussed in the context of aggregates like 3000 RPM across multiple clients, though the exact hard cap and how different traffic types combine might vary) acting as a fundamental ceiling for the entire organization. Our understanding is that neither individual users nor Client Credential applications can simply bypass these underlying platform capacities across all their OAuth clients combined. Therefore, attempting inappropriate workarounds or tricks to exceed documented per-client or per-user rate limits is strongly discouraged. Such actions risk hitting these organizational throttles unexpectedly, potentially leading to broader contact center disruptions, and could result in Genesys disabling OAuth clients exhibiting abusive behavior.

    For a great overview of the reasoning behind rate limits and platform behavior, John Carnell's DevDrop (Episode 6A: Rate-limiting and the Genesys Cloud Platform API) on YouTube is highly recommended. While that video provides a great foundation, perhaps a future DevDrop focusing specifically on detailed examples for the Architect and general API limit scenarios discussed here, with practical examples and strategies, would surely be super valuable for many developers! 

    Thank you in advance for sharing your expertise and clarifying these crucial points!

    #ratelimits


    #Architect
    #Integrations
    #PlatformAPI

    ------------------------------
    Orhun Sahin
    Software Development Engineer
    ------------------------------



  • 2.  RE: Rate Limits for Data Actions within the Architect Flows

    Posted 04-16-2025 07:43

    Continuing on the important topic of rate limits, I'd like to bring up another scenario and seek confirmation on a proposed mitigation strategy.

    We are developing a custom backend API that needs to be secured. Part of our security middleware involves validating the incoming Genesys Cloud user's Platform API token (sent in the Authorization header) with each request to our backend.

    Our plan is to use the HEAD /api/v2/tokens/me endpoint for this validation, as described in the documentation. A 200 OK response would indicate a valid token.

    However, we are aware that this endpoint is also subject to rate limits (org.app.user.rate.per.minute limit applicable to the user whose token is being validated). Making this API call for every single incoming request to our backend could quickly exhaust this limit, especially for active users.

    To address this, our proposed solution is to implement caching:

    1. When a request with a specific user token arrives at our backend for the first time (or after the cache expires), we call HEAD /api/v2/tokens/me using that user's token.
    2. If the token is validated successfully (receives a 200 OK), we store the token (or a derivative/hash of it) as a key in a cache (e.g., Redis) with a status indicating "valid" and set an expiration time.
    3. For subsequent requests to our backend carrying the same token, we first check our Redis cache.
    4. If the token exists in the cache and hasn't expired, we consider it validated and skip the API call to Genesys Cloud.
    5. If the token is not in the cache or the cache entry has expired, we proceed with step 1 (call HEAD /api/v2/tokens/me).

    This fixed, short TTL approach seems like the most practical way to balance minimizing API calls (to avoid rate limits) with minimizing the risk of incorrectly treating an expired token as valid, since we cannot determine the exact expiry time.

    Request for Confirmation:

    Does this approach – using HEAD /tokens/me for validation and caching the 'valid' result in Redis with a short, fixed TTL (like 15-30 minutes) – seem like the most reasonable and recommended practice for handling Genesys Cloud user tokens in backend security middleware, given the rate limit constraints and the inability to read token expiry locally?

    Thanks again for the valuable discussion!



    ------------------------------
    Orhun Sahin
    Software Development Engineer
    ------------------------------



  • 3.  RE: Rate Limits for Data Actions within the Architect Flows

    Posted 04-19-2025 17:20

    I also have a specific follow-up question regarding the scope of one particular user-context limit that wasn't entirely clear to us from the documentation.

    We are looking closely at the org.app.user.rate.per.minute = 3000 limit. The official definition states this applies 'per user, with a specific Authorization Code or Token Implicit OAuth client grant'.

    The phrase "with a specific... OAuth client grant" creates an ambiguity we'd appreciate clarification on:

    a) Does this mean the 3000 RPM limit applies independently to each unique combination of (User + Specific OAuth Client)? (For example: Could User 'John Doe' potentially make 2000 requests/min via 'Client App A' AND, in the same minute, make another 2000 requests/min via 'Client App B' (total 4000 reqs/min for John Doe) without violating this specific 3000 limit, assuming individual token limits of 300/min are also respected?)

    b) Or, does the definition imply an overall cap where the total number of requests made by 'John Doe' across all OAuth clients (Client App A + Client App B + ...) combined cannot exceed 3000 RPM in a minute?

    Could you please clarify which interpretation, (a) or (b), accurately reflects how the org.app.user.rate.per.minute = 3000 limit is enforced in practice?

    Understanding this specific scope is crucial for designing applications that interact with users across potentially multiple contexts or integrations.

    Thank you!



    ------------------------------
    Orhun Sahin
    Software Development Engineer
    ------------------------------



  • 4.  RE: Rate Limits for Data Actions within the Architect Flows

    Posted 04-21-2025 10:42

    I can answer a few of your questions from the Data Action perspective.

    • Genesys Cloud Data Actions & API Limits: API calls made by these Data Actions count directly against the specific rate limit (e.g., default 300 RPM (client.credentials.token.rate.per.minute)) of the individual OAuth Client configured in the associated "Genesys Cloud Data Actions" integration instance.

    Yes, executing Genesys Cloud Data Actions would consume the rate limit for the client credentials used by the integration.

    • Web Services Data Actions & API Limits: These do not consume Genesys Cloud Platform API rate limits. However, they do count towards the separate concurrent operations limit (often around 50 concurrent actions per org) and obviously depend on the rate limits and capacity of the external API being called.
    • Concurrency Limits: Both GC Data Actions and Web Services Data Actions contribute to the concurrent operations limit (~50/org). The execution time of each action is critical for managing this.

    Correct.  The recent release of timeouts at the action level can help prevent concurrency rate limiting due to occasionally slow endpoints. I wrote a post about that here: https://community.genesys.com/discussion/data-action-timeout-support-releasing-this-week

    Basically correct, but the caching occurs in the Data Action service.  https://help.mypurecloud.com/faqs/do-data-actions-request-a-new-authentication-token-for-every-request/.

    • Isolating Limits Between Flows: If multiple high-volume flows use GC Data Actions, the best practice to prevent them from exhausting a shared limit is to:
      1. Create separate OAuth Clients (Client Credentials).
      2. Create separate instances of the "Genesys Cloud Data Actions" integration.
      3. Configure each integration instance with its dedicated OAuth Client.
    • Associate the Data Actions used in each flow with the corresponding integration instance. This ensures each flow (or group of flows) utilizes the independent rate limit of its dedicated OAuth Client. Using the same OAuth Client for Data Actions across multiple flows means they will share that single client's RPM limit. By utilizing multiple, separate OAuth Clients (each typically starting with its own default 300 RPM limit), it's my understanding that an organization can effectively achieve a higher aggregate throughput across different applications/flows, potentially scaling up towards a much higher total request count per minute across all clients (perhaps approaching figures sometimes mentioned like 3000 RPM in total platform capacity (org.app.user.rate.per.minute- I know this is typically for Authorization Code or Token Implicit grant, but the same 3000 limit for client credentials grant has been previously mentioned by John in the DEV Forum.), though this isn't a single shared pool but rather the sum of independent capacities, subject to overall platform health).

    I'm not aware of this being considered best practice, but is certainly an approach you could take.  If you are concerned about hitting your platform limits I would concentrate more on caching value than trying to fully exploit the limits.  There is an article that include that concept here: https://developer.genesys.cloud/blog/2021-02-03-Caching-in-flows/  further in your post you include other approaches to avoiding run-time lookups that seem reasonable to me.



    ------------------------------
    --Jason
    ------------------------------



  • 5.  RE: Rate Limits for Data Actions within the Architect Flows

    Posted 04-23-2025 16:03

    Hi Jason,

    Thank you so much for your detailed response and for confirming many of our assumptions! It's incredibly helpful for us and the wider community.

    Regarding your comment on using separate OAuth clients for isolating limits between flows: I understand and appreciate the clarification that this might not be considered an official "best practice" by Genesys. Your primary recommendation to concentrate on caching is well-taken, and indeed we are utilizing caching extensively in other areas based on the principles in the article you linked and other strategies mentioned in our post.

    However, I wanted to provide a bit more context on why the multiple-client approach was something we considered for a specific, somewhat constrained scenario. We are currently migrating a customer from PureConnect to Genesys Cloud, with a strong commitment made to minimize changes to their existing operational structures during the initial phase. One such structure involves looking up real-time data during the Inbound Call Flow. Unfortunately, due to specific business requirements tied to this legacy process, caching the data (even for a short period) is not a viable option in this particular instance – the data must be fetched live for each call from the Platform API.

    This necessity forces us to make at least two Platform API calls (via GC Data Actions) within the call flow for every voice interaction. With a single OAuth Client using the default 300 RPM limit, this effectively caps our voice channel throughput at approximately 150 calls per minute without hitting the limit (300 RPM / 2 requests per call). While we could implement throttling within the flow to stay under this limit, it would introduce potentially significant delays for callers waiting in the IVR, which is highly undesirable for the customer experience.

    For their operation size (around 120 agents) which is also multi-channel (with concurrent Inbound Email and Inbound SMS flows also running, potentially using other Data Actions), a hard cap of 150 calls/minute for just the voice channel felt technically challenging and potentially insufficient during peak times.

    This constraint led us to consider the option of using separate OAuth Clients (perhaps one for voice, one for email/SMS, or even multiple dedicated clients if the load demanded) purely as a way to distribute these unavoidable real-time API calls across different 300 RPM limits. The goal wasn't necessarily to exceed the potential ~3000 RPM (which I believe this is hard-cap and cannot be exceeded across the organization with multiple Client Credentials OAuth Clients) aggregate cap discussed earlier, but rather to handle the necessary per-interaction load without causing IVR delays, by ensuring each logical channel or function stayed within the individual limits of its own client.

    That being said, we absolutely agree that pursuing architectural workarounds for limits isn't the ideal first step. Our intended primary course of action is definitely to engage with our CSM and/or Genesys Cloud Customer Care to formally request an increase for a single, dedicated OAuth Client's rate limit for this integration. We understand it's possible to configure a single Client Credentials grant up to potentially 3000 RPM, and if approved, this would certainly be the cleanest and most preferred solution for addressing our specific scenario.

    Thanks again for your valuable insights, the confirmation on other points, and for pointing us towards the best practices and the appropriate channels for limit increase requests!



    ------------------------------
    Orhun Sahin
    Software Development Engineer
    ------------------------------



  • 6.  RE: Rate Limits for Data Actions within the Architect Flows

    Posted 04-24-2025 08:11

    Data Actions typically distribute outgoing requests across multiple authentication tokens, so I don't think that customers actually run into Genesys Cloud public API rate limiting at 300 requests per second.  I will try to do a little bit of testing around this later today to verify that.



    ------------------------------
    --Jason
    ------------------------------



  • 7.  RE: Rate Limits for Data Actions within the Architect Flows

    Posted 04-24-2025 08:30

    Hi Jason,

    Wow, thank you for that insight! That's great and not something I had considered.

    My understanding was that while Data Actions pool/cache tokens, the Genesys Cloud Data Action integration would primarily use one token until it expired or failed (like receiving a 401), making the 300 RPM limit per token/client feel like a more immediate potential bottleneck in high-throughput scenarios. The idea that the Data Action integration might actively distribute the load across available valid tokens for a given OAuth Client to mitigate hitting that limit more smoothly is quite a welcome possibility! I didn't realize it might switch to an alternative token before encountering an issue with the currently active one, simply for load distribution.

    If this distribution behavior is indeed how it functions, it would certainly alleviate some concerns about frequently hitting the strict 300 RPM barrier during bursts, even if the average rate is lower. 

    This leads to a clarifying question, though, to ensure our overall understanding is correct: Assuming this token distribution mechanism does exist within the Genesys Cloud Data Action integration for a given Client Credential grant, our understanding remains that the overall organizational cap (which we've discussed potentially being around 3000 RPM total for all Client Credentials requests combined across all clients) would still be enforced, correct?

    In other words, while this internal distribution might smooth out traffic and make hitting the individual 300 RPM limit less probable for momentary peaks, the platform would ultimately still throttle requests if the aggregate volume from all Client Credentials clients/tokens exceeds that organizational ceiling (e.g., ~3000 RPM)? Our assumption is that this organizational limit acts as the final backstop.

    We definitely look forward to hearing the results of your testing when you get a chance, as confirming this distribution behavior would be very valuable information for designing resilient integrations!

    Thanks again for sharing this potential mechanism and your willingness to investigate further!



    ------------------------------
    Orhun Sahin
    Software Development Engineer
    ------------------------------



  • 8.  RE: Rate Limits for Data Actions within the Architect Flows

    Posted 04-24-2025 13:13

    I verified that Data Actions do distribute Genesys Cloud Public API calls over multiple auth tokens. This allows for more headroom before hitting the per token limit of 300 RPM. The exact limit isn't guaranteed, but I can not remember having ever seen this be a problem in practice.

    I am not the person to answer your questions around overall organization limits.  Our limits page has a limit of 3000 requests per minute for a user org.app.user.rate.per.minute

    https://developer.genesys.cloud/organization/organization/limits#platform-api

    but I don't see any organization wide limit listed there.  Hopefully someone else can chime in on that.



    ------------------------------
    --Jason
    ------------------------------