AWS, SQS trigger to Lambda is automatically disabled when Lambda failsAWS SQS + SNS + LambdaSNS -> Lambda: What happens when Lambda throttlesHow do I combine multiple SQS metrics with Lambda?Using AWS Lambda polling with external SQSAWS cloudwatch not triggering lambda functionI have SQS -> Lambda -> SNS (my messages end up in the dead letter queue)AWS SQS behavior with multiple Lambda Triggers setHeroku + AWS Lambda Hybridcyber security of AWS lambdaHow can I sync data or trigger a Lambda when Cognito user attributes change?
Can the president of the United States be guilty of insider trading?
Examples where existence is harder than evaluation
What will Doctor Strange protect now?
Names of the Six Tastes
How can I test a shell script in a "safe environment" to avoid harm to my computer?
99 coins into the sacks
What are my options legally if NYC company is not paying salary?
Light Switch Neutrals: Bundle all together?
Creating Stored Procedure in local db that references tables in linked server
How to explain intravenous drug abuse to a 6-year-old?
Why does this pattern in powers happen?
How could a civilization detect tachyons?
Are there vaccine ingredients which may not be disclosed ("hidden", "trade secret", or similar)?
Why did Missandei say this?
get unsigned long long addition carry
mini sub panel?
As a small race with a heavy weapon, does enlage remove the disadvantage?
Where do 5 or more U.S. counties meet in a single point?
Why is there a cap on 401k contributions?
How to avoid making self and former employee look bad when reporting on fixing former employee's work?
Why doesn't a particle exert force on itself?
Add elements inside Array conditionally in JavaScript
Using mean length and mean weight to calculate mean BMI?
How long can fsck take on a 30 TB volume?
AWS, SQS trigger to Lambda is automatically disabled when Lambda fails
AWS SQS + SNS + LambdaSNS -> Lambda: What happens when Lambda throttlesHow do I combine multiple SQS metrics with Lambda?Using AWS Lambda polling with external SQSAWS cloudwatch not triggering lambda functionI have SQS -> Lambda -> SNS (my messages end up in the dead letter queue)AWS SQS behavior with multiple Lambda Triggers setHeroku + AWS Lambda Hybridcyber security of AWS lambdaHow can I sync data or trigger a Lambda when Cognito user attributes change?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
We have some Lambdas triggered by SQS queues. The Lambdas do intensive inserts into DynamoDB tables. The DynamoDB tables have autoscaling write capacity.
On peak loads, many numbers of messages come to Lambdas and they start to fail with ProvisionedThroughputExceededException. The DynamoDB needs minutes to scale up.
We expect when the Lambda fails the messages return back to SQS and are processed again after visibility timeout. This looks correct because later DynamoDB is scaled up and should be able to handle the grown writes.
However, we see a strange thing. When the number of execution errors for Lambda grows up, the SQS trigger is automatically disabled. The Lambda stops executions, the messages are accumulated in the queue.
Manual enabling of the trigger causes even more failures because DynamoDB is still not scaled up, but the number of messages to process from the queue was dramatically increased.
Only manual increasing the write capacity of DynamoDB helps.
Why the SQS trigger disables? This behavior is not documented.
How to avoid the trigger disabling?
In general, what is the recommended way to do a "backpressure" to limit the speed of polling the messages from SQS by a Lambda?
amazon-web-services amazon-lambda
add a comment |
We have some Lambdas triggered by SQS queues. The Lambdas do intensive inserts into DynamoDB tables. The DynamoDB tables have autoscaling write capacity.
On peak loads, many numbers of messages come to Lambdas and they start to fail with ProvisionedThroughputExceededException. The DynamoDB needs minutes to scale up.
We expect when the Lambda fails the messages return back to SQS and are processed again after visibility timeout. This looks correct because later DynamoDB is scaled up and should be able to handle the grown writes.
However, we see a strange thing. When the number of execution errors for Lambda grows up, the SQS trigger is automatically disabled. The Lambda stops executions, the messages are accumulated in the queue.
Manual enabling of the trigger causes even more failures because DynamoDB is still not scaled up, but the number of messages to process from the queue was dramatically increased.
Only manual increasing the write capacity of DynamoDB helps.
Why the SQS trigger disables? This behavior is not documented.
How to avoid the trigger disabling?
In general, what is the recommended way to do a "backpressure" to limit the speed of polling the messages from SQS by a Lambda?
amazon-web-services amazon-lambda
Even more upset: the Lambda may fail not only because of errors in my code but just because it's redeployed or needs to increase the concurrency.
– gelin
Apr 29 at 13:22
AWS support says the trigger was disabled because of incorrect Lambda permissions. Actually, in our case, theAWSLambdaVPCAccessExecutionRole
was missing. However, our Lambdas were executed successfully, in 99% of cases... Asking the support where to find why the trigger was disabled...
– gelin
14 hours ago
add a comment |
We have some Lambdas triggered by SQS queues. The Lambdas do intensive inserts into DynamoDB tables. The DynamoDB tables have autoscaling write capacity.
On peak loads, many numbers of messages come to Lambdas and they start to fail with ProvisionedThroughputExceededException. The DynamoDB needs minutes to scale up.
We expect when the Lambda fails the messages return back to SQS and are processed again after visibility timeout. This looks correct because later DynamoDB is scaled up and should be able to handle the grown writes.
However, we see a strange thing. When the number of execution errors for Lambda grows up, the SQS trigger is automatically disabled. The Lambda stops executions, the messages are accumulated in the queue.
Manual enabling of the trigger causes even more failures because DynamoDB is still not scaled up, but the number of messages to process from the queue was dramatically increased.
Only manual increasing the write capacity of DynamoDB helps.
Why the SQS trigger disables? This behavior is not documented.
How to avoid the trigger disabling?
In general, what is the recommended way to do a "backpressure" to limit the speed of polling the messages from SQS by a Lambda?
amazon-web-services amazon-lambda
We have some Lambdas triggered by SQS queues. The Lambdas do intensive inserts into DynamoDB tables. The DynamoDB tables have autoscaling write capacity.
On peak loads, many numbers of messages come to Lambdas and they start to fail with ProvisionedThroughputExceededException. The DynamoDB needs minutes to scale up.
We expect when the Lambda fails the messages return back to SQS and are processed again after visibility timeout. This looks correct because later DynamoDB is scaled up and should be able to handle the grown writes.
However, we see a strange thing. When the number of execution errors for Lambda grows up, the SQS trigger is automatically disabled. The Lambda stops executions, the messages are accumulated in the queue.
Manual enabling of the trigger causes even more failures because DynamoDB is still not scaled up, but the number of messages to process from the queue was dramatically increased.
Only manual increasing the write capacity of DynamoDB helps.
Why the SQS trigger disables? This behavior is not documented.
How to avoid the trigger disabling?
In general, what is the recommended way to do a "backpressure" to limit the speed of polling the messages from SQS by a Lambda?
amazon-web-services amazon-lambda
amazon-web-services amazon-lambda
edited Apr 30 at 16:41
gelin
asked Apr 29 at 13:20
gelingelin
162
162
Even more upset: the Lambda may fail not only because of errors in my code but just because it's redeployed or needs to increase the concurrency.
– gelin
Apr 29 at 13:22
AWS support says the trigger was disabled because of incorrect Lambda permissions. Actually, in our case, theAWSLambdaVPCAccessExecutionRole
was missing. However, our Lambdas were executed successfully, in 99% of cases... Asking the support where to find why the trigger was disabled...
– gelin
14 hours ago
add a comment |
Even more upset: the Lambda may fail not only because of errors in my code but just because it's redeployed or needs to increase the concurrency.
– gelin
Apr 29 at 13:22
AWS support says the trigger was disabled because of incorrect Lambda permissions. Actually, in our case, theAWSLambdaVPCAccessExecutionRole
was missing. However, our Lambdas were executed successfully, in 99% of cases... Asking the support where to find why the trigger was disabled...
– gelin
14 hours ago
Even more upset: the Lambda may fail not only because of errors in my code but just because it's redeployed or needs to increase the concurrency.
– gelin
Apr 29 at 13:22
Even more upset: the Lambda may fail not only because of errors in my code but just because it's redeployed or needs to increase the concurrency.
– gelin
Apr 29 at 13:22
AWS support says the trigger was disabled because of incorrect Lambda permissions. Actually, in our case, the
AWSLambdaVPCAccessExecutionRole
was missing. However, our Lambdas were executed successfully, in 99% of cases... Asking the support where to find why the trigger was disabled...– gelin
14 hours ago
AWS support says the trigger was disabled because of incorrect Lambda permissions. Actually, in our case, the
AWSLambdaVPCAccessExecutionRole
was missing. However, our Lambdas were executed successfully, in 99% of cases... Asking the support where to find why the trigger was disabled...– gelin
14 hours ago
add a comment |
1 Answer
1
active
oldest
votes
I'm not sure why the Lambda stops working. I suspect Lambda service notices that it keeps failing so it temporarily suspends it. Not sure.
You can try a number of workarounds:
- Use DynamoDB on demand capacity - AWS says it scales instantly.
- Alternatively if you use provisioned capacity and get the Provisioned Throughput Exception don't actually abort the Lambda execution but instead re-insert the message to the SQS queue and exit successfully. That way Lambda service won't see any failures and no SQS messages will get lost either.
Something along these lines could help :)
The second approach is actually what I did. Unfortunately, this doesn't work well if you have write multiplication in Lambda. For example, in one of my cases, each input message makes 12 writes to DynamoDB. If any of the write fails, I need to repeat all 12, which increases the load of DynamoDB even more, etc... I've added another queue for delayed write operations and another Lambda to apply those operations without the multiplication.
– gelin
Apr 30 at 4:59
Unfortunately again, Lambda may fail and FAILS because of issues in Lambda service itself, not my code (I don't see any exceptions in logs, but see failed executions of Lambda). And SQS trigger may be disabled in this case too. I've solved this (hope, temporary) by adding the Lambda which checks all SQS event sources in the account and enables them if they're disabled every 3 minutes.
– gelin
Apr 30 at 5:05
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "2"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f965102%2faws-sqs-trigger-to-lambda-is-automatically-disabled-when-lambda-fails%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I'm not sure why the Lambda stops working. I suspect Lambda service notices that it keeps failing so it temporarily suspends it. Not sure.
You can try a number of workarounds:
- Use DynamoDB on demand capacity - AWS says it scales instantly.
- Alternatively if you use provisioned capacity and get the Provisioned Throughput Exception don't actually abort the Lambda execution but instead re-insert the message to the SQS queue and exit successfully. That way Lambda service won't see any failures and no SQS messages will get lost either.
Something along these lines could help :)
The second approach is actually what I did. Unfortunately, this doesn't work well if you have write multiplication in Lambda. For example, in one of my cases, each input message makes 12 writes to DynamoDB. If any of the write fails, I need to repeat all 12, which increases the load of DynamoDB even more, etc... I've added another queue for delayed write operations and another Lambda to apply those operations without the multiplication.
– gelin
Apr 30 at 4:59
Unfortunately again, Lambda may fail and FAILS because of issues in Lambda service itself, not my code (I don't see any exceptions in logs, but see failed executions of Lambda). And SQS trigger may be disabled in this case too. I've solved this (hope, temporary) by adding the Lambda which checks all SQS event sources in the account and enables them if they're disabled every 3 minutes.
– gelin
Apr 30 at 5:05
add a comment |
I'm not sure why the Lambda stops working. I suspect Lambda service notices that it keeps failing so it temporarily suspends it. Not sure.
You can try a number of workarounds:
- Use DynamoDB on demand capacity - AWS says it scales instantly.
- Alternatively if you use provisioned capacity and get the Provisioned Throughput Exception don't actually abort the Lambda execution but instead re-insert the message to the SQS queue and exit successfully. That way Lambda service won't see any failures and no SQS messages will get lost either.
Something along these lines could help :)
The second approach is actually what I did. Unfortunately, this doesn't work well if you have write multiplication in Lambda. For example, in one of my cases, each input message makes 12 writes to DynamoDB. If any of the write fails, I need to repeat all 12, which increases the load of DynamoDB even more, etc... I've added another queue for delayed write operations and another Lambda to apply those operations without the multiplication.
– gelin
Apr 30 at 4:59
Unfortunately again, Lambda may fail and FAILS because of issues in Lambda service itself, not my code (I don't see any exceptions in logs, but see failed executions of Lambda). And SQS trigger may be disabled in this case too. I've solved this (hope, temporary) by adding the Lambda which checks all SQS event sources in the account and enables them if they're disabled every 3 minutes.
– gelin
Apr 30 at 5:05
add a comment |
I'm not sure why the Lambda stops working. I suspect Lambda service notices that it keeps failing so it temporarily suspends it. Not sure.
You can try a number of workarounds:
- Use DynamoDB on demand capacity - AWS says it scales instantly.
- Alternatively if you use provisioned capacity and get the Provisioned Throughput Exception don't actually abort the Lambda execution but instead re-insert the message to the SQS queue and exit successfully. That way Lambda service won't see any failures and no SQS messages will get lost either.
Something along these lines could help :)
I'm not sure why the Lambda stops working. I suspect Lambda service notices that it keeps failing so it temporarily suspends it. Not sure.
You can try a number of workarounds:
- Use DynamoDB on demand capacity - AWS says it scales instantly.
- Alternatively if you use provisioned capacity and get the Provisioned Throughput Exception don't actually abort the Lambda execution but instead re-insert the message to the SQS queue and exit successfully. That way Lambda service won't see any failures and no SQS messages will get lost either.
Something along these lines could help :)
answered Apr 29 at 13:43
MLuMLu
10.1k22445
10.1k22445
The second approach is actually what I did. Unfortunately, this doesn't work well if you have write multiplication in Lambda. For example, in one of my cases, each input message makes 12 writes to DynamoDB. If any of the write fails, I need to repeat all 12, which increases the load of DynamoDB even more, etc... I've added another queue for delayed write operations and another Lambda to apply those operations without the multiplication.
– gelin
Apr 30 at 4:59
Unfortunately again, Lambda may fail and FAILS because of issues in Lambda service itself, not my code (I don't see any exceptions in logs, but see failed executions of Lambda). And SQS trigger may be disabled in this case too. I've solved this (hope, temporary) by adding the Lambda which checks all SQS event sources in the account and enables them if they're disabled every 3 minutes.
– gelin
Apr 30 at 5:05
add a comment |
The second approach is actually what I did. Unfortunately, this doesn't work well if you have write multiplication in Lambda. For example, in one of my cases, each input message makes 12 writes to DynamoDB. If any of the write fails, I need to repeat all 12, which increases the load of DynamoDB even more, etc... I've added another queue for delayed write operations and another Lambda to apply those operations without the multiplication.
– gelin
Apr 30 at 4:59
Unfortunately again, Lambda may fail and FAILS because of issues in Lambda service itself, not my code (I don't see any exceptions in logs, but see failed executions of Lambda). And SQS trigger may be disabled in this case too. I've solved this (hope, temporary) by adding the Lambda which checks all SQS event sources in the account and enables them if they're disabled every 3 minutes.
– gelin
Apr 30 at 5:05
The second approach is actually what I did. Unfortunately, this doesn't work well if you have write multiplication in Lambda. For example, in one of my cases, each input message makes 12 writes to DynamoDB. If any of the write fails, I need to repeat all 12, which increases the load of DynamoDB even more, etc... I've added another queue for delayed write operations and another Lambda to apply those operations without the multiplication.
– gelin
Apr 30 at 4:59
The second approach is actually what I did. Unfortunately, this doesn't work well if you have write multiplication in Lambda. For example, in one of my cases, each input message makes 12 writes to DynamoDB. If any of the write fails, I need to repeat all 12, which increases the load of DynamoDB even more, etc... I've added another queue for delayed write operations and another Lambda to apply those operations without the multiplication.
– gelin
Apr 30 at 4:59
Unfortunately again, Lambda may fail and FAILS because of issues in Lambda service itself, not my code (I don't see any exceptions in logs, but see failed executions of Lambda). And SQS trigger may be disabled in this case too. I've solved this (hope, temporary) by adding the Lambda which checks all SQS event sources in the account and enables them if they're disabled every 3 minutes.
– gelin
Apr 30 at 5:05
Unfortunately again, Lambda may fail and FAILS because of issues in Lambda service itself, not my code (I don't see any exceptions in logs, but see failed executions of Lambda). And SQS trigger may be disabled in this case too. I've solved this (hope, temporary) by adding the Lambda which checks all SQS event sources in the account and enables them if they're disabled every 3 minutes.
– gelin
Apr 30 at 5:05
add a comment |
Thanks for contributing an answer to Server Fault!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f965102%2faws-sqs-trigger-to-lambda-is-automatically-disabled-when-lambda-fails%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Even more upset: the Lambda may fail not only because of errors in my code but just because it's redeployed or needs to increase the concurrency.
– gelin
Apr 29 at 13:22
AWS support says the trigger was disabled because of incorrect Lambda permissions. Actually, in our case, the
AWSLambdaVPCAccessExecutionRole
was missing. However, our Lambdas were executed successfully, in 99% of cases... Asking the support where to find why the trigger was disabled...– gelin
14 hours ago