Most efficient way to batch delete S3 Files Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) Come Celebrate our 10 Year Anniversary!Deleting files from S3 subdirectoriesMost efficient way to copy between S3 accountsHow to batch edit a list of files?Most cost efficient way to backup Subversion data to S3?Moving files with batch files from one pc to a server, to a another pc - worried about disk corruptionbatch script to copy files as within SMBCIFS protocolMost bandwidth efficient setup for video sharingWhat is the most efficient way to transfer files from AWS S3 to S3?Logstash S3 input plugin re-scanning all bucket objectsBackup strategy for user uploaded filesProxy for a local mirror of S3 directories

How do I automatically answer y in bash script?

How to politely respond to generic emails requesting a PhD/job in my lab? Without wasting too much time

Why is there no army of Iron-Mans in the MCU?

New Order #5: where Fibonacci and Beatty meet at Wythoff

Why don't the Weasley twins use magic outside of school if the Trace can only find the location of spells cast?

Classification of bundles, Postnikov towers, obstruction theory, local coefficients

Does a C shift expression have unsigned type? Why would Splint warn about a right-shift?

Can a monk deflect thrown melee weapons?

Should you tell Jews they are breaking a commandment?

Stars Make Stars

Fishing simulator

Slither Like a Snake

What computer would be fastest for Mathematica Home Edition?

What is the electric potential inside a point charge?

What did Darwin mean by 'squib' here?

Complexity of many constant time steps with occasional logarithmic steps

What are the performance impacts of 'functional' Rust?

Can I throw a sword that doesn't have the Thrown property at someone?

Limit for e and 1/e

Need a suitable toxic chemical for a murder plot in my novel

How to colour the US map with Yellow, Green, Red and Blue to minimize the number of states with the colour of Green

Is there a documented rationale why the House Ways and Means chairman can demand tax info?

Is there folklore associating late breastfeeding with low intelligence and/or gullibility?

Do we know why communications with Beresheet and NASA were lost during the attempted landing of the Moon lander?

Most efficient way to batch delete S3 Files

Announcing the arrival of Valued Associate #679: Cesar Manara

Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)

Come Celebrate our 10 Year Anniversary!Deleting files from S3 subdirectoriesMost efficient way to copy between S3 accountsHow to batch edit a list of files?Most cost efficient way to backup Subversion data to S3?Moving files with batch files from one pc to a server, to a another pc - worried about disk corruptionbatch script to copy files as within SMBCIFS protocolMost bandwidth efficient setup for video sharingWhat is the most efficient way to transfer files from AWS S3 to S3?Logstash S3 input plugin re-scanning all bucket objectsBackup strategy for user uploaded filesProxy for a local mirror of S3 directories

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;

I'd like to be able to batch delete thousands or tens of thousands of files at a time on S3. Each file would be anywhere from 1MB to 50MB. Naturally, I don't want the user (or my server) to be waiting while the files are in the process of being deleted. Hence, the questions:

How does S3 handle file deletion, especially when deleting large numbers of files?

Is there an efficient way to do this and make AWS do most of the work? By efficient, I mean by making the least number of requests to S3 and taking the least amount of time using the least amount of resources on my servers.

edited Apr 2 '15 at 4:58

tpml7

3301421

asked Apr 2 '15 at 4:06

SudoKill

48114

add a comment |

How does S3 handle file deletion, especially when deleting large numbers of files?

Is there an efficient way to do this and make AWS do most of the work? By efficient, I mean by making the least number of requests to S3 and taking the least amount of time using the least amount of resources on my servers.

edited Apr 2 '15 at 4:58

tpml7

3301421

asked Apr 2 '15 at 4:06

SudoKill

48114

add a comment |

How does S3 handle file deletion, especially when deleting large numbers of files?

Is there an efficient way to do this and make AWS do most of the work? By efficient, I mean by making the least number of requests to S3 and taking the least amount of time using the least amount of resources on my servers.

edited Apr 2 '15 at 4:58

tpml7

3301421

asked Apr 2 '15 at 4:06

SudoKill

48114

How does S3 handle file deletion, especially when deleting large numbers of files?

Is there an efficient way to do this and make AWS do most of the work? By efficient, I mean by making the least number of requests to S3 and taking the least amount of time using the least amount of resources on my servers.

amazon-s3 batch-processing

edited Apr 2 '15 at 4:58

tpml7

3301421

asked Apr 2 '15 at 4:06

SudoKill

48114

edited Apr 2 '15 at 4:58

tpml7

3301421

asked Apr 2 '15 at 4:06

SudoKill

48114

edited Apr 2 '15 at 4:58

tpml7

3301421

edited Apr 2 '15 at 4:58

tpml7

3301421

edited Apr 2 '15 at 4:58

tpml7

3301421

asked Apr 2 '15 at 4:06

SudoKill

48114

asked Apr 2 '15 at 4:06

SudoKill

48114

asked Apr 2 '15 at 4:06

SudoKill

48114

add a comment |

5 Answers
5

active

oldest

votes

AWS supports bulk deletion of up to 1000 objects per request using the S3 REST API and its various wrappers. This method assumes you know the S3 object keys you want to remove (that is, it's not designed to handle something like a retention policy, files that are over a certain size, etc).

The S3 REST API can specify up to 1000 files to be deleted in a single request, which is must quicker than making individual requests. Remember, each request is an HTTP (thus TCP) request. So each request carries overhead. You just need to know the objects' keys and create an HTTP request (or use an wrapper in your language of choice). AWS provides great information on this feature and its usage. Just choose the method you're most comfortable with!

I'm assuming your use case involves end users specifying a number of specific files to delete at once. Rather than initiating a task such as "purge all objects that refer to picture files" or "purge all files older than a certain date" (which I believe is easy to configure separately in S3).

If so, you'll know the keys that you need to delete. It also means the user will like more real time feedback about whether their file was deleted successfully or not. References to exact keys are supposed to be very quick, since S3 was designed to scale efficiently despite handling an extremely large amount of data.

If not, you can look into asynchronous API calls. You can read a bit about how they'd work in general from this blog post or search for how to do it in the language of your choice. This would allow the deletion request to take up its own thread, and the rest of the code can execute without making a user wait. Or, you could offload the request to a queue . . . But both of these options needlessly complicate either your code (asynchronous code can be annoying) or your environment (you'd need a service/daemon/container/server to handle the queue. So I'd avoid this scenario if possible.

Edit: I don't have the reputation to post more than 2 links. But you can see Amazon's comments on request rate and performance here: http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html And the s3 faq comments that bulk deleiton is the way to go if possible.

answered Apr 2 '15 at 19:27

Ed D'Azzo

1061

add a comment |

The excruciatingly slow option is s3 rm --recursive if you actually like waiting.

Running parallel s3 rm --recursive with differing --include patterns is slightly faster but a lot of time is still spent waiting, as each process individually fetches the entire key list in order to locally perform the --include pattern matching.

Enter bulk deletion.

I found I was able to get the most speed by deleting 1000 keys at a time using aws s3api delete-objects.

Here's an example:

cat file-of-keys | xargs -P8 -n1000 bash -c 'aws s3api delete-objects --bucket MY_BUCKET_NAME --delete "Objects=[$(printf "Key=%s," "$@")],Quiet=true"' _

The -P8 option on xargs controls the parallelism. It's eight in this case, meaning 8 instances of 1000 deletions at a time.

The -n1000 option tells xargs to bundle 1000 keys for each aws s3api delete-objects call.

Removing ,Quiet=true or changing it to false will spew out server responses.

Note: There's an easily missed _ at the end of that command line. @VladNikiforov posted an excellent commentary of what it's for in the comment so I'm going to just link to that.

But how do you get file-of-keys?

If you already have your list of keys, good for you. Job complete.

If not, here's one way I guess:

aws s3 ls "s3://MY_BUCKET_NAME/SOME_SUB_DIR" | sed -nre "s|[0-9-]+ [0-9:]+ +[0-9]+ |SOME_SUB_DIR|p" >file-of-keys

edited Oct 2 '18 at 1:27

answered Jun 22 '18 at 6:38

antak

18915

6

Great approach, but I found that listing the keys was the bottleneck. This is much faster: aws s3api list-objects --output text --bucket BUCKET --query 'Contents[].[Key]' | pv -l > BUCKET.keys And then removing objects (this was sufficient that going over 1 parallel process reaches the rate limits for object deletion): tail -n+0 BUCKET.keys | pv -l | grep -v -e "'" | tr 'n' '' | xargs -0 -P1 -n1000 bash -c 'aws s3api delete-objects --bucket BUCKET --delete "Objects=[$(printf "Key=%q," "$@")],Quiet=true"' _

– SEK
Aug 13 '18 at 18:09

You probably should also have stressed the importance on _ in the end :) I missed it and then it took me quite a while to understand why the first element gets skipped. The point is that bash -c passes all arguments as positional parameters, starting with $0, while "$@" only processes parameters starting with $1. So the underscore dummy is needed to fill the position of $0.

– Vlad Nikiforov
Oct 1 '18 at 12:42

@VladNikiforov Cheers, edited.

– antak
Oct 2 '18 at 1:30

One problem I've found with this approach (either from antak or Vlad) is that it's not easily resumable if there's an error. If you are deleting a lot keys (10M in my case) you may have a network error, or throttling error, that breaks this. So to improve this, I've used split -l 1000 to split my keys file into 1000 key batches. Now for each file I can issue the delete command then delete the file. If anything goes wrong, I can continue.

– joelittlejohn
Apr 3 at 12:32

add a comment |

I was frustrated by the performance of the web console for this task. I found that the AWS CLI command does this well. For example:

aws s3 rm --recursive s3://my-bucket-name/huge-directory-full-of-files

For a large file hierarchy, this may take some considerable amount of time. You can set this running in a tmux or screen session and check back later.

answered Aug 9 '17 at 19:01

dannyman

286312

2

It looks like the aws s3 rm --recursive command deletes files individually. Although faster than the web console, when deleting lots of files, it could be much faster if it deleted in bulk

– Brandon
Feb 22 '18 at 4:35

add a comment |

A neat trick is using lifecycle rules to handle the delete for you. You can queue a rule to delete the prefix or objects that you want and Amazon will just take care of the deletion.

https://docs.aws.amazon.com/AmazonS3/latest/user-guide/create-lifecycle.html

answered Apr 9 at 20:59

cam8001

1112

New contributor

add a comment |

Without knowing how you're managing the s3 buckets, this may or may not be particularly useful.

The AWS CLI tools has an option called "sync" which can be particularly effective to ensure s3 has the correct objects. If you, or your users, are managing S3 from a local filesystem, you may be able to save a ton of work determining which objects need to be deleted by using the CLI tools.

http://docs.aws.amazon.com/cli/latest/reference/s3/sync.html

answered Apr 2 '15 at 19:42

Bill B

411

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "2"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f679989%2fmost-efficient-way-to-batch-delete-s3-files%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

5 Answers
5

active

oldest

votes

5 Answers
5

active

oldest

votes

answered Apr 2 '15 at 19:27

Ed D'Azzo

1061

add a comment |

answered Apr 2 '15 at 19:27

Ed D'Azzo

1061

add a comment |

answered Apr 2 '15 at 19:27

Ed D'Azzo

1061

answered Apr 2 '15 at 19:27

Ed D'Azzo

1061

answered Apr 2 '15 at 19:27

Ed D'Azzo

1061

answered Apr 2 '15 at 19:27

Ed D'Azzo

1061

answered Apr 2 '15 at 19:27

Ed D'Azzo

1061

add a comment |

The excruciatingly slow option is s3 rm --recursive if you actually like waiting.

Enter bulk deletion.

I found I was able to get the most speed by deleting 1000 keys at a time using aws s3api delete-objects.

Here's an example:

cat file-of-keys | xargs -P8 -n1000 bash -c 'aws s3api delete-objects --bucket MY_BUCKET_NAME --delete "Objects=[$(printf "Key=%s," "$@")],Quiet=true"' _

The -P8 option on xargs controls the parallelism. It's eight in this case, meaning 8 instances of 1000 deletions at a time.

The -n1000 option tells xargs to bundle 1000 keys for each aws s3api delete-objects call.

Removing ,Quiet=true or changing it to false will spew out server responses.

Note: There's an easily missed _ at the end of that command line. @VladNikiforov posted an excellent commentary of what it's for in the comment so I'm going to just link to that.

But how do you get file-of-keys?

If you already have your list of keys, good for you. Job complete.

If not, here's one way I guess:

aws s3 ls "s3://MY_BUCKET_NAME/SOME_SUB_DIR" | sed -nre "s|[0-9-]+ [0-9:]+ +[0-9]+ |SOME_SUB_DIR|p" >file-of-keys

edited Oct 2 '18 at 1:27

answered Jun 22 '18 at 6:38

antak

18915

6

Great approach, but I found that listing the keys was the bottleneck. This is much faster: aws s3api list-objects --output text --bucket BUCKET --query 'Contents[].[Key]' | pv -l > BUCKET.keys And then removing objects (this was sufficient that going over 1 parallel process reaches the rate limits for object deletion): tail -n+0 BUCKET.keys | pv -l | grep -v -e "'" | tr 'n' '' | xargs -0 -P1 -n1000 bash -c 'aws s3api delete-objects --bucket BUCKET --delete "Objects=[$(printf "Key=%q," "$@")],Quiet=true"' _

– SEK
Aug 13 '18 at 18:09

You probably should also have stressed the importance on _ in the end :) I missed it and then it took me quite a while to understand why the first element gets skipped. The point is that bash -c passes all arguments as positional parameters, starting with $0, while "$@" only processes parameters starting with $1. So the underscore dummy is needed to fill the position of $0.

– Vlad Nikiforov
Oct 1 '18 at 12:42

@VladNikiforov Cheers, edited.

– antak
Oct 2 '18 at 1:30

One problem I've found with this approach (either from antak or Vlad) is that it's not easily resumable if there's an error. If you are deleting a lot keys (10M in my case) you may have a network error, or throttling error, that breaks this. So to improve this, I've used split -l 1000 to split my keys file into 1000 key batches. Now for each file I can issue the delete command then delete the file. If anything goes wrong, I can continue.

– joelittlejohn
Apr 3 at 12:32

add a comment |

The excruciatingly slow option is s3 rm --recursive if you actually like waiting.

Enter bulk deletion.

I found I was able to get the most speed by deleting 1000 keys at a time using aws s3api delete-objects.

Here's an example:

cat file-of-keys | xargs -P8 -n1000 bash -c 'aws s3api delete-objects --bucket MY_BUCKET_NAME --delete "Objects=[$(printf "Key=%s," "$@")],Quiet=true"' _

The -P8 option on xargs controls the parallelism. It's eight in this case, meaning 8 instances of 1000 deletions at a time.

The -n1000 option tells xargs to bundle 1000 keys for each aws s3api delete-objects call.

Removing ,Quiet=true or changing it to false will spew out server responses.

Note: There's an easily missed _ at the end of that command line. @VladNikiforov posted an excellent commentary of what it's for in the comment so I'm going to just link to that.

But how do you get file-of-keys?

If you already have your list of keys, good for you. Job complete.

If not, here's one way I guess:

aws s3 ls "s3://MY_BUCKET_NAME/SOME_SUB_DIR" | sed -nre "s|[0-9-]+ [0-9:]+ +[0-9]+ |SOME_SUB_DIR|p" >file-of-keys

edited Oct 2 '18 at 1:27

answered Jun 22 '18 at 6:38

antak

18915

6

Great approach, but I found that listing the keys was the bottleneck. This is much faster: aws s3api list-objects --output text --bucket BUCKET --query 'Contents[].[Key]' | pv -l > BUCKET.keys And then removing objects (this was sufficient that going over 1 parallel process reaches the rate limits for object deletion): tail -n+0 BUCKET.keys | pv -l | grep -v -e "'" | tr 'n' '' | xargs -0 -P1 -n1000 bash -c 'aws s3api delete-objects --bucket BUCKET --delete "Objects=[$(printf "Key=%q," "$@")],Quiet=true"' _

– SEK
Aug 13 '18 at 18:09

You probably should also have stressed the importance on _ in the end :) I missed it and then it took me quite a while to understand why the first element gets skipped. The point is that bash -c passes all arguments as positional parameters, starting with $0, while "$@" only processes parameters starting with $1. So the underscore dummy is needed to fill the position of $0.

– Vlad Nikiforov
Oct 1 '18 at 12:42

@VladNikiforov Cheers, edited.

– antak
Oct 2 '18 at 1:30

One problem I've found with this approach (either from antak or Vlad) is that it's not easily resumable if there's an error. If you are deleting a lot keys (10M in my case) you may have a network error, or throttling error, that breaks this. So to improve this, I've used split -l 1000 to split my keys file into 1000 key batches. Now for each file I can issue the delete command then delete the file. If anything goes wrong, I can continue.

– joelittlejohn
Apr 3 at 12:32

add a comment |

The excruciatingly slow option is s3 rm --recursive if you actually like waiting.

Enter bulk deletion.

I found I was able to get the most speed by deleting 1000 keys at a time using aws s3api delete-objects.

Here's an example:

cat file-of-keys | xargs -P8 -n1000 bash -c 'aws s3api delete-objects --bucket MY_BUCKET_NAME --delete "Objects=[$(printf "Key=%s," "$@")],Quiet=true"' _

The -P8 option on xargs controls the parallelism. It's eight in this case, meaning 8 instances of 1000 deletions at a time.

The -n1000 option tells xargs to bundle 1000 keys for each aws s3api delete-objects call.

Removing ,Quiet=true or changing it to false will spew out server responses.

Note: There's an easily missed _ at the end of that command line. @VladNikiforov posted an excellent commentary of what it's for in the comment so I'm going to just link to that.

But how do you get file-of-keys?

If you already have your list of keys, good for you. Job complete.

If not, here's one way I guess:

aws s3 ls "s3://MY_BUCKET_NAME/SOME_SUB_DIR" | sed -nre "s|[0-9-]+ [0-9:]+ +[0-9]+ |SOME_SUB_DIR|p" >file-of-keys

edited Oct 2 '18 at 1:27

answered Jun 22 '18 at 6:38

antak

18915

The excruciatingly slow option is s3 rm --recursive if you actually like waiting.

Enter bulk deletion.

I found I was able to get the most speed by deleting 1000 keys at a time using aws s3api delete-objects.

Here's an example:

cat file-of-keys | xargs -P8 -n1000 bash -c 'aws s3api delete-objects --bucket MY_BUCKET_NAME --delete "Objects=[$(printf "Key=%s," "$@")],Quiet=true"' _

The -P8 option on xargs controls the parallelism. It's eight in this case, meaning 8 instances of 1000 deletions at a time.

The -n1000 option tells xargs to bundle 1000 keys for each aws s3api delete-objects call.

Removing ,Quiet=true or changing it to false will spew out server responses.

Note: There's an easily missed _ at the end of that command line. @VladNikiforov posted an excellent commentary of what it's for in the comment so I'm going to just link to that.

But how do you get file-of-keys?

If you already have your list of keys, good for you. Job complete.

If not, here's one way I guess:

aws s3 ls "s3://MY_BUCKET_NAME/SOME_SUB_DIR" | sed -nre "s|[0-9-]+ [0-9:]+ +[0-9]+ |SOME_SUB_DIR|p" >file-of-keys

edited Oct 2 '18 at 1:27

answered Jun 22 '18 at 6:38

antak

18915

edited Oct 2 '18 at 1:27

answered Jun 22 '18 at 6:38

antak

18915

answered Jun 22 '18 at 6:38

antak

18915

answered Jun 22 '18 at 6:38

antak

18915

6

Great approach, but I found that listing the keys was the bottleneck. This is much faster: aws s3api list-objects --output text --bucket BUCKET --query 'Contents[].[Key]' | pv -l > BUCKET.keys And then removing objects (this was sufficient that going over 1 parallel process reaches the rate limits for object deletion): tail -n+0 BUCKET.keys | pv -l | grep -v -e "'" | tr 'n' '' | xargs -0 -P1 -n1000 bash -c 'aws s3api delete-objects --bucket BUCKET --delete "Objects=[$(printf "Key=%q," "$@")],Quiet=true"' _

– SEK
Aug 13 '18 at 18:09

You probably should also have stressed the importance on _ in the end :) I missed it and then it took me quite a while to understand why the first element gets skipped. The point is that bash -c passes all arguments as positional parameters, starting with $0, while "$@" only processes parameters starting with $1. So the underscore dummy is needed to fill the position of $0.

– Vlad Nikiforov
Oct 1 '18 at 12:42

@VladNikiforov Cheers, edited.

– antak
Oct 2 '18 at 1:30

One problem I've found with this approach (either from antak or Vlad) is that it's not easily resumable if there's an error. If you are deleting a lot keys (10M in my case) you may have a network error, or throttling error, that breaks this. So to improve this, I've used split -l 1000 to split my keys file into 1000 key batches. Now for each file I can issue the delete command then delete the file. If anything goes wrong, I can continue.

– joelittlejohn
Apr 3 at 12:32

add a comment |

6

Great approach, but I found that listing the keys was the bottleneck. This is much faster: aws s3api list-objects --output text --bucket BUCKET --query 'Contents[].[Key]' | pv -l > BUCKET.keys And then removing objects (this was sufficient that going over 1 parallel process reaches the rate limits for object deletion): tail -n+0 BUCKET.keys | pv -l | grep -v -e "'" | tr 'n' '' | xargs -0 -P1 -n1000 bash -c 'aws s3api delete-objects --bucket BUCKET --delete "Objects=[$(printf "Key=%q," "$@")],Quiet=true"' _

– SEK
Aug 13 '18 at 18:09

You probably should also have stressed the importance on _ in the end :) I missed it and then it took me quite a while to understand why the first element gets skipped. The point is that bash -c passes all arguments as positional parameters, starting with $0, while "$@" only processes parameters starting with $1. So the underscore dummy is needed to fill the position of $0.

– Vlad Nikiforov
Oct 1 '18 at 12:42

@VladNikiforov Cheers, edited.

– antak
Oct 2 '18 at 1:30

One problem I've found with this approach (either from antak or Vlad) is that it's not easily resumable if there's an error. If you are deleting a lot keys (10M in my case) you may have a network error, or throttling error, that breaks this. So to improve this, I've used split -l 1000 to split my keys file into 1000 key batches. Now for each file I can issue the delete command then delete the file. If anything goes wrong, I can continue.

– joelittlejohn
Apr 3 at 12:32

Great approach, but I found that listing the keys was the bottleneck. This is much faster: aws s3api list-objects --output text --bucket BUCKET --query 'Contents[].[Key]' | pv -l > BUCKET.keys And then removing objects (this was sufficient that going over 1 parallel process reaches the rate limits for object deletion):

tail -n+0 BUCKET.keys | pv -l | grep -v -e "'" | tr 'n' '' | xargs -0 -P1 -n1000 bash -c 'aws s3api delete-objects --bucket BUCKET --delete "Objects=[$(printf "Key=%q," "$@")],Quiet=true"' _

– SEK
Aug 13 '18 at 18:09

tail -n+0 BUCKET.keys | pv -l | grep -v -e "'" | tr 'n' '' | xargs -0 -P1 -n1000 bash -c 'aws s3api delete-objects --bucket BUCKET --delete "Objects=[$(printf "Key=%q," "$@")],Quiet=true"' _

– SEK
Aug 13 '18 at 18:09

You probably should also have stressed the importance on _ in the end :) I missed it and then it took me quite a while to understand why the first element gets skipped. The point is that bash -c passes all arguments as positional parameters, starting with $0, while "$@" only processes parameters starting with $1. So the underscore dummy is needed to fill the position of $0.

– Vlad Nikiforov
Oct 1 '18 at 12:42

@VladNikiforov Cheers, edited.

– antak
Oct 2 '18 at 1:30

One problem I've found with this approach (either from antak or Vlad) is that it's not easily resumable if there's an error. If you are deleting a lot keys (10M in my case) you may have a network error, or throttling error, that breaks this. So to improve this, I've used split -l 1000 to split my keys file into 1000 key batches. Now for each file I can issue the delete command then delete the file. If anything goes wrong, I can continue.

– joelittlejohn
Apr 3 at 12:32

add a comment |

I was frustrated by the performance of the web console for this task. I found that the AWS CLI command does this well. For example:

aws s3 rm --recursive s3://my-bucket-name/huge-directory-full-of-files

For a large file hierarchy, this may take some considerable amount of time. You can set this running in a tmux or screen session and check back later.

answered Aug 9 '17 at 19:01

dannyman

286312

2

It looks like the aws s3 rm --recursive command deletes files individually. Although faster than the web console, when deleting lots of files, it could be much faster if it deleted in bulk

– Brandon
Feb 22 '18 at 4:35

add a comment |

I was frustrated by the performance of the web console for this task. I found that the AWS CLI command does this well. For example:

aws s3 rm --recursive s3://my-bucket-name/huge-directory-full-of-files

For a large file hierarchy, this may take some considerable amount of time. You can set this running in a tmux or screen session and check back later.

answered Aug 9 '17 at 19:01

dannyman

286312

2

It looks like the aws s3 rm --recursive command deletes files individually. Although faster than the web console, when deleting lots of files, it could be much faster if it deleted in bulk

– Brandon
Feb 22 '18 at 4:35

add a comment |

I was frustrated by the performance of the web console for this task. I found that the AWS CLI command does this well. For example:

aws s3 rm --recursive s3://my-bucket-name/huge-directory-full-of-files

For a large file hierarchy, this may take some considerable amount of time. You can set this running in a tmux or screen session and check back later.

answered Aug 9 '17 at 19:01

dannyman

286312

I was frustrated by the performance of the web console for this task. I found that the AWS CLI command does this well. For example:

aws s3 rm --recursive s3://my-bucket-name/huge-directory-full-of-files

For a large file hierarchy, this may take some considerable amount of time. You can set this running in a tmux or screen session and check back later.

answered Aug 9 '17 at 19:01

dannyman

286312

answered Aug 9 '17 at 19:01

dannyman

286312

answered Aug 9 '17 at 19:01

dannyman

286312

answered Aug 9 '17 at 19:01

dannyman

286312

2

It looks like the aws s3 rm --recursive command deletes files individually. Although faster than the web console, when deleting lots of files, it could be much faster if it deleted in bulk

– Brandon
Feb 22 '18 at 4:35

add a comment |

2

It looks like the aws s3 rm --recursive command deletes files individually. Although faster than the web console, when deleting lots of files, it could be much faster if it deleted in bulk

– Brandon
Feb 22 '18 at 4:35

It looks like the aws s3 rm --recursive command deletes files individually. Although faster than the web console, when deleting lots of files, it could be much faster if it deleted in bulk

– Brandon
Feb 22 '18 at 4:35

add a comment |

A neat trick is using lifecycle rules to handle the delete for you. You can queue a rule to delete the prefix or objects that you want and Amazon will just take care of the deletion.

https://docs.aws.amazon.com/AmazonS3/latest/user-guide/create-lifecycle.html

answered Apr 9 at 20:59

cam8001

1112

New contributor

add a comment |

A neat trick is using lifecycle rules to handle the delete for you. You can queue a rule to delete the prefix or objects that you want and Amazon will just take care of the deletion.

https://docs.aws.amazon.com/AmazonS3/latest/user-guide/create-lifecycle.html

answered Apr 9 at 20:59

cam8001

1112

New contributor

add a comment |

A neat trick is using lifecycle rules to handle the delete for you. You can queue a rule to delete the prefix or objects that you want and Amazon will just take care of the deletion.

https://docs.aws.amazon.com/AmazonS3/latest/user-guide/create-lifecycle.html

answered Apr 9 at 20:59

cam8001

1112

New contributor

A neat trick is using lifecycle rules to handle the delete for you. You can queue a rule to delete the prefix or objects that you want and Amazon will just take care of the deletion.

https://docs.aws.amazon.com/AmazonS3/latest/user-guide/create-lifecycle.html

answered Apr 9 at 20:59

cam8001

1112

New contributor

answered Apr 9 at 20:59

cam8001

1112

New contributor

answered Apr 9 at 20:59

cam8001

1112

answered Apr 9 at 20:59

cam8001

1112

New contributor

cam8001 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

Without knowing how you're managing the s3 buckets, this may or may not be particularly useful.

http://docs.aws.amazon.com/cli/latest/reference/s3/sync.html

answered Apr 2 '15 at 19:42

Bill B

411

add a comment |

Without knowing how you're managing the s3 buckets, this may or may not be particularly useful.

http://docs.aws.amazon.com/cli/latest/reference/s3/sync.html

answered Apr 2 '15 at 19:42

Bill B

411

add a comment |

Without knowing how you're managing the s3 buckets, this may or may not be particularly useful.

http://docs.aws.amazon.com/cli/latest/reference/s3/sync.html

answered Apr 2 '15 at 19:42

Bill B

411

Without knowing how you're managing the s3 buckets, this may or may not be particularly useful.

http://docs.aws.amazon.com/cli/latest/reference/s3/sync.html

answered Apr 2 '15 at 19:42

Bill B

411

answered Apr 2 '15 at 19:42

Bill B

411

answered Apr 2 '15 at 19:42

Bill B

411

answered Apr 2 '15 at 19:42

Bill B

411

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Server Fault!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

DbOPbQK4EREz GxsK6XC jAQXWgY96K7yTF,fdvI4JIx9yP JgW1XEy4vs18j3NGAjZhQCB6VW 5eU5aoE

搜尋此網誌

Otdfbt

5 Answers
5

Your Answer

Post as a guest

5 Answers
5

5 Answers
5

Post as a guest

Popular posts from this blog

Vilaño, A Laracha Índice Patrimonio | Lugares e parroquias | Véxase tamén | Menú de navegación43°14′52″N 8°36′03″O / 43.24775, -8.60070

5 Answers 5

Your Answer

Sign up or log in

Post as a guest

Post as a guest

5 Answers 5

5 Answers 5

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Vilaño, A Laracha Índice Patrimonio | Lugares e parroquias | Véxase tamén | Menú de navegación43°14′52″N 8°36′03″O﻿ / ﻿43.24775, -8.60070

5 Answers
5

5 Answers
5

5 Answers
5

Vilaño, A Laracha Índice Patrimonio | Lugares e parroquias | Véxase tamén | Menú de navegación43°14′52″N 8°36′03″O / 43.24775, -8.60070