Recover Deleted Files from Versioned S3 buckets. - Tool

In this article, we are going to see how to recover deleted files from an S3 bucket. But this works only for buckets which having Versioning enabled

When you do not have versioning enabled on an S3 bucket and you delete an object, it would be a permanent irreversible change, so you cannot recover those files

It is one of the production recommendations to have versioning enabled and we presume you have versioning enabled on the S3 bucket in which you are trying to recover objects.

To understand how this is possible only in the versioned bucket. Let's quickly learn how versioning works

Table of Contents

How Versioning Works on S3

If you enable versioning for a bucket, Amazon S3 automatically generates a unique version ID for the object that is being stored.

For example, in one bucket you can have two objects with the same key (object name) but different version IDs, such as photo.gif (version 111111) and photo.gif (version 121212).

Each object has a version ID, whether or not S3 Versioning is enabled. If S3 Versioning is not enabled, Amazon S3 sets the value of the version ID to null. If you enable S3 Versioning, Amazon S3 assigns a version ID value for the object. This value distinguishes that object from other versions of the same key.

When you enable S3 Versioning on an existing bucket, objects that are already stored in the bucket are unchanged.

Their version IDs (null), contents and permissions remain the same. After you enable S3 Versioning, each object that is added to the bucket gets a version ID, which distinguishes it from other versions of the same key.

Delete Marker or Soft Delete

When you delete an object in a versioned bucket without specifying the version of the object. S3 simply adds a flag called Delete Marker on the object as a new version

This Delete Marker would have its own Version ID and it would be added on top of the versions of the object

Objects with Delete Markers are considered soft deleted and they can be recovered if we remove the delete marker version on the top

In the preceding image, you can see the delete marker has been added on top of the current versions and when you try to GET the S3 object photo.gif it would try to get the latest version of the object which is the delete marker

If there is a delete marker on the object, S3 would return a NOT FOUND response and say the object is not present

Now to recover all we have to do is remove the Delete Marker version

How to permanently delete - in versioned S3 bucket

You can permanently delete an S3 object by specifying the version that you want to delete. Only the owner of an Amazon S3 bucket can permanently delete a version.

If your DELETE operation specifies the versionId, that object version is permanently deleted, and Amazon S3 doesn't insert a delete marker.

As shown in the preceding diagram, When you specify the VERSION ID on the delete call the object with the corresponding version is permanently deleted

Now we presume that you have understood only objects in versioned S3 buckets with delete markers can be recovered.

Lets move on to the next stage on how to recover all the deleted files in S3 bucket ( with delete marker) recursively with a prefix

RecoverS3 - Open Source tool for efficient recovery

We have created this recover-s3 tool to help you efficiently recover the deleted files from the S3 bucket

but why do you need this tool? can you not do it manually?

Of course, you can, by directly going to the console and removing the delete marker version of the object you want to recover. you can recover the S3 object

But. when you want to do it recursively in a large bucket with Millions of objects. this tool would come in handy

we have added threading to speed up the process. As part of benchmarking, we have been able to recover 10000 files in 8 minutes.

The speed of the script depends on the number of files that you are trying to recover, the network speed and the file size.

As we already mentioned, This script will recover files from S3 that have been deleted by removing the deletemarker from the S3 object.

We have released Version2 of RecoverS3 with the following features
1. Intutive HTML report with Sortable Searchable Data
2. Dryrun feature to validate before actual recovery
3. Validation of Recoverable and Non Recoverable files

Here is a glimpse of the report that Recover S3 version 2 creates. Now you can search, sort and this report is generated with a dry run.

Now let's move on to the installation and pre-requisites for the recover-S3 tool

Pre-requisites

Python 3
Boto3
pip
AWS Access Key and Secret Key
Permission to access S3 bucket and list and delete objects

Installation

Clone the repository and install the dependencies.

$ git clone https://github.com/AKSarav/recover-s3.git
$ pip install -r requirements.txt

Usage

python recover-s3-files.py [-h] – bucket BUCKET – prefix PREFIX – region

optional arguments:
  -h, – help       show this help message and exit
  – bucket BUCKET  S3 bucket name
  – prefix PREFIX  S3 prefix
  – region REGION  S3 region

Example

Here is some example command of recover-s3 that you can use to recover files from a bucket named my-bucket

python recover-s3-files.py – bucket my-bucket – prefix my-prefix – region us-east-1

You can use --dry-run option to validate the recovery and to create the HTML report before going for the actual recovery

python recover-s3-files.py – bucket my-bucket – prefix my-prefix – region us-east-1 – dry-run

Recover Deleted Files from Versioned S3 buckets. - Tool | Devops Junction

How Versioning Works on S3

Delete Marker or Soft Delete

How to permanently delete - in versioned S3 bucket

RecoverS3 - Open Source tool for efficient recovery

Pre-requisites

Installation

Usage

Example

Further reading

How Versioning Works on S3

Delete Marker or Soft Delete

How to permanently delete - in versioned S3 bucket

RecoverS3 - Open Source tool for efficient recovery

Pre-requisites

Installation

Usage

Example

Further reading

More from Middleware Inventory