Scheduling Elastic Block Storage (EBS) Snapshots with AWS Lambda
Scheduling Elastic Block Storage
(EBS) Snapshots with AWS Lambda
Traditionally, scheduling snapshots
of your Elastic Block Storage (EBS) volumes required the setup and maintenance
of an EC2 instance or the use of a third-party service like Skeddly. Depending on cost or security concerns (having to grant a
service like Skeddly access to your AWS account) this may not be an option.
Additionally, storing your access keys on an EC2 instance may not be acceptable,
even if you limit the IAM role to only allow the creation and deletion of
snapshots.
Enter AWS Lambda. This service
allows you to write a small application in either Node.js, Java, or Python that
is executed either on a schedule or in response to other events. In this
article we will be focusing on creating a Python script that creates EBS
snapshots once a day and deletes backups older than a week from creation to
keep storage costs in check.
Just a couple notes before we begin:
- The Python code below is a compilation of two articles
located here
and here.
Big shout-out to the original author!
- Lambda is billed by amount of requests made do your
application (the number of times your lambda function is triggered) and
the amount of time it runs in milliseconds. Depending on how many volumes
you are creating snapshots from the cost could vary. In my scenario I am
only creating snapshots for a single volume. This low frequency does not
cost me a single penny. See this article
for more details on pricing.
- The type of backup we are configuring here is
considered “crash consistent”. If any data is being written to the EBS
volume when the snapshot first starts there is a chance for corruption or
lost data when restoring from the snapshot being created. The only way I
know of to get an application consistent backup would be to power down the
instance, start the snapshot, then start it again. This can be done in
these scripts, but it is outside the scope of this particular article.
With that out of the way, on to the
configuration…
Create a new Identity and Access
Management (IAM) role in the AWS console
- Login to the AWS Console and go to Identity &
Access Management
- Click Roles on the left navigation
- Click Create New Role
- Name the role (no spaces allowed) and click Next Step
- Click the Select button for AWS Lambda
- No not attach any policies, just click Next Step
- Click Create Role
- The new role has been created and you are returned to
the Roles list. Click the new role you just created, we need to add the
new custom policy.
- Expand Inline Policies and click the “click here” link
- Choose Custom Policy and click the Select button
- Name the policy (you can just call it the same thing
you did in step 4)
- Paste in the following policy document:
13.{
14."Version": "2012-10-17",
15."Statement": [
16.
{
17. "Effect": "Allow",
18. "Action": ["logs:*"],
19. "Resource": "arn:aws:logs:*:*:*"
20.
},
21.
{
22. "Effect": "Allow",
23. "Action": "ec2:Describe*",
24. "Resource": "*"
25.
},
26.
{
27. "Effect": "Allow",
28. "Action": [
29. "ec2:CreateSnapshot",
30. "ec2:DeleteSnapshot",
31. "ec2:CreateTags",
32. "ec2:ModifySnapshotAttribute",
33. "ec2:ResetSnapshotAttribute"
34. ],
35. "Resource": ["*"]
36.
}
37.]
38.}
- Click Apply Policy. The new role has been created and
is ready for use by Lambda.
Create a New Lambda Function to
Create the Snapshots
- Go to the Lambda console and Click Get Started. Choose
to create a new Lambda function.
- Click Skip on the Select blueprint page
- Give your function a name and optionally a description
- Choose Python 2.7 for the Runtime
- Paste in the following code:
6. import boto3
7. import collections
8. import datetime
9.
10.ec = boto3.client('ec2')
11.
12.def lambda_handler(event, context):
13.
reservations = ec.describe_instances(
14. Filters=[
15. {'Name': 'tag-key', 'Values': ['backup', 'Backup']},
16. ]
17.
).get(
18. 'Reservations', []
19.
)
20.
21.
instances = sum(
22. [
23. [i for i in r['Instances']]
24. for r in reservations
25. ], [])
26.
27.
print "Found %d instances that need
backing up" % len(instances)
28.
29.
to_tag = collections.defaultdict(list)
30.
31.
for instance in instances:
32. try:
33. retention_days = [
34. int(t.get('Value')) for t in instance['Tags']
35. if t['Key'] == 'Retention'][0]
36. except IndexError:
37. retention_days = 7
38.
39. for dev in instance['BlockDeviceMappings']:
40. if dev.get('Ebs', None) is None:
41. continue
42. vol_id = dev['Ebs']['VolumeId']
43.
print "Found
EBS volume %s on instance %s" % (
44. vol_id, instance['InstanceId'])
45.
46. snap = ec.create_snapshot(
47. VolumeId=vol_id,
48. )
49.
50. to_tag[retention_days].append(snap['SnapshotId'])
51.
52. print "Retaining
snapshot %s of volume %s from instance %s for %d days" % (
53. snap['SnapshotId'],
54. vol_id,
55. instance['InstanceId'],
56. retention_days,
57. )
58.
59.
60.
for retention_days in to_tag.keys():
61. delete_date = datetime.date.today() + datetime.timedelta(days=retention_days)
62. delete_fmt = delete_date.strftime('%Y-%m-%d')
63. print "Will
delete %d snapshots on %s" % (len(to_tag[retention_days]), delete_fmt)
64. ec.create_tags(
65. Resources=to_tag[retention_days],
66. Tags=[
67. {'Key': 'DeleteOn', 'Value': delete_fmt},
68. ]
69. )
- Under the code editor section choose the new role you
created from the Role drop-down.
- Click Next
- Click Create function.
Go into the EC2 console and add a
tag to any EC2 Instances that will be included in the backup.
Simply create a new tag on any instance(s) you would like to include in the backup. Enter Backup for they Key and Value and the script will create snaps for any attached EBS volumes.
Simply create a new tag on any instance(s) you would like to include in the backup. Enter Backup for they Key and Value and the script will create snaps for any attached EBS volumes.
Test the Lambda Function
- Go back to the Lambda console and click the function
you created earlier.
- Click the Test button. Leave “Hello World” selected for
the Sample event template and click Save and test. If you go back to the
EC2 console and click snapshots you should see a snapshot in the process
of being created.
Create Another Function to Delete
Old Backups
Create another lambda function like we did in the “Create a new lambda function to create the snapshots.” section of this how-to, but use the following code for step 5 instead. Replace the “12345” in account_ids = [‘12345’] with your actual AWS account number (found on the My Account page of the AWS console).
Create another lambda function like we did in the “Create a new lambda function to create the snapshots.” section of this how-to, but use the following code for step 5 instead. Replace the “12345” in account_ids = [‘12345’] with your actual AWS account number (found on the My Account page of the AWS console).
import boto3
import datetime
ec = boto3.client('ec2')
"""
This
function looks at *all* snapshots that have a "DeleteOn" tag
containing
the
current day formatted as YYYY-MM-DD. This function should be run at least
daily.
"""
"""
To
get your account id, run this snippet:
>
import boto3
>
iam = boto3.client('iam')
>
print iam.get_user()['User']['Arn'].split(':')[4]
"""
account_ids = ['12345']
def lambda_handler(event, context):
delete_on = datetime.date.today().strftime('%Y-%m-%d')
filters = [
{'Name': 'tag-key', 'Values': ['DeleteOn']},
{'Name': 'tag-value', 'Values': [delete_on]},
]
snapshot_response = ec.describe_snapshots(OwnerIds=account_ids, Filters=filters)
for snap in snapshot_response['Snapshots']:
print "Deleting
snapshot %s" % snap['SnapshotId']
ec.delete_snapshot(SnapshotId=snap['SnapshotId'])
Adjusting Retention
Retention duration by default is seven days. You can change this by modifying the snapshot creation code in the “Create a new lambda function to create the snapshots.” section of this how-to. The specific line you are looking for reads “retention_days = 7”.
Retention duration by default is seven days. You can change this by modifying the snapshot creation code in the “Create a new lambda function to create the snapshots.” section of this how-to. The specific line you are looking for reads “retention_days = 7”.
Scheduling the Lambda Functions
- Lastly, we need to schedule our two new functions to
run daily (or however often as you would like). Select one of the
functions and choose the Event sources tab.
- Click Add event source
- Choose Scheduled Event and give it a name
- Choose cron(0 7 * * ? *) for the Schedule expression.
This will allow you to schedule the function to run at a time of your
choosing. See this link
for more information on how to specify a Cron expression.
- Click Submit. The function will now run on a schedule
you specify. Repeat these steps for the other function.
That’s it, you should now have daily
backups of your instance(s) based on the schedule you specified. If you would
like more frequent backups just change the first function you created to run
more frequently.
Good post and it is very useful.. :)
ReplyDeleteThanks Sysrams
DeleteWow... Thanks for sharing useful script and all details of EBS auto snapshot. I really found this blog very helpful.
ReplyDeleteExcellant post!!!. The strategy you have posted on this technology helped me to get into the next level and had lot of information in it.
ReplyDeleteAWS Online Training
AWS Certification Training
AWS Certification Course Online
AWS Training
AWS Online Course