Backup AWS Codecommit
- Oleksii Grudev
- Oct 25, 2022
- 3 min read
Problem statement
Having a full backup of all Codecommit repos on the account is always a good idea in order to be protected from situations like accidental repository deletion, account hacker attack etc. Unfortunately the AWS Backup tool as of now does not support Codecommit backup natively. There is a solution from Amazon which utilizes Codebuild and Eventbridge:
However the downside of this solution is that it is activated on a certain repository when the commit to this particular repository is done hence if there are some not frequently used repositories present in the account - it may take a lot of time till they will be backed up. The solution below will backup all repositories in the account on a scheduled basis.
Solution design
The solution uses same components as the AWS solution:

The Codebuild process is triggered by Eventbridge, the codebuild takes the code from a separate Codecommit repository with source files and invokes a script which will go through all Codecommit repositories, pull all branches, create a zip archives and push it to S3 bucket.
The “buildspec.yml” file:
version: 0.2
phases:
install:
commands:
- apt-get update -y
- apt-get install -y jq
build:
commands:
- chmod +x backup_codecommit.sh
- ./backup_codecommit.sh
The “backup_codecommit.sh” script:
#!/bin/bash
set -ex
# variable CodeCommitBackupsS3BucketPrefix is exported into CodeBuild environment variables
backup_s3_bucket_prefix="${CodeCommitBackupsS3BucketPrefix:-"my-s3-bucket"}"
# Region and Account ID
aws_region="${AwsRegion:-"us-east-1"}"
aws_account_id="${AwsAccountId:-"00000000"}"
git config --global credential.helper '!aws codecommit credential-helper $@'
git config --global credential.UseHttpPath true
declare -a repos=(`aws codecommit list-repositories | jq -r '.repositories[].repositoryName'`)
for codecommitrepo in "${repos[@]}"
do
echo "[===== Cloning repository: ${codecommitrepo} =====]"
git clone --mirror "https://git-codecommit.${AWS_DEFAULT_REGION}.amazonaws.com/v1/repos/${codecommitrepo}" "${codecommitrepo}/.git"
cd ${codecommitrepo}
git config --bool core.bare false
for branch in $(git branch --all); do
git checkout ${branch}
done
cd ..
dt=$(date -u '+%Y_%m_%d_%H_%M')
zipfile="${codecommitrepo}_backup_${dt}_UTC.tar.gz"
echo "Compressing repository: ${codecommitrepo} into file: ${zipfile} and uploading to S3 bucket: ${backup_s3_bucket}/${codecommitrepo}"
tar -zcvf "${zipfile}" "${codecommitrepo}/"
aws s3 cp "${zipfile}" "s3://${backup_s3_bucket_prefix}-${aws_account_id}-${aws_region}/${aws_account_id}/${aws_region}/${codecommitrepo}/${zipfile}" --content-type application/x-gzip --region $AWS_DEFAULT_REGION
rm $zipfile
rm -rf "$codecommitrepo"
done
Full CloudFormation template:
AWSTemplateFormatVersion: '2010-09-09'
Parameters:
RulePrefix:
Type: "String"
Default: "amd"
CodeCommitBackupsS3BucketPrefix:
Type: "String"
Description: "S3 Bucket prefix for CodeCommit repository backups"
Default: "amd-backup-codecommit-results"
CodeCommitSourceRepoName:
Type: "String"
Description: "CodeCommit source repo name"
Default: "amd-backup-codecommit-source"
CodeCommitSourceRepoRegion:
Type: "String"
Description: "CodeCommit source repo region"
Default: "us-east-1"
BackupScriptsFile:
Type: "String"
Description: "Compressed file containing backup scripts and buildspec"
Default: "codecommit_backup_scripts.zip"
BackupSchedule:
Type: "String"
Description: "Backup schedule as a cron expression"
Default: "cron(10 09 * * ? *)"
Resources:
S3ResultBucket:
Type: AWS::S3::Bucket
Properties:
BucketName: !Sub '${CodeCommitBackupsS3BucketPrefix}-${AWS::AccountId}-${AWS::Region}'
BucketEncryption:
ServerSideEncryptionConfiguration:
- ServerSideEncryptionByDefault:
SSEAlgorithm: aws:kms
KMSMasterKeyID: !Sub 'arn:aws:kms:${AWS::Region}:${AWS::AccountId}:alias/aws/s3'
LifecycleConfiguration:
Rules:
- NoncurrentVersionExpirationInDays: 30
Status: Enabled
Tags:
- Key: "backup"
Value: "true"
VersioningConfiguration:
Status: Enabled
CodeBuildProjectRole:
Type: "AWS::IAM::Role"
Properties:
RoleName: !Sub '${RulePrefix}-CodeBuildProjectRoleForCodecommitBackup-${AWS::Region}'
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: "Allow"
Principal:
Service:
- "codebuild.amazonaws.com"
Action:
- "sts:AssumeRole"
Path: "/"
Policies:
- PolicyName: "codecommit-readonly"
PolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: "Allow"
Action:
- "codecommit:BatchGet*"
- "codecommit:Get*"
- "codecommit:Describe*"
- "codecommit:List*"
- "codecommit:GitPull"
Resource: "*"
- PolicyName: "logs"
PolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: "Allow"
Action:
- "logs:CreateLogGroup"
- "logs:CreateLogStream"
- "logs:PutLogEvents"
Resource: "*"
- PolicyName: "s3-backup"
PolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: "Allow"
Action:
- "s3:putObject"
Resource:
- !Sub "arn:aws:s3:::${CodeCommitBackupsS3BucketPrefix}-${AWS::AccountId}-${AWS::Region}/*"
CodeBuildProject:
Type: AWS::CodeBuild::Project
Properties:
Name: !Sub '${RulePrefix}-CodeCommitBackup-${AWS::Region}'
Description: CodeBuild will backup all CodeCommit repo in this region
ServiceRole: !GetAtt CodeBuildProjectRole.Arn
Artifacts:
Type: no_artifacts
Environment:
Type: LINUX_CONTAINER
ComputeType: BUILD_GENERAL1_MEDIUM
Image: aws/codebuild/python:3.5.2
EnvironmentVariables:
- Name: CodeCommitBackupsS3BucketPrefix
Value: !Ref CodeCommitBackupsS3BucketPrefix
- Name: AwsRegion
Value: !Ref AWS::Region
- Name: AwsAccountId
Value: !Ref AWS::AccountId
Source:
Type: CODECOMMIT
Location: !Join
- ''
- - 'https://git-codecommit.'
- !Ref 'CodeCommitSourceRepoRegion'
- '.amazonaws.com/v1/repos/'
- !Ref 'CodeCommitSourceRepoName'
TimeoutInMinutes: 60
EventRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Principal:
Service: events.amazonaws.com
Action: sts:AssumeRole
Policies:
- PolicyName: events-codebuild
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- codebuild:StartBuild
Resource: !GetAtt CodeBuildProject.Arn
RoleName: !Sub '${RulePrefix}-event-role-backup-codebuild-${AWS::Region}'
CodeCommitBackupScheduledRule:
Type: "AWS::Events::Rule"
Properties:
Description: "Scheduled rule for CodeCommit backups"
ScheduleExpression: !Ref BackupSchedule
State: "ENABLED"
Targets:
- Arn: !GetAtt CodeBuildProject.Arn
Id: !Sub '${RulePrefix}-CodeCommitBackup'
RoleArn: !GetAtt EventRole.Arn
The buildspec and the bash script files are in zip archive, the archive is put to the created codecommit repository; the name of the repository and name of the zip archive are set in the Cloudformation template parameters.
Conclusion
Using the automation above it is possible to backup all codecommit repositories in the given region on a scheduled basis and not to rely on the presence of commits to repositories whilst using the AWS-provided solution.
Comments