Mastering Emergency Access: How to Configure a Breakglass Fargate Docker Container for Root-Level EFS Access
Navigate through crisis: A comprehensive guide to seamlessly access your entire EFS Filesystem in emergency scenarios for investigative purposes.
In this digital age we live in, where data is the most valuable currency, safeguarding your files and ensuring access in times of crisis is paramount. AWS Elastic File System (EFS) offers organizations a scalable, cloud-native file storage solution that integrates seamlessly with AWS cloud services. However, even with the most advanced systems, the risk of access disruptions due to technical failures or security breaches remains a tangible threat. This brings us to the concept of a "breakglass" scenario—a situation where standard access protocols are ineffective or compromised, necessitating an alternative method to gain immediate, secure access to your data. Enter the breakglass EFS Fargate Docker container, a specialized emergency investigatory tool that enables you to quickly access typically locked down and segregated views of a larger filesystem. This blog post delves into the why and how of setting up such a container, ensuring that when the unexpected occurs, you're ready—not just to react, but to proactively inspect and remediate any issues affecting your vital data.
The Why: The Importance of a Breakglass Container in Crisis Management
Imagine this: Your primary access controls have failed, or a configuration mishap has left your EFS data unreachable through standard means. The clock is ticking, operational downtime is costing you, and the pressure is mounting. This is where a breakglass EFS Fargate Docker container comes into play. It's your emergency access point, a pre-configured, secure backdoor that allows you to bypass normal access protocols safely and access your raw data as it is directly stored on the filesystem. By spinning up this ephemeral container which has the root of your EFS filesystem mounted, you can quickly investigate and remediate any inconsistent data that is bringing down your application.
Advantages:
Immediate Access: When every second counts, having a container ready to deploy ensures you can access your data without the delay of configuring access from scratch. This can be the difference between adhering to your RTO (Recovery Time Objective) and still trying to gain access while it sails past you.
Secure Yet Flexible: Designed with the principle of least privilege in mind, it provides just enough access to manage the crisis without exposing your system to further risk. The task does not exist, only the task definition, so it is running only when it is truly needed.
Valuable Investigation Tool: This tool can be used not only for the worst case scenario, it can also be used for pure investigate purposes.
The What: A Step-by-Step Guide to Building a Breakglass Container
In order to properly showcase this concept, I will showcase the EFS Breakglass container service within a minimally configured AWS VPC with an EFS filesystem. In order to apply this into your own ecosystem you only need to copy the AWS ECS EFS Util Service. The Breakglass service will have a desired count of 0 so there are no running containers and if access is needed, it can be set to 1. Once a container boots and stabilizez, we can leverage AWS ECS Exec, which I already covered in a previous post here, to land with an interactive shell within the container with the root of the EFS mounted in /mnt/efs
.
The high level architectural diagram of what we are going to build is as follows:
Without further ado, let’s start building it.
Prerequisites
Before we begin, make sure you have the following prerequisites installed and configured:
AWS CLI [install guide]
AWS CDK [install guide]
Node.JS and NPM [install guide]
Java [install guide]
Although the IaC will be written in AWS CDK using Java, the fastest and easiest way to bootstrap the project is by leveraging the cdk
CLI directly from NPM.
Step 1: Initialize the CDK App
First, we need to initialize a new CDK app in your preferred language, for this series, I will go with Java. Let’s open up a terminal and run the following commands to get started:
mkdir -p efs-escape-hatch
cd efs-escape-hatch
npx cdk init app --language=java
This will create a new CDK app in Java with the following structure:
Step 2: Define the VPC Stack
In this simple skeleton configuration, the VPC is simple and straightforward. While we could use isolated subnets with no egress, beyond the VPC Endpoints for ECR (to pull containers images), S3 (ECR stores container image data on AWS S3 owned buckets) and SSM ( we will use SSM Exec to obtain an interactive terminal into the Fargate container, more on that later), in order to keep this as simple as possible, we will use the generic public & private with egress subnet configuration.
Step 3: Define the ECS Stack
As we aim to have a modular design, in which the ECS stack may very well be maintained by a completely different team, we will define the ECS cluster in its own dedicated stack, with ECS execute command enabled, in a straightforward manner as follows:
Step 4: Define the EFS Stack
As the purpose of this exercise is to access an EFS file system, we will create a simple and straightforward one now. The only added complexity for the purpose of the demo is to initialize the EFS file system with some folders, in this case, we will use this EFS system for a MySQL / RabbitMQ cluster combo, where both clusters share the same EFS and we are using AWS CDK Custom Resources to trigger a Lambda that will mount the EFS via an Access Point and run a simple inline NodeJS snippet to create the /mysql/data
and /rabbitmq/data
directories.
Step 5: Define the EFS Breakglass Stack
And now the crux of this article, the service that is by default in cold standby, with 0 active containers, waiting for a signal to be started and mount the EFS filesystem, to which we can connect via AWS ECS Exec to land with a shell directly in the container.
Step 6: Define the ECS Exec utility script
The final setup step needed now is to configure a lightweight shell script that will allow us to quickly leverage ECS Exec and land within an interactive shell inside the busybox
container. This script is done in a such a way that it can be reused to land in any container that respects the container naming convention across any environment name although for this demo purpose, we only have the DEMO
environment.
As we can see in the above script, we can reap many benefits from adhering to a standardized and predictable naming format across the resources on our entire estate.
Step 7: Defining the CDK entrypoint
Final step in the AWS CDK setup is to define the entry class for the entire application as follows:
Step 8: Deploying the stacks
We can deploy all of the above stacks in one go with the following command:
npx cdk deploy —all -c skipDependencies false
And now we can finally use ECS Exec and start browsing our EFS file system at will with the following:
./ecs-exec.sh DEMO EFS-Util
Step 9: Cleanup
Now that we have completed this exercise, we can delete all the resources to ensure no nefarious bill hits us at the end of the month !
./ecs-exec.sh DEMO EFS-Util
Conclusion: A Stitch in Time
The adage "a stitch in time saves nine" holds particularly true in the realm of IT and data management. You're not just setting up a technical solution; you're laying the foundation for countering the unexpected.
In conclusion, the implementation of a breakglass EFS Fargate Docker container is not merely a technical exercise; it is a critical step towards ensuring operational resilience in the face of unforeseen challenges. By preparing for emergency access scenarios, organizations can significantly reduce the impact of access disruptions, safeguard their data integrity, and maintain business continuity with confidence.
It helps you stick to and outperform your RTO while keeping calm and collected. In a world where the unexpected is the only certainty, such preparedness is not just beneficial; it's essential for business continuity and client confidence.
Let this guide serve as a reminder of the power of preparation and the critical role of emergency access solutions in building a robust, resilient IT infrastructure. I encourage you to take the necessary steps to implement this breakglass scenario solution within your organization. By doing so, you're not just protecting your data; you're safeguarding the future of your business.
Remember, in the world of IT, the unexpected is the only certainty. Equip your organization with the tools and strategies to navigate these uncertainties with confidence. Operational resilience is not just about surviving the storm; it's about thriving in the aftermath. Prepare today to ensure your organization's resilience for tomorrow.
As always, you can find the full source for this project on my Github repository here.