Amazon AWS Archives - {5} Setfive - Talking to the World

Amazon Web Services recently published an E-Book on modern application development. In short, this guide explains the significance of digital transformation and how it can reinvent how your business delivers value. The main topics covered include: Digital Innovators, Characteristics of Modern Applications, Data Management & Computing in Modern Applications, and Security & Compliance. Below, I have summarized a few takeaways from each topic.

Digital Innovators

To be a digital innovator, you must work backwards to understand that innovation starts with your customers and listening to their wants and needs. AWS calls this process the “innovation flywheel.” The innovation flywheel consists of three steps: listen, experiment, iterate. After putting your customers first, it is essential to put technology at the center of your business. Some ways to do this are through digital marketplaces (two sided market that connects buyers and sellers,) direct-to-customer engagement, digital products as services, and insight services.

Characteristics of Modern Applications

Modern application development is a powerful approach to designing, building, and managing software in the cloud. Characteristics of Modern Applications align with digital innovation (see above.) Modern Applications require a culture of ownership, which also starts with the customers. To create this culture, companies should hire builders and support them with a belief system and let them build. It is important to trust in others skill sets and know where your boundaries lie. In terms of the architectural patterns of modern applications, most are micro-services. Micro-services have minimal function services, are deployed separately but interact together, each has its own datastore, is organized around business capabilities, the state is externalized, and provides a choice of technology for each micro-service.

Data Management & Computing in Modern Applications

Data management refers to purpose built databases that serve as decoupled data stores. Data management includes computing in modern applications. Computing with micro-services effect the way you package and run code, and compute in modern applications such as AWS Lambda. Release pipelines in AWS are standardized and automated. This means that they are no longer manual, there is continuous integration and continuous delivery. Also, there is a server-less operational model. These models are ideal for high-growth companies that want to innovate quickly because they don’t require server management, they provide flexible scaling, you pay for the value you need, and they automate high availability.

Security & Compliance

Security configuration and automation are needed. To ensure security and compliance, these practices are incorporated within the tooling. Some of this tooling includes code repositories, build-management programs, and deployment tools. Security and compliance are also applied to the release pipeline itself and the software being released through the pipeline. Lastly, DevOps and DevSecOps safeguard security and compliance. AWS defines DevOps as, “the combination of cultural philosophies, practices, and tools that increases an organization’s ability to deliver applications and services at high velocity.” Similarly, DevSecOps is described as “philosophy of integrating security practices within the DevOps process. DevSecOps involves creating a “Security as Code” culture with ongoing, flexible collaboration between release engineers and security teams.”

I hope that you found these summary useful. We will continue to try to summarize AWS content so that you don’t have to read it or navigate demo vids / webinars. Like what you read? Check out our blog post on why AWS is so cool: https://shout.setfive.com/2019/07/11/what-makes-the-aws-cloud-so-cool/.

We watched an AWS Cloud webinar, so, you don’t have to. (But if you want to, it’s attached below.) Here are are three aspects of the AWS cloud that make it both unique and useful, particularly for ISVs (Independent Software Vendors.)

Architectural Design helps ISV’s customers focus on which services are being used, understand how scalability is managed, and preview how an application looks in the cloud. The priority of the architectural design is to ensure:
- High performance / ability
- Dynamic scalability
- Adherence to AWS best practices
- Flexibility to automate app deployment – using CloudFormation templates, AWS Marketplace, and more
Security & Compliances allow ISV’s to provide their customers with risk assessment and mitigation methodologies to strengthen their security posture on AWS. These features supply customers with the documentation to explore how application deployment in AWS meets requirements and reduces a barrier to entry.
- Architectural security strategy
- Frameworks for compliance
- Policy and controls mapping
Cost Optimization assists ISV’s in offering more cost-effective services to their customers. This aspect persuades customers to adopt more solutions by ensuring that software properly utilizes AWS. With increased elasticity, and a minimized resource consumption, customers are facilitated in making decisions to move forward with an application.
- Minimizes idle resources
- Properly right-sizing infrastructure
- Helps avoid common architecture pitfalls

At the beginning of last year we tried something different and deployed an application on Amazon’s Elastic Container Service (ECS). This is in contrast to our normal approach of deploying applications directly on EC2 instances. So from a devops perspective using ECS involved two challenges, working with Docker containers and using ECS as a service. We’ll focus on ECS here since enough ink has been spilled about Docker.

For a bit of background, ECS is a managed services that allows you to run Docker containers on AWS cloud infrastructure (EC2s). Amazon extended the abstraction with “Fargate for ECS” which launches your Docker containers on AWS managed EC2s so you don’t have to manage or maintain any underlying hardware. With Fargate you define a “Task” consisting of a Docker image, # of vCPUs, and amount of RAM which AWS uses to launch a Docker container on a EC2 which you don’t have any access to. And then naturally if you need to provision additional capacity you can just tick the 1 to 2 and AWS will launch an additional container.

The Setup

The app we deployed on ECS is one we inherited from another team. The app is a consumer facing Vert.x (Java) web app that provides a set of API endpoints for content focussed consumer sites. Before taking over the app we had learned that it had some “unique” (aka bugs) scaling requirements which was one of the motivators to use ECS. On the AWS side, our setup consisted of a Fargate ECS cluster connected to an application load balancer (ALB) which handled SSL termination and health checks for the ECS Tasks. In addition, we connected CircleCI for continuous integration and continuous deployment. We’re usingecs-deploy to handle CD on ECS. ecs-deploy handles creating a new ECS task definition, bringing up a new container with that definition, and cycling out the old container if everything went well. So a year in here are some takeaways of using Fargate ECS.

Reinforces cattle, not pets

There’s a cloud devops mantra that you should treat your servers like a herd of cattle, not the family pets. The thinking being that, especially for horizontally scalable cloud servers, you want the servers to be easy to bring up and not “special” in any way. When using EC2s you can convince yourself that you’re adopting this philosophy but eventually something will leak through. Sure you use a configuration management tool but during an emergency someone will surely manually install something. And your developers aren’t supposed to use the local disk but at some point someone will rely on files in “/tmp” always being there.

In contrast, deploying on ECS reinforces thinking of individual servers as disposable because the state of your container is destroyed every time you launch a new task. So each time you deploy new code a new container will be launched without retaining any previous state. At the server level, this dynamic actually makes it impossible to make ad hoc server changes since it’s impossible to SSH to your ECS so any changes would have to present in your Dockerfile. Similarly at the app level since your disks don’t persist between deployments you’d quickly stop writing anything important just to disk.

Logs are…challenging

When using Fargate on ECS the only way to access output from your container is through a CloudWatch log group through the CloudWatch UI. At first look this is great, you can view your logs right in the AWS console UI without having to SSH into any servers! But as time goes on you’ll start to miss being able to see and manipulate logs in a regular terminal.

The first stumbling block is actually the UI itself. It’s not uncommon for the UI to be a couple of minutes delayed which ends up being a significant pain point when “shit is broken”. Related to the delays, it seems like sometimes logs are available in the ECS Task view before CloudWatch which ends up being confusing to members of the team as they debugged issues.

Additionally, although the UI has search and filtering capabilities they’re fairly limited and difficult to use. Compounding this, there frustratingly isn’t an easy way to download the log files locally. This makes it difficult to use common Linux command line tools to parse and analyze logs which you’d normally be able to do. It is possible to export your CloudWatch logs to S3 via the console and then download them locally but the process involves a lot of clicks. You could automate this via the API but it feels like something you shouldn’t have to build since for example the load balancer automatically delivers logs into S3.

You’ll (probably) still need an EC2

The ECS/container dream is that you’ll be able to run all of your apps on managed, abstract infrastructure where all you have to worry about is a Dockerfile. This might be true for some people but for typical organizations you’re probably going to need to run an EC2. The biggest pain points for us were running scheduled tasks (crontabs) and having the flexibility to run ad hoc commands inside AWS.

It is possible to run scheduled tasks on ECS but after experimenting with it we didn’t think it was a great fit. Instead, the approach we took was setting up Jenkins on an EC2 to run scheduled jobs which consisted of running some command in a Docker container. So ultimately our scheduled jobs shared Docker images with our ECS tasks since the images are hosted by AWS ECR. Because of this, the same CircleCI build process that updates the ECS task will also update the image that Jenkins runs so the presence of Jenkins is mostly transparent to developers.

Not having an “inside AWS” environment to run commands is one of the most limiting aspects of using ECS exclusively. At some point an engineer on every team is going to find themselves needing to run a database dump or analyze some log files both of which will simply be orders of magnitude faster if run within AWS vs. over the internet. We took a pretty typical approach here with an EC2 configured via Ansible in an autoscale group.

Is it worth it?

After a year with ECS+Fargate I think it’s definitely a good solution for a set of deployment scenarios. Specifically, if you’re dealing with running a dynamic set of web frontends that are easily containerized it’ll probably be a great fit. The task scaling is “one click” (or API call) as advertised and it feels much snappier than bringing up a whole EC2, even with a snapshotted AMI. One final dimension to evaluate is naturally cost. ECS is billed roughly at the same rate as a regular EC2 but if you’re leaving tasks underutilized you’ll be paying for capacity you aren’t using. As noted above, there are some operational pain points with ECS but on the whole I think it’s a good option to evaluate when using AWS.

When Amazon Web Services rolled out their version 4 signature we started seeing sporadic errors on a few projects when we created pre-authenticated link to S3 resources with a relative timestamp. Trying to track down the errors wasn’t easy. It seemed that it would occur rarely while executing the same exact code. Our code was simply to get a pre-authenticated URL that would expire in 7 days, the max duration V4 signatures are allowed to be valid. The error we’d get was “The expiration date of a signature version 4 presigned URL must be less than one week”. Weird, we kept passing in “7 days” as the expiration time. After the error occurred a couple of times over a few weeks I decided to look into it.

The code throwing the error was located right in the SignatureV4 class. The error is thrown when the end timestamp minus the start timestamp for the signature was greater than a week. Looking through the way the timestamps were generated it went something like this:

Generate the start timestamp as current time for the signature assuming one is not passed.
Do a few other quick things not related to this problem.
Do a check to insure that the end minus start timestamp is less than a week in seconds.

So a rough example with straight PHP could of the above steps for a ‘7 days’ expiration would be as follows:

Straight forward enough, right? the problem lies when a second “rolls” between generating the `$start` and the end timestamp check. For example, if you generate the `$start` at `2017-08-20 12:01:01.999999`. Let’s say this gets assigned the timestamp of `2017-08-20 12:01:01`. Then the check for the 7 weeks occurs at `2017-08-27 12:01:02.0000` it’ll throw an exception as duration between the start and end it’s actually now for 86,401 seconds total. It turns outs triggering this error is easier than you’d think. Run this script locally:

That will throw an exception within a few seconds of running most likely.

After I figured out the error, the next step was to submit a issue to make sure I’m not misunderstanding how the library should be used. The simplest fix for me was to generate the end expiration timestamp before generating the start timestamp. After I made the PR, Kevin S. from AWS pointed out that while this fixed the problem, the duration still wasn’t guaranteed to always be the same for the same relative time period. For example, if you created 1000 presigned URLs all with ‘+7 days’ as the valid period, some may be 86400 in duration others may be 86399. This isn’t a huge problem, but Kevin made a great point that we could solve the problem by locking the relative timestamp for the end based on the start timestamp. After adding that to the PR it was accepted. As of release 3.32.4 the fix is now included in the SDK.

On a project we were working on recently it appeared that we had data coming into our Extract, Transform, Load (ETL) processes which should have been filtered out. In this particular case the files which we imported only would exist at max up to 7 days and on any given day we’d have tens of thousands of files that would be created and imported. This presented a difficult problem to trace down if something inside our ETL had gone awry or if we were being fed bad data. Furthermore as the files always would be deleted after importing we didn’t keep where a data point was created from.

Instead of updating our ETL process to track where a specific piece of data originated from we wanted to basically ‘grep’ the files in S3. After looking around it doesn’t look like anyone has built a “Grep for S3”, so we built one. The reason we didn’t simply download the files locally and then process them one at a time is it’d take forever to transfer, then grep each one individual sequentially. Instead we wanted to do the search in parallel and not hold the entire files on the local disk.

With this we came up with our simple S3Grep java app (a pre-built jar is located in the releases) which will search all files in a specific bucket for a specific string. It currently supports both regex or non-regex search strings. You can specify how many threads you want it to use to process the files or it by default will try to use the same number of CPU’s on your machine. It utilizes the S3 Java adapter to read the files as a stream rather than a single transfer, than read from disk. Using the tool is very simple:

A the s3grep.properties file is a config file where you setup what you are searching for. An example:

For the most part this is self explanatory. The log level will default to INFO, however if you specify DEBUG it will output some more information such as what file’s it is currently checking. The logger_pattern parameter defaults to “%d{dd MMM yyyy HH:mm:ss} [%p] %m%n” and can be any pattern you want. For more information on the formatting visit the PatternLayout Documentation.

The default output format would look something like this:

If you want a little less verbose and more of just log lines you can update the logger_pattern to be just %m%n and end up with something similar to:

The format of the output is FILE:LINE_NUMBER:matching_string.

Anyways hope this helps you if you are trying to hunt down what file contains a text string in your S3 buckets. Let us know if you have any questions or if we can help!

Category: Amazon AWS

AWS Modern Application Development E-Book

What Makes the AWS Cloud so Cool!?

AWS: Looking back on a year with AWS Elastic Container Service

The Setup

Reinforces cattle, not pets

Logs are…challenging

You’ll (probably) still need an EC2

Is it worth it?

An Edge Case of Time in AWS PHP SDK

S3Grep – Searching S3 Files and Buckets