Structured Logging with Spring Boot and Amazon CloudWatch

Table Of Contents

The primary purpose of logging in applications is to debug and trace one or more root causes of an unexpected behavior. We take various approaches to logging from putting ad-hoc print statements to embedding sophisticated logging libraries in our code.

Irrespective of which approach we take, a log without a consistent structure with context information is difficult to search for and locate the root cause of problems.

Amazon CloudWatch is a managed monitoring and logging service which is used as centralized log storage. It can also run queries on structured logs to extract valuable information.

In this article, we will understand:

  • how to produce structured logs from applications with an example of producing structured logs from a Spring Boot application
  • ingest those structured logs in Amazon CloudWatch
  • run queries on the ingested structured logs to extract useful insights into the application

Example Code

This article is accompanied by a working code example on GitHub.

What is Structured Logging

Before going further, let us understand Structured logging in a bit more detail.

Structured logging is writing logs with information in a consistent format that allows logs to be treated as data rather than text. We log a structured object, most often as JSON for writing structured logs, instead of just logging a line of text.

The JSON object is composed of fields that can give contextual information about the log event, for example:

  • the application name
  • class or method name from where the log was produced
  • invoker of the method
  • DateTime of the logging event

The JSON object may also include the request and response payload in case of API or method calls and optionally the stacktrace in case of errors.

This structured format of logs helps us to search by applying filters, sort, and limit operations on different fields in the structure to gain useful insights about our application.

Here is an example of a structured log:

{
  "instant": {
    "epochSecond": 1682426514,
    "nanoOfSecond": 223252000
  },
  "thread": "http-nio-8080-exec-6",
  "level": "ERROR",
  "loggerName": "***.services.AccountService",
  "message": "Account not found:: 5678000",
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.log4j.spi.AbstractLogger",
  "contextMap": {
    "accountNo": "5678000"
  },
  "threadId": 43,
  "threadPriority": 5,
  "appName": "AccountsProcessor",
  "version": "release1.0"
}

In this log, we can see several contextual information like the thread identifier, datetime epoch, and application name apart from the log message: Account not found:: 5678000.

Producing Structured Logs from a Spring Boot Application

We produce structured logs in applications most often by using logging libraries in different programming languages.

Let us use a Spring Boot application for generating structured logs. We can create the initial application setup of our Spring Boot application from the Spring Boot starter and open it in our favorite IDE.

We will use the log4j library to generate structured logs. The snippet of the FileAppender of our Log4j configuration in log4j2.xml looks like this:

<File name="FileAppender" 
    fileName="/home/ec2-user/accountprocessor/logs/accountprocessor-logging-dev.log">
    <JsonLayout 
        complete="false" 
        compact="true" 
        eventEol="true" 
        properties="true"  >
        <KeyValuePair 
            key="appName" 
            value="AccountsProcessor" />
        <KeyValuePair 
            key="version" 
            value="release1.0" />
        <KeyValuePair 
            key="accountNo" 
            value="${ctx:accountNo}"/>
    </JsonLayout>
</File>

In this FileAppender we have used JsonLayout to generate the logs in JSON format. We have added additional fields: appName, version, and accountNo to add useful context around the log events.

We have also added a sample API to the application to which we will send HTTP GET requests. On receiving these requests, our application will use the log4j configuration to produce structured logs.

@RestController
@RequestMapping("/accounts")
public class AccountInquiryController {

    private AccountService accountService;

    private static final Logger LOG = 
        LogManager.getLogger(
            AccountInquiryController.class);


    public AccountInquiryController(
        final AccountService accountService){
          this.accountService = accountService;
    }

    @GetMapping("/{accountNo}")
    @ResponseBody
    public AccountDetail getAccountDetails(
        @PathVariable("accountNo") String accountNo) {

        ThreadContext.put("accountNo", accountNo);
        LOG.info(
            "fetching account details for account {}", accountNo);

        Optional<AccountDetail> accountDetail = 
                accountService.getAccount(accountNo);

        // Log response from the service class
        LOG.info("Details of account {}", 
            accountDetail);
        ThreadContext.clearAll();
        return accountDetail.orElse(
            AccountDetail.builder().build());
    }
}

Here we have added two logger statements to print the HTTP request’s path parameter accountNo and the response from the service class.

We have also added the accountNo in a ThreadContext so that all the logs in this thread of execution will print the accountNo field. This will allow us to correlate and group requests by the accountNo field.

When we run this application and send some requests to the endpoint http://localhost:8080/accounts/5678888, we can see the logs in the console as well as in a file. In the next section, we will run this application in an Amazon EC2 instance and send the structured logs generated by the application to Amazon CloudWatch.

CloudWatch Logging Concepts: Log Events, Log Streams, and Log Groups

Before sending our logs to Amazon CloudWatch, let us understand how the logs are stored and organized in CloudWatch into Log Streams and Log Groups.

Log Event: A Log Event is an activity recorded by the application. It contains a timestamp and raw event message encoded in UTF-8.

Log Streams: A log stream is a sequence of log events emitted by AWS services or any custom application. This is how a set of log streams looks in the AWS management console:

Log Streams

This is a snapshot of a log stream containing a sequence of log events.

Log Groups: Log Groups are a group of Log Streams that share the same retention, monitoring, and access control settings. Each log stream belongs to one log group. A set of log groups in the AWS console is shown here:

Log Groups

We can specify the duration for which we want the logs to be retained by specifying retention settings to the log group.

We can also assign metric filters to log groups to extract metric observations from ingested log events and transform them into data points in a CloudWatch metric.

Here we will configure a Spring Boot application to produce structured logs and then send those logs to CloudWatch.

Sending the Logs to Amazon CloudWatch from Amazon EC2 Instance

We will next run the Spring Boot application in an EC2 instance and ship our application logs to CloudWatch. We use the unified CloudWatch agent to collect logs from Amazon EC2 instances and send them to CloudWatch.

Creating EC2 Instance and Configuring it to Run the Spring Boot Application

We can either create the EC2 instance from the AWS Management Console or any of the Infrastructure as Code tools: Terraform, CloudFormation, or CDK.

For the purpose of running our example, Terraform scripts are included in the source code for creating the EC2 instance.

We also need to install OpenJDK: an open-source implementation of the Java Platform to run our Spring Boot application. After the EC2 instance starts up, we can use the following script to install OpenJDK on the EC2 instance.

wget https://download.java.net/***openjdk-20.0.1_linux-x64_bin.tar.gz
tar xvf openjdk*
export JAVA_HOME=jdk-20.0.1
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

We will further need to attach an IAM role with AWS Managed policy CloudWatchAgentServerPolicy to the EC2 instance which allows the EC2 instance to write logs to Amazon CloudWatch.

Running the Spring Boot Application

After configuring the EC2 instance, we will transfer the Spring Boot application from our local machine to the EC2 instance created in the previous step using SCP(Secure Copy) protocol:

scp -i tf-key-pair.pem ~/Downloads/accountProcessor/target/accountProcessor-0.0.1-SNAPSHOT.jar ec2-user@3.66.165.62:/home/ec2-user/

In this scp command, we are copying the Spring Boot application jar file: accountProcessor-0.0.1-SNAPSHOT.jar from our local machine to the EC2 instance. We will then run this Jar file with the command:

java -jar accountProcessor-0.0.1-SNAPSHOT.jar

After the application is started we can see the application logs in the file: accountprocessor-logging-dev.log configured in the FileAppender in the log4j configuration of our application.

In the next section, we will configure the CloudWatch agent to read this file and ship the log entries to Amazon CloudWatch.

Installing and Configuring the Unified CloudWatch Agent

The Unified CloudWatch agent is available as a package in Amazon Linux 2. Let us install the CloudWatch agent by running the yum command:

sudo yum install amazon-cloudwatch-agent    

Next, we need to create a configuration file for configuring the CloudWatch agent to collect specific log files from the EC2 instance and send them to CloudWatch.

The agent configuration file is a JSON file with three sections: agent, metrics, and logs that specifies the metrics and logs which the agent needs to collect. The logs section specifies what log files are published to CloudWatch Logs.

Since our Spring Boot application is writing the log files to the path: accountprocessor/logs/accountprocessor-logging-dev.log, we will configure this path in the logs section of our agent configuration file.

We can create the agent configuration file by using the agent configuration file wizard or by creating it manually from scratch.

Let us use the wizard to create the configuration file by starting the agent configuration file wizard using the following command:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard

In our case we will specify the log file path in the wizard:

start-wizard input log file path

In this snapshot, we can see the file path and the names of the log group specified in the wizard.

The configuration file: config.json generated by the wizard looks like this:

{
    "agent": {
        "run_as_user": "ec2-user"
    },
    "logs": {
        "logs_collected": {
            "files": {
                "collect_list": [
                    {
                        "file_path": "/home/ec2-user/accountprocessor/logs/accountprocessor-logging-dev.log",
                        "log_group_name": "accountprocessor-logging-dev.log",
                        "log_stream_name": "{instance_id}",
                        "retention_in_days": -1
                    }
                ]
            }
        }
    }
}

We can further modify this file manually to add more file paths.

After configuring the CloudWatch agent let us start the CloudWatch agent by running the command:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json

Once the agent is started, it will start sending the log events to Amazon CloudWatch.

Viewing the Application Logs in Amazon CloudWatch

We can now view the logs from our Spring Boot application in the Amazon CloudWatch console:

view log group view log stream view log events

Although we are logging from a single application here, CloudWatch is commonly used as a log aggregator from multiple source applications or services. This allows us to see the logs from all sources in one place as a single and consistent flow of log events ordered by time.

Running Queries on Logs with CloudWatch Log Insights

CloudWatch Log Insights provides a User Interface and a powerful purpose-built query language to search through the ingested log data and decipher different signals to monitor our applications.

Here we are using CloudWatch Log Insights to find the number of errors that occurred in our Spring Boot application in the last 1 hour.

insights-query

We have defined a query with a filter on level = ‘ERROR’ sorting by timestamp and limiting the results to 20. When we run the query, we get the following results:

insights-results

In the query results, we can see 6 errors from our application in the last 1 hour. We can define appropriate thresholds on fields like the number of errors within a defined interval to take proactive mitigating actions.

Conclusion

Here is a list of the major points for a quick reference:

  1. Amazon CloudWatch is a managed monitoring and logging service which is used as centralized log storage.
  2. Structured logging is a methodology to log information in a consistent format that allows logs to be treated as data rather than text.
  3. We produce structured logs in applications most often by using logging libraries in different programming languages.
  4. logs are stored and organized in CloudWatch into Log Streams and Log Groups.
  5. Log stream is a sequence of log events emitted by AWS services or any custom application.
  6. Log Groups are a group of Log Streams that share the same retention, monitoring, and access control settings.
  7. We use the unified CloudWatch agent to collect logs from Amazon EC2 instances and send them to CloudWatch.
  8. CloudWatch Log Insights provides a User Interface and a powerful purpose-built query language to search through log data and decipher different signals to monitor our applications.

You can refer to all the source code used in the article on Github.

Written By:

Pratik Das

Written By:

Pratik Das

Software Engineer, Consultant and Architect with current expertise in Enterprise and Cloud Architecture, serverless technologies, Microservices, and Devops.

Recent Posts

Optimizing Node.js Application Performance with Caching

Endpoints or APIs that perform complex computations and handle large amounts of data face several performance and responsiveness challenges. This occurs because each request initiates a computation or data retrieval process from scratch, which can take time.

Read more

Bubble Sort in Kotlin

Bubble Sort, a basic yet instructive sorting algorithm, takes us back to the fundamentals of sorting. In this tutorial, we’ll look at the Kotlin implementation of Bubble Sort, understanding its simplicity and exploring its limitations.

Read more

Quick Sort in Kotlin

Sorting is a fundamental operation in computer science and Quick Sort stands out as one of the most efficient sorting algorithms.

Read more