<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Avangards Blog]]></title><description><![CDATA[Explore the Avangards Blog for practical insights, expert tips, and valuable lessons derived from real-world AWS, Azure, and DevOps consulting experiences.]]></description><link>https://blog.avangards.io</link><generator>RSS for Node</generator><lastBuildDate>Sat, 18 Apr 2026 10:32:59 GMT</lastBuildDate><atom:link href="https://blog.avangards.io/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Deploying LibreChat on Amazon ECS using Terraform]]></title><description><![CDATA[Introduction
Generative AI has fundamentally shifted how we approach work, from writing and coding to research and problem-solving. For the past year, I used ChatGPT Business almost daily at work to i]]></description><link>https://blog.avangards.io/deploying-librechat-on-amazon-ecs-using-terraform</link><guid isPermaLink="true">https://blog.avangards.io/deploying-librechat-on-amazon-ecs-using-terraform</guid><category><![CDATA[AWS]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[Terraform]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Mon, 06 Apr 2026 01:49:02 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/61cd4fc3bf68083702212a26/0cdbe13b-cd3d-4b31-9f7b-a2610d9342ee.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Introduction</h2>
<p>Generative AI has fundamentally shifted how we approach work, from writing and coding to research and problem-solving. For the past year, I used ChatGPT Business almost daily at work to improve my writing and research. However, I noticed limitations like fabrication and confirmation bias, so I wanted to explore how other non-OpenAI models perform. Additionally, my organization is consolidating on Microsoft 365 Copilot, which doesn't match ChatGPT's capabilities for my needs. This led me to search for a self-hosted, ChatGPT-like platform with flexibility in model choices.</p>
<p>I also needed it to be web-based for team members to access. As an AWS advocate, I wanted to leverage a <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html">diverse set of foundational models</a> that Amazon Bedrock has to offer, and to host the platform using primarily AWS services. Based on my research, the three main options are <a href="https://www.librechat.ai/">LibreChat</a>, <a href="https://openwebui.com/">Open WebUI</a>, and <a href="https://anythingllm.com/">AnythingLLM</a>. Given that LibreChat is more feature-rich, customizable, and seemingly easier to deploy, I decided to give it a try and share my experience.</p>
<p>Without further ado, let's walk through the solution architecture and how it addresses my requirements.</p>
<h2>Architecture Overview</h2>
<p>The main design principle of the solution is to be cost-effective initially while allowing flexibility to scale in the future. Minimizing cost also means reducing operational overhead, not just service charges.</p>
<p>While cramming all components into an <a href="https://docs.aws.amazon.com/lightsail/latest/userguide/what-is-amazon-lightsail.html">Amazon Lightsail</a> instance is the cheapest option, it would need to be re-architected to scale horizontally. Deploying to an <a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/concepts.html">Amazon EC2</a> instance provides more flexibility, but it requires manual LibreChat installation and VM management. Ultimately, I decided to take a more modern approach and adopt a componentized architecture depicted in the following diagram:</p>
<img src="https://cdn.hashnode.com/uploads/covers/61cd4fc3bf68083702212a26/82e0e077-bf6c-4325-a8f3-0f515aba1143.png" alt="LibreChat AWS solution architecture" style="display:block;margin:0 auto" />

<p>This architecture uses the following technologies:</p>
<ul>
<li><p><a href="https://www.mongodb.com/products/platform/atlas-database">MongoDB Atlas</a> - The LibreChat database runs on MongoDB Atlas using the <a href="https://www.mongodb.com/products/platform/atlas-cloud-providers/aws/pricing">Free (M0) cluster tier</a>. It's technically free, runs in AWS, and is sufficient as a starter database engine as <a href="https://www.librechat.ai/docs/configuration/mongodb/mongodb_atlas">recommended by LibreChat</a>.</p>
</li>
<li><p><a href="https://docs.aws.amazon.com/AmazonECS/latest/developerguide/AWS_Fargate.html">Amazon ECS with AWS Fargate</a> - The LibreChat application runs as a container with 512 (0.5) vCPU and 1 GB memory using 64-bit ARM architecture, which is sufficient when not enabling too many LibreChat features. Secrets are stored in <a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-parameter-store.html">AWS Systems Manager (SSM) Parameter Store</a> (free), and configurations are stored in an <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html">Amazon S3</a> bucket to avoid additional shared storage services.</p>
</li>
<li><p><a href="https://docs.aws.amazon.com/elasticloadbalancing/latest/application/introduction.html">Application Load Balancer (ALB)</a> - The public-facing endpoint. Although there is a running cost, its simpler TLS setup and support for scaling and <a href="https://docs.aws.amazon.com/waf/latest/developerguide/waf-chapter.html">AWS WAF</a> integration makes it worthwhile.</p>
</li>
<li><p><a href="https://fck-nat.dev/stable/">fck-nat</a> - Provides NAT gateway functionality using a pair of EC2 t4g.nano instances instead of AWS-managed NAT Gateways. This significantly reduces NAT gateway data transfer charges, making it a cost-effective option for modest traffic volumes.</p>
</li>
</ul>
<p>The monthly cost of this architecture should be about $50 USD in us-east-1 including moderate Bedrock model use. To prevent surprise bills, set up a budget in <a href="https://docs.aws.amazon.com/cost-management/latest/userguide/budgets-managing-costs.html">AWS Budgets</a> and create a cost monitor in <a href="https://docs.aws.amazon.com/cost-management/latest/userguide/getting-started-ad.html">AWS Cost Anomaly Detection</a>.</p>
<p>Now that we have a good understanding of the architecture, let's go through the LibreChat concepts and prerequisites before we look at the Terraform configuration.</p>
<h2>Deploying a MongoDB Atlas Database</h2>
<p>Since an official <a href="https://www.mongodb.com/products/integrations/hashicorp-terraform">Terraform MongoDB Atlas Provider</a> is available, let's use it to provision the database for LibreChat. If you have not already done so, sign up for a new account using the <strong>Get Started</strong> button on the <a href="https://www.mongodb.com/">MongoDB website</a>.</p>
<p>Once you've completed sign-up, log in to the MongoDB Atlas Console. MongoDB Atlas automatically creates an organization and a sample project named <strong>Project 0</strong>. Since we'll use Terraform to create a new project, feel free to delete this sample project. You may also edit the organization name in <strong>Organizational Settings</strong> as needed.</p>
<p>Next, create an API key at the organization level for the Terraform provider. In the MongoDB Atlas Console, go to the organization level view and select <strong>Identity &amp; Access</strong> &gt; <strong>Applications</strong> in the left menu. On the <strong>Application</strong> page, select the <strong>Service Accounts</strong> tab and click <strong>Create service account</strong>:</p>
<img src="https://cdn.hashnode.com/uploads/covers/61cd4fc3bf68083702212a26/f1ba149b-4815-4bf7-aeb7-bdb597441b6b.png" alt="Create a service account" style="display:block;margin:0 auto" />

<p>On the <strong>Create Service Account</strong> page, enter a name (for example, "terraform") and a description (for example, "Service account for Terraform"), keep the client secret expiration as recommended, select the <strong>Organization Project Creator</strong> permission, and click <strong>Create</strong>. Copy both the client ID and secret from the next page for use with Terraform:</p>
<img src="https://cdn.hashnode.com/uploads/covers/61cd4fc3bf68083702212a26/36fa0e30-f4d5-409a-ad67-9d1853917d0e.png" alt="Save the service account information" style="display:block;margin:0 auto" />

<p>We're now ready to write the Terraform configuration to:</p>
<ol>
<li><p>Create a new project</p>
</li>
<li><p>Create a new Free (<code>M0</code>) cluster with AWS as the backing provider</p>
</li>
<li><p>Create a database user for LibreChat</p>
</li>
<li><p>Add the NAT Gateway's public IP to the project IP Access List for security</p>
</li>
</ol>
<p>To avoid hardcoding credentials, set the service account credentials as environment variables before running Terraform using <code>MONGODB_ATLAS_CLIENT_ID</code> and <code>MONGODB_ATLAS_CLIENT_SECRET</code> as per the <a href="https://registry.terraform.io/providers/mongodb/mongodbatlas/latest/docs/guides/provider-configuration">provider configuration</a>.</p>
<p>The following Terraform configuration provisions these MongoDB Atlas resources. Note that some attributes are provided as variables for flexibility, and the IP access list CIDR block refers to the NAT service elastic IPs (since egress traffic from LibreChat container goes through the fck-nat instances or NAT gateways):</p>
<pre><code class="language-dockerfile">resource "mongodbatlas_project" "librechat" {
  org_id = var.atlas_org_id
  name   = var.atlas_project_name
}

resource "mongodbatlas_advanced_cluster" "librechat" {
  project_id   = mongodbatlas_project.librechat.id
  name         = var.atlas_cluster_name
  cluster_type = "REPLICASET"

  replication_specs = [
    {
      region_configs = [
        {
          electable_specs = {
            instance_size = "M0"
          }
          provider_name         = "TENANT"
          backing_provider_name = "AWS"
          region_name           = var.atlas_region_name
          priority              = 7
        }
      ]
    }
  ]
}

resource "mongodbatlas_project_ip_access_list" "librechat" {
  for_each   = var.use_fck_nat ? aws_eip.fck_nat : aws_nat_gateway.zonal
  project_id = mongodbatlas_project.librechat.id
  cidr_block = "${each.value.public_ip}/32"
  comment    = "ECS egress CIDR for LibreChat"
}

resource "random_password" "atlas_db" {
  length = 16
}

resource "mongodbatlas_database_user" "librechat" {
  project_id         = mongodbatlas_project.librechat.id
  username           = var.atlas_db_username
  password           = random_password.atlas_db.result
  auth_database_name = "admin"

  roles {
    role_name     = "readWriteAnyDatabase"
    database_name = "admin"
  }
}
</code></pre>
<p>Although <a href="https://www.mongodb.com/docs/atlas/security/aws-iam-authentication/">using an IAM role to authenticate the Atlas database user</a> would be more secure, it unfortunately doesn't work with the off-the-shelf LibreChat Docker image because it lacks the <code>aws4</code> module required by the Mongoose library for AWS authentication. To avoid rebuilding a new container image, we'll stick with password authentication for now.</p>
<h2>Strategies for Managing LibreChat Configuration</h2>
<p>LibreChat uses two main sources of configuration: <a href="https://www.librechat.ai/docs/configuration/dotenv">environment variables</a> and <a href="https://www.librechat.ai/docs/configuration/librechat_yaml">LibreChat YAML</a>. Environment variables are typically provided via a <code>.env</code> file created from the example in LibreChat's <a href="https://github.com/danny-avila/librechat">GitHub repository</a>. Most LibreChat configuration is done using environment variables, and the LibreChat YAML file references these variables using the <code>${}</code> notation for values such as API keys.</p>
<p>Additionally, the LibreChat YAML file (<code>librechat.yaml</code>) is typically placed in the application folder or mounted as an override in a containerized environment. However, it can also be provided in other locations by specifying the configuration path using the <code>CONFIG_PATH</code> environment variable.</p>
<p>Running LibreChat as a container introduces some challenges:</p>
<ol>
<li><p><strong>Building custom images is inefficient</strong> - While it's possible to build a LibreChat container image with <code>.env</code> and <code>librechat.yaml</code> baked in, rebuilding the image for every configuration change is inefficient. It's ideal to use the <a href="https://hub.docker.com/r/librechat/librechat">official LibreChat image</a> from Docker Hub.</p>
</li>
<li><p><strong>Managing many environment variables is difficult</strong> - Setting environment variables directly without using <code>.env</code> is hard to manage due to the sheer number of variables to configure, even when omitting those irrelevant to your use case.</p>
</li>
<li><p><strong>Security concerns</strong> - Providing security-sensitive information as plain-text environment variables is not a security best practice.</p>
</li>
</ol>
<p>ECS provides features to address these concerns:</p>
<ol>
<li><p><strong>Environment variable files</strong> - ECS supports passing environment variables via an <a href="https://docs.aws.amazon.com/AmazonECS/latest/developerguide/use-environment-file.html">environment variable file</a> stored in S3. Since LibreChat already uses this format, it's a perfect fit.</p>
</li>
<li><p><strong>Secrets management</strong> - <a href="https://docs.aws.amazon.com/AmazonECS/latest/developerguide/secrets-envvar-secrets-manager.html">AWS Secrets Manager</a> or <a href="https://docs.aws.amazon.com/AmazonECS/latest/developerguide/secrets-envvar-ssm-paramstore.html">AWS Systems Manager (SSM) Parameter Store</a> can securely pass sensitive data to containers as environment variables, avoiding hardcoded credentials.</p>
</li>
</ol>
<p>To keep costs low, we will define all sensitive data as SSM Parameter Store parameters of type <code>SecureString</code>, while keeping all other configuration in a <code>.env</code> file provided to the container via ECS.</p>
<p>Providing the LibreChat YAML file to the container is more complicated. While the standard approach uses persistent <a href="https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_data_volumes.html">storage options</a> like Amazon EFS, that's cost-prohibitive for managing a single file. A better approach is using a sidecar container to write the file to a task-level shared volume before the main container starts.</p>
<p>For this, I implemented a sidecar container using <a href="https://gallery.ecr.aws/docker/library/busybox">busybox</a> to decode a base64-encoded <code>librechat.yaml</code> (provided as an environment variable) and write it to <code>/config/librechat.yaml</code> in the task-level shared volume. The LibreChat container references this path using the <code>CONFIG_PATH</code> environment variable. The resulting container definitions (defined in the ECS task definition) are shown below:</p>
<pre><code class="language-dockerfile">  container_definitions = jsonencode([
    {
      name      = "init-librechat-config"
      image     = "public.ecr.aws/docker/library/busybox:1.36"
      essential = false

      command = [
        "sh",
        "-lc",
        "mkdir -p /config &amp;&amp; printf '%s' \"$LIBRECHAT_YAML_B64\" | base64 -d &gt; /config/librechat.yaml"
      ]

      environment = [
        {
          name  = "LIBRECHAT_YAML_B64"
          value = filebase64("\({path.module}/librechat/\){var.librechat_version}/librechat.yaml")
        }
      ]

      mountPoints = [
        {
          sourceVolume  = "librechat-config"
          containerPath = "/config"
          readOnly      = false
        }
      ]

      logConfiguration = {
        logDriver = "awslogs"
        options = {
          "awslogs-group"         = aws_cloudwatch_log_group.librechat.name
          "awslogs-region"        = var.region
          "awslogs-stream-prefix" = "librechat"
        }
      }
    },
    {
      name      = "librechat"
      image     = "librechat/librechat:v0.8.4"
      essential = true

      dependsOn = [
        {
          containerName = "init-librechat-config"
          condition     = "SUCCESS"
        }
      ]

      portMappings = [
        {
          containerPort = 3080
          protocol      = "tcp"
        }
      ]

      secrets = [
        {
          name      = "CREDS_KEY"
          valueFrom = aws_ssm_parameter.librechat_creds_key.arn
        },
        {
          name      = "CREDS_IV"
          valueFrom = aws_ssm_parameter.librechat_creds_iv.arn
        },
        {
          name      = "JWT_SECRET"
          valueFrom = aws_ssm_parameter.librechat_jwt_secret.arn
        },
        {
          name      = "JWT_REFRESH_SECRET"
          valueFrom = aws_ssm_parameter.librechat_jwt_refresh_secret.arn
        },
        {
          name      = "MEILI_MASTER_KEY"
          valueFrom = aws_ssm_parameter.librechat_meili_master_key.arn
        },
        {
          name      = "MONGO_URI"
          valueFrom = aws_ssm_parameter.librechat_mongo_uri.arn
        }
      ]

      environment = [
        {
          name  = "CONFIG_PATH"
          value = "/config/librechat.yaml"
        }
      ]

      environmentFiles = [
        {
          value = aws_s3_object.librechat_dot_env.arn
          type  = "s3"
        }
      ]

      mountPoints = [
        {
          sourceVolume  = "librechat-config"
          containerPath = "/config"
          readOnly      = true
        }
      ]

      logConfiguration = {
        logDriver = "awslogs"
        options = {
          "awslogs-group"         = aws_cloudwatch_log_group.librechat.name
          "awslogs-region"        = var.region
          "awslogs-stream-prefix" = "librechat"
        }
      }
    }
  ])
</code></pre>
<p>Now that we have a strategy for managing LibreChat configuration, let's define the minimal set of environment variables and LibreChat YAML as a starting point.</p>
<h2>Preparing the Starter Environment File</h2>
<p>LibreChat provides starter files <code>.env.example</code> and <code>librechat.example.yaml</code> that we'll use as the basis for our configuration. Let's start with the environment file.</p>
<h3>Environment File (.env)</h3>
<p>Copy the <a href="https://github.com/danny-avila/LibreChat/blob/main/.env.example">.env.example</a> file to a local folder and comment out any sensitive values that will be provided as SSM Parameter Store environment variables specified below. Although individually defined environment variables take precedence over variables in environment files per the <a href="https://docs.aws.amazon.com/AmazonECS/latest/developerguide/use-environment-file.html">Amazon ECS Developer Guide</a>, commenting them out avoids confusion.</p>
<p>In addition to <code>MONGO_URI</code> (the MongoDB connection string), LibreChat recommends <a href="https://www.librechat.ai/docs/remote/docker_linux">adjusting any "secret" values from their default value for added security</a>:</p>
<ul>
<li><p><code>CREDS_IV</code> - 16-byte Initialization Vector (IV) (32 characters in hex) for securely storing credentials</p>
</li>
<li><p><code>CREDS_KEY</code> - 32-byte key (64 characters in hex) for securely storing credentials</p>
</li>
<li><p><code>JWT_SECRET</code> - 32-byte key (64 characters in hex) as the JWT secret key</p>
</li>
<li><p><code>JWT_REFRESH_SECRET</code> - 32-byte key (64 characters in hex) as the JWT refresh secret key</p>
</li>
<li><p><code>MEILI_MASTER_KEY</code> - 16-byte key (32 characters in hex) as the MeiliSearch master key (required only if message and conversation search is enabled)</p>
</li>
</ul>
<p>LibreChat provides a <a href="https://www.librechat.ai/docs/toolkit/credentials-generator">Credentials Generator</a> to generate cryptographically secure random values for these secrets. Store them as SSM Parameter Store <code>SecureString</code> parameters or generate and store them directly using Terraform.</p>
<p>Next, adjust these environment variables for proper and secure operation:</p>
<ul>
<li><p><code>HOST</code> - Set to <code>0.0.0.0</code> to listen on all network interfaces, allowing ALB access</p>
</li>
<li><p><code>CONSOLE_JSON</code> - Set to <code>true</code> to write logs to CloudWatch in JSON format for easier querying</p>
</li>
<li><p><code>ALLOW_REGISTRATION</code> - Set to <code>false</code> to disable self-registration (we will create users with the <a href="https://www.librechat.ai/docs/configuration/authentication#create-user-script">create user script</a> instead)</p>
</li>
<li><p><code>SEARCH</code> - Set to <code>false</code> to disable message and conversation search (we're only demonstrating a minimal setup)</p>
</li>
</ul>
<p>For the AI provider, we'll enable AWS Bedrock. Set <code>ENDPOINTS</code> to <code>bedrock</code> (the <code>.env.example</code> enables all <a href="https://www.librechat.ai/docs/configuration/pre_configured_ai">pre-configured endpoints</a> except Bedrock). Then <a href="https://www.librechat.ai/docs/configuration/pre_configured_ai/bedrock">configure the Bedrock endpoint</a> by setting the following environment variables:</p>
<ul>
<li><p><code>BEDROCK_AWS_DEFAULT_REGION</code> - Set to your Bedrock region (e.g., us-east-1)</p>
</li>
<li><p><code>BEDROCK_AWS_MODELS</code> - Set to <code>us.amazon.nova-2-lite-v1:0</code> to use Amazon Nova Lite as a starting point (referring to the US Nova Lite system-defined <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles.html">inference profile</a>)</p>
</li>
<li><p><code>OPENAI_API_KEY</code> - Comment out to ensure that LibreChat does not try to make any calls to OpenAI APIs even if the endpoint is disabled via <code>ENDPOINTS</code>.</p>
</li>
<li><p><code>ANTHROPIC_API_KEY</code> - Comment out to ensure that LibreChat does not try to make any calls to Anthropic APIs even if the endpoint is disabled via <code>ENDPOINTS</code>.</p>
</li>
<li><p><code>GOOGLE_KEY</code> - Comment out to ensure that LibreChat does not try to make any calls to Google APIs even if the endpoint is disabled via <code>ENDPOINTS</code>.</p>
</li>
<li><p><code>ASSISTANTS_API_KEY</code> - Comment out to ensure that LibreChat does not try to make any calls to OpenAI Assistants APIs even if the endpoint is disabled via <code>ENDPOINTS</code>.</p>
</li>
</ul>
<p>Note that we'll be using the <a href="https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-iam-roles.html">ECS task IAM role</a> to allow LibreChat to seamlessly call Bedrock APIs, so we don't need to set <code>BEDROCK_AWS_ACCESS_KEY_ID</code> nor <code>BEDROCK_AWS_SECRET_ACCESS_KEY</code>.</p>
<h3>LibreChat YAML File (librechat.yaml)</h3>
<p>Copy the <a href="https://github.com/danny-avila/LibreChat/blob/main/librechat.example.yaml">librechat.example.yaml</a> file to a local folder. For this minimal setup, we won't need to define custom endpoints or configure advanced settings. However, to prevent custom endpoints like Groq and Mistral AI from appearing in the UI, comment out the <code>custom</code> key in the <a href="https://www.librechat.ai/docs/configuration/librechat_yaml/object_structure/config#endpoints">endpoints</a> block and set it to an empty array <code>[]</code>.</p>
<p>With all environment variables set in the environment file or SSM Parameter Store, and the librechat.yaml file prepared, we're ready to tie everything together with Terraform.</p>
<h2>Building the Terraform Configuration</h2>
<p>Since this blog post is getting quite long, let's focus on the key design elements of the Terraform configuration. You can find the complete Terraform configuration and source code in the <code>1_ecs_basic</code> directory in <a href="https://github.com/acwwat/terraform-aws-librechat-examples">this GitHub repository</a>. Here are the descriptions for each Terraform configuration file:</p>
<ul>
<li><p><code>atlas.tf</code> - Defines the MongoDB Atlas resources, as explained in the earlier section of this blog post.</p>
</li>
<li><p><code>vpc.tf</code> - Defines the VPC infrastructure for the solution. Key design elements include:</p>
<ul>
<li><p>The VPC design follows a three-tier architecture across two AZs. Although the database subnets are currently not in use, they can be utilized for AWS cache and database services in the future.</p>
</li>
<li><p>There is built-in support for either fck-nat (default) or NAT Gateways, depending on your preference.</p>
</li>
</ul>
</li>
<li><p><code>s3.tf</code> - Defines the S3 bucket that hosts the LibreChat files and the S3 object for the <code>.env</code> file. Key design elements include:</p>
<ul>
<li><p>A ready-to-use <code>.env</code> file is included in the <code>librechat/v0.8.4</code> folder. You can edit this file if using the same version, or upload a new <code>.env</code> file when you upgrade LibreChat (be sure to change the <code>librechat_version</code> variable).</p>
</li>
<li><p>This S3 bucket may be used in the future as the <a href="https://www.librechat.ai/docs/configuration/cdn/s3">LibreChat file storage backend</a>, hence the <code>.env</code> file is placed in the <code>config</code> subfolder for better separation.</p>
</li>
</ul>
</li>
<li><p><code>ecs.tf</code> - Defines all ECS and related resources to run LibreChat as an ECS service. Key design elements include:</p>
<ul>
<li><p>A ready-to-use <code>librechat.yaml</code> file is included in the <code>librechat/v0.8.4</code> folder. You can edit this file if using the same version, or upload a new <code>librechat.yaml</code> file when you upgrade LibreChat (be sure to change the <code>librechat_version</code> variable).</p>
</li>
<li><p>The LibreChat credentials are generated by first creating a random password using the <code>random_password</code> ephemeral resource, then defining an <code>aws_ssm_parameter</code> resource with the write-only value storing the hash of the password (SHA1 or SHA256). You can replace this logic if you prefer to manually store the generated credentials in SSM Parameter Store first.</p>
</li>
<li><p>The IAM policy for the ECS task role, <code>aws_iam_role_policy.ecs_task_librechat</code>, contains permissions to invoke Bedrock models and <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html">manage marketplace subscription of third-party models</a>.</p>
</li>
<li><p>Service auto-scaling is defined for future use, but for now, it is scaled to 1 for cost control.</p>
</li>
<li><p>The LibreChat container runs on TCP port 3080 by default.</p>
</li>
<li><p>CloudWatch logs are streamed to the <code>/ecs/librechat</code> log group.</p>
</li>
</ul>
</li>
<li><p><code>alb.tf</code> - Defines the ALB resources as the public-facing endpoint for LibreChat. Key design elements include:</p>
<ul>
<li><p>A HTTPS listener is defined with HTTP redirect to ensure security. Consequently, you must import or create a TLS certificate in AWS Certificate Manager (ACM), then pass the certificate's ARN using the <code>alb_certificate_arn</code> variable.</p>
</li>
<li><p>HTTPS also requires a custom host name, so you must pass the DNS name using the <code>librechat_dns_name</code> variable.</p>
</li>
<li><p>The target group checks the ECS task health using LibreChat's <code>/health</code> endpoint.</p>
</li>
</ul>
</li>
</ul>
<h2>Deploying the Solution</h2>
<p>After cloning the GitHub repository, deploy the solution as follows:</p>
<ol>
<li><p>From the root of the cloned GitHub repository, navigate to <code>1_ecs_basic</code>.</p>
</li>
<li><p>Set the <code>MONGODB_ATLAS_CLIENT_ID</code> and <code>MONGODB_ATLAS_CLIENT_SECRET</code> to the Atlas service account credentials created in the earlier section.</p>
</li>
<li><p>Configure your AWS credentials using <code>aws configure</code> for IAM, or <code>aws configure sso</code> for IAM Identity Center. The profile name will be provided as a Terraform variable.</p>
</li>
<li><p>Copy <code>terraform.tfvars.example</code> as <code>terraform.tfvars</code> and update the variables to match your configuration.</p>
</li>
<li><p>Run <code>terraform init</code> and <code>terraform apply</code>.</p>
</li>
</ol>
<p>It is advised to check the ECS service status in the AWS Management Console to ensure that the tasks run successfully. Failed tasks will cause a perpetual restart due to the required capacity of 1, and overlooking this may lead to unexpected cost consequences. Check the task logs in the CloudWatch log group <code>/ecs/librechat</code> for errors if needed.</p>
<p>Once the configuration is applied, create the CNAME record for the ALB DNS name. If you manage your domain/subdomain using a <a href="https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/AboutHZWorkingWith.html">public hosted zone</a> in Amazon Route 53, you can also create an <a href="https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/resource-record-sets-choosing-alias-non-alias.html">alias record</a> pointing to the ALB. Lastly, go to the custom host name for LibreChat and ensure that the login page loads successfully.</p>
<h2>Creating a User and Validating LibreChat</h2>
<p>Since self-registration is disabled, we need to use the <a href="https://www.librechat.ai/docs/configuration/authentication#create-user-script">create user script</a> to create the first user. The easiest way is to use <a href="https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-exec-run.html">ECS Exec in the ECS console</a> to run the script in the <code>librechat</code> container in the task configuration. Here's a screenshot showing where to find the <strong>Connect</strong> button to open an interactive session:</p>
<img src="https://cdn.hashnode.com/uploads/covers/61cd4fc3bf68083702212a26/0ec9c3e0-e8ae-4768-9ac2-5a75936e5795.png" alt="Connect using ECS Exec" style="display:block;margin:0 auto" />

<p>Once CloudShell opens and connects to the container's shell, run the following command to start the user creation wizard:</p>
<pre><code class="language-shell">npm run create-user
</code></pre>
<p>Enter the user's information as prompted, and a user will be created in LibreChat's database. Here's an example of creating a user for John Doe:</p>
<img src="https://cdn.hashnode.com/uploads/covers/61cd4fc3bf68083702212a26/860d8517-efb2-45c0-b8c6-ee06f5179382.png" alt="Creating a new user using the create user script" style="display:block;margin:0 auto" />

<p>Now you're ready to log in. Open your LibreChat application URL and log in using the credentials you just created. Upon successful login, accept LibreChat's terms of service. You should see the Nova 2 Lite model already selected at the top, since it's the only configured model.</p>
<p>Let's test the setup with a simple prompt:</p>
<blockquote>
<p>Tell me a joke about AWS.</p>
</blockquote>
<p>If LibreChat responds with a joke, you've successfully completed the setup! Here's the joke I received, which honestly suggests that the Nova model could use some additional training with a better comedy dataset...</p>
<img src="https://cdn.hashnode.com/uploads/covers/61cd4fc3bf68083702212a26/70c7702b-a4da-457a-bf67-a5ecc0a66324.png" alt="Lame joke by Nova" style="display:block;margin:0 auto" />

<h2>Summary</h2>
<p>Congratulations, you now have your own LibreChat instance in AWS! You're now ready to start exploring and expanding its capabilities. While this setup gives you a functional chat interface, there's much more you can do to enhance its features. Here are some examples:</p>
<ul>
<li><p>Configure more <a href="https://www.librechat.ai/docs/configuration/pre_configured_ai/bedrock#configuring-models">Bedrock models</a> to unlock diverse capabilities and customization, balancing functionality, performance, and cost</p>
</li>
<li><p>Enable <a href="https://www.librechat.ai/docs/features/web_search">web search</a> to allow LibreChat to search the internet and retrieve relevant information, enhancing conversations</p>
</li>
<li><p>Build custom AI assistants using <a href="https://www.librechat.ai/docs/features/agents">AI Agents</a> and integrate with various built-in and MCP tools to elevate capabilities and user experience</p>
</li>
</ul>
<p>As your LibreChat usage grows, it's imperative to align your architecture with the <a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/welcome.html">AWS Well-Architected Framework</a>. Here are some examples to improve security and operational robustness:</p>
<ul>
<li><p>Streamline <a href="https://www.librechat.ai/docs/configuration/authentication">authentication</a> by integrating with an Identity Provider (IDP)</p>
</li>
<li><p>Enable <a href="https://www.librechat.ai/docs/configuration/redis">caching, session storage, and horizontal scaling</a> in LibreChat using Redis and compute auto-scaling</p>
</li>
<li><p>Scale file storage with an <a href="https://www.librechat.ai/docs/configuration/cdn/s3">Amazon S3 storage backend</a></p>
</li>
</ul>
<p>Given the vast possibilities, I will start a new blog series about LibreChat and how to best use AWS services and best practices to enable these capabilities. If you enjoyed this post, stay tuned for new content at the <a href="https://blog.avangards.io">Avangards Blog</a>. Thanks so much for reading, and I hope you have fun chatting with LibreChat!</p>
]]></content:encoded></item><item><title><![CDATA[AWS Control Tower Proactive Controls for Terraform: A Proof of Concept]]></title><description><![CDATA[Explore AWS Control Tower proactive controls, why they don’t work with Terraform, and a proof of concept that attempts to make them work.
Introduction
As a Terraform advocate and an AWS consultant who]]></description><link>https://blog.avangards.io/aws-control-tower-proactive-controls-for-terraform-a-proof-of-concept</link><guid isPermaLink="true">https://blog.avangards.io/aws-control-tower-proactive-controls-for-terraform-a-proof-of-concept</guid><category><![CDATA[AWS]]></category><category><![CDATA[Terraform]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Tue, 24 Feb 2026 05:45:14 GMT</pubDate><enclosure url="https://cloudmate-test.s3.us-east-1.amazonaws.com/uploads/covers/61cd4fc3bf68083702212a26/b29dd259-cca6-4912-baab-1ad9675e0430.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Explore AWS Control Tower proactive controls, why they don’t work with Terraform, and a proof of concept that attempts to make them work.</p>
<h2>Introduction</h2>
<p>As a Terraform advocate and an AWS consultant who builds many landing zones, AWS Control Tower has always been one of my favorite AWS services. Beyond its common use cases, such as account provisioning with <a href="https://docs.aws.amazon.com/controltower/latest/userguide/af-customization-page.html">Account Factory Customization (AFC)</a> and <a href="https://docs.aws.amazon.com/controltower/latest/userguide/taf-account-provisioning.html">Account Factory for Terraform (AFT)</a>, I am always on the lookout for opportunities to bring the two technologies closer together.</p>
<p>During landing zone design workshops, when walking customers through Control Tower controls, I often found myself unable to recommend proactive controls because many organizations prefer using Terraform over CloudFormation for infrastructure as code (IaC). To fully leverage everything Control Tower has to offer, wouldn’t it be nice if proactive controls worked with other IaC tools, including Terraform?</p>
<p>Through research, I learned that proactive controls are implemented as CloudFormation Hooks and can target resources created via the Cloud Control API. Having worked with the Terraform AWS Cloud Control (CC) Provider, I began to wonder whether proactive controls could evaluate Terraform resources created through this provider. This question became the experiment that is the subject of this blog post.</p>
<p>Let’s start with a quick explanation of what proactive controls are.</p>
<h2>What Are AWS Control Tower Proactive Controls?</h2>
<p><a href="https://docs.aws.amazon.com/controltower/latest/controlreference/proactive-controls.html">Proactive controls</a> are pre-built compliance rules that evaluate AWS resources before deployment via CloudFormation stack operations, preventing non-compliant resources from being created or updated. AWS Control Tower provides more than 200 controls covering a wide range of AWS services and compliance frameworks.</p>
<p>An example of a proactive control is <a href="https://docs.aws.amazon.com/controltower/latest/controlreference/ec2-rules.html#ct-ec2-pr-7-description">[CT.EC2.PR.7] Require an Amazon EBS volume resource to be encrypted at rest when defined by means of the AWS::EC2::Instance BlockDeviceMappings property or AWS::EC2::Volume resource type</a>. As the name suggests, this control prevents an EC2 instance from being created or updated if it specifies an unencrypted EBS volume.</p>
<p>For the full list of proactive controls, you can either view them on the <a href="https://docs.aws.amazon.com/controltower/latest/controlreference/control-details.html">Control Catalog</a> page in the AWS Control Tower console or refer to the <a href="https://docs.aws.amazon.com/controltower/latest/controlreference/proactive-controls.html">Proactive control</a> section in the AWS Control Tower Control Reference Guide.</p>
<h2>Testing Proactive Controls with Terraform (Unsuccessfully)</h2>
<p>Since proactive controls are implemented using <a href="https://docs.aws.amazon.com/cloudformation-cli/latest/hooks-userguide/what-is-cloudformation-hooks.html">CloudFormation Hooks</a>, I initially assumed they would evaluate all <a href="https://docs.aws.amazon.com/cloudformation-cli/latest/hooks-userguide/hooks-concepts.html#hook-terms-hook-target">Hook targets</a>, particularly resources supported by the <a href="https://docs.aws.amazon.com/cloudcontrolapi/latest/userguide/what-is-cloudcontrolapi.html">Cloud Control API</a>. Because the <a href="https://registry.terraform.io/providers/hashicorp/awscc/latest/docs">Terraform AWS Cloud Control (CC) Provider</a> is implemented using the Cloud Control API (as opposed to the standard AWS API used by the original Terraform AWS Provider), I expected proactive controls to apply there as well.</p>
<p>Although it is still uncommon for organizations to fully adopt the Terraform AWS CC Provider, I wanted to determine whether proactive controls could be used with Terraform to enforce compliance.</p>
<p>As a quick test, I used the following Terraform configuration to create an EC2 instance with an unencrypted EBS volume using the Terraform AWS CC Provider, expecting it to fail:</p>
<pre><code class="language-dockerfile">variable "subnet_id" {
    type = string
}

data "aws_ami" "al2023" {
  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["al2023-ami-2023.*-x86_64"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }
}

# EC2 instance with unencrypted EBS volume
resource "awscc_ec2_instance" "this" {
  image_id      = data.aws_ami.al2023.id
  instance_type = "t3.micro"
  subnet_id     = var.subnet_id

  block_device_mappings = [
    {
      device_name = "/dev/xvda"
      ebs = {
        volume_size = 20
        volume_type = "gp3"
        encrypted   = false  # Explicitly unencrypted
        delete_on_termination = true
      }
    }
  ]

  tags = [
    {
      key   = "Name"
      value = "unencrypted-vol-test"
    }
  ]
}
</code></pre>
<p>However, <code>terraform apply</code> ran successfully and the EC2 instance was created:</p>
<pre><code class="language-plaintext">$ terraform apply
data.aws_ami.al2023: Reading...
data.aws_ami.al2023: Read complete after 0s [id=ami-0f3caa1cf4417e51b]

Terraform used the selected providers to generate the following execution plan.       
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # awscc_ec2_instance.this will be created
  + resource "awscc_ec2_instance" "this" {
      + additional_info                      = (known after apply)
      + affinity                             = (known after apply)
      + availability_zone                    = (known after apply)
      + block_device_mappings                = [
          + {
              + device_name  = "/dev/xvda"
              + ebs          = {
                  + delete_on_termination = true
                  + encrypted             = false
                  + iops                  = (known after apply)
                  + kms_key_id            = (known after apply)
                  + snapshot_id           = (known after apply)
                  + volume_size           = 20
                  + volume_type           = "gp3"
                }
              + no_device    = (known after apply)
              + virtual_name = (known after apply)
            },
        ]
      + cpu_options                          = (known after apply)
      + credit_specification                 = (known after apply)
      + disable_api_termination              = (known after apply)
      + ebs_optimized                        = (known after apply)
      + elastic_gpu_specifications           = (known after apply)
      + elastic_inference_accelerators       = (known after apply)
      + enclave_options                      = (known after apply)
      + hibernation_options                  = (known after apply)
      + host_id                              = (known after apply)
      + host_resource_group_arn              = (known after apply)
      + iam_instance_profile                 = (known after apply)
      + id                                   = (known after apply)
      + image_id                             = "ami-0f3caa1cf4417e51b"
      + instance_id                          = (known after apply)
      + instance_initiated_shutdown_behavior = (known after apply)
      + instance_type                        = "t3.micro"
      + ipv_6_address_count                  = (known after apply)
      + ipv_6_addresses                      = (known after apply)
      + kernel_id                            = (known after apply)
      + key_name                             = (known after apply)
      + launch_template                      = (known after apply)
      + license_specifications               = (known after apply)
      + metadata_options                     = (known after apply)
      + monitoring                           = (known after apply)
      + network_interfaces                   = (known after apply)
      + placement_group_name                 = (known after apply)
      + private_dns_name                     = (known after apply)
      + private_dns_name_options             = (known after apply)
      + private_ip                           = (known after apply)
      + private_ip_address                   = (known after apply)
      + propagate_tags_to_volume_on_creation = (known after apply)
      + public_dns_name                      = (known after apply)
      + public_ip                            = (known after apply)
      + ramdisk_id                           = (known after apply)
      + security_group_ids                   = (known after apply)
      + security_groups                      = (known after apply)
      + source_dest_check                    = (known after apply)
      + ssm_associations                     = (known after apply)
      + state                                = (known after apply)
      + subnet_id                            = "subnet-0a0bb7e920672c803"
      + tags                                 = [
          + {
              + key   = "Name"
              + value = "unencrypted-vol-test"
            },
        ]
      + tenancy                              = (known after apply)
      + user_data                            = (known after apply)
      + volumes                              = (known after apply)
      + vpc_id                               = (known after apply)
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

awscc_ec2_instance.this: Creating...
awscc_ec2_instance.this: Still creating... [00m10s elapsed]
awscc_ec2_instance.this: Creation complete after 16s [id=i-0963bfcf44274c8d9]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

$
</code></pre>
<p>So how did the EC2 instance manage to be created?</p>
<h2>Investigating Why Proactive Controls Don’t Work with Terraform</h2>
<p>To investigate the issue, I examined the Hook in the CloudFormation console. Looking at the Hook named <strong>AWS::ControlTower::Hook</strong>, I noticed that its <strong>Targets</strong> field was set to <code>None</code>.</p>
<img src="https://cloudmate-test.s3.us-east-1.amazonaws.com/uploads/covers/61cd4fc3bf68083702212a26/9d8a1a3d-e890-40b7-89df-622c2c11e1bb.png" alt="AWS::ControlTower::Hook listed as without a target" />

<p>This seemed odd, as I expected at least one target to be listed. Upon reviewing the Hook details, I observed that the Hook targets included only <strong>CloudFormation resources</strong>, not the <strong>Cloud Control API</strong>:</p>
<img src="https://cloudmate-test.s3.us-east-1.amazonaws.com/uploads/covers/61cd4fc3bf68083702212a26/878169e4-bcd3-45cb-b696-632880fb2d72.png" alt="AWS::ControlTower::Hook details show only CF resource as a target" />

<p>Assuming the Hook details reflect the actual configuration, this implies that proactive controls validate only CloudFormation resources, not resources provisioned through the Cloud Control API (and therefore not those created via the Terraform AWS CC Provider). To confirm this behavior, I opened an AWS Support case.</p>
<p>The AWS support engineer explained that proactive controls are implemented using a special Hook type called <strong>Controls (Managed Hooks)</strong>, which supports only CloudFormation resources as targets. To extend proactive controls to other targets, each control must be re-implemented as a custom Hook.</p>
<p>Although I submitted a feature request to the AWS Control Tower team to expand proactive controls to additional targets, I decided to proceed with a workaround, even if it required additional effort.</p>
<h2><strong>Replicating Proactive Controls to Target the Cloud Control API with Lambda Hooks</strong></h2>
<p>The AWS support engineer initially suggested re-implementing each proactive control using Lambda Hooks. While this approach would require significant effort, I was provided with the following Python code used for the <strong>CT.EC2.PR.7</strong> control:</p>
<pre><code class="language-python">import json
import logging

logger = logging.getLogger()
logger.setLevel(logging.INFO)

def lambda_handler(event, context):
    '''
    CloudFormation Hook Handler for EBS Encryption Validation
    Validates that EC2 instances have encrypted EBS volumes
    '''

    logger.info(f"Received full event: {json.dumps(event, indent=2)}")

    # Extract request details from Cloud Control API event structure
    request_data = event.get('requestData', {})
    resource_type = request_data.get('targetName')
    target_model = request_data.get('targetModel', {})
    resource_properties = target_model.get('resourceProperties', {}) 

    logger.info(f"Extracted - Type: {resource_type}, Properties keys: {list(resource_properties.keys())}")

    # Only validate EC2 instances
    if not resource_type or resource_type != 'AWS::EC2::Instance':
        logger.info(f"Skipping validation - resource type '{resource_type}' is not EC2 Instance")
        return {
            'hookStatus': 'SUCCESS',
            'message': f'Resource type {resource_type} not applicable for this hook'
        }

    # Validation logic
    validation_result = validate_ebs_encryption(resource_properties)

    if validation_result['compliant']:
        return {
            'hookStatus': 'SUCCESS',
            'message': 'EC2 instance has encrypted EBS volumes'
        }
    else:
        return {
            'hookStatus': 'FAILED',
            'errorCode': 'NonCompliant',
            'message': validation_result['message']
        }

def validate_ebs_encryption(properties):
    '''
    Validates that all EBS volumes are encrypted
    '''

    # Check BlockDeviceMappings
    block_device_mappings = properties.get('BlockDeviceMappings', [])

    if not block_device_mappings:
        return {
            'compliant': False,
            'message': 'No BlockDeviceMappings specified. Ensure the AMI uses encrypted volumes or specify encrypted BlockDeviceMappings.'
        }

    # Validate each block device mapping
    for idx, mapping in enumerate(block_device_mappings):
        ebs = mapping.get('Ebs', {})

        if ebs:
            encrypted = ebs.get('Encrypted', False)

            if not encrypted:
                return {
                    'compliant': False,
                    'message': f'BlockDeviceMapping at index {idx} has an unencrypted EBS volume. Set Encrypted to true.'
                }

    return {
        'compliant': True,
        'message': 'All EBS volumes are encrypted'
    }
</code></pre>
<p>After creating the Lambda function with the provided code, I <a href="https://docs.aws.amazon.com/cloudformation-cli/latest/hooks-userguide/lambda-hooks-activate-hooks.html">created a Lambda Hook</a> as per screenshots below. The key configuration details were:</p>
<ul>
<li><p><strong>Hook targets</strong> should include at least <code>Cloud Control API</code> . Since proactive controls already target CloudFormation resources, including them here is unnecessary.</p>
</li>
<li><p><strong>Actions</strong> should include <code>Create</code> and <code>Update</code> .</p>
</li>
<li><p><strong>Hook mode</strong> should be set to <code>Fail</code> .</p>
</li>
<li><p>Target resources should include <code>AWS::EC2::Instance</code> and <code>AWS::EC2::Volume</code> . These resource types are specified in the control details within the <strong>AWS::ControlTower::Hook</strong> configuration.</p>
</li>
</ul>
<img src="https://cloudmate-test.s3.us-east-1.amazonaws.com/uploads/covers/61cd4fc3bf68083702212a26/b831778b-59f7-4683-8828-94418156f2c4.png" alt="Create Hook with Lambda - step 1 details" />

<img src="https://cloudmate-test.s3.us-east-1.amazonaws.com/uploads/covers/61cd4fc3bf68083702212a26/fd93733f-cb37-4f3a-8813-07a7d816bae8.png" alt="Create Hook with Lambda - step 2 details" />

<p>After the Lambda Hook is created, when I reran <code>terraform apply</code>, and this time it failed as expected due to the Hook:</p>
<pre><code class="language-plaintext">awscc_ec2_instance.this: Creating...
╷
│ Error: AWS SDK Go Service Operation Incomplete
│
│   with awscc_ec2_instance.this,
│   on main.tf line 21, in resource "awscc_ec2_instance" "this":
│   21: resource "awscc_ec2_instance" "this" {
│
│ Waiting for Cloud Control API service CreateResource operation completion returned: 
│ waiter state transitioned to FAILED. StatusMessage:
│ 149a3eef-eaba-4459-8ea7-1707b1183e11. Hook failures: HookName:
│ Private::Lambda::CTEC2PR7, HookArn:
│ arn:aws:cloudformation:us-east-1:2**********1:type/hook/Private-Lambda-CTEC2PR7/00000001/aws-hooks/AWS-Hooks-LambdaHook/00000001.00000024,
│ HookVersion: 00000025, Time: 2026-02-23T21:06:45Z, HookMessage: BlockDeviceMapping  
│ at index 0 has an unencrypted EBS volume. Set Encrypted to true.
╵
</code></pre>
<h2><strong>Even Better: Replicating Proactive Controls with Guard Hooks</strong></h2>
<p>While researching further, I noticed that the rule specifications for all proactive controls are published under <a href="https://docs.aws.amazon.com/controltower/latest/controlreference/proactive-controls.html">Proactive controls</a> in the AWS Control Tower Controls Reference Guide. This makes replication significantly easier.</p>
<p>According to the documentation, proactive controls are implemented using Guard Hooks powered by <a href="https://docs.aws.amazon.com/cfn-guard/latest/ug/what-is-guard.html">AWS CloudFormation Guard</a>, a domain-specific language (DSL) for policy-as-code. For more context and development instructions of Guard Hook, refer to <a href="https://docs.aws.amazon.com/cfn-guard/latest/ug/writing-rules.html">Writing AWS CloudFormation Guard rules</a> in the AWS CloudFormation Guard User Guide.</p>
<p>For our purposes, the <a href="https://docs.aws.amazon.com/controltower/latest/controlreference/ec2-rules.html#ct-ec2-pr-7-description">CT.EC2.PR.7</a> control specification already contains everything needed to create the Hook. For instance, the rule specification is as follows:</p>
<pre><code class="language-plaintext">
# ###################################
##       Rule Specification        ##
#####################################
# 
# Rule Identifier:
#   ec2_encrypted_volumes_check
# 
# Description:
#   Checks whether standalone Amazon EC2 EBS volumes and new EC2 EBS volumes created through EC2 instance
#   Block Device Mappings are encrypted at rest.
# 
# Reports on:
#   AWS::EC2::Instance, AWS::EC2::Volume
# 
# Evaluates:
#   CloudFormation, CloudFormation hook
# 
# Rule Parameters:
#   None
# 
# Scenarios:
#   Scenario: 1
#     Given: The input document is an CloudFormation or CloudFormation hook document
#       And: The input document does not contain any Amazon EC2 volume resources
#      Then: SKIP
#   Scenario: 2
#     Given: The input document is an CloudFormation or CloudFormation hook document
#       And: The input document contains an EC2 instance resource
#       And: 'BlockDeviceMappings' has not been provided or has been provided as an empty list
#      Then: SKIP
#   Scenario: 3
#     Given: The input document is an CloudFormation or CloudFormation hook document
#       And: The input document contains an EC2 instance resource
#       And: 'BlockDeviceMappings' has been provided as a non-empty list
#       And: 'Ebs' has been provided in a 'BlockDeviceMappings' configuration
#       And: 'Encrypted' has not been provided in the 'Ebs' configuration
#      Then: FAIL
#   Scenario: 4
#     Given: The input document is an CloudFormation or CloudFormation hook document
#       And: The input document contains an EC2 instance resource
#       And: 'BlockDeviceMappings' has been provided as a non-empty list
#       And: 'Ebs' has been provided in a 'BlockDeviceMappings' configuration
#       And: 'Encrypted' has been provided in the 'Ebs' configuration and set to bool(false)
#      Then: FAIL
#   Scenario: 5
#     Given: The input document is an CloudFormation or CloudFormation hook document
#       And: The input document contains an EC2 volume resource
#       And: 'Encrypted' on the EC2 volume has not been provided
#      Then: FAIL
#   Scenario: 6
#     Given: The input document is an CloudFormation or CloudFormation hook document
#       And: The input document contains an EC2 volume resource
#       And: 'Encrypted' on the EC2 volume has been provided and is set to bool(false)
#      Then: FAIL
#   Scenario: 7
#     Given: The input document is an CloudFormation or CloudFormation hook document
#       And: The input document contains an EC2 instance resource
#       And: 'BlockDeviceMappings' has been provided as a non-empty list
#       And: 'Ebs' has been provided in a 'BlockDeviceMappings' configuration
#       And: 'Encrypted' has been provided in the 'Ebs' configuration and set to bool(true)
#      Then: PASS
#   Scenario: 8
#     Given: The input document is an CloudFormation or CloudFormation hook document
#       And: The input document contains an EC2 volume resource
#       And: 'Encrypted' on the EC2 volume has been provided and is set to bool(true)
#      Then: PASS

#
# Constants
#
let EC2_VOLUME_TYPE = "AWS::EC2::Volume"
let EC2_INSTANCE_TYPE = "AWS::EC2::Instance"
let INPUT_DOCUMENT = this

#
# Assignments
#
let ec2_volumes = Resources.*[ Type == %EC2_VOLUME_TYPE ]
let ec2_instances = Resources.*[ Type == %EC2_INSTANCE_TYPE ]

#
# Primary Rules
#
rule ec2_encrypted_volumes_check when is_cfn_template(%INPUT_DOCUMENT)
                                      %ec2_volumes not empty {
    check_volume(%ec2_volumes.Properties)
        &lt;&lt;
        [CT.EC2.PR.7]: Require that an Amazon EBS volume attached to an Amazon EC2 instance is encrypted at rest
        [FIX]: Set 'Encryption' to true on EC2 EBS Volumes.
        &gt;&gt;
}

rule ec2_encrypted_volumes_check when is_cfn_hook(%INPUT_DOCUMENT, %EC2_VOLUME_TYPE) {
    check_volume(%INPUT_DOCUMENT.%EC2_VOLUME_TYPE.resourceProperties)
        &lt;&lt;
        [CT.EC2.PR.7]: Require that an Amazon EBS volume attached to an Amazon EC2 instance is encrypted at rest
        [FIX]: Set 'Encryption' to true on EC2 EBS Volumes.
        &gt;&gt;
}

rule ec2_encrypted_volumes_check when is_cfn_template(%INPUT_DOCUMENT)
                                      %ec2_instances not empty {
    check_instance(%ec2_instances.Properties)
        &lt;&lt;
        [CT.EC2.PR.7]: Require that an Amazon EBS volume attached to an Amazon EC2 instance is encrypted at rest
        [FIX]: Set 'Encryption' to true on EC2 EBS Volumes.
        &gt;&gt;
}

rule ec2_encrypted_volumes_check when is_cfn_hook(%INPUT_DOCUMENT, %EC2_INSTANCE_TYPE) {
    check_instance(%INPUT_DOCUMENT.%EC2_INSTANCE_TYPE.resourceProperties)
        &lt;&lt;
        [CT.EC2.PR.7]: Require that an Amazon EBS volume attached to an Amazon EC2 instance is encrypted at rest
        [FIX]: Set 'Encryption' to true on EC2 EBS Volumes.
        &gt;&gt;
}

#
# Parameterized Rules
#

rule check_instance(ec2_instance) {
    %ec2_instance[
        filter_ec2_instance_block_device_mappings(this)
    ] {
        BlockDeviceMappings[
            Ebs exists
            Ebs is_struct
        ] {
            check_volume(Ebs)
        }
    }
}

rule check_volume(ec2_volume) {
    %ec2_volume {
        # Scenario 2
        Encrypted exists
        # Scenarios 3 and 4
        Encrypted == true
    }
}

rule filter_ec2_instance_block_device_mappings(ec2_instance) {
    %ec2_instance {
        BlockDeviceMappings exists
        BlockDeviceMappings is_list
        BlockDeviceMappings not empty
    }
}

#
# Utility Rules
#
rule is_cfn_template(doc) {
    %doc {
        AWSTemplateFormatVersion exists  or
        Resources exists
    }
}

rule is_cfn_hook(doc, RESOURCE_TYPE) {
    %doc.%RESOURCE_TYPE.resourceProperties exists
}
</code></pre>
<p>To implement this, I first deleted the previous Lambda Hook and its associated IAM role and policy to avoid redundant checks. Then following the <a href="https://docs.aws.amazon.com/cloudformation-cli/latest/hooks-userguide/guard-hooks-activate-hooks.html">instructions to activate a Guard Hook</a>, I created an S3 bucket and uploaded the Guard rule as <code>CT.EC2.PR.7.guard</code>, and created the Guard Hook as per the screenshots below. The key configuration details were:</p>
<ul>
<li><p><strong>Hook targets</strong> should include at least <code>Cloud Control API</code> . Since proactive controls already target CloudFormation resources, including them here is unnecessary.</p>
</li>
<li><p><strong>Actions</strong> should include <code>Create</code> and <code>Update</code> .</p>
</li>
<li><p><strong>Hook mode</strong> should be set to <code>Fail</code> .</p>
</li>
<li><p>Target resources should include <code>AWS::EC2::Instance</code> and <code>AWS::EC2::Volume</code> . These resource types are specified in the control details within the <strong>AWS::ControlTower::Hook</strong> configuration.</p>
</li>
</ul>
<img src="https://cloudmate-test.s3.us-east-1.amazonaws.com/uploads/covers/61cd4fc3bf68083702212a26/e7334173-3a0e-40c9-a1ed-f3b24bded19c.png" alt="Create a Hook with Guard - step 1 details" />

<img src="https://cloudmate-test.s3.us-east-1.amazonaws.com/uploads/covers/61cd4fc3bf68083702212a26/8d99b27e-ddd5-475a-9363-f07859732770.png" alt="Create a Hook with Guard - step 2 details" />

<img src="https://cloudmate-test.s3.us-east-1.amazonaws.com/uploads/covers/61cd4fc3bf68083702212a26/3adf1cd7-84d2-4ce1-aef3-b43907aafd22.png" alt="Create a Hook with Guard - step 3 details" />

<p>After the Guard Hook is created, rerunning <code>terraform apply</code> resulted in a failure as expected:</p>
<pre><code class="language-plaintext">awscc_ec2_instance.this: Creating...
╷
│ Error: AWS SDK Go Service Operation Incomplete
│
│   with awscc_ec2_instance.this,
│   on main.tf line 21, in resource "awscc_ec2_instance" "this":
│   21: resource "awscc_ec2_instance" "this" {
│
│ Waiting for Cloud Control API service CreateResource operation completion returned: 
│ waiter state transitioned to FAILED. StatusMessage:
│ 0708b661-d689-4451-b05f-28c8429f3836. Hook failures: HookName:
│ Private::Guard::CTEC2PR7, HookArn:
│ arn:aws:cloudformation:us-east-1:2**********1:type/hook/Private-Guard-CTEC2PR7/00000002/aws-hooks/AWS-Hooks-GuardHook/00000001.00000071,
│ HookVersion: 00000072, Time: 2026-02-24T02:28:27Z, HookMessage: Template failed     
│ validation, the following rule(s) failed: ec2_encrypted_volumes_check.
╵
</code></pre>
<p>This confirms that Guard Hooks provide a clean and scalable way to extend proactive controls to Terraform via the Cloud Control API.</p>
<h2>Next Steps</h2>
<p>Now that we have a viable approach for replicating proactive controls using Guard Hooks, the next logical step is automation at scale.</p>
<p>I submitted another feature request to the Control Tower team to publish proactive control Guard rules in a GitHub repository, similar to the <a href="https://github.com/aws-cloudformation/aws-guard-rules-registry">AWS Guard Rules Registry</a>. After cross-checking the rules in that repository against the proactive control documentation, I found they differ.</p>
<p>As a workaround, I could develop a scraper to extract rule definitions directly from the documentation and publish them into my own GitHub repository.</p>
<p>From there, I could identify an appropriate trigger, such as a CloudTrail event for updates to the <strong>AWS::ControlTower::Hook</strong>, to invoke a Lambda function that automatically manages Guard Hook replication based on enabled proactive controls.</p>
<p>This could make for an interesting project, and perhaps a future blog post, demonstrating the capabilities of AI coding assistants such as <a href="https://kiro.dev/">Kiro</a>.</p>
<h2>Summary</h2>
<p>In this blog post, I explored whether AWS Control Tower proactive controls can apply to resources created with Terraform. After a failed attempt, I looked for a workaround, which ultimately took the form of replicating the CloudFormation Guard rules that power the proactive controls. Hopefully, AWS will eventually implement the feature request to extend proactive controls to cover the Cloud Control API as well. In the meantime, we now have a viable and potentially automatable approach to replicate them.</p>
<p>If you enjoyed this blog post and the topic it covers, be sure to check out the <a href="https://blog.avangards.io/">Avangards Blog</a> for more content. Thanks for reading!</p>
]]></content:encoded></item><item><title><![CDATA[5 Practical Tips for the Terraform Authoring and Operations Professional Exam]]></title><description><![CDATA[Introduction
Over the past year, I have worked extensively with Terraform and AWS in building turnkey infrastructure solutions and contributing to the Terraform AWS Provider as a HashiCorp Core Contributor. As a formal validation of my Terraform expe...]]></description><link>https://blog.avangards.io/5-practical-tips-for-the-terraform-authoring-and-operations-professional-exam</link><guid isPermaLink="true">https://blog.avangards.io/5-practical-tips-for-the-terraform-authoring-and-operations-professional-exam</guid><category><![CDATA[Terraform]]></category><category><![CDATA[AWS]]></category><category><![CDATA[Certification]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Mon, 12 Jan 2026 06:45:55 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1768200262887/3cbfd41b-9969-4f66-85b3-9a66ed0f37e1.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>Over the past year, I have worked extensively with Terraform and AWS in building turnkey infrastructure solutions and contributing to the <a target="_blank" href="https://github.com/hashicorp/terraform-provider-aws">Terraform AWS Provider</a> as a <a target="_blank" href="https://www.credly.com/org/hashicorp/badge/hashicorp-core-contributor-2025">HashiCorp Core Contributor</a>. As a formal validation of my Terraform expertise, I was motivated to take the Terraform Authoring and Operations Professional Exam. I recently passed the exam, and it was such a fun and unique experience that I decided to share my study tips with the community in this blog post to raise awareness and spark interest. Let’s first take a look at what this exam is all about.</p>
<h2 id="heading-about-the-terraform-authoring-and-operations-professional-exam">About the Terraform Authoring and Operations Professional Exam</h2>
<p>The <a target="_blank" href="https://developer.hashicorp.com/certifications/infrastructure-automation#terraform-authoring-and-operations-professional-details">Terraform Authoring and Operations Professional certification</a> is not an entry-level exam. It’s intended for engineers who already have hands-on experience using Terraform in production and maintaining infrastructure over time. Passing the exam demonstrates strong skills in writing Terraform modules and managing Terraform operations in an organizational setting. The objectives of the exam include:</p>
<ol>
<li><p><strong>Manage resource lifecycle</strong> - Covers the end-to-end Terraform workflow for creating, updating, destroying, and managing infrastructure state using core CLI commands.</p>
</li>
<li><p><strong>Develop and troubleshoot dynamic configuration</strong> - Focuses on writing flexible, reusable Terraform configuration using HCL features, functions, variables, and best practices for sensitive data.</p>
</li>
<li><p><strong>Develop collaborative Terraform workflows</strong> - Addresses how Terraform is used in team and automated environments, including versioning, remote state, automation, and data sharing.</p>
</li>
<li><p><strong>Create, maintain, and use Terraform modules</strong> - Examines how to design, consume, refactor, and version Terraform modules to enable reuse and maintainability.</p>
</li>
<li><p><strong>Configure and use Terraform providers</strong> - Covers provider architecture, configuration, authentication, versioning, and troubleshooting provider-related issues.</p>
</li>
<li><p><strong>Collaborate on infrastructure as code using HCP Terraform</strong> - Focuses on operating Terraform at scale with HCP Terraform, including runs, workspaces, credential management, and governance controls.</p>
</li>
</ol>
<p>The exam is primarily lab-based, with scenarios in which you modify configuration and provision and manage infrastructure in a virtual exam environment. It also includes a small multiple-choice section that validates your knowledge of HCP Terraform and related topics. This is an online, proctored exam that is four hours in length, with an optional 15-minute break. It costs $295 USD plus tax; however, it does include a free retake.</p>
<p>If the content and format of this exam intrigue you enough to take the plunge, here are some practical tips that may help with your preparation.</p>
<h2 id="heading-tip-1-use-the-official-prep-tutorial-to-guide-your-study">Tip 1: Use the Official Prep Tutorial to Guide Your Study</h2>
<p>The <a target="_blank" href="https://developer.hashicorp.com/terraform/tutorials/pro-cert">Terraform Professional certification exam prep guide</a> was my go-to resource for studying. Both the <a target="_blank" href="https://developer.hashicorp.com/terraform/tutorials/pro-cert/pro-study">learning path</a> and the <a target="_blank" href="https://developer.hashicorp.com/terraform/tutorials/pro-cert/pro-review">exam content list</a> provide an exhaustive and structured set of topics to master, including recommended tutorials. Based on these resources, I created my own study plan and a list of things to test in my lab environment. Much of my study involved writing Terraform configuration, running commands, and observing how the system behaves.</p>
<p>While there aren’t many third-party study materials available for the exam, I did come across the Terraform Authoring and Operations Professional Study Guide, which was mentioned in <a target="_blank" href="https://www.reddit.com/r/Terraform/comments/1gj1uul/skip_terraform_associate_003_cert_and_go_straight/">this Reddit thread</a> I found during my research. I read a free chapter of the book and found it to be quite well written, although I personally have enough Terraform and AWS experience that I could do without it.</p>
<h2 id="heading-tip-2-practical-experience-with-terraform-and-aws-is-a-must">Tip 2: Practical Experience with Terraform and AWS Is a Must</h2>
<p>Unlike the <a target="_blank" href="https://developer.hashicorp.com/certifications/infrastructure-automation#terraform-associate-\(004\)-details">HashiCorp Certified: Terraform Associate Exam</a> and many cloud provider exams, passing the Terraform Authoring and Operations Professional Exam requires extensive hands-on experience with Terraform. Learning from basic tutorials alone will not help much, and it is futile to try studying and memorizing your way to a passing grade for this exam.</p>
<p>Although the <a target="_blank" href="https://developer.hashicorp.com/certifications/infrastructure-automation#terraform-authoring-and-operations-professional-details">exam details</a> do not list AWS experience as a prerequisite, you must know how to deploy resources from core AWS services, such as Amazon EC2, Amazon VPC, and AWS IAM, to complete the lab-based scenarios efficiently. The list of <a target="_blank" href="https://developer.hashicorp.com/terraform/tutorials/pro-cert/pro-review#aws-resources-to-review">AWS resources to review</a> in the exam content outline is a good indicator of what you need to know and brush up on before the exam.</p>
<p>The required Terraform knowledge is also advanced, as you will need to employ dynamic configuration (think functions, modules, and meta-arguments) and state manipulation to complete the lab-based scenarios. Even in my professional services role with daily interaction with AWS and Terraform, I rarely had a need to work so extensively with states and advanced Terraform features. Fortunately, I explicitly practiced based on cues from Reddit and the study guide, which left me well prepared. Make sure you go through tutorials on these advanced topics and try them out in a sandbox environment.</p>
<p>Additionally, your Terraform CLI experience likely amounts to running <code>terraform init</code>, <code>terraform plan</code> and <code>terraform apply</code> even if you work with Terraform regularly. You should therefore review and experiment with the full list of CLI commands and their various options. I, for one, learned about an <em>experimental</em> option for a particular CLI command that proved helpful during the exam.</p>
<h2 id="heading-tip-3-know-your-way-around-the-exam-environment">Tip 3: Know Your Way Around the Exam Environment</h2>
<p>The exam is conducted within a <a target="_blank" href="https://guacamole.apache.org/">Guacamole</a>-powered Linux virtual desktop environment that is accessed via your web browser. The video walkthrough on the <a target="_blank" href="https://developer.hashicorp.com/terraform/tutorials/pro-cert/pro-orientation">exam orientation page</a> should give you an idea of what the exam environment looks like. The exam is primarily delivered through a local web page in the preinstalled <a target="_blank" href="https://www.firefox.com/">Mozilla Firefox</a> web browser. It provides the exam instructions and lab scenarios, a section for completing the multiple-choice questions, and links to whitelisted resources such as Terraform and provider documentation.</p>
<p>For the lab-based scenarios, the preinstalled <a target="_blank" href="https://code.visualstudio.com/">Visual Studio Code</a> is your main interface. The IDE has some useful extensions, such as the <a target="_blank" href="https://marketplace.visualstudio.com/items?itemName=HashiCorp.terraform">HashiCorp Terraform extension</a>, preinstalled, giving you access to features like <a target="_blank" href="https://marketplace.visualstudio.com/items?itemName=HashiCorp.terraform#intellisense-and-autocomplete">IntelliSense</a>. However, the IDE may not be fully configured with all the quality-of-life features that you’d expect (such as format on save), so you will either have to change the settings yourself or bear with the minor inconveniences. One thing I found to be extremely helpful is the terminal in VS Code. Even though it is not mentioned in the exam instructions, opening the terminal in VS Code provides you with a prompt to navigate to the directory of each lab scenario. This made it more efficient for me to open files and run commands, without having to browse and open files from the desktop. If you are not already using VS Code for Terraform development, be sure to familiarize yourself with it and the Terraform extension before the exam.</p>
<p>One notable annoyance with the exam environment is that if you use your mouse side buttons to navigate between documentation pages in Firefox, the shortcut is instead recognized by your main web browser and leads you out of the exam session. You’d then have to ask the proctor to let you back in. I have also seen other exam takers reporting similar issues with keyboard shortcuts. It was difficult for me to break this habit during the exam, so I accidentally exited the exam environment several times, much to the proctor’s dismay, which he made clear in the chat. After the exam, I reported the issue and was contacted by someone from IBM, so I hope this can be fixed soon.</p>
<p>Otherwise, it wasn’t too difficult to adapt to the copy-and-paste shortcuts, and overall latency was acceptable.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">ℹ</div>
<div data-node-type="callout-text">I noticed that <a target="_self" href="https://github.com/features/copilot">GitHub Copilot</a> was installed and seemingly usable in VS Code. Although tempting, I did not use it during the exam, as it would likely violate the exam terms.</div>
</div>

<h2 id="heading-tip-4-learn-to-navigate-terraform-and-aws-documentation-efficiently">Tip 4: Learn to Navigate Terraform and AWS Documentation Efficiently</h2>
<p>During the exam, you can access the <a target="_blank" href="https://developer.hashicorp.com/terraform/docs">Terraform documentation</a>, the <a target="_blank" href="https://registry.terraform.io/">Terraform Registry</a>, the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest">AWS provider documentation</a>, and some AWS documentation in Firefox. However, you will not be able to search, so you will need to rely on navigating the documentation.</p>
<p>You should know where to find CLI command references for available options, as well as configuration construct references such as functions and built-in resources. You can navigate to these topics from the main documentation page via <a target="_blank" href="https://developer.hashicorp.com/terraform/cli">Terraform CLI</a> and <a target="_blank" href="https://developer.hashicorp.com/terraform/language">Configuration Language</a> in the left-hand menu, respectively:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1768190548965/fac1d1a4-97a6-4a1a-a7cd-996d7b53089f.png" alt="Where to find language and CLI references in the Terraform documentation" class="image--center mx-auto" /></p>
<p>As I am fairly proficient with Terraform and practiced before the exam, I only referred to the documentation a couple of times. Where it helped me most was in validating my answers to multiple-choice questions related to platform and enterprise features that I am not experienced with.</p>
<p>Navigating the AWS provider documentation should be straightforward, but you should know where to find information about <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs#argument-reference">provider configuration</a> and <a target="_blank" href="https://developer.hashicorp.com/terraform/tutorials/pro-cert/pro-review">in-scope resources for the exam</a>. As I work with AWS almost on a daily basis, I didn’t need to refer to AWS documentation at all. Overall, you should familiarize yourself with navigating the documentation without relying on search.</p>
<h2 id="heading-tip-5-manage-your-time-effectively-during-the-exam">Tip 5: Manage Your Time Effectively During the Exam</h2>
<p>The Terraform Authoring and Operations Professional Exam is <a target="_blank" href="https://developer.hashicorp.com/certifications/infrastructure-automation#terraform-authoring-and-operations-professional-details">4 hours in duration with an optional 15-minute break</a>, making it the longest IT exam I’ve taken to date. That said, you will likely need the entire allotted time to complete the exam. The majority of your time will be spent on lab-based scenarios that are aligned with the <a target="_blank" href="https://developer.hashicorp.com/terraform/tutorials/pro-cert/pro-review">exam objectives</a> by theme. Be sure to review each lab scenario description thoroughly and complete them “to spec”. The code doesn’t have to be pretty - it just needs to work correctly.</p>
<p>It may also be wise to time-box each scenario, perhaps to 45 minutes, so that you at least have an opportunity to attempt every question. I recall getting stuck on a very specific provider-related issue in the second scenario and spending about an hour on it before deciding to park it and move on. I was fortunately able to complete the other scenarios fairly quickly, which afforded me about 30 minutes to figure out the earlier scenario and double-check my answers to the multiple-choice questions. I completed the exam right at the time limit with confidence and received a passing notification an hour or two later.</p>
<h2 id="heading-summary">Summary</h2>
<p>The Terraform Authoring and Operations Professional Exam is a challenging, hands-on exam that really tests how well you know Terraform in real-world scenarios. In this blog post, I shared what the exam is like and the study strategies that worked for me - from focusing on hands-on practice and advanced Terraform features to getting comfortable with the exam environment and managing your time effectively.</p>
<p>If you’re thinking about taking the exam, I hope these tips help you prepare with more confidence. And if you found this useful, feel free to check out my other blog posts in the <a target="_blank" href="https://blog.avangards.io/">Avangards Blog</a>, where I share more lessons learned from working with Terraform, AWS, and infrastructure as code. Good luck and happy “Terraforming”!</p>
]]></content:encoded></item><item><title><![CDATA[Using Amazon Bedrock Knowledge Base Application Logs for Notifications]]></title><description><![CDATA[Introduction
In the earlier blog post Building a Data Ingestion Solution for Amazon Bedrock Knowledge Bases, we developed a data ingestion solution that includes job completion notifications with a status pull mechanism which wasn’t as efficient as i...]]></description><link>https://blog.avangards.io/using-amazon-bedrock-knowledge-base-application-logs-for-notifications</link><guid isPermaLink="true">https://blog.avangards.io/using-amazon-bedrock-knowledge-base-application-logs-for-notifications</guid><category><![CDATA[AWS]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[Terraform]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Wed, 02 Apr 2025 02:57:24 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1740971762186/b76af42a-1b4f-4dbd-b685-4e1f590ff6a6.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In the earlier blog post <a target="_blank" href="https://blog.avangards.io/building-a-data-ingestion-solution-for-amazon-bedrock-knowledge-bases">Building a Data Ingestion Solution for Amazon Bedrock Knowledge Bases</a>, we developed a data ingestion solution that includes job completion notifications with a status pull mechanism which wasn’t as efficient as it could be. Since then, we examined <a target="_blank" href="https://blog.avangards.io/enabling-logging-for-amazon-bedrock-knowledge-bases-using-terraform">Knowledge Bases logging</a> which publishes ingestion job log events to CloudWatch Logs, which opens up a new opportunity for a better design with a status push mechanism based on subscription filters. In this blog post, we will examine how to update the original solution with the new design.</p>
<h2 id="heading-updated-design-overview">Updated Design Overview</h2>
<p>The overall design of the updated solution is depicted in the following diagram:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1740971777923/65453b1d-e5b0-4669-9a94-2e5c9d65e14f.png" alt="Updated solution architecture" class="image--center mx-auto" /></p>
<p>The updated solution works as follows:</p>
<ol>
<li><p>Logging is configured for the Bedrock Knowledge Base to deliver logs to CloudWatch Logs. A subscription filter is created in the associated log group to filter ingestion job status change events that correspond to an end state and send log events to a Lambda function.</p>
</li>
<li><p>A Lambda function, triggered by an EventBridge schedule rule, periodically starts an ingestion (a.k.a. sync) job for each specified knowledge base and data source. Note that the SQS queue is removed as it is no longer necessary.</p>
</li>
<li><p>Another Lambda function serves as the destination for the subscription filter. For each event message that is extracted from the log events, the function uses the job ID information to get details about the ingestion job. A notification is sent to one of the two SNS topics depending on whether the job is successful or failed.</p>
</li>
</ol>
<h2 id="heading-updating-the-components">Updating the Components</h2>
<p>As the SQS queue is not required, the only change to the Lambda function that starts the ingestion job is a minor cleanup. The updated Lambda function code is as follows:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> boto3
<span class="hljs-keyword">import</span> json
<span class="hljs-keyword">from</span> botocore.exceptions <span class="hljs-keyword">import</span> ClientError

bedrock_agent = boto3.client(<span class="hljs-string">'bedrock-agent'</span>)
ssm = boto3.client(<span class="hljs-string">'ssm'</span>)


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">lambda_handler</span>(<span class="hljs-params">event, context</span>):</span>
    <span class="hljs-keyword">try</span>:
        <span class="hljs-comment"># Retrieve the JSON config from Parameter Store</span>
        response = ssm.get_parameter(Name=<span class="hljs-string">'/start-kb-ingestion-jobs/config-json'</span>)
        config_json = response[<span class="hljs-string">'Parameter'</span>][<span class="hljs-string">'Value'</span>]
        config = json.loads(config_json)

        <span class="hljs-keyword">for</span> record <span class="hljs-keyword">in</span> config:
            knowledge_base_id = record.get(<span class="hljs-string">'knowledge_base_id'</span>)
            <span class="hljs-keyword">for</span> data_source_id <span class="hljs-keyword">in</span> record.get(<span class="hljs-string">'data_source_ids'</span>):
                <span class="hljs-comment"># Start the ingestion job</span>
                print(<span class="hljs-string">f'Starting ingestion job for data source <span class="hljs-subst">{data_source_id}</span> of knowledge base <span class="hljs-subst">{knowledge_base_id}</span>'</span>)
                response = bedrock_agent.start_ingestion_job(
                    knowledgeBaseId=knowledge_base_id,
                    dataSourceId=data_source_id
                )
        <span class="hljs-keyword">return</span> {
            <span class="hljs-string">'statusCode'</span>: <span class="hljs-number">200</span>,
            <span class="hljs-string">'body'</span>: <span class="hljs-string">'Success'</span>
        }
    <span class="hljs-keyword">except</span> ClientError <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">return</span> {
            <span class="hljs-string">'statusCode'</span>: <span class="hljs-number">500</span>,
            <span class="hljs-string">'body'</span>: <span class="hljs-string">f'Client error: <span class="hljs-subst">{str(e)}</span>'</span>
        }
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">return</span> {
            <span class="hljs-string">'statusCode'</span>: <span class="hljs-number">500</span>,
            <span class="hljs-string">'body'</span>: <span class="hljs-string">f'Unexpected error: <span class="hljs-subst">{str(e)}</span>'</span>
        }
</code></pre>
<p>Meanwhile, updating the component that checks ingestion job statuses is slightly more complex. First, we need to update the <code>check-kb-job-statuses</code> Lambda function to be a subscription filter target. As described in the <a target="_blank" href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/SubscriptionFilters.html">Log group-level subscription filters</a> page of the CloudWatch Logs user guide, the log data received by the function is compressed, Base64-encoded, and batched. I was able to easily find <a target="_blank" href="https://stackoverflow.com/questions/50295838/cloudwatch-logs-stream-to-lambda-python">this StackOverflow question</a> which has the exact code we need in the first answer.</p>
<p>Next, we need to know what a relevant log event looks like. The <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-bases-logging.html#knowledge-bases-logging-example-logs">examples of knowledge base logs</a> in the AWS documentation provide the general format for an ingestion job event, however it is preferable to look at an actual log event. Here’s one that captures a job completion event for a successful job:</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"event_timestamp"</span>: <span class="hljs-number">1740895462316</span>,
    <span class="hljs-attr">"event"</span>: {
        <span class="hljs-attr">"ingestion_job_id"</span>: <span class="hljs-string">"W0V45LVZY6"</span>,
        <span class="hljs-attr">"data_source_id"</span>: <span class="hljs-string">"ATUWOVZJOD"</span>,
        <span class="hljs-attr">"ingestion_job_status"</span>: <span class="hljs-string">"COMPLETE"</span>,
        <span class="hljs-attr">"knowledge_base_arn"</span>: <span class="hljs-string">"arn:aws:bedrock:us-east-1:&lt;redacted&gt;:knowledge-base/R1K1UIZKKQ"</span>,
        <span class="hljs-attr">"resource_statistics"</span>: {
            <span class="hljs-attr">"number_of_resources_updated"</span>: <span class="hljs-number">366</span>,
            <span class="hljs-attr">"number_of_resources_ingested"</span>: <span class="hljs-number">0</span>,
            <span class="hljs-attr">"number_of_resources_deleted"</span>: <span class="hljs-number">0</span>,
            <span class="hljs-attr">"number_of_resources_with_metadata_updated"</span>: <span class="hljs-number">0</span>,
            <span class="hljs-attr">"number_of_resources_failed"</span>: <span class="hljs-number">15</span>
        }
    },
    <span class="hljs-attr">"event_version"</span>: <span class="hljs-string">"1.0"</span>,
    <span class="hljs-attr">"event_type"</span>: <span class="hljs-string">"StartIngestionJob.StatusChanged"</span>,
    <span class="hljs-attr">"level"</span>: <span class="hljs-string">"INFO"</span>
}
</code></pre>
<p>The Lambda function must extract the following details from the log event:</p>
<ol>
<li><p>The knowledge base ID, which can be extracted from the value of the <code>event.knowledge_base_arn</code>, specifically after the <code>/</code>.</p>
</li>
<li><p>The data source ID, which is the value of the <code>event.data_source_id</code> field.</p>
</li>
<li><p>The ingestion job ID, which is the value of the <code>event.ingestion_job_id</code> field.</p>
</li>
</ol>
<p>Although we are able to extract all of the information required for notification, the log event does not contain the verbose content ingestion failures which we get from the response of the <code>GetIngestionJob</code> API action. Although this approach is slightly less efficient, we will still call the API for completeness. The resulting Lambda function should look like this:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> base64
<span class="hljs-keyword">import</span> boto3
<span class="hljs-keyword">import</span> gzip
<span class="hljs-keyword">import</span> json
<span class="hljs-keyword">from</span> botocore.exceptions <span class="hljs-keyword">import</span> ClientError

bedrock_agent = boto3.client(<span class="hljs-string">'bedrock-agent'</span>)
ssm = boto3.client(<span class="hljs-string">'ssm'</span>)
sns = boto3.client(<span class="hljs-string">'sns'</span>)


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_ssm_parameter</span>(<span class="hljs-params">name</span>):</span>
    response = ssm.get_parameter(Name=name, WithDecryption=<span class="hljs-literal">True</span>)
    <span class="hljs-keyword">return</span> response[<span class="hljs-string">'Parameter'</span>][<span class="hljs-string">'Value'</span>]


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_ingestion_job</span>(<span class="hljs-params">knowledge_base_id, data_source_id, ingestion_job_id</span>):</span>
    response = bedrock_agent.get_ingestion_job(
        knowledgeBaseId=knowledge_base_id,
        dataSourceId=data_source_id,
        ingestionJobId=ingestion_job_id
    )
    <span class="hljs-keyword">return</span> response[<span class="hljs-string">'ingestionJob'</span>]


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">lambda_handler</span>(<span class="hljs-params">event, context</span>):</span>
    <span class="hljs-keyword">try</span>:
        success_sns_topic_arn = get_ssm_parameter(<span class="hljs-string">'/check-kb-ingestion-job-statuses/success-sns-topic-arn'</span>)
        failure_sns_topic_arn = get_ssm_parameter(<span class="hljs-string">'/check-kb-ingestion-job-statuses/failure-sns-topic-arn'</span>)

        encoded_zipped_data = event[<span class="hljs-string">'awslogs'</span>][<span class="hljs-string">'data'</span>]
        zipped_data = base64.b64decode(encoded_zipped_data)
        data = json.loads(gzip.decompress(zipped_data))
        log_events = data[<span class="hljs-string">'logEvents'</span>]
        <span class="hljs-keyword">for</span> log_event <span class="hljs-keyword">in</span> log_events:
            message = json.loads(log_event[<span class="hljs-string">'message'</span>])
            knowledge_base_arn = message[<span class="hljs-string">'event'</span>][<span class="hljs-string">'knowledge_base_arn'</span>]
            knowledge_base_id = knowledge_base_arn.split(<span class="hljs-string">'/'</span>)[<span class="hljs-number">-1</span>]
            data_source_id = message[<span class="hljs-string">'event'</span>][<span class="hljs-string">'data_source_id'</span>]
            ingestion_job_id = message[<span class="hljs-string">'event'</span>][<span class="hljs-string">'ingestion_job_id'</span>]

            print(
                <span class="hljs-string">f'Checking ingestion job status for knowledge base <span class="hljs-subst">{knowledge_base_id}</span> data source <span class="hljs-subst">{data_source_id}</span> job <span class="hljs-subst">{ingestion_job_id}</span>'</span>)
            ingestion_job = get_ingestion_job(knowledge_base_id, data_source_id, ingestion_job_id)
            print(
                <span class="hljs-string">f'Ingestion job summary: \n\n<span class="hljs-subst">{json.dumps(ingestion_job, indent=<span class="hljs-number">2</span>, sort_keys=<span class="hljs-literal">True</span>, default=str)}</span>'</span>)
            job_status = ingestion_job[<span class="hljs-string">'status'</span>]
            <span class="hljs-keyword">if</span> job_status == <span class="hljs-string">'COMPLETE'</span>:
                sns.publish(
                    TopicArn=success_sns_topic_arn,
                    Subject=<span class="hljs-string">f'Ingestion job for knowledge base <span class="hljs-subst">{knowledge_base_id}</span> data source <span class="hljs-subst">{data_source_id}</span> job <span class="hljs-subst">{ingestion_job_id}</span> Completed'</span>,
                    Message=json.dumps(ingestion_job, indent=<span class="hljs-number">2</span>, sort_keys=<span class="hljs-literal">True</span>, default=str)
                )
            <span class="hljs-keyword">elif</span> job_status == <span class="hljs-string">'FAILED'</span>:
                sns.publish(
                    TopicArn=failure_sns_topic_arn,
                    Subject=<span class="hljs-string">f'Ingestion job for knowledge base <span class="hljs-subst">{knowledge_base_id}</span> data source <span class="hljs-subst">{data_source_id}</span> job <span class="hljs-subst">{ingestion_job_id}</span> FAILED'</span>,
                    Message=json.dumps(ingestion_job, indent=<span class="hljs-number">2</span>, sort_keys=<span class="hljs-literal">True</span>, default=str)
                )
            <span class="hljs-keyword">elif</span> job_status == <span class="hljs-string">'STOPPED'</span>:
                sns.publish(
                    TopicArn=failure_sns_topic_arn,
                    Subject=<span class="hljs-string">f'Ingestion job for knowledge base <span class="hljs-subst">{knowledge_base_id}</span> data source <span class="hljs-subst">{data_source_id}</span> job <span class="hljs-subst">{ingestion_job_id}</span> STOPPED'</span>,
                    Message=json.dumps(ingestion_job, indent=<span class="hljs-number">2</span>, sort_keys=<span class="hljs-literal">True</span>, default=str)
                )
        <span class="hljs-keyword">return</span> {
            <span class="hljs-string">'statusCode'</span>: <span class="hljs-number">200</span>,
            <span class="hljs-string">'body'</span>: <span class="hljs-string">'Success'</span>
        }
    <span class="hljs-keyword">except</span> ClientError <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">return</span> {
            <span class="hljs-string">'statusCode'</span>: <span class="hljs-number">500</span>,
            <span class="hljs-string">'body'</span>: <span class="hljs-string">f'Client error: <span class="hljs-subst">{str(e)}</span>'</span>
        }
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">return</span> {
            <span class="hljs-string">'statusCode'</span>: <span class="hljs-number">500</span>,
            <span class="hljs-string">'body'</span>: <span class="hljs-string">f'Unexpected error: <span class="hljs-subst">{str(e)}</span>'</span>
        }
</code></pre>
<p>Lastly, we need to create a subscription filter in the log group that acts as the log delivery destination of the knowledge base. Since we are only interested in log events for ingestion job completion, we need to define an appropriate subscription filter pattern. There are two fields which we need for this purpose:</p>
<ol>
<li><p>The <code>event_type</code> field with the value <code>StartIngestionJob.StatusChanged</code>.</p>
</li>
<li><p>The <code>event.ingestion_job_status</code> field with the value matching one of <code>COMPLETE</code>, <code>FAILED</code>, <code>CRAWLING_COMPLETED</code>, as described in the <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-bases-logging.html#knowledge-bases-logging-example-logs">data ingestion job log example</a>.</p>
</li>
</ol>
<p>Based on some testing, a <code>CRAWLING_COMPLETED</code> event does not indicate a full completion of an ingestion job. A <code>COMPLETE</code> (and presumably <code>FAILED</code>) event is always sent upon job completion. So we can use <code>COMPLETE</code> and <code>FAILED</code> for the filter. Furthermore, stopping a job does not generate an event and there is no status value for it. This seems like a miss on AWS’ part, so I’ll open an AWS support case for it. For now, we will still add <code>STOPPED</code> to the filter for the sake of completeness.</p>
<p>Referring to <a target="_blank" href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/FilterAndPatternSyntax.html#matching-terms-json-log-events">subscription filter pattern for JSON log events</a>, we can define our compound expression that checks for the event type and ingestion job status as follows:</p>
<pre><code class="lang-plaintext">{$.event_type = "StartIngestionJob.StatusChanged" &amp;&amp; ($.event.ingestion_job_status = "COMPLETE" || $.event.ingestion_job_status = "FAILED" || $.event.ingestion_job_status = "STOPPED")}
</code></pre>
<p>We can first test the pattern in the AWS Management Console's subscription filter creation dialog without creating the filter. Later, we will implement it using Terraform. Here is a screenshot of what the dialog looks like:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1740976706072/a640d3ac-8188-4258-9d58-f5cd7e5ce18e.png" alt="Testing pattern in the subscription filter creation dialog" class="image--center mx-auto" /></p>
<p>In this example, the subscription filter is created in the log group as evident by the standard naming pattern. Knowledge base logs are written to the log stream <code>bedrock/knowledgebaselogs</code>, so we need to select that. Using the <strong>Test pattern</strong> button, we can see one filtered entry in the test results among 50 log events. The log events were generated from a single ingestion job, and the other events are either resource change events or non-related status change events.</p>
<h2 id="heading-updating-the-terraform-configuration">Updating the Terraform Configuration</h2>
<p>The following changes are required to the original solution’s Terraform configuration to support the new design:</p>
<ul>
<li><p>Remove the SQS queue, the associated IAM permissions, and the SSM parameters.</p>
</li>
<li><p>Update the Lambda permission for the <code>check-kb-intgestion-job-statuses</code> to allow trigger from CloudWatch logs via the log group to which the Bedrock knowledge base writes its application logs.</p>
</li>
</ul>
<p>Lastly, we need a new resource for the subscription filter as follows:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_cloudwatch_log_subscription_filter"</span> <span class="hljs-string">"check_kb_ingestion_job_statuses"</span> {
  name            = <span class="hljs-string">"check-kb-ingestion-job-statuses"</span>
  log_group_name  = var.kb_app_log_group_name
  filter_pattern  = <span class="hljs-string">"{$.event_type = \"StartIngestionJob.StatusChanged\" &amp;&amp; ($.event.ingestion_job_status = \"COMPLETE\" || $.event.ingestion_job_status = \"FAILED\" || $.event.ingestion_job_status = \"STOPPED\")}"</span>
  destination_arn = aws_lambda_function.check_kb_ingestion_job_statuses.arn
  depends_on      = [aws_lambda_permission.check_kb_ingestion_job_statuses]
}
</code></pre>
<p>Note that the log group name is provided as a variable. The log group name should follow the default format provided by AWS, which is <code>/aws/vendedlogs/bedrock/knowledge-base/APPLICATION_LOGS/&lt;KB_ID&gt;</code>, where <code>&lt;KB_ID&gt;</code> is the Bedrock knowledge base ID.</p>
<h2 id="heading-deploying-and-testing-the-solution">Deploying and Testing the Solution</h2>
<div data-node-type="callout">
<div data-node-type="callout-emoji">✅</div>
<div data-node-type="callout-text">You can find the complete Terraform configuration and source code in the <code>5_kb_data_ingestion_via_logs</code> directory in <a target="_self" href="https://github.com/acwwat/terraform-amazon-bedrock-agent-example">this GitHub repository</a>.</div>
</div>

<p>To deploy and test the solution, you need a knowledge base with at least one data source that has content to ingest either in an S3 bucket or a crawlable website. You can set this up in the Bedrock console using the vector database quick start options. Alternatively, deploy a sample knowledge base using the Terraform configuration from my blog post <a target="_blank" href="https://blog.avangards.io/how-to-manage-an-amazon-bedrock-knowledge-base-using-terraform">How To Manage an Amazon Bedrock Knowledge Base Using Terraform</a>. This configuration is also available in the same GitHub repository under the <code>2_knowledge_base</code> directory.</p>
<p>Additionally, you must also change the knowledge base’s logging configuration to deliver application logs to CloudWatch Logs. You can enable it either manually following the <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-bases-logging.html">AWS documentation</a> or using the Terraform configuration from my previous blog post <a target="_blank" href="https://blog.avangards.io/enabling-logging-for-amazon-bedrock-knowledge-bases-using-terraform">Enabling Logging for Amazon Bedrock Knowledge Bases using Terraform</a>. This configuration is also available in the same GitHub repository under the <code>4_kb_logging</code> directory.</p>
<p>With the prerequisites in place, deploy the solution as follows:</p>
<ol>
<li><p>From the root of the cloned GitHub repository, navigate to <code>5_kb_data_ingestion_via_logs</code>.</p>
</li>
<li><p>Copy <code>terraform.tfvars.example</code> as <code>terraform.tfvars</code> and update the variables to match your configuration.</p>
<ul>
<li>By default, the <code>start-kb-ingestion-jobs</code> Lambda function runs daily at 0:00 UTC.</li>
</ul>
</li>
<li><p>Configure your AWS credentials.</p>
</li>
<li><p>Run <code>terraform init</code> and terraform <code>apply -var-file terraform.tfvars</code>.</p>
</li>
</ol>
<p>Once deployed, test the solution by adding an email subscription to the SNS topics <code>check-kb-ingestion-job-statuses-success</code> and <code>check-kb-ingestion-job-statuses-failure</code> for your e-mail address so that you can receive email notifications. Confirm your subscriptions using the link in the verification emails.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1741063793761/bfc90696-b71f-45f8-9fe9-d311c029a49c.png" alt="Adding an email subscription to the SNS topics" class="image--center mx-auto" /></p>
<p>Next, manually invoke the <code>start-kb-ingestion-jobs</code> Lambda function in the Lambda console.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1741064302756/04867b33-a1e1-4c17-8945-d71db826e1d4.png" alt="Invoking the start-kb-ingestion-jobs Lambda function manually" class="image--center mx-auto" /></p>
<p>As the ingestion jobs run and complete, logs are written to CloudWatch Logs and pass through the subscription filter. The status change events should be filtered and sent to the Lambda function for notification, ultimately leading to the emails you’ll receive. Here’s an example:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1741065689669/ebb70362-6316-4a76-a04b-eb21a11d63f6.png" alt="Success email notification" class="image--center mx-auto" /></p>
<p>Once you've verified the solution works, remove the SNS subscription and replace it with those that better fit your needs. If you don’t plan to keep the knowledge base, delete it along with the vector store (for example, the OSS index) to avoid unnecessary costs.</p>
<h2 id="heading-summary">Summary</h2>
<p>In this blog post, we improved the original Bedrock Knowledge Bases data ingestion solution with push-based notification using CloudWatch features. This is likely more efficient than a scheduled pull-based mechanism and allows us to leverage a Lambda subscription filter.</p>
<p>That being said, the ideal solution would be an <a target="_blank" href="https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-rules.html">EventBridge rule</a> to react to native ingestion job events from Bedrock. The Bedrock service unfortunately does not publish such events today, but I’ve made a feature request via an AWS support case. Hopefully this will be supported soon and we can evolve our data ingestion solution further.</p>
<p>I hope you find this blog post helpful and engaging. Please feel free to check out my other blog posts in the <a target="_blank" href="https://blog.avangards.io/">Avangards Blog</a>. Take care and happy learning!</p>
]]></content:encoded></item><item><title><![CDATA[Enabling Logging for Amazon Bedrock Knowledge Bases using Terraform]]></title><description><![CDATA[Introduction
In the recent blog post Building a Data Ingestion Solution for Amazon Bedrock Knowledge Bases, we created a data ingestion solution that includes job completion notifications with a status pull mechanism. Not satisfied with how frequentl...]]></description><link>https://blog.avangards.io/enabling-logging-for-amazon-bedrock-knowledge-bases-using-terraform</link><guid isPermaLink="true">https://blog.avangards.io/enabling-logging-for-amazon-bedrock-knowledge-bases-using-terraform</guid><category><![CDATA[AWS]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[Terraform]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Sun, 02 Mar 2025 20:10:07 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1740868574957/2738276a-19a0-4c0f-8706-49aaae6cd152.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In the recent blog post <a target="_blank" href="https://blog.avangards.io/building-a-data-ingestion-solution-for-amazon-bedrock-knowledge-bases">Building a Data Ingestion Solution for Amazon Bedrock Knowledge Bases</a>, we created a data ingestion solution that includes job completion notifications with a status pull mechanism. Not satisfied with how frequently the Lambda function must run to check job statuses, I looked into whether a push mechanism is available.</p>
<p>From my research, I found that <a target="_blank" href="https://aws.amazon.com/about-aws/whats-new/2024/06/knowledge-bases-amazon-bedrock-observability-logs/">Bedrock Knowledge Bases supports observability logs</a> and that it logs events related to content ingestion. With support for log delivery to CloudWatch Logs, it unlocks the possibility of using a subscription filter to push ingestion job completion log events. Consequently, I dedicated this blog post to reviewing this feature and determining how to enable it efficiently using Terraform.</p>
<p>With this context, let’s first look at how CloudWatch log delivery works in general and how it applies to Bedrock Knowledge Bases.</p>
<h2 id="heading-creating-a-delivery-source-for-the-knowledge-base">Creating a Delivery Source for the Knowledge Base</h2>
<p>Bedrock Knowledge Bases is one of the AWS services that uses the <a target="_blank" href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AWS-logs-and-resource-policy.html">log delivery feature in CloudWatch Logs</a> to write vended logs. This is a framework that provides a standard interface to configure logging, which typically involves a <a target="_blank" href="https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_DeliverySource.html">delivery source</a>, a <a target="_blank" href="https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_DeliveryDestination.html">delivery destination</a>, and a <a target="_blank" href="https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_Delivery.html">delivery</a> that enables logging by linking the two.</p>
<p>As per <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-bases-logging.html">Monitor knowledge bases using CloudWatch Logs</a>, Bedrock Knowledge Bases currently only support application logs. Thus, we can create the delivery source in Terraform using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_log_delivery_source"><code>aws_cloudwatch_log_delivery_source</code> resource</a> as follows:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_cloudwatch_log_delivery_source"</span> <span class="hljs-string">"kb_logs"</span> {
  count        = var.enable_kb_log_delivery_cloudwatch_logs || var.enable_kb_log_delivery_s3 || var.enable_kb_log_delivery_data_firehose ? <span class="hljs-number">1</span> : <span class="hljs-number">0</span>
  name         = <span class="hljs-string">"bedrock-kb-${var.kb_id}"</span>
  log_type     = <span class="hljs-string">"APPLICATION_LOGS"</span>
  resource_arn = <span class="hljs-string">"arn:${local.partition}:bedrock:${local.region}:${local.account_id}:knowledge-base/${var.kb_id}"</span>
}
</code></pre>
<p>Notice that there is a condition to create the resource only if one of the log delivery options is enabled using variables, which will also be used in the destination-specific configurations that are explained in subsequent sections. This makes the configuration more generic and pluggable to your Terraform configuration that manages your Bedrock Agents and Knowledge Bases.</p>
<h2 id="heading-sending-logs-to-cloudwatch-logs">Sending Logs to CloudWatch Logs</h2>
<p>To send logs to CloudWatch Logs, we need to create a log group and configure it as destination for delivery. Creating a log group is simple enough using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_log_group"><code>aws_cloudwatch_log_group</code> resource</a>. The log group name should follow the default format provided by AWS, which is <code>/aws/vendedlogs/bedrock/knowledge-base/APPLICATION_LOGS/&lt;KB_ID&gt;</code>, where <code>&lt;KB_ID&gt;</code> is the Bedrock knowledge base ID.</p>
<p>Using the log group for log delivery requires a log group resource policy. As per the <a target="_blank" href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AWS-logs-and-resource-policy.html#AWS-logs-infrastructure-V2-CloudWatchLogs">AWS documentation</a>, CloudWatch Logs can automatically add the appropriate policy if the log group does not have a resource policy, and the user setting up the logging has the appropriate permissions. Although, for the sake of completeness, we should manually create the resource policy as described in the aforementioned documentation using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_log_resource_policy"><code>aws_cloudwatch_log_resource_policy</code> resource</a>.</p>
<p>Lastly, we need to create a <a target="_blank" href="https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_DeliveryDestination.html">delivery destination</a> for it using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_log_delivery_destination"><code>aws_cloudwatch_log_delivery_destination</code> resource</a>, and then establish the delivery from the source (i.e., the knowledge base) to the destination (i.e., the log group) using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_log_delivery"><code>aws_cloudwatch_log_delivery</code> resource</a>. The resulting Terraform configuration should look like the following:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_cloudwatch_log_group"</span> <span class="hljs-string">"kb_logs"</span> {
  count = var.enable_kb_log_delivery_cloudwatch_logs ? <span class="hljs-number">1</span> : <span class="hljs-number">0</span>
  name  = <span class="hljs-string">"/aws/vendedlogs/bedrock/knowledge-base/APPLICATION_LOGS/${var.kb_id}"</span>
}

resource <span class="hljs-string">"aws_cloudwatch_log_resource_policy"</span> <span class="hljs-string">"kb_logs"</span> {
  count       = var.enable_kb_log_delivery_cloudwatch_logs ? <span class="hljs-number">1</span> : <span class="hljs-number">0</span>
  policy_name = <span class="hljs-string">"bedrock-kb-${var.kb_id}-policy"</span>
  policy_document = jsonencode({
    Version = <span class="hljs-string">"2012-10-17"</span>
    Statement = [
      {
        Sid    = <span class="hljs-string">"AWSLogDeliveryWrite20150319"</span>
        Effect = <span class="hljs-string">"Allow"</span>
        Principal = {
          Service = [<span class="hljs-string">"delivery.logs.amazonaws.com"</span>]
        }
        Action = [
          <span class="hljs-string">"logs:CreateLogStream"</span>,
          <span class="hljs-string">"logs:PutLogEvents"</span>
        ]
        Resource = [<span class="hljs-string">"${aws_cloudwatch_log_group.kb_logs[0].arn}:log-stream:*"</span>]
        Condition = {
          StringEquals = {
            <span class="hljs-string">"aws:SourceAccount"</span> = [<span class="hljs-string">"${local.account_id}"</span>]
          },
          ArnLike = {
            <span class="hljs-string">"aws:SourceArn"</span> = [<span class="hljs-string">"arn:${local.partition}:logs:${local.region}:${local.account_id}:*"</span>]
          }
        }
      }
    ]
  })
}

resource <span class="hljs-string">"aws_cloudwatch_log_delivery_destination"</span> <span class="hljs-string">"kb_logs_cloudwatch_logs"</span> {
  count = var.enable_kb_log_delivery_cloudwatch_logs ? <span class="hljs-number">1</span> : <span class="hljs-number">0</span>
  name  = <span class="hljs-string">"bedrock-kb-${var.kb_id}-cloudwatch-logs"</span>
  delivery_destination_configuration {
    destination_resource_arn = aws_cloudwatch_log_group.kb_logs[<span class="hljs-number">0</span>].arn
  }
  depends_on = [aws_cloudwatch_log_resource_policy.kb_logs]
}

resource <span class="hljs-string">"aws_cloudwatch_log_delivery"</span> <span class="hljs-string">"kb_logs_cloudwatch_logs"</span> {
  count                    = var.enable_kb_log_delivery_cloudwatch_logs ? <span class="hljs-number">1</span> : <span class="hljs-number">0</span>
  delivery_destination_arn = aws_cloudwatch_log_delivery_destination.kb_logs_cloudwatch_logs[<span class="hljs-number">0</span>].arn
  delivery_source_name     = aws_cloudwatch_log_delivery_source.kb_logs[<span class="hljs-number">0</span>].name
}
</code></pre>
<h2 id="heading-sending-logs-to-s3">Sending Logs to S3</h2>
<p>The process to enable S3 as a delivery destination follows a similar pattern. The first step is to create the S3 bucket using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/s3_bucket"><code>aws_s3_bucket</code> resource</a> with a bucket policy that provides the appropriate permissions for log delivery as described in the <a target="_blank" href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AWS-logs-and-resource-policy.html#AWS-logs-infrastructure-V2-S3">AWS documentation</a>. Note that if you are using SSE-KMS for server-side encryption, you’ll also add the appropriate permissions to the key policy for the CMK. For completeness, we also choose not to rely on CloudWatch Logs to set the bucket policy and instead use the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/s3_bucket_policy"><code>aws_s3_bucket_policy</code> resource</a> to manage it.</p>
<p>We also need to create a <a target="_blank" href="https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_DeliveryDestination.html">delivery destination</a> for it using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_log_delivery_destination"><code>aws_cloudwatch_log_delivery_destination</code> resource</a>, then establish the delivery from the source (i.e. the knowledge base) and the destination (i.e. the S3 bucket) using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_log_delivery"><code>aws_cloudwatch_log_delivery</code> resource</a>. Note that updating multiple <code>aws_cloudwatch_logs_delivery</code> resources in parallel will cause concurrency issues, so we must ensure that they are created sequentially using <code>depends_on</code> meta-argument. In this case, the delivery resource for S3 depends on that of CloudWatch Logs.</p>
<p>The resulting Terraform configuration should look like the following:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_s3_bucket"</span> <span class="hljs-string">"kb_logs_s3"</span> {
  count         = var.enable_kb_log_delivery_s3 ? <span class="hljs-number">1</span> : <span class="hljs-number">0</span>
  bucket        = <span class="hljs-string">"bedrock-kb-logs-${lower(var.kb_id)}-${local.region_short}-${local.account_id}"</span>
  force_destroy = true
}

resource <span class="hljs-string">"aws_s3_bucket_policy"</span> <span class="hljs-string">"kb_logs_s3"</span> {
  count  = var.enable_kb_log_delivery_s3 ? <span class="hljs-number">1</span> : <span class="hljs-number">0</span>
  bucket = aws_s3_bucket.kb_logs_s3[<span class="hljs-number">0</span>].id
  policy = jsonencode({
    Version = <span class="hljs-string">"2012-10-17"</span>
    Id      = <span class="hljs-string">"AWSLogDeliveryWrite20150319"</span>
    <span class="hljs-string">"Statement"</span> : [
      {
        Sid    = <span class="hljs-string">"AWSLogDeliveryWrite171157658"</span>
        Effect = <span class="hljs-string">"Allow"</span>
        Principal = {
          Service = <span class="hljs-string">"delivery.logs.amazonaws.com"</span>
        }
        Action   = <span class="hljs-string">"s3:PutObject"</span>
        Resource = <span class="hljs-string">"${aws_s3_bucket.kb_logs_s3[0].arn}/AWSLogs/${local.account_id}/bedrock/knowledgebases/*"</span>
        Condition = {
          StringEquals = {
            <span class="hljs-string">"aws:SourceAccount"</span> = <span class="hljs-string">"${local.account_id}"</span>
            <span class="hljs-string">"s3:x-amz-acl"</span>      = <span class="hljs-string">"bucket-owner-full-control"</span>
          }
          ArnLike = {
            <span class="hljs-string">"aws:SourceArn"</span> = <span class="hljs-string">"${aws_cloudwatch_log_delivery_source.kb_logs[0].arn}"</span>
          }
        }
      }
    ]
  })
}

resource <span class="hljs-string">"aws_cloudwatch_log_delivery_destination"</span> <span class="hljs-string">"kb_logs_s3"</span> {
  count = var.enable_kb_log_delivery_s3 ? <span class="hljs-number">1</span> : <span class="hljs-number">0</span>
  name  = <span class="hljs-string">"bedrock-kb-${var.kb_id}-s3"</span>
  delivery_destination_configuration {
    destination_resource_arn = aws_s3_bucket.kb_logs_s3[<span class="hljs-number">0</span>].arn
  }
  depends_on = [aws_s3_bucket_policy.kb_logs_s3[<span class="hljs-number">0</span>]]
}

resource <span class="hljs-string">"aws_cloudwatch_log_delivery"</span> <span class="hljs-string">"kb_logs_s3"</span> {
  count                    = var.enable_kb_log_delivery_s3 ? <span class="hljs-number">1</span> : <span class="hljs-number">0</span>
  delivery_destination_arn = aws_cloudwatch_log_delivery_destination.kb_logs_s3[<span class="hljs-number">0</span>].arn
  delivery_source_name     = aws_cloudwatch_log_delivery_source.kb_logs[<span class="hljs-number">0</span>].name
  depends_on               = [aws_cloudwatch_log_delivery.kb_logs_cloudwatch_logs]
}
</code></pre>
<h2 id="heading-sending-logs-to-data-firehose">Sending Logs to Data Firehose</h2>
<p>Sending logs to Data Firehose is slightly more involved because of the Firehose delivery stream’s configuration. Since this blog post does not focus on the downstream destination at the Firehose level, we will use an S3 bucket with basic configuration. To set up a Firehose delivery stream, we first need to create an IAM role that the delivery stream uses to send data to its destination (that is, the S3 bucket). <a target="_blank" href="https://docs.aws.amazon.com/firehose/latest/dev/controlling-access.html">Controlling access with Amazon Data Firehose</a> provides IAM policy examples for different configuration, including one for an <a target="_blank" href="https://docs.aws.amazon.com/firehose/latest/dev/controlling-access.html#using-iam-s3">S3 destination</a>. To create the Firehose delivery stream, we use the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/kinesis_firehose_delivery_stream"><code>aws_kinesis_firehose_delivery_stream</code> resource</a>. One thing to know is that the Firehose delivery stream needs to have the tag <code>LogDeliveryEnabled</code> set to <code>true</code>, which the service-linked role that CloudWatch Logs creates uses to write to Firehose delivery streams.</p>
<p>We also need to create a <a target="_blank" href="https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_DeliveryDestination.html">delivery destination</a> for it using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_log_delivery_destination"><code>aws_cloudwatch_log_delivery_destination</code> resource</a>, then establish the delivery from the source (i.e. the knowledge base) and the destination (i.e. the S3 bucket) using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_log_delivery"><code>aws_cloudwatch_log_delivery</code> resource</a>. To ensure that the delivery resources are created sequentially so to avoid concurrent modification issues, this delivery resource depends on that of S3.</p>
<p>The resulting Terraform configuration should look like the following:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_s3_bucket"</span> <span class="hljs-string">"kb_logs_data_firehose"</span> {
  count         = var.enable_kb_log_delivery_data_firehose ? <span class="hljs-number">1</span> : <span class="hljs-number">0</span>
  bucket        = <span class="hljs-string">"bedrock-kb-logs-data-firehose-${lower(var.kb_id)}-${local.region_short}-${local.account_id}"</span>
  force_destroy = true
}

resource <span class="hljs-string">"aws_iam_role"</span> <span class="hljs-string">"kb_logs_data_firehose"</span> {
  count = var.enable_kb_log_delivery_data_firehose ? <span class="hljs-number">1</span> : <span class="hljs-number">0</span>
  name  = <span class="hljs-string">"S3RoleForDataFirehose-bedrock-kb-logs-${var.kb_id}"</span>
  assume_role_policy = jsonencode({
    Version = <span class="hljs-string">"2012-10-17"</span>
    Statement = [
      {
        Action = <span class="hljs-string">"sts:AssumeRole"</span>
        Effect = <span class="hljs-string">"Allow"</span>
        Principal = {
          Service = <span class="hljs-string">"firehose.amazonaws.com"</span>
        }
        Condition = {
          StringEquals = {
            <span class="hljs-string">"sts:ExternalId"</span> = <span class="hljs-string">"${local.account_id}"</span>
          }
        }
      }
    ]
  })
}

resource <span class="hljs-string">"aws_iam_role_policy"</span> <span class="hljs-string">"kb_logs_data_firehose"</span> {
  count = var.enable_kb_log_delivery_data_firehose ? <span class="hljs-number">1</span> : <span class="hljs-number">0</span>
  name  = <span class="hljs-string">"S3PolicyForDataFirehose-bedrock-kb-logs-${var.kb_id}"</span>
  role  = aws_iam_role.kb_logs_data_firehose[<span class="hljs-number">0</span>].name
  policy = jsonencode({
    Version = <span class="hljs-string">"2012-10-17"</span>
    Statement = [
      {
        Action = [
          <span class="hljs-string">"s3:AbortMultipartUpload"</span>,
          <span class="hljs-string">"s3:GetBucketLocation"</span>,
          <span class="hljs-string">"s3:GetObject"</span>,
          <span class="hljs-string">"s3:ListBucket"</span>,
          <span class="hljs-string">"s3:ListBucketMultipartUploads"</span>,
          <span class="hljs-string">"s3:PutObject"</span>
        ]
        Effect = <span class="hljs-string">"Allow"</span>
        Resource = [
          aws_s3_bucket.kb_logs_data_firehose[<span class="hljs-number">0</span>].arn,
          <span class="hljs-string">"${aws_s3_bucket.kb_logs_data_firehose[0].arn}/*"</span>
        ]
      }
    ]
  })
}

resource <span class="hljs-string">"aws_kinesis_firehose_delivery_stream"</span> <span class="hljs-string">"kb_logs"</span> {
  count       = var.enable_kb_log_delivery_data_firehose ? <span class="hljs-number">1</span> : <span class="hljs-number">0</span>
  name        = <span class="hljs-string">"bedrock-kb-logs-${var.kb_id}"</span>
  destination = <span class="hljs-string">"extended_s3"</span>
  extended_s3_configuration {
    role_arn   = aws_iam_role.kb_logs_data_firehose[<span class="hljs-number">0</span>].arn
    bucket_arn = aws_s3_bucket.kb_logs_data_firehose[<span class="hljs-number">0</span>].arn
  }
  tags = {
    <span class="hljs-string">"LogDeliveryEnabled"</span> = <span class="hljs-string">"true"</span>
  }
  depends_on = [aws_iam_role_policy.kb_logs_data_firehose]
}

resource <span class="hljs-string">"aws_cloudwatch_log_delivery_destination"</span> <span class="hljs-string">"kb_logs_data_firehose"</span> {
  count = var.enable_kb_log_delivery_data_firehose ? <span class="hljs-number">1</span> : <span class="hljs-number">0</span>
  name  = <span class="hljs-string">"bedrock-kb-${var.kb_id}-data-firehose"</span>
  delivery_destination_configuration {
    destination_resource_arn = aws_kinesis_firehose_delivery_stream.kb_logs[<span class="hljs-number">0</span>].arn
  }
}

resource <span class="hljs-string">"aws_cloudwatch_log_delivery"</span> <span class="hljs-string">"kb_logs_data_firehose"</span> {
  count                    = var.enable_kb_log_delivery_data_firehose ? <span class="hljs-number">1</span> : <span class="hljs-number">0</span>
  delivery_destination_arn = aws_cloudwatch_log_delivery_destination.kb_logs_data_firehose[<span class="hljs-number">0</span>].arn
  delivery_source_name     = aws_cloudwatch_log_delivery_source.kb_logs[<span class="hljs-number">0</span>].name
  depends_on               = [aws_cloudwatch_log_delivery.kb_logs_s3]
}
</code></pre>
<h2 id="heading-testing-the-configuration">Testing the Configuration</h2>
<div data-node-type="callout">
<div data-node-type="callout-emoji">✅</div>
<div data-node-type="callout-text">You can find the complete Terraform configuration and source code in the <code>4_kb_logging</code> directory in <a target="_self" href="https://github.com/acwwat/terraform-amazon-bedrock-agent-example">this GitHub repository</a>.</div>
</div>

<p>To deploy and test the configuration, you need a knowledge base with at least one data source that has content to ingest either in an S3 bucket or a crawlable website. You can set this up in the Bedrock console using the vector database quick start options. Alternatively, deploy a sample knowledge base using the Terraform configuration from my blog post <a target="_blank" href="https://blog.avangards.io/how-to-manage-an-amazon-bedrock-knowledge-base-using-terraform">How To Manage an Amazon Bedrock Knowledge Base Using Terraform</a>. This configuration is also available in the same GitHub repository under the <code>2_knowledge_base</code> directory.</p>
<p>With the prerequisites in place, deploy the solution as follows:</p>
<ol>
<li><p>From the root of the cloned GitHub repository, navigate to <code>4_kb_logging</code>.</p>
</li>
<li><p>Copy <code>terraform.tfvars.example</code> as <code>terraform.tfvars</code> and update the variables to match your configuration.</p>
<ul>
<li>All log delivery destinations are enabled is enabled in <code>terraform.tfvars.example</code>. However, only delivery to CloudWatch Logs is enabled by default in the variable definition.</li>
</ul>
</li>
<li><p>Configure your AWS credentials.</p>
</li>
<li><p>Run <code>terraform init</code> and <code>terraform apply -var-file terraform.tfvars</code>.</p>
</li>
</ol>
<p>Once the configuration is applied, you can open the target knowledge base in the Amazon Bedrock Console and click <strong>Edit</strong> in the <strong>Knowledge Base overview</strong> section to review the logging configuration:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1740894678351/6cd65656-5868-4de0-bb41-8cc2fbfaae2e.png" alt="Edit button in the knowledge base page" class="image--center mx-auto" /></p>
<p>Assuming all three log destinations are enabled, it should look something like:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1740894741488/e8e03f79-8e0f-4f44-a9ea-a89395047a29.png" alt="Complete log delivery configuration for the knowledge base" class="image--center mx-auto" /></p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">⚠</div>
<div data-node-type="callout-text">While working on this blog post, I encountered an issue where the log deliveries section does not load and shows a spinner indefinitely if <a target="_self" href="https://aws.amazon.com/about-aws/whats-new/2025/01/aws-management-console-simultaneous-sign-in-multiple-accounts/">multi-session support</a> is enabled in the AWS Management Console. Disabling the feature will work around the problem. I have opened an AWS support case for this issue, which I hope will be fixed soon.</div>
</div>

<p>For good measure, we can perform a task with the knowledge base that generates application logs and see if logs are being delivered. At the time of writing, Bedrock Knowledge Bases only <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-bases-logging.html#knowledge-bases-logging-log-types">generate logs from ingestion job events</a>. As such, we can trigger a sync of a data source in the knowledge base. The log group should have logs similar to the following:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1740895521221/bf4d5b27-e97a-47e6-b147-b5010ad07f82.png" alt="Logs in CloudWatch Logs" class="image--center mx-auto" /></p>
<p>Next, the S3 bucket should have logs similar to the following:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1740895604201/8b8bce40-c171-4c41-9f82-428eb0b56361.png" alt="Logs in S3 bucket" class="image--center mx-auto" /></p>
<p>Lastly, the destination of the Firehose delivery stream, which in our case is another S3 bucket, should have logs similar to the following:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1740895771881/96deb299-e4dc-4abf-a6c7-7be3073c6f22.png" alt="Logs in S3 bucket that is set as the Firehose data stream's destination" class="image--center mx-auto" /></p>
<p>If you don’t need the resources after testing, be sure to delete them to avoid unexpected costs.</p>
<h2 id="heading-summary">Summary</h2>
<p>In this blog post, we examined how logging works for Amazon Bedrock Knowledge Bases, which uses the log delivery feature in CloudWatch Logs. We created and tested Terraform configuration that demonstrates knowledge base log delivery to all three supported destinations - CloudWatch Logs, S3, and Data Firehose. You can also repurpose the Terraform configuration for other AWS services that use the log delivery mechanism with minimal changes, should you have a need.</p>
<p>At this point, we have the know-how to write ingestion logs to CloudWatch Logs, so we can update the <a target="_blank" href="https://blog.avangards.io/building-a-data-ingestion-solution-for-amazon-bedrock-knowledge-bases">data ingestion solution</a> I previously wrote about to improve how ingestion job notifications are triggered. Please stay tuned for my next blog post on this topic. Thanks for reading, as always, and be sure to check out the <a target="_blank" href="https://blog.avangards.io/">Avangards Blog</a> for more AWS and Terraform content.</p>
]]></content:encoded></item><item><title><![CDATA[Building a Data Ingestion Solution for Amazon Bedrock Knowledge Bases]]></title><description><![CDATA[Introduction
2025 started off busy, and only recently have I had the chance to catch up on all the new Amazon Bedrock features that were launched during re:Invent 2024. These new capabilities make it easier than ever to build a comprehensive RAG solu...]]></description><link>https://blog.avangards.io/building-a-data-ingestion-solution-for-amazon-bedrock-knowledge-bases</link><guid isPermaLink="true">https://blog.avangards.io/building-a-data-ingestion-solution-for-amazon-bedrock-knowledge-bases</guid><category><![CDATA[AWS]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[Terraform]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Thu, 20 Feb 2025 06:05:36 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1740114333086/5a2bb4d2-8ef5-45f1-89ae-7abf226c0869.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>2025 started off busy, and only recently have I had the chance to catch up on <a target="_blank" href="https://community.aws/content/2pnoldBeF6BFQi6sFw5JwANNwMk/amazon-bedrock-re-invent-2024-features-launch-summary?lang=en">all the new Amazon Bedrock features that were launched during re:Invent 2024</a>. These new capabilities make it easier than ever to build a comprehensive RAG solution using a low-code approach. As I explored these new features, I realized that most, if not all, are functional features. I feel that there isn’t a lot of guidance on the operational aspects of Bedrock services, so I decided to write more about this topic.</p>
<p>A key component of any RAG system is the data ingestion pipeline. Amazon Bedrock Knowledge Bases has built-in data ingestion that does the heavy lifting, and synchronizations can be triggered on demand. From an operational perspective, the end-to-end process should ideally be automated and aligned with either the update cadence of the data source or a designated maintenance window. This is an ideal use case for a Lambda-based automation solution, for which I have built a basic version. In this blog post, we’ll walk through its design and implementation.</p>
<h2 id="heading-design-overview">Design Overview</h2>
<p>The overall design of the solution is depicted in the following diagram:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1740114195189/9c70f6b9-3097-4e81-854a-920a5469131f.png" alt="Solution architecture" class="image--center mx-auto" /></p>
<p>The solution works as follows:</p>
<ol>
<li><p>A Lambda function, triggered by an EventBridge schedule rule, periodically starts an ingestion (a.k.a. sync) job for each specified knowledge base and data source. The function also sends a message with job ID information to an SQS queue.</p>
</li>
<li><p>Another Lambda function, triggered by an EventBridge schedule rule, frequently fetches messages from the SQS queue. For each message, the function uses the job ID information to get details about the ingestion job. The message is removed from the queue if the job has completed, and a notification is sent to one of the two SNS topics depending on whether the job is successful or failed.</p>
</li>
<li><p>(Optional) Additional downstream tasks can be performed by subscribing to the SNS topics.</p>
</li>
</ol>
<p>With the high-level architecture in mind, let’s now dive into the detailed design of each major component.</p>
<h2 id="heading-component-design-starting-ingestion-jobs">Component Design: Starting Ingestion Jobs</h2>
<p>Amazon Bedrock Knowledge Bases simplifies ingestion for data sources it natively supports such as <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/s3-data-source-connector.html">Amazon S3</a> and <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/webcrawl-data-source-connector.html">Web Crawler</a>. Its default parsing logic covers most cases, while its default chunking logic allows the selection of different strategies that could improve data retrieval quality.</p>
<p>For Amazon S3, while it is possible to start ingesting data as the S3 bucket is updated (using <a target="_blank" href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/EventNotifications.html">S3 event notifications</a> for instance), it is not recommended when the knowledge base is in use especially during high usage periods. For Web Crawler, detecting website updates - especially one you don't own - is often difficult. Instead, a more reliable approach is to schedule ingestion during a maintenance window (for example, after midnight). This, as we know, can easily be implemented using an <a target="_blank" href="https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-create-rule-schedule.html">EventBridge rule that runs on a schedule</a>.</p>
<p>When using default settings, you simply need to <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/kb-data-source-sync-ingest.html">synchronize the data source</a> when the data source is updated. This is done programmatically via the <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_StartIngestionJob.html">StartIngestionJob action</a> in the Agents for Amazon Bedrock API. This action is available as <a target="_blank" href="https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-agent/client/start_ingestion_job.html">the start_ingestion_job method</a> in Boto3, which is used in our Python-based Lambda function.</p>
<p>Since ingestion is asynchronous, you must check job statuses separately. In our event-based setup, each job’s details (knowledge base ID, data source ID, ingestion job ID) is passed to the other Lambda function that checks ingestion job statuses. To facilitate communication of the decoupled components, we can use an SNS topic, SQS queue, or DynamoDB table. Since long-running jobs require multiple status checks, an <a target="_blank" href="https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/standard-queues.html">SQS standard queue</a> seems like a better fit.</p>
<p>Lastly, we need to configure which knowledge bases, data sources, and SQS queues the Lambda function should manage. <a target="_blank" href="https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-parameter-store.html">AWS Systems Manager Parameter Store</a> is an obvious choice. To structure the knowledge base and data source information, we’ll use a JSON list that contains an object with the knowledge base ID and the list of data source IDs, for example:</p>
<pre><code class="lang-json">[
  {
    <span class="hljs-attr">"knowledge_base_id"</span> = <span class="hljs-attr">"YO4R9AYHQZ"</span>,
    <span class="hljs-attr">"data_source_ids"</span>   = [<span class="hljs-attr">"5IHZ5YAIBY"</span>]
  }
]
</code></pre>
<p>Now that we've outlined the design, let's look at the implementation of the Lambda function:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> boto3
<span class="hljs-keyword">import</span> json
<span class="hljs-keyword">from</span> botocore.exceptions <span class="hljs-keyword">import</span> ClientError

bedrock_agent = boto3.client(<span class="hljs-string">'bedrock-agent'</span>)
sqs = boto3.client(<span class="hljs-string">'sqs'</span>)
ssm = boto3.client(<span class="hljs-string">'ssm'</span>)


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">lambda_handler</span>(<span class="hljs-params">event, context</span>):</span>
    <span class="hljs-keyword">try</span>:
        <span class="hljs-comment"># Retrieve the JSON config from Parameter Store</span>
        response = ssm.get_parameter(Name=<span class="hljs-string">'/start-kb-ingestion-jobs/config-json'</span>)
        config_json = response[<span class="hljs-string">'Parameter'</span>][<span class="hljs-string">'Value'</span>]
        config = json.loads(config_json)

        <span class="hljs-keyword">for</span> record <span class="hljs-keyword">in</span> config:
            knowledge_base_id = record.get(<span class="hljs-string">'knowledge_base_id'</span>)
            <span class="hljs-keyword">for</span> data_source_id <span class="hljs-keyword">in</span> record.get(<span class="hljs-string">'data_source_ids'</span>):
                <span class="hljs-comment"># Start the ingestion job</span>
                print(<span class="hljs-string">f'Starting ingestion job for data source <span class="hljs-subst">{data_source_id}</span> of knowledge base <span class="hljs-subst">{knowledge_base_id}</span>'</span>)
                response = bedrock_agent.start_ingestion_job(
                    knowledgeBaseId=knowledge_base_id,
                    dataSourceId=data_source_id
                )
                ingestion_job_id = response[<span class="hljs-string">'ingestionJob'</span>][<span class="hljs-string">'ingestionJobId'</span>]

                <span class="hljs-comment"># Send a message to the SQS queue</span>
                response = ssm.get_parameter(Name=<span class="hljs-string">'/start-kb-ingestion-jobs/sqs-queue-url'</span>)
                sqs_queue_url = response[<span class="hljs-string">'Parameter'</span>][<span class="hljs-string">'Value'</span>]
                message = {
                    <span class="hljs-string">'knowledge_base_id'</span>: knowledge_base_id,
                    <span class="hljs-string">'data_source_id'</span>: data_source_id,
                    <span class="hljs-string">'ingestion_job_id'</span>: ingestion_job_id
                }
                sqs.send_message(
                    QueueUrl=sqs_queue_url,
                    MessageBody=json.dumps(message)
                )
        <span class="hljs-keyword">return</span> {
            <span class="hljs-string">'statusCode'</span>: <span class="hljs-number">200</span>,
            <span class="hljs-string">'body'</span>: <span class="hljs-string">'Success'</span>
        }
    <span class="hljs-keyword">except</span> ClientError <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">return</span> {
            <span class="hljs-string">'statusCode'</span>: <span class="hljs-number">500</span>,
            <span class="hljs-string">'body'</span>: <span class="hljs-string">f'Client error: <span class="hljs-subst">{str(e)}</span>'</span>
        }
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">return</span> {
            <span class="hljs-string">'statusCode'</span>: <span class="hljs-number">500</span>,
            <span class="hljs-string">'body'</span>: <span class="hljs-string">f'Unexpected error: <span class="hljs-subst">{str(e)}</span>'</span>
        }
</code></pre>
<h2 id="heading-component-design-checking-ingestion-job-statuses"><strong>Component Design: Checking Ingestion Job Statuses</strong></h2>
<p>The second piece of the puzzle is checking the status of any ingestion jobs initiated by the solution and sending notifications when a job completes, fails, or is canceled. Notifications should also include any warnings to highlight issues ingesting specific documents that could impact the quality of the RAG solution.</p>
<p>To retrieve ingestion job details, the <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_GetIngestionJob.html">GetIngestionJob action</a> in the Agents for Amazon Bedrock API can be used. The Boto3 equivalent would be the <a target="_blank" href="https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-agent/client/get_ingestion_job.html">get_ingestion_job method</a>. The required parameters are included in each SQS message sent by the Lambda function that starts ingestion jobs as described earlier. The approach is to poll and process SQS messages on a schedule (for example, every five minutes).</p>
<p>SQS polling can be tricky, especially if you’re unfamiliar with it. <a target="_blank" href="https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-short-and-long-polling.html">Short and long polling</a> determine how long a <a target="_blank" href="https://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_ReceiveMessage.html">ReceiveMessage API request</a> (or its Boto3 equivalent, the <a target="_blank" href="https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sqs/client/receive_message.html">receive_message method</a>) waits for at least one message to show up in the queue. Here, long polling is not ideal since the best practice is to minimize Lambda function runtime when idle. It is also unnecessary given the Lambda function is run on a schedule anyway. Additionally, SQS’s distributed nature affects how many messages the ReceiveMessage API returns. Even with a higher <code>MaxNumberOfMessages</code> number, receiving multiple messages aren’t guaranteed in each call. Consequently, you must call the API until it returns empty result.</p>
<p>For notifications, the publish-subscribe pattern fits the scenario well. We’ll use two Amazon SNS topics: one for successful job completions and another for failures and cancellations. Administrators can handle subscriptions downstream as needed, whether via email or additional Lambda processing.</p>
<p>Finally, SSM Parameter Store will store configuration details, including the SQS queue URL and SNS topic ARNs, ensuring consistency. The resulting Lambda function could look like this:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> boto3
<span class="hljs-keyword">import</span> json
<span class="hljs-keyword">from</span> botocore.exceptions <span class="hljs-keyword">import</span> ClientError

bedrock_agent = boto3.client(<span class="hljs-string">'bedrock-agent'</span>)
ssm = boto3.client(<span class="hljs-string">'ssm'</span>)
sns = boto3.client(<span class="hljs-string">'sns'</span>)
sqs = boto3.client(<span class="hljs-string">'sqs'</span>)


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_ssm_parameter</span>(<span class="hljs-params">name</span>):</span>
    response = ssm.get_parameter(Name=name, WithDecryption=<span class="hljs-literal">True</span>)
    <span class="hljs-keyword">return</span> response[<span class="hljs-string">'Parameter'</span>][<span class="hljs-string">'Value'</span>]


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_ingestion_job</span>(<span class="hljs-params">knowledge_base_id, data_source_id, ingestion_job_id</span>):</span>
    response = bedrock_agent.get_ingestion_job(
        knowledgeBaseId=knowledge_base_id,
        dataSourceId=data_source_id,
        ingestionJobId=ingestion_job_id
    )
    <span class="hljs-keyword">return</span> response[<span class="hljs-string">'ingestionJob'</span>]


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">lambda_handler</span>(<span class="hljs-params">event, context</span>):</span>
    <span class="hljs-keyword">try</span>:
        sqs_queue_url = get_ssm_parameter(<span class="hljs-string">'/check-kb-ingestion-job-statuses/sqs-queue-url'</span>)
        success_sns_topic_arn = get_ssm_parameter(<span class="hljs-string">'/check-kb-ingestion-job-statuses/success-sns-topic-arn'</span>)
        failure_sns_topic_arn = get_ssm_parameter(<span class="hljs-string">'/check-kb-ingestion-job-statuses/failure-sns-topic-arn'</span>)

        response = sqs.receive_message(
            QueueUrl=sqs_queue_url,
            MaxNumberOfMessages=<span class="hljs-number">10</span>
        )
        <span class="hljs-keyword">while</span> <span class="hljs-string">'Messages'</span> <span class="hljs-keyword">in</span> response:
            messages = response[<span class="hljs-string">'Messages'</span>]
            <span class="hljs-keyword">for</span> message <span class="hljs-keyword">in</span> messages:
                body = json.loads(message[<span class="hljs-string">'Body'</span>])
                knowledge_base_id = body[<span class="hljs-string">'knowledge_base_id'</span>]
                data_source_id = body[<span class="hljs-string">'data_source_id'</span>]
                ingestion_job_id = body[<span class="hljs-string">'ingestion_job_id'</span>]

                print(
                    <span class="hljs-string">f'Checking ingestion job status for knowledge base <span class="hljs-subst">{knowledge_base_id}</span> data source <span class="hljs-subst">{data_source_id}</span> job <span class="hljs-subst">{ingestion_job_id}</span>'</span>)
                ingestion_job = get_ingestion_job(knowledge_base_id, data_source_id, ingestion_job_id)
                print(
                    <span class="hljs-string">f'Ingestion job summary: \n\n<span class="hljs-subst">{json.dumps(ingestion_job, indent=<span class="hljs-number">2</span>, sort_keys=<span class="hljs-literal">True</span>, default=str)}</span>'</span>)
                job_status = ingestion_job[<span class="hljs-string">'status'</span>]
                <span class="hljs-keyword">if</span> job_status == <span class="hljs-string">'COMPLETE'</span>:
                    sns.publish(
                        TopicArn=success_sns_topic_arn,
                        Subject=<span class="hljs-string">f'Ingestion job for knowledge base <span class="hljs-subst">{knowledge_base_id}</span> data source <span class="hljs-subst">{data_source_id}</span> job <span class="hljs-subst">{ingestion_job_id}</span> Completed'</span>,
                        Message=json.dumps(ingestion_job, indent=<span class="hljs-number">2</span>, sort_keys=<span class="hljs-literal">True</span>, default=str)
                    )
                <span class="hljs-keyword">elif</span> job_status == <span class="hljs-string">'FAILED'</span>:
                    sns.publish(
                        TopicArn=failure_sns_topic_arn,
                        Subject=<span class="hljs-string">f'Ingestion job for knowledge base <span class="hljs-subst">{knowledge_base_id}</span> data source <span class="hljs-subst">{data_source_id}</span> job <span class="hljs-subst">{ingestion_job_id}</span> FAILED'</span>,
                        Message=json.dumps(ingestion_job, indent=<span class="hljs-number">2</span>, sort_keys=<span class="hljs-literal">True</span>, default=str)
                    )
                <span class="hljs-keyword">elif</span> job_status == <span class="hljs-string">'STOPPED'</span>:
                    sns.publish(
                        TopicArn=failure_sns_topic_arn,
                        Subject=<span class="hljs-string">f'Ingestion job for knowledge base <span class="hljs-subst">{knowledge_base_id}</span> data source <span class="hljs-subst">{data_source_id}</span> job <span class="hljs-subst">{ingestion_job_id}</span> STOPPED'</span>,
                        Message=json.dumps(ingestion_job, indent=<span class="hljs-number">2</span>, sort_keys=<span class="hljs-literal">True</span>, default=str)
                    )

                <span class="hljs-keyword">if</span> job_status <span class="hljs-keyword">in</span> [<span class="hljs-string">'COMPLETE'</span>, <span class="hljs-string">'FAILED'</span>, <span class="hljs-string">'STOPPED'</span>]:
                    sqs.delete_message(
                        QueueUrl=sqs_queue_url,
                        ReceiptHandle=message[<span class="hljs-string">'ReceiptHandle'</span>]
                    )
            response = sqs.receive_message(
                QueueUrl=sqs_queue_url,
                MaxNumberOfMessages=<span class="hljs-number">10</span>
            )
        <span class="hljs-keyword">return</span> {
            <span class="hljs-string">'statusCode'</span>: <span class="hljs-number">200</span>,
            <span class="hljs-string">'body'</span>: <span class="hljs-string">'Success'</span>
        }
    <span class="hljs-keyword">except</span> ClientError <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">return</span> {
            <span class="hljs-string">'statusCode'</span>: <span class="hljs-number">500</span>,
            <span class="hljs-string">'body'</span>: <span class="hljs-string">f'Client error: <span class="hljs-subst">{str(e)}</span>'</span>
        }
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">return</span> {
            <span class="hljs-string">'statusCode'</span>: <span class="hljs-number">500</span>,
            <span class="hljs-string">'body'</span>: <span class="hljs-string">f'Unexpected error: <span class="hljs-subst">{str(e)}</span>'</span>
        }
</code></pre>
<h2 id="heading-terraform-configuration-design">Terraform Configuration Design</h2>
<p>No solution is complete without an easy way to deploy it. Terraform is an obvious choice given my advocacy for it, however any IaC solutions or even frameworks such as <a target="_blank" href="https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/what-is-sam.html">AWS SAM</a> would do. Since the event-based automation architecture is quite typical, I won’t go into too much details on how the Terraform configuration is developed. But here is a snippet related to the first Lambda function:</p>
<pre><code class="lang-dockerfile"><span class="hljs-comment"># Data sources and local values omitted for brevity</span>

resource <span class="hljs-string">"aws_sqs_queue"</span> <span class="hljs-string">"check_kb_ingestion_job_statuses"</span> {
  name = <span class="hljs-string">"check-kb-ingestion-job-statuses"</span>
}

resource <span class="hljs-string">"aws_ssm_parameter"</span> <span class="hljs-string">"start_kb_ingestion_jobs_config_json"</span> {
  name  = <span class="hljs-string">"/start-kb-ingestion-jobs/config-json"</span>
  type  = <span class="hljs-string">"String"</span>
  value = jsonencode(var.start_kb_ingestion_jobs_config_json)
}

resource <span class="hljs-string">"aws_ssm_parameter"</span> <span class="hljs-string">"start_kb_ingestion_jobs_sqs_queue_url"</span> {
  name  = <span class="hljs-string">"/start-kb-ingestion-jobs/sqs-queue-url"</span>
  type  = <span class="hljs-string">"String"</span>
  value = aws_sqs_queue.check_kb_ingestion_job_statuses.id
}

resource <span class="hljs-string">"aws_iam_role"</span> <span class="hljs-string">"lambda_start_kb_ingestion_jobs"</span> {
  name = <span class="hljs-string">"FunctionExecutionRoleForLambda-start-kb-ingestion-jobs"</span>
  assume_role_policy = jsonencode({
    Version = <span class="hljs-string">"2012-10-17"</span>
    Statement = [
      {
        Action = <span class="hljs-string">"sts:AssumeRole"</span>
        Effect = <span class="hljs-string">"Allow"</span>
        Principal = {
          Service = <span class="hljs-string">"lambda.amazonaws.com"</span>
        }
        Condition = {
          StringEquals = {
            <span class="hljs-string">"aws:SourceAccount"</span> = <span class="hljs-string">"${local.account_id}"</span>
          }
        }
      }
    ]
  })
}

resource <span class="hljs-string">"aws_iam_role_policy_attachment"</span> <span class="hljs-string">"lambda_start_kb_ingestion_jobs_lambda_basic_execution"</span> {
  role       = aws_iam_role.lambda_start_kb_ingestion_jobs.name
  policy_arn = data.aws_iam_policy.lambda_basic_execution.arn
}

resource <span class="hljs-string">"aws_iam_role_policy"</span> <span class="hljs-string">"lambda_start_kb_ingestion_jobs"</span> {
  name = <span class="hljs-string">"FunctionExecutionRolePolicyForLambda-start-kb-ingestion-jobs"</span>
  role = aws_iam_role.lambda_start_kb_ingestion_jobs.name
  policy = jsonencode({
    Version = <span class="hljs-string">"2012-10-17"</span>
    Statement = [
      {
        Action = [
          <span class="hljs-string">"ssm:GetParameter"</span>,
          <span class="hljs-string">"ssm:GetParameters"</span>,
          <span class="hljs-string">"ssm:GetParametersByPath"</span>
        ]
        Effect   = <span class="hljs-string">"Allow"</span>
        Resource = <span class="hljs-string">"arn:${local.partition}:ssm:${local.region}:${local.account_id}:parameter/*"</span>
      },
      {
        Action   = <span class="hljs-string">"sqs:SendMessage"</span>
        Effect   = <span class="hljs-string">"Allow"</span>
        Resource = <span class="hljs-string">"arn:${local.partition}:sqs:${local.region}:${local.account_id}:*"</span>
      },
      {
        Action = [
          <span class="hljs-string">"bedrock:StartIngestionJob"</span>
        ]
        Effect   = <span class="hljs-string">"Allow"</span>
        Resource = <span class="hljs-string">"arn:${local.partition}:bedrock:${local.region}:${local.account_id}:knowledge-base/*"</span>
      }
    ]
  })
}

resource <span class="hljs-string">"aws_lambda_function"</span> <span class="hljs-string">"start_kb_ingestion_jobs"</span> {
  function_name = <span class="hljs-string">"start-kb-ingestion-jobs"</span>
  role          = aws_iam_role.lambda_start_kb_ingestion_jobs.arn
  description   = <span class="hljs-string">"Lambda function that starts ingestion jobs for Bedrock Knowledge Bases"</span>
  filename      = data.archive_file.start_kb_ingestion_jobs_zip.output_path
  handler       = <span class="hljs-string">"index.lambda_handler"</span>
  runtime       = <span class="hljs-string">"python3.13"</span>
  architectures = [<span class="hljs-string">"arm64"</span>]
  timeout       = <span class="hljs-number">60</span>
  <span class="hljs-comment"># source_code_hash is required to detect changes to Lambda code/zip</span>
  source_code_hash = data.archive_file.start_kb_ingestion_jobs_zip.output_base64sha256
}

resource <span class="hljs-string">"aws_cloudwatch_event_rule"</span> <span class="hljs-string">"start_kb_ingestion_jobs"</span> {
  name                = <span class="hljs-string">"lambda-start-kb-ingestion-jobs"</span>
  schedule_expression = var.start_kb_ingestion_jobs_schedule
}

resource <span class="hljs-string">"aws_cloudwatch_event_target"</span> <span class="hljs-string">"start_kb_ingestion_jobs"</span> {
  rule = aws_cloudwatch_event_rule.start_kb_ingestion_jobs.name
  arn  = aws_lambda_function.start_kb_ingestion_jobs.arn
}

resource <span class="hljs-string">"aws_lambda_permission"</span> <span class="hljs-string">"start_kb_ingestion_jobs"</span> {
  statement_id  = <span class="hljs-string">"AllowExecutionFromCloudWatch"</span>
  action        = <span class="hljs-string">"lambda:InvokeFunction"</span>
  function_name = aws_lambda_function.start_kb_ingestion_jobs.function_name
  principal     = <span class="hljs-string">"events.amazonaws.com"</span>
  source_arn    = aws_cloudwatch_event_rule.start_kb_ingestion_jobs.arn
}

<span class="hljs-comment"># Remaining resources omitted for brevity</span>
</code></pre>
<p>What you will notice is that:</p>
<ul>
<li><p>The IAM and resource policies follow the <a target="_blank" href="https://docs.aws.amazon.com/wellarchitected/latest/framework/sec_permissions_least_privileges.html">least privilege principle</a>.</p>
</li>
<li><p>The SSM parameters uses a simple <a target="_blank" href="https://docs.aws.amazon.com/systems-manager/latest/userguide/sysman-paramstore-hierarchies.html">parameter hierarchy</a> based on the Lambda function name. If deploying multiple copies of the solution, consider using a variable to set the function name and related source names.</p>
</li>
<li><p>The Lambda function uses the <a target="_blank" href="https://docs.aws.amazon.com/lambda/latest/dg/foundation-arch.html#foundation-arch-adv">arm64 architecture</a> for ~20% better cost efficiency per GB-second/month vs. x86_64, though the impact in this solution is negligible.</p>
</li>
<li><p>The default Lambda execution timeout of three seconds is insufficient. Increasing it to 60 seconds provides a good safety net.</p>
</li>
</ul>
<h2 id="heading-deploying-and-testing-the-solution">Deploying and Testing the Solution</h2>
<div data-node-type="callout">
<div data-node-type="callout-emoji">✅</div>
<div data-node-type="callout-text">You can find the complete Terraform configuration and source code in the <code>3_kb_data_ingestion</code> directory in <a target="_self" href="https://github.com/acwwat/terraform-amazon-bedrock-agent-example">this GitHub repository</a>.</div>
</div>

<p>To deploy and test the solution, you need a knowledge base with at least one data source that has content to ingest either in an S3 bucket or a crawlable website. You can set this up in the Bedrock console using the vector database quick start options. Alternatively, deploy a sample knowledge base using the Terraform configuration from my blog post <a target="_blank" href="https://blog.avangards.io/how-to-manage-an-amazon-bedrock-knowledge-base-using-terraform">How To Manage an Amazon Bedrock Knowledge Base Using Terraform</a>. This configuration is also available in the same GitHub repository under the <code>2_knowledge_base</code> directory.</p>
<p>With the prerequisites in place, deploy the solution as follows:</p>
<ol>
<li><p>From the root of the cloned GitHub repository, navigate to <code>3_kb_data_ingestion</code>.</p>
</li>
<li><p>Copy <code>terraform.tfvars.example</code> as <code>terraform.tfvars</code> and update the variables to match your configuration.</p>
<ul>
<li>By default, the schedule for the <code>start-kb-ingestion-jobs</code> Lambda function is daily at 0:00 UTC, while the schedule for the <code>check_kb_ingestion_job_statuses</code> Lambda function is every five minutes.</li>
</ul>
</li>
<li><p>Configure your AWS credentials.</p>
</li>
<li><p>Run <code>terraform init</code> and <code>terraform apply -var-file terraform.tfvars</code>.</p>
</li>
</ol>
<p>Once deployed, test the solution by adding an email subscription to the SNS topics <code>check-kb-ingestion-job-statuses-success</code> and <code>check-kb-ingestion-job-statuses-failure</code> for your e-mail address so that you can receive email notifications. Confirm your subscriptions using the link in the verification emails.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1739935187612/a26b3199-2427-4317-ac84-5f7461d490d1.png" alt="Adding an email subscription to the SNS topics" class="image--center mx-auto" /></p>
<p>Next, manually invoke the <code>start-kb-ingestion-jobs</code> Lambda function in the Lambda console within the five minutes of the scheduled <code>check_kb_ingestion_job_statuses</code> runs.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1739934854610/a78a6963-3611-4fa5-b7b5-d7ab3c40291f.png" alt="Invoking the start-kb-ingestion-jobs Lambda function manually" class="image--center mx-auto" /></p>
<p>After the next <code>check_kb_ingestion_job_statuses</code> invocation completes at the scheduled time, check the CloudWatch logs to confirm it ran successfully. You should also receive a an email from SNS with its status. Here is an example:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1739974970838/b12c9fbd-76e3-4264-894b-01189ab3605b.png" alt="Success email notification" class="image--center mx-auto" /></p>
<p>The message shows several warnings about files that couldn’t be ingested. In my case, I used a S3 bucket that contained <code>tar.gz</code> files, which are <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-ds.html">unsupported</a>. There was also one file exceeding the 50 MB limit, so it was also skipped. With this information, you can remediate these issues to ensure good data input to the knowledge base.</p>
<p>Once you've verified the solution works, remove the SNS subscription and replace it with those that better fit your needs. If you don’t plan to keep the knowledge base, delete it along with the vector store (for example, the OSS index) to avoid unnecessary costs.</p>
<h2 id="heading-summary">Summary</h2>
<p>In this blog post, we examined the development of a Lambda-based data ingestion solution for Bedrock knowledge bases. The design follows a familiar event-based pattern, leveraging AWS APIs to perform the required tasks and Terraform for deployment. That said, while checking ingestion job statues on a schedule works, it is not as efficient as a true event-based system which Amazon Bedrock does not seem to support at the moment. Perhaps I can update the solution once more support for data ingestion job events are added in the future.</p>
<p>Another interesting experience while writing this blog post is that I used <a target="_blank" href="https://github.com/features/copilot">GitHub Copilot</a> to develop both the Lambda functions and the Terraform configuration. While it didn’t produce ready-to-use code, it saved me significant time. I’ll share more about this experience in my next blog post.</p>
<p>In the mean time, please check out other helpful content in the <a target="_blank" href="https://blog.avangards.io/">Avangards Blog</a>. Thanks for reading!</p>
]]></content:encoded></item><item><title><![CDATA[Encrypting EBS Volumes of Amazon EC2 Instances Using Python]]></title><description><![CDATA[Introduction
Ensuring compliance with stringent security requirements often leads to unexpected challenges - here’s one I recently tackled. The account in question has hundreds of EC2 instances with EBS volumes that are encrypted with the KMS AWS man...]]></description><link>https://blog.avangards.io/encrypting-ebs-volumes-of-amazon-ec2-instances-using-python</link><guid isPermaLink="true">https://blog.avangards.io/encrypting-ebs-volumes-of-amazon-ec2-instances-using-python</guid><category><![CDATA[AWS]]></category><category><![CDATA[Security]]></category><category><![CDATA[ec2]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Mon, 20 Jan 2025 02:13:52 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1737399441774/84200cf4-4d74-4b38-b6e9-a07e831f2adb.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>Ensuring compliance with stringent security requirements often leads to unexpected challenges - here’s one I recently tackled. The account in question has hundreds of EC2 instances with EBS volumes that are encrypted with the KMS AWS managed key <code>aws/ebs</code>. Due to more stringent security compliance requirements, the encryption key must be rotated every 90 days. The <a target="_blank" href="https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#aws-managed-cmk">default one-year rotation period of the AWS managed key</a> no longer suffices, thus all EC2 instance volumes must be re-encrypted with a custom managed key (CMK) that provides more control.</p>
<p>With a looming deadline, it was simply not feasible to manually re-encrypt all EBS volumes. Thus I set out to find a tool or script that can automate this daunting task before resorting to developing my own (even if it’s AI-generated). Luckily I found a GitHub repository with a script that meets 90% of my needs, and made perfect after some enhancements. I’d like to share my experience and the resulting script in this blog post in case it benefits any fellow builders facing the same problem. Let’s first set the stage by examining the encryption workflow.</p>
<h2 id="heading-the-encryption-workflow">The Encryption Workflow</h2>
<p>To encrypt or re-encrypt an EBS volume that is attached to an EC2 instance, it is unfortunately not as simple as setting a KMS key ID on the volume. The process is a bit roundabout and involves the following steps:</p>
<ol>
<li><p>Shut down the EC2 instance.</p>
</li>
<li><p>Create a snapshot of the volume.</p>
</li>
<li><p>Create a new volume from the previously created snapshot, while enabling encryption with the new KMS key. Ensure that you select the same availability zone as the original volume and apply any volume settings.</p>
</li>
<li><p>Detach the original volume from the EC2 instance while making note of the device name to which the volume is attached.</p>
</li>
<li><p>Attach the new volume to the EC2 instance with the same device name as above.</p>
</li>
<li><p>Repeat step 2 to step 6 for any additional volumes that the EC2 instance has.</p>
</li>
<li><p>Start the EC2 instance and verify that it is running properly.</p>
</li>
<li><p>Delete the original volumes and the snapshots taken during this process as appropriate.</p>
</li>
</ol>
<p>There are other <a target="_blank" href="https://aws.amazon.com/blogs/compute/must-know-best-practices-for-amazon-ebs-encryption/">considerations and best practices for Amazon EBS encryption</a> with auto scaling groups, spot instances, and snapshot sharing, however they are not relevant for the basic scenario for this blog post.</p>
<p>As you can see, the workflow is quite involved and thus makes a great candidate for automation.</p>
<h2 id="heading-leveraging-an-existing-script-on-github">Leveraging an Existing Script on GitHub</h2>
<p>The need to encrypt or re-encrypt EBS volumes is not uncommon, so I figure that someone would have developed tools and scripts for it. Indeed, a quick Google search yielded three possible options:</p>
<ol>
<li><p><a target="_blank" href="https://github.com/dwbelliston/aws_volume_encryption">dwbelliston/aws_volume_encryption</a> - a Python-based script developed by <a target="_blank" href="https://github.com/dwbelliston">Dustin Belliston</a> that orchestrates encryption of EBS volumes of an EC2 instance.</p>
</li>
<li><p><a target="_blank" href="https://github.com/jbrt/ec2cryptomatic">jbrt/ec2cryptomatic</a> - a Go-based tool developed by <a target="_blank" href="https://github.com/jbrt">Julien B.</a> that is very similar to the Python solution above, but with a few more quality-of-life features.</p>
</li>
<li><p><a target="_blank" href="https://github.com/aws-samples/aws-system-manager-automation-unencrypted-to-encrypted-resources">aws-samples/aws-system-manager-automation-unencrypted-to-encrypted-resources</a> - an AWS solution that automatically remediates unencrypted EBS and RDS resources using AWS Config and SSM Automation.</p>
</li>
</ol>
<p>Given Boto3 and Python are part of my preferred toolset, I decided to leverage the aws_volume_encryption solution as my starting point. The repository has a well-written <a target="_blank" href="https://github.com/dwbelliston/aws_volume_encryption/blob/master/README.md">README file</a> that provides usage instructions and detailed explanation on what each section of the code does. Be sure to check it out so you understand the general architecture and usage of the script.</p>
<h2 id="heading-enhancing-the-existing-script">Enhancing the Existing Script</h2>
<p>Although developed years ago, the original script remains fully functional, proving its reliability. That said, I have identified a few small enhancements that could improve the usability of the script:</p>
<ol>
<li><p>Defer to <a target="_blank" href="https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html">Boto3’s default credential search mechanism</a> instead of adding redundant options to the script.</p>
</li>
<li><p>Create an encrypted volume directly from an encrypted snapshot.</p>
</li>
<li><p>Add KMS key ID validation and skip encryption of volumes that are already encrypted with the provided KMS key. The KMS key ID can be <a target="_blank" href="https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_CreateVolume.html">any of the four supported formats</a> by the AWS API.</p>
</li>
<li><p>Add option to preserve the original volumes and add metadata tags (prefixed by <code>VolumeEncryptionMetadata:</code>) to them in case volume changes need to be reverted.</p>
</li>
<li><p>Improve console logging with timestamps and more details.</p>
</li>
</ol>
<p>These enhancements allow me to test and benchmark the script more effectively, and they provide me an extra level of assurance when working on production workloads.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">You can find the source code in my forked <a target="_self" href="https://github.com/acwwat/aws-volume-encryption">GitHub repository</a> that accompanies this blog post.</div>
</div>

<h2 id="heading-running-the-script">Running the Script</h2>
<p>To test the script, create a Windows EC2 instance with an unencrypted root volume and a data volume that is encrypted using the AWS managed key:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1737332004260/834ad8da-1ed5-452f-91e8-3439c07b45e6.png" alt="Unencrypted root volume and a data volume that is encrypted with the AWS managed key" class="image--center mx-auto" /></p>
<p>Once the EC2 instance is running, connect to Windows, initialize the second volume as D: drive, and add a text file to help with validation later.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1737399248759/91eab801-ecdb-4715-aed6-a85ab654bce1.png" alt="The data volume initialized as D: drive" class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1737332229434/775aa010-49da-4a98-867c-b1d29d624711.png" alt="D:ello.txt for later validation" class="image--center mx-auto" /></p>
<p>Then create a KMS CMK that will be used to encrypt the EBS volumes of the EC2 instance.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1737332374480/b8649f27-8b0b-4efc-9408-bc58f0a8dd5f.png" alt="KMS customer managed key" class="image--center mx-auto" /></p>
<p>Lastly, clone the <a target="_blank" href="https://github.com/acwwat/aws-volume-encryption">GitHub repository</a> or download the code zip file, then follow the <a target="_blank" href="https://github.com/acwwat/aws-volume-encryption/blob/main/README.md">README file</a> to set up the prerequisites including <a target="_blank" href="https://www.python.org/downloads/">Python 3.x</a> and the <a target="_blank" href="https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html">AWS CLI</a>. Since the script defers to the typical credentials lookup sequence, you may use any of the <a target="_blank" href="https://docs.aws.amazon.com/cli/v1/userguide/cli-chap-configure.html">supported methods</a>. Typically, you either configure a profile with the AWS CLI and refer to it using the <code>AWS_PROFILE</code> environment variable, or you use the more verbose environment variables including <code>AWS_ACCESS_KEY_ID</code>, <code>AWS_SECRET_ACCESS_KEY</code>, and <code>AWS_SESSION_TOKEN</code>. In any case, ensure that you provide the target region in the profile or using the <code>AWS_DEFAULT_REGION</code> environment variable.</p>
<p>Run the <code>volume_encryption.py</code> script while providing the following arguments:</p>
<ul>
<li><p><code>-i</code> with the ID of the target EC2 instance</p>
</li>
<li><p><code>-k</code> with the KMS CMK ID or alias</p>
</li>
<li><p><code>-p</code> to preserve the original volume, which can be inspected and deleted after validating the EC2 instance with newly encrypted volumes</p>
</li>
</ul>
<p>Here is the command that is specific to my resources:</p>
<pre><code class="lang-bash">python volume_encryption.py -i i-095dc6901ff37f71d -k 73d13a4a-b0b0-4ced-b82b-d86a78c89df0 -p
</code></pre>
<p>How long the script takes largely depends on the volume sizes and the encryption state of the volumes. For my test instance with a 30 GB encrypted volume and a 8 GB encrypted volume, it took a bit over 6 minutes to complete. If I were to run the script again using another key, it would take more than 20 minutes to complete, presumably because re-encryption takes longer. In any case, the console logs include timestamps that indicate how long each step takes.</p>
<pre><code class="lang-bash">$ python volume_encryption.py -i i-095dc6901ff37f71d -k 73d13a4a-b0b0-4ced-b82b-d86a78c89df0 -p
[2025-01-19T13:51:38.781614-05:00] Checking instance i-095dc6901ff37f71d
[2025-01-19T13:51:39.624892-05:00] Stopping instance i-095dc6901ff37f71d
[2025-01-19T13:52:55.908023-05:00] Create snapshot of volume vol-0c79ec8bde159fc7b (xvdb)
[2025-01-19T13:53:57.048132-05:00] Create encrypted volume from snapshot snap-0fb3ecd0eaf583f82
[2025-01-19T13:53:57.684619-05:00] Detach volume vol-0c79ec8bde159fc7b (xvdb)
[2025-01-19T13:53:58.048124-05:00] Attach volume vol-0ddb7f75257290d19 (xvdb)
[2025-01-19T13:54:13.966946-05:00] Create snapshot of volume vol-054b9017ef1f8f25c (/dev/sda1)
[2025-01-19T13:55:14.751392-05:00] Create encrypted volume from snapshot snap-0dac8c77e6e777d1a
[2025-01-19T13:55:15.271831-05:00] Detach volume vol-054b9017ef1f8f25c (/dev/sda1)
[2025-01-19T13:55:15.615831-05:00] Attach volume vol-01140d9f4c666a763 (/dev/sda1)
[2025-01-19T13:55:31.975440-05:00] Start instance i-095dc6901ff37f71d
[2025-01-19T13:55:48.251798-05:00] Clean up resources
[2025-01-19T13:55:48.252797-05:00] Delete snapshot snap-0fb3ecd0eaf583f82
[2025-01-19T13:55:48.438101-05:00] Skipping deletion of original volume vol-0c79ec8bde159fc7b (xvdb)
[2025-01-19T13:55:48.438101-05:00] Delete snapshot snap-0dac8c77e6e777d1a
[2025-01-19T13:55:48.614087-05:00] Skipping deletion of original volume vol-054b9017ef1f8f25c (/dev/sda1)
[2025-01-19T13:55:48.615090-05:00] Encryption finished
$
</code></pre>
<p>Once the script completes, verify that the EBS volumes are encrypted with the new KMS key.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1737332659454/fc948ba0-68cd-4bc3-86e6-10948c101f72.png" alt="Newly attached volumes that are encrypted with the CMK" class="image--center mx-auto" /></p>
<p>You should also see that the original volumes still exist and have some metadata tags added by the script for traceability. Note the original volume IDs from the console logs of the script (for example, <code>vol-054b9017ef1f8f25c</code> is the ID of the original root volume).</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1737333524230/5c5722b8-f013-4646-a64b-1a7466f7ad77.png" alt="Original volumes retained with added metadata tags" class="image--center mx-auto" /></p>
<p>Lastly, log in to the EC2 instance and ensure that Windows is working as intended and <code>D:\hello.txt</code> still exists.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1737332742390/c5f1f64b-ac1d-417d-8b79-414b7aa9be20.png" alt="D:ello.txt still accessible after volumes are encrypted" class="image--center mx-auto" /></p>
<p>However, you will notice that the EC2 instance is seemingly slower than usual. This is because volumes that are restored from snapshots are not fully initialized or pre-warmed. This means that only when a block within the volume is accessed that it will be loaded from the snapshot that’s stored in S3 behind the scenes, thus increasing the I/O latency. While this may not be a huge issue with common use cases, for use cases with high disk I/O needs (such as running a database), you may need to <a target="_blank" href="https://docs.aws.amazon.com/ebs/latest/userguide/ebs-initialize.html">manually initialize the disks</a>.</p>
<h2 id="heading-summary">Summary</h2>
<p>With this improved script, you can (re-)encrypt the EBS volumes of any EC2 instance with ease. If you are encrypting volumes for many instances, you can also write another script that reads a CSV file containing EC2 instance information and runs <code>volume_encryption.py</code> on multiple instances in parallel. AI tools like <a target="_blank" href="https://openai.com/index/chatgpt/">ChatGPT</a>, <a target="_blank" href="https://aws.amazon.com/q/developer/?trk=ff18f09a-090a-4af5-849f-9f9c7840819a&amp;sc_channel=ps&amp;ef_id=Cj0KCQiA4rK8BhD7ARIsAFe5LXI58Y3mapcJ6isDfR3oK88q9dTDIiAeChBeWrCJ2eYGhNphB92fiNcaAs9yEALw_wcB:G:s&amp;s_kwcid=AL!4422!3!698165427973!e!!g!!amazon%20q%20developer!21054971249!162057026815">Amazon Q Developer</a>, or <a target="_blank" href="https://github.com/features/copilot">GitHub Copilot</a> can easily create one for you, as I did for my own needs. I will leave this as an exercise for the audience.</p>
<p>As they say, prevention is better than cure. If your organization’s security policies require that EBS volumes be encrypted, consider using the <a target="_blank" href="https://docs.aws.amazon.com/ebs/latest/userguide/encryption-by-default.html">Amazon EBS encryption by default feature</a> to automatically encrypt any new EBS volumes.</p>
<p>This demonstrates how automation and generative AI empower DevOps engineers to tackle complex challenges efficiently. I hope you find this blog post informative and the script useful should you run into similar situations. If you like this article, please check out <a target="_blank" href="https://blog.avangards.io/">my other blog posts</a> for more helpful and intriguing content on AWS and DevOps. Thank you for reading and have a great one!</p>
]]></content:encoded></item><item><title><![CDATA[Guardrail Support for the Generic Bedrock Agent Test UI]]></title><description><![CDATA[Introduction
In the blog post Developing a Generic Streamlit UI to Test Amazon Bedrock Agents, I shared the design and source code of a basic yet functional UI for testing Bedrock agents. I’ve since added support for Knowledge Bases for Amazon Bedroc...]]></description><link>https://blog.avangards.io/guardrail-support-for-the-generic-bedrock-agent-test-ui</link><guid isPermaLink="true">https://blog.avangards.io/guardrail-support-for-the-generic-bedrock-agent-test-ui</guid><category><![CDATA[AWS]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[Amazon Bedrock]]></category><category><![CDATA[AI]]></category><category><![CDATA[Python]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Thu, 26 Sep 2024 03:06:51 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1727058412977/d4df6a32-48bc-4c27-ab67-7fb907727ba2.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In the blog post <a target="_blank" href="https://blog.avangards.io/developing-a-generic-streamlit-ui-to-test-amazon-bedrock-agents">Developing a Generic Streamlit UI to Test Amazon Bedrock Agents</a>, I shared the design and <a target="_blank" href="https://github.com/acwwat/amazon-bedrock-agent-test-ui">source code</a> of a basic yet functional UI for testing Bedrock agents. I’ve since added <a target="_blank" href="https://blog.avangards.io/knowledge-base-support-for-the-generic-bedrock-agent-test-ui">support for Knowledge Bases for Amazon Bedrock</a> by displaying citations and their details to match the functionality in the Bedrock console.</p>
<p>Recently I’ve started experimenting with <a target="_blank" href="https://aws.amazon.com/bedrock/guardrails/">Guardrails for Amazon Bedrock</a>, a feature that enables the implementation of safeguards for your generative AI applications based on specific use cases and responsible AI policies. As part of the blog post <a target="_blank" href="https://blog.avangards.io/a-guide-to-effective-use-of-the-terraform-aws-cloud-control-provider">A Guide to Effective Use of the Terraform AWS Cloud Control Provider</a>, I created a simple history-themed Bedrock agent with a guardrail that filters violent content. As I test a guardrail-enabled agent in the Bedrock console, I see additional traces showing whether guardrails have intervened as they assess the input and output, and if so, what policies were triggered:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726987174511/5142caca-ff97-4ee3-8fc0-7e53afeac3b8.png" alt="Post-guardrail trace" class="image--center mx-auto" /></p>
<p>With a bit of work, I have added similar support to the generic test UI and I am happy to share the updates in the <a target="_blank" href="https://github.com/acwwat/amazon-bedrock-agent-test-ui">GitHub repository</a>.</p>
<h2 id="heading-design-overview">Design overview</h2>
<p>With the latest update, guardrail traces are now added to the pre-processing and post-processing trace sections in a manner similar to how they are displayed in the Bedrock console:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726987561302/7baf19b0-2f6d-41ee-884b-3fbb6d7f8345.png" alt="Guardrail trace details" class="image--center mx-auto" /></p>
<p>The <a target="_blank" href="https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-agent-runtime/client/invoke_agent.html">Boto3 <code>invoke_agent</code> method</a> provides a new <code>guardrailTrace</code> trace type that includes trace details used in the guardrail. Distinguishing between pre- and post-guardrail traces took a bit of work, as they need to be tracked as events are streamed in sequence. If a <code>guardrailTrace</code> shows up for the first time (naturally before <code>preProcessingTrace</code>s, it would be a pre-guardrail trace. A subsequent <code>guardrailTrace</code> that shows up (naturally after any <code>postProcessingTrace</code>s would be a post-guardrail trace. Then they must be displayed under the <strong>Pre-Processing</strong> and <strong>Post-Processing</strong> sections as a single JSON object unlike other traces where they are broken down into smaller JSON objects for readability.</p>
<h2 id="heading-summary">Summary</h2>
<p>With the improvements to the generic test UI outlined in this post, you should now be able to test any Bedrock agents with an associated guardrail. I will be sure to incorporate support for new Agents for Amazon Bedrock features. If you find this blog post helpful, there is plenty more similar content at the <a target="_blank" href="https://blog.avangards.io">Avangards Blog</a>. Be sure to check them out!</p>
]]></content:encoded></item><item><title><![CDATA[A Guide to Effective Use of the Terraform AWS Cloud Control Provider]]></title><description><![CDATA[Introduction
The AWS Cloud Control (CC) Provider gained significant attention in May 2024 when it became generally available, three years after its initial launch. It promises to support new AWS features and services immediately due to its auto-gener...]]></description><link>https://blog.avangards.io/a-guide-to-effective-use-of-the-terraform-aws-cloud-control-provider</link><guid isPermaLink="true">https://blog.avangards.io/a-guide-to-effective-use-of-the-terraform-aws-cloud-control-provider</guid><category><![CDATA[AWS]]></category><category><![CDATA[Terraform]]></category><category><![CDATA[Amazon Bedrock]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Mon, 23 Sep 2024 02:34:49 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1726976016784/f1d325bb-6f52-417a-8f30-e4a86313a050.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>The <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/awscc/latest/docs">AWS Cloud Control (CC) Provider</a> gained significant attention in May 2024 when it became <a target="_blank" href="https://www.hashicorp.com/blog/terraform-aws-cloud-control-api-provider-now-generally-available">generally available</a>, three years after its <a target="_blank" href="https://www.hashicorp.com/blog/announcing-terraform-aws-cloud-control-provider-tech-preview">initial launch</a>. It promises to support new AWS features and services immediately due to its auto-generated nature, which is especially beneficial for the rapidly evolving generative AI services like Amazon Bedrock.</p>
<p>But should you immediately switch to the AWS CC Provider and abandon the classic AWS Provider? Not necessarily. While the CC Provider brings speed and coverage of new services, it’s not without limitations. In this blog post, we’ll break down the strengths and weaknesses of both providers, highlighting when it makes sense to leverage the AWS CC Provider and where the classic AWS Provider still shines. A practical example will demonstrate how both can be best used together.</p>
<h2 id="heading-understanding-the-strengths-and-weaknesses-of-the-aws-cc-provider">Understanding the Strengths and Weaknesses of the AWS CC Provider</h2>
<p>The main selling point of the AWS CC Provider is that it provides support for new AWS services sooner than the classic AWS Provider. The <a target="_blank" href="https://aws.amazon.com/blogs/devops/quickly-adopt-new-aws-features-with-the-terraform-aws-cloud-control-provider/">GA announcement</a> showcases this speed by supporting the <a target="_blank" href="https://docs.aws.amazon.com/amazonq/latest/qbusiness-ug/what-is.html">Amazon Q Business</a> resources early on. In contrast, the <a target="_blank" href="https://github.com/hashicorp/terraform-provider-aws/issues/36464">enhancement request</a> that was opened against the AWS Provider in January 2024 is still pending, even though a pull request (PR) has already been submitted by a contributor for some time. It makes sense to leverage the resources from the AWS CC Provider to not delay your IaC automation effort.</p>
<p>Even though the AWS CC Provider covers new AWS services quickly, there’s still a lot of room for improvement when it comes to older services. According to <a target="_blank" href="https://github.com/aws-cloudformation/cloudformation-cli/issues/1039">this GitHub issue</a>, as of October 2023, the Cloud Control API supports only 859 resources, many of which are not supported in all AWS regions. According to <a target="_blank" href="https://docs.aws.amazon.com/cloudcontrolapi/latest/userguide/supported-resources.html">Resource types that support Cloud Control API</a>, as of July 2024 it supports 1034 resources, which is encouraging to see. However, there is still a <a target="_blank" href="https://github.com/hashicorp/terraform-provider-awscc/issues/156">list of suppressed resources</a> that are not compatible with how the AWS CC Provider generates resources, bringing the actual supported number of resources to around 1,000. Compared to about 1,400 resources supported by the AWS Provider, that's only about 70% coverage and less if you consider resources that haven't been implemented in the AWS Provider.</p>
<p>While working with the AWS CC Provider, I noticed challenges with both documentation and quality assurance. The quality of descriptions for resources and attributes is somewhat inconsistent. For example, the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/awscc/latest/docs/resources/iam_group">documentation for the <code>awscc_iam_group</code> resource</a> is quite well written, while the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/awscc/latest/docs/resources/qbusiness_application">documentation for the <code>awscc_qbusiness_application</code> resource</a> is practically non-existent. Overall, it pales in comparison to the AWS API documentation which I often refer to when contributing to the AWS Provider. I am not sure why the CloudFormation schemas (from which the AWS CC Provider resources and documentation are generated) are so far apart from the AWS API documentation, but I hope AWS can reconcile the two sources at some point.</p>
<p>As for the functional quality of the AWS CC Provider, my experience unfortunately hasn't been great. While <a target="_blank" href="https://github.com/hashicorp/terraform-provider-awscc/pull/1822">adding examples to the Lightsail resources</a>, I ran into two major functional issues and two documentation issues that are caused upstream in the Cloud Control API. This led me to believe that there is insufficient quality assurance with the Cloud Control API, and the generated nature of the AWS CC Provider does not help catch these issues. The situation will hopefully improve over time, but for the time being I would prefer the AWS Provider for mission-critical use. Nevertheless, I must give credit to the AWS CC Provider maintainers in diligently reporting and working with AWS to resolve upstream issues in a timely manner. The turnaround time is much quicker than if I were to open AWS support cases myself.</p>
<h2 id="heading-also-knowing-the-merits-and-drawbacks-of-the-classic-aws-provider">Also Knowing the Merits and Drawbacks of the Classic AWS Provider</h2>
<p>The Terraform AWS Provider has been active for over ten years and <a target="_blank" href="https://www.hashicorp.com/blog/terraform-aws-provider-tops-3-billion-downloads">has recently surpassed three billion downloads</a>. The tremendous work that HashiCorp, AWS, and the community put into the provider over the years has led to high AWS service coverage. The provider boasts ample acceptance tests which result in a relatively high degree of quality. The user base also proactively reports less prevalent issues that are not caught by automated tests.</p>
<p>Since the AWS Provider code is not generated like the AWS CC Provider, the hand-crafted nature means that development is labor intensive and time consuming. Consequently, lower priority issues and features often take some time to be fixed. Even when a PR is submitted by a contributor, a maintainer from HashiCorp still needs to review, test, and merge it in between their other work such as those related to product roadmaps.</p>
<p>Meanwhile, the need for hands-on development also affords the flexibility to add custom logic to resources and data sources. As previously mentioned, the AWS API often does not fit perfectly into a CRUDL model due to actions that fall outside these operations. For instance, the Agents for <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/APIReference/welcome.html">Amazon Bedrock API</a> has an <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_AssociateAgentKnowledgeBase.html">AssociateAgentKnowledgeBase action</a> that associates a knowledge base to an agent. Since it is not considered a resource in the AWS CC API, it is not mapped to a resource in the AWS CC Provider. However, an experienced developer for the AWS Provider is able to adapt this action into an "association", resulting in the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/bedrockagent_agent_knowledge_base_association"><code>aws_bedrockagent_agent_knowledge_base_association</code> resource</a>.</p>
<p>As another example, a Bedrock agent must be prepared using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/bedrockagent_agent_knowledge_base_association">PrepareAgent action</a> after it is updated. Since this action cannot be easily adapted to a resource, the logical approach is to call this API action when an <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/bedrockagent_agent"><code>aws_bedrockagent_agent</code> resource</a> is created and updated, leading to the custom <code>prepare_agent</code> argument. Similar logic can be added to other Agents for Bedrock resources that indirectly modifies an agent. Having this type of custom resources and logic is only possible in the AWS Provider today.</p>
<h2 id="heading-using-both-providers-in-a-complementary-manner">Using Both Providers in a Complementary Manner</h2>
<p>The good news is that Terraform is designed to work with multiple providers, so you can leverage both the AWS Provider and the AWS CC Provider for what they each excel at.</p>
<p>Let’s look at a use case of adding a <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html">guardrail</a> to a Bedrock agent. Currently, the AWS Provider has the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/bedrock_guardrail"><code>aws_bedrock_guardrail</code> resource</a>, but it does not yet have a <a target="_blank" href="https://github.com/hashicorp/terraform-provider-aws/issues/38853">resource to manage guardrail versions</a>. While the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/bedrockagent_agent"><code>aws_bedrock_agent</code> resource</a> has been around for some time, it does not yet have the <a target="_blank" href="https://github.com/hashicorp/terraform-provider-aws/issues/39404">configuration to associate a guardrail</a>.</p>
<p>On the other hand, the AWS CC Provider has a <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/awscc/latest/docs/resources/bedrock_guardrail_version"><code>awscc_bedrock_guardrail_version</code> resource</a> and the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/awscc/latest/docs/data-sources/bedrock_agent"><code>awscc_bedrock_agent</code> resource</a> supports the <code>guardrail_configuration</code> argument for associating a guardrail. Thus we can strategically use the AWS CC Provider for the new features while using the AWS Provider for all other resources.</p>
<p>Here is the Terraform configuration for a simple Bedrock agent that answers questions about world history, but is guarded against providing information on violent historical events like what happened to Julius Caesar:</p>
<pre><code class="lang-dockerfile">data <span class="hljs-string">"aws_caller_identity"</span> <span class="hljs-string">"this"</span> {}
data <span class="hljs-string">"aws_partition"</span> <span class="hljs-string">"this"</span> {}
data <span class="hljs-string">"aws_region"</span> <span class="hljs-string">"this"</span> {}
locals {
  account_id = data.aws_caller_identity.this.account_id
  partition  = data.aws_partition.this.partition
  region     = data.aws_region.this.name
}

data <span class="hljs-string">"aws_bedrock_foundation_model"</span> <span class="hljs-string">"this"</span> {
  model_id = <span class="hljs-string">"anthropic.claude-3-haiku-20240307-v1:0"</span>
}

resource <span class="hljs-string">"aws_bedrock_guardrail"</span> <span class="hljs-string">"this"</span> {
  name                      = <span class="hljs-string">"MyGuardrail"</span>
  description               = <span class="hljs-string">"My guardrail"</span>
  blocked_input_messaging   = <span class="hljs-string">"Sorry, I cannot answer this question."</span>
  blocked_outputs_messaging = <span class="hljs-string">"Sorry, I cannot answer this question."</span>
  content_policy_config {
    filters_config {
      input_strength  = <span class="hljs-string">"HIGH"</span>
      output_strength = <span class="hljs-string">"HIGH"</span>
      type            = <span class="hljs-string">"VIOLENCE"</span>
    }
  }
}

resource <span class="hljs-string">"awscc_bedrock_guardrail_version"</span> <span class="hljs-string">"this"</span> {
  guardrail_identifier = aws_bedrock_guardrail.this.guardrail_id
  lifecycle {
    replace_triggered_by = [aws_bedrock_guardrail.this]
  }
}

resource <span class="hljs-string">"aws_iam_role"</span> <span class="hljs-string">"bedrock_agent_this"</span> {
  name = <span class="hljs-string">"AmazonBedrockExecutionRoleForAgents_MyAgent"</span>
  assume_role_policy = jsonencode({
    Version = <span class="hljs-string">"2012-10-17"</span>
    Statement = [
      {
        Action = <span class="hljs-string">"sts:AssumeRole"</span>
        Effect = <span class="hljs-string">"Allow"</span>
        Principal = {
          Service = <span class="hljs-string">"bedrock.amazonaws.com"</span>
        }
        Condition = {
          StringEquals = {
            <span class="hljs-string">"aws:SourceAccount"</span> = local.account_id
          }
          ArnLike = {
            <span class="hljs-string">"aws:SourceArn"</span> = <span class="hljs-string">"arn:${local.partition}:bedrock:${local.region}:${local.account_id}:agent/*"</span>
          }
        }
      }
    ]
  })
}

resource <span class="hljs-string">"aws_iam_role_policy"</span> <span class="hljs-string">"bedrock_agent_this"</span> {
  name = <span class="hljs-string">"AmazonBedrockAgentBedrockFoundationModelPolicy_MyAgent"</span>
  role = aws_iam_role.bedrock_agent_this.name
  policy = jsonencode({
    Version = <span class="hljs-string">"2012-10-17"</span>
    Statement = [
      {
        Sid      = <span class="hljs-string">"InvokeFoundationModel"</span>
        Action   = <span class="hljs-string">"bedrock:InvokeModel"</span>
        Effect   = <span class="hljs-string">"Allow"</span>
        Resource = data.aws_bedrock_foundation_model.this.model_arn
      },
      {
        Sid      = <span class="hljs-string">"ApplyGuardrail"</span>
        Action   = <span class="hljs-string">"bedrock:ApplyGuardrail"</span>
        Effect   = <span class="hljs-string">"Allow"</span>
        Resource = awscc_bedrock_guardrail_version.this.guardrail_arn
      }
    ]
  })
}

resource <span class="hljs-string">"awscc_bedrock_agent"</span> <span class="hljs-string">"this"</span> {
  agent_name              = <span class="hljs-string">"MyAgent"</span>
  agent_resource_role_arn = aws_iam_role.bedrock_agent_this.arn
  auto_prepare            = true
  description             = <span class="hljs-string">"My Agent"</span>
  foundation_model        = data.aws_bedrock_foundation_model.this.model_id
  guardrail_configuration = {
    guardrail_identifier = awscc_bedrock_guardrail_version.this.guardrail_arn
    guardrail_version    = awscc_bedrock_guardrail_version.this.version
  }
  instruction = <span class="hljs-string">"You are an assistant that provides information about world history. You are allowed to use general knowledge that you already possess to answer any history-related questions."</span>
}
</code></pre>
<p>As you can see, the Terraform configuration maintains the familiar usage of resources and data sources in the AWS Provider. A quick validation in the Amazon Bedrock console shows that the agent does indeed filter the violent event about Julius Caesar, while it correctly answers another question about the history of wheels.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726973130721/b6544afb-7245-4140-a7f5-1e8f5fd1ee41.png" alt="Testing the agent with guardrail" class="image--center mx-auto" /></p>
<h2 id="heading-summary">Summary</h2>
<p>In this blog post, we looked at the pros and cons of the AWS Provider and the AWS CC Provider. As it stands, there is still a long way until AWS CC Provider has the quality and feature parity necessary to replace the AWS Provider, so both are here to stay for the foreseeable future.</p>
<p>If you're managing complex AWS infrastructure, now is the time to experiment with both providers. Use the AWS CC Provider for cutting-edge features, and rely on the AWS Provider for tried-and-true solutions. By blending both providers, you’ll have the best of both worlds in your Terraform configurations. You can follow <a target="_blank" href="https://developer.hashicorp.com/terraform/tutorials/aws/aws-cloud-control">this tutorial</a> or find more information <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/guides/using-aws-with-awscc-provider">here</a>.</p>
<p>For other tips and walkthroughs on AWS and Terraform, be sure to check out the <a target="_blank" href="https://blog.avangards.io">Avangards Blog</a>. Thanks for reading!</p>
]]></content:encoded></item><item><title><![CDATA[My Experience With the AWS Certified AI Practitioner (AI1-C01) Beta Exam]]></title><description><![CDATA[Introduction
AWS has recently revamped their certification lineup to align with the growing AI/ML trend. Among them is the AWS Certified AI Practitioner (AI1-C01) exam, which is currently in beta. Beta exams validate exam questions to finalize the co...]]></description><link>https://blog.avangards.io/my-experience-with-the-aws-certified-ai-practitioner-ai1-c01-beta-exam</link><guid isPermaLink="true">https://blog.avangards.io/my-experience-with-the-aws-certified-ai-practitioner-ai1-c01-beta-exam</guid><category><![CDATA[AWS]]></category><category><![CDATA[Certification]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Wed, 04 Sep 2024 06:00:21 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1725424192471/b05ac858-0cab-4629-89a0-31508a510913.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>AWS has recently revamped their certification lineup to align with the growing AI/ML trend. Among them is the <a target="_blank" href="https://aws.amazon.com/certification/certified-ai-practitioner/">AWS Certified AI Practitioner (AI1-C01) exam</a>, which is currently in beta. Beta exams validate exam questions to finalize the content before wide release. As an early adopter, you get a discount on the exam fee in exchange for your contribution to test out the exam for AWS. Given that this is an intriguing proposition and it aligns with my focus on AWS, I decided to write this exam and share my experience with the community.</p>
<h2 id="heading-a-bit-about-my-background">A Bit About My Background</h2>
<p>For context, I've been working exclusively with AWS for about three years and completed a couple of professional/specialty level exams, so I am quite familiar with AWS in general. My work also involves generative and conversational AI (although it has mostly been on the Microsoft side), so I have knowledge on concepts such as LLM and RAG.</p>
<p>I am interested in generative AI on AWS, so I've been experimenting on my own and has written a few blog post about <a target="_blank" href="https://blog.avangards.io/building-a-basic-forex-rate-assistant-using-agents-for-amazon-bedrock">Agents for Amazon Bedrock</a> and <a target="_blank" href="https://blog.avangards.io/adding-an-amazon-bedrock-knowledge-base-to-the-forex-rate-assistant">Knowledge Bases for Amazon Bedrock</a>. However, my knowledge of Amazon SageMaker is limited to understanding the ML workflow and some hands-on experience from <a target="_blank" href="https://workshops.aws/">AWS Workshops</a>.</p>
<p>Overall I'd say that I know a bit more about AI/ML than the average Joe, so I was able to expedite my study somewhat. As you read about my exam prep, consider your own knowledge and experience to adjust your approach.</p>
<h2 id="heading-how-i-studied-for-the-exam">How I Studied for the Exam</h2>
<p>In the past, I've always used <a target="_blank" href="https://www.pluralsight.com/cloud-guru">A Cloud Guru</a> to study for AWS and Azure exams. Recently I've been given an <a target="_blank" href="https://skillbuilder.aws/subscriptions">AWS Skill Builder subscription</a> by my company, so I've decided to use it as the primary source of study material.</p>
<p>The <a target="_blank" href="https://aws.amazon.com/certification/certified-ai-practitioner/">official AWS Certified AI Practitioner webpage</a> recommends a <a target="_blank" href="https://skillbuilder.aws/exam-prep/ai-practitioner">4-step exam prep plan</a> which I followed over two days. To test my knowledge prior to studying, I went through the <a target="_blank" href="https://explore.skillbuilder.aws/learn/course/external/view/elearning/19790/exam-prep-official-practice-question-set-aws-certified-ai-practitioner-aif-c01-english">official practice question set</a> and got 85% which boosted my confidence (but eventually turned out to be a trap). I then proceeded with the <a target="_blank" href="https://explore.skillbuilder.aws/learn/public/learning_plan/view/2194/enhanced-exam-prep-plan-aws-certified-ai-practitioner-aif-c01">enhanced exam prep course</a> that included bonus practice questions and flashcards over the standard (free) version.</p>
<p>The course itself is well-organized and touches upon all topics. The instructor spoke very slowly in the video, so I watched the videos at 2x speed for a more reasonable pace. However, I found that certain concepts aren't explained very well and the level of details isn't representative of what the exam itself demands. I suppose the additional resources would have been good supplements, however I would prefer if the course itself is more in-depth.</p>
<p>The bonus questions provided additional practice opportunities and the flashcards were helpful for memorizing key concepts. I transferred the flashcards into a Word document for offline review before the exam. That being said, they did not cover all important concepts and I had to supplement them with my own research on Google (such as <a target="_blank" href="https://towardsdatascience.com/types-of-machine-learning-algorithms-you-should-know-953a08248861">a list of ML algorithms</a>).</p>
<p>Following my typical study regimen, I sought more practice right before the exam. Aside from the practice questions from AWS, I quickly went through <a target="_blank" href="https://portal.tutorialsdojo.com/product/free-aws-certified-ai-practitioner-practice-exams-aif-c01-sampler/">Tutorial Dojo's free practice exams sampler</a> given that they don't yet have a full practice exam package yet. (I later found out that Stephane Maarek does.)</p>
<h2 id="heading-how-the-exam-went">How the Exam Went</h2>
<p>I took the exam online at the comfort of my home in a quiet evening. The check-in process went smoothly and took about 10 minutes. The exam questions were more difficult than I expected of a foundational-level exam. On a scale of 1 to 10, I’d rate it a 4 in difficulty.</p>
<p>Although the exam guide lists case study as a question type, there wasn't any for me so your mileage may vary. There were many questions on machine learning concepts such as algorithms and performance metrics. Amazon Bedrock also featured a lot. There was also the expected mix of questions on Amazon SageMaker features, generative AI, and responsible AI. While I wouldn't say that the questions were very different from the <a target="_blank" href="https://explore.skillbuilder.aws/learn/course/external/view/elearning/19790/exam-prep-official-practice-question-set-aws-certified-ai-practitioner-aif-c01-english">official practice question set</a> or the bonus questions in the exam prep enhanced course, they seemed more in-depth and specific.</p>
<p>It took me about 75 minutes to complete the exam, with the first pass completed in about an hour which left 27 out of 85 questions flagged for review. This is more questions than I usually flag in AWS exams (including the DOP exam), and after the review I was only confident on my final answer to about half of those questions. I finished the exam feeling I would pass, though not with full confidence.</p>
<h2 id="heading-a-retrospective-on-my-approach">A Retrospective on My Approach</h2>
<p>I received the results by early morning and scored only 729, which admittedly is lower than I expected. Nonetheless, a pass is a pass. According to the report, I didn't do as well in the "guidelines for responsible AI" domain which was a bit surprising as I did well during practice. In any case, here is my reflection on my experience based on a typical retrospective framework.</p>
<h3 id="heading-what-went-well">What Went Well</h3>
<ul>
<li><p>Setting an exam date motivated me to be disciplined and keep to a set study schedule amid other priorities in life. The long weekend afforded me enough time to study, socialize, and do chores.</p>
</li>
<li><p>Following the official exam prep plan and the exam prep course helped structure my study. I've always studied by following a course be it from A Cloud Guru or AWS Skill Builder, and it's been proven to be a sound strategy.</p>
</li>
</ul>
<h3 id="heading-what-didnt-go-well">What Didn't Go Well</h3>
<ul>
<li><p>I was overly confident of my hands-on experience with services such as Amazon Bedrock, when there were numerous features and other services that I've not have enough exposure to. Consequently I did not allocate time to watch tutorial videos or do hands-on labs to gain the necessary familiarity and "muscle memory".</p>
</li>
<li><p>I didn’t spend enough time on Step 2 of the <a target="_blank" href="https://skillbuilder.aws/exam-prep/ai-practitioner">4-step plan</a> to explore additional courses outside the AWS Skill Builder exam prep course. I've retroactively looked at some of the recommended courses and they would have improved my overall knowledge for this exam.</p>
</li>
</ul>
<h3 id="heading-what-could-be-improved">What Could Be Improved</h3>
<ul>
<li><p>Although I was able to get by with one full day of study, it would have been better to spread the study across multiple days for a better pace.</p>
</li>
<li><p>I would have done better with additional research on general AI/ML concepts such as algorithms and performance metrics, which the exam seemed to have more focus on.</p>
</li>
<li><p>Although the AWS Skill Builder exam prep enhanced course was decent, it was not fully adequate as the sole study material. Investing in additional courses and practice exams, <a target="_blank" href="https://www.reddit.com/r/AWSCertifications/comments/1efomya/here_is_my_new_aws_certified_ai_practitioner/">such as those from Stephane Maarek</a>, would probably have helped boost my score.</p>
</li>
</ul>
<h2 id="heading-summary">Summary</h2>
<p>I hope this blog post gives you a sense of what to expect from the AWS Certified AI Practitioner (AI1-C01) Beta Exam. It's certainly not an exam that you can wing, unless you are well-exposed to AI/ML or are already studying for other certifications such as <a target="_blank" href="https://aws.amazon.com/certification/certified-machine-learning-engineer-associate/">AWS Certified Machine Learning Engineer - Associate</a>. However with the right material and a few days of study, it is very much achievable.</p>
<p>Check out the <a target="_blank" href="https://blog.avangards.io/">Avangards Blog</a> for more articles on AWS, Terraform, and other topics. Best of luck with your studies, and I hope you’ll soon be certified!</p>
]]></content:encoded></item><item><title><![CDATA[How To Manage Amazon Inspector in AWS Organizations Using Terraform]]></title><description><![CDATA[Introduction
Over the past two months, I have published numerous blog posts on managing different AWS security services in AWS Organizations using Terraform. In this blog post, I will cover one remaining AWS service, AWS Inspector, for native vulnera...]]></description><link>https://blog.avangards.io/how-to-manage-amazon-inspector-in-aws-organizations-using-terraform</link><guid isPermaLink="true">https://blog.avangards.io/how-to-manage-amazon-inspector-in-aws-organizations-using-terraform</guid><category><![CDATA[AWS]]></category><category><![CDATA[Terraform]]></category><category><![CDATA[Security]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Sun, 09 Jun 2024 07:02:42 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1717918956631/a0e70239-1223-4109-a28a-bcde238bbf70.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>Over the past two months, I have published numerous <a target="_blank" href="https://blog.avangards.io/series/aws-sec-org-terraform">blog posts on managing different AWS security services in AWS Organizations using Terraform</a>. In this blog post, I will cover one remaining AWS service, AWS Inspector, for native vulnerability management. The Terraform resources for Inspector are a bit quirky, so I will show some slightly more advanced techniques to keep the configuration neat and configurable. With that said, let's review the objective.</p>
<h2 id="heading-about-the-use-case">About the use case</h2>
<p><a target="_blank" href="https://docs.aws.amazon.com/inspector/latest/user/what-is-inspector.html">Amazon Inspector</a> is a vulnerability management service that continuously scans AWS workloads for software vulnerabilities and unintended network exposure. Supported compute services include Amazon EC2 instances, container images in Amazon ECR, and AWS Lambda functions.</p>
<p>Similar to other AWS security services, Inspector supports <a target="_blank" href="https://docs.aws.amazon.com/inspector/latest/user/managing-multiple-accounts.html">managing multiple accounts with AWS Organizations</a> via the delegated administrator feature. Once an account in the organization is designated as a delegated administrator, it can manage member accounts and view aggregated findings.</p>
<p>Since it is increasingly common to establish an AWS landing zone using <a target="_blank" href="https://docs.aws.amazon.com/controltower/latest/userguide/what-is-control-tower.html">AWS Control Tower</a>, we will use the <a target="_blank" href="https://docs.aws.amazon.com/controltower/latest/userguide/accounts.html">standard account structure</a> in a Control Tower landing zone to demonstrate how to configure Inspector in Terraform:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1717901308047/ca83af30-83b3-41ad-8ae4-f6014b044a4f.png" alt="Control Tower standard OU and account structure" class="image--center mx-auto" /></p>
<p>The relevant accounts for our use case in the landing zone are:</p>
<ul>
<li><p>The <strong>Management</strong> account for the organization where AWS Organizations is configured. For details, refer to <a target="_blank" href="https://docs.aws.amazon.com/inspector/latest/user/managing-multiple-accounts.html">Managing multiple accounts in Amazon Inspector with Organizations</a>.</p>
</li>
<li><p>The <strong>Audit</strong> account where security and compliance services are typically centralized in a Control Tower landing zone.</p>
</li>
</ul>
<p>The objective is to delegate Inspector administrative duties from the <strong>Management</strong> account to the <strong>Audit</strong> account, after which all organization configurations are managed in the <strong>Audit</strong> account. Let's walk through how to do this using Terraform.</p>
<h2 id="heading-designating-an-inspector-administrator-account">Designating an Inspector administrator account</h2>
<p>The Inspector delegated administrator is configured in the <strong>Management</strong> account, so we need a provider associated with it in Terraform. To keep things simple, we will take a multi-provider approach by defining two providers, one for the <strong>Management</strong> account and another for the <strong>Audit</strong> account, using AWS CLI profiles as follows:</p>
<pre><code class="lang-dockerfile">provider <span class="hljs-string">"aws"</span> {
  alias   = <span class="hljs-string">"management"</span>
  <span class="hljs-comment"># Use "aws configure" to create the "management" profile with the Management account credentials</span>
  profile = <span class="hljs-string">"management"</span> 
}

provider <span class="hljs-string">"aws"</span> {
  alias   = <span class="hljs-string">"audit"</span>
  <span class="hljs-comment"># Use "aws configure" to create the "audit" profile with the Audit account credentials</span>
  profile = <span class="hljs-string">"audit"</span> 
}
</code></pre>
<div data-node-type="callout">
<div data-node-type="callout-emoji">⚠</div>
<div data-node-type="callout-text">Since Inspector is a regional service, you must apply this Terraform configuration on each region that you are using. Consider using the <code>region</code> argument in your provider definition and a variable to make your Terraform configuration rerunnable in other regions.</div>
</div>

<p>We can designate the delegated administrator using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/inspector2_delegated_admin_account"><code>aws_inspector2_delegated_admin_account</code> resource</a>. However, this does not enable Inspector in the delegated administrator account, so we also need to use the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/inspector2_enabler"><code>aws_inspector2_enabler</code> resource</a>. What I learned from testing the <code>aws_inspector2_enabler</code> resource is that you cannot provide both the delegated account and the member accounts in the <code>account_ids</code> argument, so we need a dedicated <code>aws_inspector2_enabler</code> resource for the <strong>Audit</strong> account. According to the resource source code, this is to address a legacy Inspector issue.</p>
<p>The resulting Terraform configuration should look like the following (pay special attention to the <code>provider</code> argument in each resource):</p>
<pre><code class="lang-dockerfile">data <span class="hljs-string">"aws_caller_identity"</span> <span class="hljs-string">"audit"</span> {
  provider = aws.audit
}

resource <span class="hljs-string">"aws_inspector2_enabler"</span> <span class="hljs-string">"audit"</span> {
  provider       = aws.audit
  account_ids    = [data.aws_caller_identity.audit.account_id]
}

resource <span class="hljs-string">"aws_inspector2_delegated_admin_account"</span> <span class="hljs-string">"audit"</span> {
  provider   = aws.management
  account_id = data.aws_caller_identity.audit.account_id
  depends_on = [aws_inspector2_enabler.audit]
}
</code></pre>
<h2 id="heading-configuring-inspector-activation-for-new-member-accounts">Configuring Inspector activation for new member accounts</h2>
<p>To allow more control over which scan types are enabled, we can define the following variables and use them with the relevant resources:</p>
<pre><code class="lang-dockerfile"><span class="hljs-comment"># Variable definition (.tfvars)</span>

variable <span class="hljs-string">"enable_ec2"</span> {
  description = <span class="hljs-string">"Whether Amazon EC2 scans should be enabled for both existing and new member accounts in the organization."</span>
  type        = bool
  default     = true
}

variable <span class="hljs-string">"enable_ecr"</span> {
  description = <span class="hljs-string">"Whether Amazon ECR scans should be enabled for both existing and new member accounts in the organization."</span>
  type        = bool
  default     = true
}

variable <span class="hljs-string">"enable_lambda"</span> {
  description = <span class="hljs-string">"Whether Lambda Function scans should be enabled for both existing and new member accounts in the organization."</span>
  type        = bool
  default     = true
}

variable <span class="hljs-string">"enable_lambda_code"</span> {
  description = <span class="hljs-string">"Whether Lambda code scans should be enabled for both existing and new member accounts in the organization."</span>
  type        = bool
  default     = true
}
</code></pre>
<p>In an organizational setup, Inspector can auto-enable on new member accounts. In Terraform, this can be configured using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/inspector2_organization_configuration"><code>aws_inspector2_organization_configuration</code> resource</a>. Leveraging the variables above, the resource can be defined as follows:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_inspector2_organization_configuration"</span> <span class="hljs-string">"this"</span> {
  provider = aws.audit
  auto_enable {
    ec2         = var.enable_ec2
    ecr         = var.enable_ecr
    lambda      = var.enable_lambda
    lambda_code = var.enable_lambda_code &amp;&amp; var.enable_lambda
  }
  depends_on = [aws_inspector2_delegated_admin_account.audit]
}
</code></pre>
<p>Note that for AWS Lambda code scanning (<code>lambda_code</code>), AWS Lambda standard scanning (<code>lambda</code>) is a prerequisite, so we need to check both variables to enable it.</p>
<p>Now let's address the existing member accounts.</p>
<h2 id="heading-activating-scanning-for-existing-member-accounts">Activating scanning for existing member accounts</h2>
<p>Unlike GuardDuty, the Inspector organization configuration does not support auto-enablement for existing member accounts, so we need to separately manage the member accounts. The strategy is to get the list of <em>active</em> member accounts from the organization, which we can use with the Inspector Terraform resources, including the <code>aws_inspector2_enabler</code> resource. We can exclude the <strong>Audit</strong> account since that is managed separately. To get the list of member accounts in the organization, we can use the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/organizations_organization"><code>aws_organizations_organization</code> data source</a>.</p>
<p>Furthermore, the <code>aws_inspector2_enabler</code> resource's <code>resource_types</code> argument takes a list of strings that represent the scan types to enable. Since the variables we defined earlier are boolean variables, we need a bit of function magic to create the list of scans to enable based on the variables.</p>
<p>The Terraform configuration that addresses the above requirements can be defined as follows:</p>
<pre><code class="lang-dockerfile">data <span class="hljs-string">"aws_organizations_organization"</span> <span class="hljs-string">"this"</span> {
  provider = aws.management
}

locals {
  enabler_resource_types = compact([
    var.enable_ec2 ? <span class="hljs-string">"EC2"</span> : null,
    var.enable_ecr ? <span class="hljs-string">"ECR"</span> : null,
    var.enable_lambda ? <span class="hljs-string">"LAMBDA"</span> : null,
    var.enable_lambda_code &amp;&amp; var.enable_lambda ? <span class="hljs-string">"LAMBDA_CODE"</span> : null,
  ])

  member_account_ids = [for account in data.aws_organizations_organization.this.accounts : account.id if account.status == <span class="hljs-string">"ACTIVE"</span> &amp;&amp; account.id != data.aws_caller_identity.audit.account_id]
}
</code></pre>
<p>Member accounts are not automatically associated with the delegated administrator account, so they must first be associated using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/inspector2_member_association"><code>aws_inspector2_member_association</code> resource</a>.</p>
<p>Using the <a target="_blank" href="https://developer.hashicorp.com/terraform/language/meta-arguments/for_each"><code>for_each</code> meta-argument</a>, we can define a single resource to associate all member accounts with the previously defined <code>member_account_ids</code> local value:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_inspector2_member_association"</span> <span class="hljs-string">"members"</span> {
  provider   = aws.audit
  for_each   = toset(local.member_account_ids)
  account_id = each.key
  depends_on = [aws_inspector2_delegated_admin_account.audit]
}
</code></pre>
<p>Lastly, we can enable Inspector scans in the member accounts using the <code>aws_inspector2_enabler</code> resource. Although the <code>account_ids</code> argument can take the list of member accounts, it is more flexible to have one resource per account. Thus, using <code>for_each</code> and the local values, the resource can be defined as follows:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_inspector2_enabler"</span> <span class="hljs-string">"members"</span> {
  provider       = aws.audit
  for_each       = toset(local.member_account_ids)
  account_ids    = [each.key]
  resource_types = local.enabler_resource_types
  depends_on     = [aws_inspector2_member_association.members]
}
</code></pre>
<div data-node-type="callout">
<div data-node-type="callout-emoji">✅</div>
<div data-node-type="callout-text">You can find the complete Terraform in the <a target="_blank" href="https://github.com/acwwat/terraform-amazon-inspector-organization-example">GitHub repository</a> that accompanies this blog post.</div>
</div>

<p>Now that the Terraform configuration is fully defined, you can apply it to establish the <strong>Audit</strong> account as the delegated administration and centrally manage Inspector settings for both new and existing accounts.</p>
<h2 id="heading-caveats-about-deactivating-inspector-in-member-accounts">Caveats about deactivating Inspector in member accounts</h2>
<p>Among the AWS security services, Inspector has the least sophisticated API for organizational management. The mix between auto-enablement for new member accounts and explicit enablement for existing member accounts complicates how they are managed in Terraform, particularly if you are trying to disable Inspector via <code>terraform destroy</code>.</p>
<p>Consider the case where a new member account is added and auto-enablement is applied to this account. If you run <code>terraform destroy</code> as-is, Terraform is not aware of the new member account, and thus Inspector cannot be deactivated in this account. You must manually deactivate the account in each applied region.</p>
<p>Alternatively, you can first run <code>terraform apply</code> so that the <code>aws_inspector2_member_association</code> and <code>aws_inspector2_enabler</code> resource instances are created, then run <code>terraform destroy</code> to properly clean up. While this method works, you must keep track of when new member accounts are added so that you know when to run <code>terraform apply</code> to reconcile the Terraform resources with the updated organization.</p>
<p>In any case, be aware of this caveat and take one of the two approaches if you ever need to clean up Inspector resources.</p>
<h2 id="heading-summary">Summary</h2>
<p>In this blog post, you learned how to manage Amazon Inspector in AWS Organizations using Terraform. With a delegated administrator, Inspector can be auto-enabled for new member accounts, while existing member accounts are dynamically associated and configured with the desired scan types. If you have also <a target="_blank" href="https://blog.avangards.io/how-to-manage-aws-security-hub-in-aws-organizations-using-terraform">configured AWS Security Hub to operate at the organization level</a>, you can manage Inspector findings across accounts and regions, thereby streamlining your security operations.</p>
<p>If you are interested in this type of content, be sure to read other posts on the <a target="_blank" href="https://blog.avangards.io">Avangards Blog</a>, where I share tips and deep dives on AWS, Terraform, and beyond. Thank you, and enjoy the rest of your day!</p>
]]></content:encoded></item><item><title><![CDATA[How To Manage an Amazon Bedrock Knowledge Base Using Terraform]]></title><description><![CDATA[Introduction
In the previous blog post, Adding an Amazon Bedrock Knowledge Base to the Forex Rate Assistant, I explained how to create a Bedrock knowledge base and associate it with a Bedrock agent using the AWS Management Console, with a forex rate ...]]></description><link>https://blog.avangards.io/how-to-manage-an-amazon-bedrock-knowledge-base-using-terraform</link><guid isPermaLink="true">https://blog.avangards.io/how-to-manage-an-amazon-bedrock-knowledge-base-using-terraform</guid><category><![CDATA[AWS]]></category><category><![CDATA[Terraform]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[Amazon Bedrock]]></category><category><![CDATA[AI]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Sun, 02 Jun 2024 19:59:31 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1717918997715/b7c30ddd-f59f-4349-a0aa-38f1c04b810c.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In the previous blog post, <a target="_blank" href="https://blog.avangards.io/adding-an-amazon-bedrock-knowledge-base-to-the-forex-rate-assistant">Adding an Amazon Bedrock Knowledge Base to the Forex Rate Assistant</a>, I explained how to create a Bedrock knowledge base and associate it with a Bedrock agent using the AWS Management Console, with a forex rate assistant as the use case example.</p>
<p>We also covered how to manage Bedrock agents with Terraform in another blog post, <a target="_blank" href="https://blog.avangards.io/how-to-manage-an-amazon-bedrock-agent-using-terraform">How To Manage an Amazon Bedrock Agent Using Terraform</a>. In this blog post, we will extend that setup to also manage knowledge bases in Terraform. To begin, we will first examine the relevant AWS resources in the AWS Management Console.</p>
<h2 id="heading-taking-inventory-of-the-required-resources">Taking inventory of the required resources</h2>
<p>Upon examining the knowledge base we previously built, we find that it comprises the following AWS resources:</p>
<ol>
<li><p>The <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-create.html">knowledge base</a> itself;</p>
</li>
<li><p>The <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/kb-permissions.html">knowledge base service role</a> that provides the knowledge base access to Amazon Bedrock models, data sources in S3, and the vector index;</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716965216124/16b0370c-3767-4370-a008-284f9228e0c2.png" alt="The knowledge base and its service role" class="image--center mx-auto" /></p>
</li>
<li><p>The <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html">OpenSearch Serverless policies, collection, and the vector index</a>;</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716965225539/12a5a1f8-5a7c-47ac-90b1-4d61748b7304.png" alt="The OpenSearch Serverless collection" class="image--center mx-auto" /></p>
</li>
<li><p>The S3 bucket that acts as the <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-ds-manage.html">data source</a></p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1717372925645/0f609755-c15c-45bb-a9cb-71cebad212ca.png" alt="The knowledge base data source" class="image--center mx-auto" /></p>
</li>
</ol>
<p>With this list of resources, along with those required by the agent to which the knowledge base will be attached, we can begin creating the Terraform configuration. Before diving into the setup, let's first take care of the prerequisites.</p>
<h2 id="heading-defining-variables-for-the-configuration">Defining variables for the configuration</h2>
<p>For better manageability, we define some variables in a <code>variables.tf</code> file that we will reference throughout the Terraform configuration:</p>
<pre><code class="lang-dockerfile">variable <span class="hljs-string">"kb_s3_bucket_name_prefix"</span> {
  description = <span class="hljs-string">"The name prefix of the S3 bucket for the data source of the knowledge base."</span>
  type        = string
  default     = <span class="hljs-string">"forex-kb"</span>
}

variable <span class="hljs-string">"kb_oss_collection_name"</span> {
  description = <span class="hljs-string">"The name of the OSS collection for the knowledge base."</span>
  type        = string
  default     = <span class="hljs-string">"bedrock-knowledge-base-forex-kb"</span>
}

variable <span class="hljs-string">"kb_model_id"</span> {
  description = <span class="hljs-string">"The ID of the foundational model used by the knowledge base."</span>
  type        = string
  default     = <span class="hljs-string">"amazon.titan-embed-text-v1"</span>
}

variable <span class="hljs-string">"kb_name"</span> {
  description = <span class="hljs-string">"The knowledge base name."</span>
  type        = string
  default     = <span class="hljs-string">"ForexKB"</span>
}
</code></pre>
<h2 id="heading-defining-the-s3-and-iam-resources">Defining the S3 and IAM resources</h2>
<p>The knowledge base requires a service role, which can be created using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role"><code>aws_iam_role</code> resource</a> as follows:</p>
<pre><code class="lang-dockerfile">data <span class="hljs-string">"aws_caller_identity"</span> <span class="hljs-string">"this"</span> {}
data <span class="hljs-string">"aws_partition"</span> <span class="hljs-string">"this"</span> {}
data <span class="hljs-string">"aws_region"</span> <span class="hljs-string">"this"</span> {}

locals {
  account_id            = data.aws_caller_identity.this.account_id
  partition             = data.aws_partition.this.partition
  region                = data.aws_region.this.name
  region_name_tokenized = split(<span class="hljs-string">"-"</span>, local.region)
  region_short          = <span class="hljs-string">"${substr(local.region_name_tokenized[0], 0, 2)}${substr(local.region_name_tokenized[1], 0, 1)}${local.region_name_tokenized[2]}"</span>
}

resource <span class="hljs-string">"aws_iam_role"</span> <span class="hljs-string">"bedrock_kb_forex_kb"</span> {
  name = <span class="hljs-string">"AmazonBedrockExecutionRoleForKnowledgeBase_${var.kb_name}"</span>
  assume_role_policy = jsonencode({
    Version = <span class="hljs-string">"2012-10-17"</span>
    Statement = [
      {
        Action = <span class="hljs-string">"sts:AssumeRole"</span>
        Effect = <span class="hljs-string">"Allow"</span>
        Principal = {
          Service = <span class="hljs-string">"bedrock.amazonaws.com"</span>
        }
        Condition = {
          StringEquals = {
            <span class="hljs-string">"aws:SourceAccount"</span> = local.account_id
          }
          ArnLike = {
            <span class="hljs-string">"aws:SourceArn"</span> = <span class="hljs-string">"arn:${local.partition}:bedrock:${local.region}:${local.account_id}:knowledge-base/*"</span>
          }
        }
      }
    ]
  })
}
</code></pre>
<p>With the service role in place, we can now proceed to define the corresponding IAM policy. As we define the configuration for creating resources that the knowledge base service role needs to access, we will consequently define the corresponding IAM policy using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role_policy"><code>aws_iam_role_policy</code> resource</a>. First, we create the IAM policy that provides access to the embeddings model. Since the foundation model is not created but referenced, we can use the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/bedrock_foundation_model"><code>aws_bedrock_foundation_model</code> data source</a> to obtain the ARN which we need:</p>
<pre><code class="lang-dockerfile">data <span class="hljs-string">"aws_bedrock_foundation_model"</span> <span class="hljs-string">"kb"</span> {
  model_id = var.kb_model_id
}

resource <span class="hljs-string">"aws_iam_role_policy"</span> <span class="hljs-string">"bedrock_kb_forex_kb_model"</span> {
  name = <span class="hljs-string">"AmazonBedrockFoundationModelPolicyForKnowledgeBase_${var.kb_name}"</span>
  role = aws_iam_role.bedrock_kb_forex_kb.name
  policy = jsonencode({
    Version = <span class="hljs-string">"2012-10-17"</span>
    Statement = [
      {
        Action   = <span class="hljs-string">"bedrock:InvokeModel"</span>
        Effect   = <span class="hljs-string">"Allow"</span>
        Resource = data.aws_bedrock_foundation_model.kb.model_arn
      }
    ]
  })
}
</code></pre>
<p>Next, we create the Amazon S3 bucket that acts as the <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-ds.html">data source</a> for the knowledge base using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/s3_bucket"><code>aws_s3_bucket</code> resource</a>. To adhere to security best practices, we also enable S3-SSE using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/s3_bucket_server_side_encryption_configuration"><code>aws_s3_bucket_server_side_encryption_configuration</code> resource</a> and bucket versioning with the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/s3_bucket_versioning"><code>aws_s3_bucket_versioning</code> resource</a> as follows:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_s3_bucket"</span> <span class="hljs-string">"forex_kb"</span> {
  bucket        = <span class="hljs-string">"${var.kb_s3_bucket_name_prefix}-${local.region_short}-${local.account_id}"</span>
  force_destroy = true
}

resource <span class="hljs-string">"aws_s3_bucket_server_side_encryption_configuration"</span> <span class="hljs-string">"forex_kb"</span> {
  bucket = aws_s3_bucket.forex_kb.id
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = <span class="hljs-string">"AES256"</span>
    }
  }
}

resource <span class="hljs-string">"aws_s3_bucket_versioning"</span> <span class="hljs-string">"forex_kb"</span> {
  bucket = aws_s3_bucket.forex_kb.id
  versioning_configuration {
    status = <span class="hljs-string">"Enabled"</span>
  }
  depends_on = [aws_s3_bucket_server_side_encryption_configuration.forex_kb]
}
</code></pre>
<p>Now that the S3 bucket is available, we can create the IAM policy that gives the knowledge base service role access to files for indexing:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_iam_role_policy"</span> <span class="hljs-string">"bedrock_kb_forex_kb_s3"</span> {
  name = <span class="hljs-string">"AmazonBedrockS3PolicyForKnowledgeBase_${var.kb_name}"</span>
  role = aws_iam_role.bedrock_kb_forex_kb.name
  policy = jsonencode({
    Version = <span class="hljs-string">"2012-10-17"</span>
    Statement = [
      {
        Sid      = <span class="hljs-string">"S3ListBucketStatement"</span>
        Action   = <span class="hljs-string">"s3:ListBucket"</span>
        Effect   = <span class="hljs-string">"Allow"</span>
        Resource = aws_s3_bucket.forex_kb.arn
        Condition = {
          StringEquals = {
            <span class="hljs-string">"aws:PrincipalAccount"</span> = local.account_id
          }
      } },
      {
        Sid      = <span class="hljs-string">"S3GetObjectStatement"</span>
        Action   = <span class="hljs-string">"s3:GetObject"</span>
        Effect   = <span class="hljs-string">"Allow"</span>
        Resource = <span class="hljs-string">"${aws_s3_bucket.forex_kb.arn}/*"</span>
        Condition = {
          StringEquals = {
            <span class="hljs-string">"aws:PrincipalAccount"</span> = local.account_id
          }
        }
      }
    ]
  })
}
</code></pre>
<h2 id="heading-defining-the-opensearch-serverless-policy-resources">Defining the OpenSearch Serverless policy resources</h2>
<p>The Bedrock console offers a quick create option that provisions an OpenSearch Serverless vector store on our behalf as the knowledge base is created. Since the <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html">documentation</a> for creating the vector index in OpenSearch Serverless is a bit open-ended, we can refer to the resources from the quick create option to supplement.</p>
<p>First, we <a target="_blank" href="https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-vector-search.html#serverless-vector-permissions">configure permissions</a> by defining a <a target="_blank" href="https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-data-access.html">data access policy</a> for the vector search collection. The data access policy from the quick create option is defined as follows:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716966076039/56614b6b-4bc4-4769-a668-aed23471b5b3.png" alt="The OpenSearch Serverless data access policy" class="image--center mx-auto" /></p>
<p>This data access policy provides read and write permissions to the vector search collection and its indices to the knowledge base execution role and the creator of the policy.</p>
<p>Using the corresponding <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/opensearchserverless_access_policy"><code>aws_opensearchserverless_access_policy</code> resource</a>, we can define the policy as follows:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_opensearchserverless_access_policy"</span> <span class="hljs-string">"forex_kb"</span> {
  name = var.kb_oss_collection_name
  type = <span class="hljs-string">"data"</span>
  policy = jsonencode([
    {
      Rules = [
        {
          ResourceType = <span class="hljs-string">"index"</span>
          Resource = [
            <span class="hljs-string">"index/${var.kb_oss_collection_name}/*"</span>
          ]
          Permission = [
            <span class="hljs-string">"aoss:CreateIndex"</span>,
            <span class="hljs-string">"aoss:DeleteIndex"</span>,
            <span class="hljs-string">"aoss:DescribeIndex"</span>,
            <span class="hljs-string">"aoss:ReadDocument"</span>,
            <span class="hljs-string">"aoss:UpdateIndex"</span>,
            <span class="hljs-string">"aoss:WriteDocument"</span>
          ]
        },
        {
          ResourceType = <span class="hljs-string">"collection"</span>
          Resource = [
            <span class="hljs-string">"collection/${var.kb_oss_collection_name}"</span>
          ]
          Permission = [
            <span class="hljs-string">"aoss:CreateCollectionItems"</span>,
            <span class="hljs-string">"aoss:DescribeCollectionItems"</span>,
            <span class="hljs-string">"aoss:UpdateCollectionItems"</span>
          ]
        }
      ],
      Principal = [
        aws_iam_role.bedrock_kb_forex_kb.arn,
        data.aws_caller_identity.this.arn
      ]
    }
  ])
}
</code></pre>
<p>Note that <code>aoss:DeleteIndex</code> was added to the list because this is required for cleanup by Terraform via <code>terraform destroy</code>.</p>
<p>Next, we need an <a target="_blank" href="https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-encryption.html">encryption policy</a> that assigns an encryption key to a collection for data protection at rest. The encryption policy from the quick create option is defined as follows:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716966097683/a29bf2d0-7d1d-4569-bbf1-9e4cb36bd83c.png" alt="The OpenSearch Serverless encryption policy" class="image--center mx-auto" /></p>
<p>This encryption policy simply assigns an AWS-owned key to the vector search collection. Using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/opensearchserverless_security_policy"><code>aws_opensearchserverless_security_policy</code> resource</a> with an encryption type, we can define the policy as follows:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_opensearchserverless_security_policy"</span> <span class="hljs-string">"forex_kb_encryption"</span> {
  name = var.kb_oss_collection_name
  type = <span class="hljs-string">"encryption"</span>
  policy = jsonencode({
    Rules = [
      {
        Resource = [
          <span class="hljs-string">"collection/${var.kb_oss_collection_name}"</span>
        ]
        ResourceType = <span class="hljs-string">"collection"</span>
      }
    ],
    AWSOwnedKey = true
  })
}
</code></pre>
<p>Lastly, we need a <a target="_blank" href="https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-network.html">network policy</a> which defines whether a collection is accessible publicly or privately. The network policy from the quick create option is defined as follows:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716963295938/7b410c0a-8ff6-4333-83df-7d489e6959af.png" alt="The OpenSearch Serverless network policy" class="image--center mx-auto" /></p>
<p>his network policy allows public access to the vector search collection's API endpoint and dashboard over the internet. Using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/opensearchserverless_security_policy"><code>aws_opensearchserverless_security_policy</code> resource</a> with an network type, we can define the policy as follows:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_opensearchserverless_security_policy"</span> <span class="hljs-string">"forex_kb_network"</span> {
  name = var.kb_oss_collection_name
  type = <span class="hljs-string">"network"</span>
  policy = jsonencode([
    {
      Rules = [
        {
          ResourceType = <span class="hljs-string">"collection"</span>
          Resource = [
            <span class="hljs-string">"collection/${var.kb_oss_collection_name}"</span>
          ]
        },
        {
          ResourceType = <span class="hljs-string">"dashboard"</span>
          Resource = [
            <span class="hljs-string">"collection/${var.kb_oss_collection_name}"</span>
          ]
        }
      ]
      AllowFromPublic = true
    }
  ])
}
</code></pre>
<p>With the prerequisite policies in place, we can now create the vector search collection and the index.</p>
<h2 id="heading-defining-the-opensearch-serverless-collection-and-index-resources">Defining the OpenSearch Serverless collection and index resources</h2>
<p>Creating the collection in Terraform is straightforward using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/opensearchserverless_collection"><code>aws_opensearchserverless_collection</code> resource</a>:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_opensearchserverless_collection"</span> <span class="hljs-string">"forex_kb"</span> {
  name = var.kb_oss_collection_name
  type = <span class="hljs-string">"VECTORSEARCH"</span>
  depends_on = [
    aws_opensearchserverless_access_policy.forex_kb,
    aws_opensearchserverless_security_policy.forex_kb_encryption,
    aws_opensearchserverless_security_policy.forex_kb_network
  ]
}
</code></pre>
<p>The knowledge base service role also needs access to the collection, which we can provide using the <code>aws_iam_role_policy</code> similar to before:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_iam_role_policy"</span> <span class="hljs-string">"bedrock_kb_forex_kb_oss"</span> {
  name = <span class="hljs-string">"AmazonBedrockOSSPolicyForKnowledgeBase_${var.kb_name}"</span>
  role = aws_iam_role.bedrock_kb_forex_kb.name
  policy = jsonencode({
    Version = <span class="hljs-string">"2012-10-17"</span>
    Statement = [
      {
        Action   = <span class="hljs-string">"aoss:APIAccessAll"</span>
        Effect   = <span class="hljs-string">"Allow"</span>
        Resource = aws_opensearchserverless_collection.forex_kb.arn
      }
    ]
  })
}
</code></pre>
<p>Creating the index in Terraform is however more complex, since it is not an AWS resource but an OpenSearch construct. Looking at CloudTrail events, there wasn't any event that correspond to an AWS API call that would create the index. However, observing the network traffic in the Bedrock console did reveal a request to the OpenSearch collection's API endpoint to create the index. This is what we want to port to Terraform.</p>
<p>Luckily, there is an <a target="_blank" href="https://registry.terraform.io/providers/opensearch-project/opensearch/latest/docs">OpenSearch Provider</a> maintained by OpenSearch that we can use. To connect to the vector search collection, we provide the endpoint URL and credentials in the <code>provider</code> block. The provider has first-class support for AWS, so credentials can be provided implicitly similar to the Terraform AWS Provider. The resulting provider definition is as follows:</p>
<pre><code class="lang-dockerfile">provider <span class="hljs-string">"opensearch"</span> {
  url         = aws_opensearchserverless_collection.forex_kb.collection_endpoint
  <span class="hljs-keyword">healthcheck</span><span class="bash"> = <span class="hljs-literal">false</span></span>
}
</code></pre>
<p>Note that the <code>healthcheck</code> argument is set to <code>false</code> because the client health check does not really work with OpenSearch Serverless.</p>
<p>To get the index definition, we can examine the collection in the OpenSearch Service Console:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716963317103/c85c3d2c-f7e8-4e0b-82f0-16a04f1f87e9.png" alt="The OpenSearch Serverless index details" class="image--center mx-auto" /></p>
<p>We can create the index using the <a target="_blank" href="https://registry.terraform.io/providers/opensearch-project/opensearch/latest/docs/resources/index"><code>opensearch_index</code> resource</a> with the same specifications:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"opensearch_index"</span> <span class="hljs-string">"forex_kb"</span> {
  name                           = <span class="hljs-string">"bedrock-knowledge-base-default-index"</span>
  number_of_shards               = <span class="hljs-string">"2"</span>
  number_of_replicas             = <span class="hljs-string">"0"</span>
  index_knn                      = true
  index_knn_algo_param_ef_search = <span class="hljs-string">"512"</span>
  mappings                       = &lt;&lt;-EOF
    {
      <span class="hljs-string">"properties"</span>: {
        <span class="hljs-string">"bedrock-knowledge-base-default-vector"</span>: {
          <span class="hljs-string">"type"</span>: <span class="hljs-string">"knn_vector"</span>,
          <span class="hljs-string">"dimension"</span>: <span class="hljs-number">1536</span>,
          <span class="hljs-string">"method"</span>: {
            <span class="hljs-string">"name"</span>: <span class="hljs-string">"hnsw"</span>,
            <span class="hljs-string">"engine"</span>: <span class="hljs-string">"faiss"</span>,
            <span class="hljs-string">"parameters"</span>: {
              <span class="hljs-string">"m"</span>: <span class="hljs-number">16</span>,
              <span class="hljs-string">"ef_construction"</span>: <span class="hljs-number">512</span>
            },
            <span class="hljs-string">"space_type"</span>: <span class="hljs-string">"l2"</span>
          }
        },
        <span class="hljs-string">"AMAZON_BEDROCK_METADATA"</span>: {
          <span class="hljs-string">"type"</span>: <span class="hljs-string">"text"</span>,
          <span class="hljs-string">"index"</span>: <span class="hljs-string">"false"</span>
        },
        <span class="hljs-string">"AMAZON_BEDROCK_TEXT_CHUNK"</span>: {
          <span class="hljs-string">"type"</span>: <span class="hljs-string">"text"</span>,
          <span class="hljs-string">"index"</span>: <span class="hljs-string">"true"</span>
        }
      }
    }
  EOF
  force_destroy                  = true
  depends_on                     = [aws_opensearchserverless_collection.forex_kb]
}
</code></pre>
<p>Note that the dimension is set to 1536, which is the value required for the <strong>Titan G1 Embeddings - Text</strong> model.</p>
<p>Before we move on, you must know about an issue with the Terraform OpenSearch provider that caused me a lot of headache. When I was testing the Terraform configuration, the <code>opensearch_index</code> resource kept failing because the provider could not seemingly authenticate against the collection's endpoint URL. After a long debugging session, I was able to find a <a target="_blank" href="https://github.com/opensearch-project/terraform-provider-opensearch/issues/179">GitHub issue</a> in the Terraform OpenSearch Provider repository that mentions the cryptic "EOF" error that was present. The issue mentions that the bug is related to OpenSearch Serverless and an earlier provider version, v2.2.0, does not have the problem. Consequently, I was able to work around the problem by using this specific version of the provider:</p>
<pre><code class="lang-dockerfile">terraform {
  required_providers {
    aws = {
      source  = <span class="hljs-string">"hashicorp/aws"</span>
      version = <span class="hljs-string">"~&gt; 5.48"</span>
    }
    opensearch = {
      source  = <span class="hljs-string">"opensearch-project/opensearch"</span>
      version = <span class="hljs-string">"= 2.2.0"</span>
    }
  }
  required_version = <span class="hljs-string">"~&gt; 1.5"</span>
}
</code></pre>
<p>Hopefully letting you in on this tip will save you hours of troubleshooting.</p>
<h2 id="heading-defining-the-knowledge-base-resource">Defining the knowledge base resource</h2>
<p>With all dependent resources in place, we can now proceed to create the knowledge base. However, there is the matter of <a target="_blank" href="https://docs.aws.amazon.com/IAM/latest/UserGuide/troubleshoot_general.html#troubleshoot_general_eventual-consistency">eventual consistency with IAM resources</a> that we first need to address. Since Terraform creates resources in quick succession, there is a chance that the configuration of the knowledge base service role is not propagated across AWS endpoints before it is used by the knowledge base during its creation, resulting in temporary permission issues. What I observed during testing is that the permission error is usually related to the OpenSearch Serverless collection.</p>
<p>To mitigate this, we add a delay using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/time/latest/docs/resources/sleep"><code>time_sleep</code> resource</a> in the Time Provider. The following configuration will add a 20-second delay after the IAM policy for the OpenSearch Serverless collection is created:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"time_sleep"</span> <span class="hljs-string">"aws_iam_role_policy_bedrock_kb_forex_kb_oss"</span> {
  create_duration = <span class="hljs-string">"20s"</span>
  depends_on      = [aws_iam_role_policy.bedrock_kb_forex_kb_oss]
}
</code></pre>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">If you still encounter permission issues when creating the knowledge base, try increasing the delay to 30 seconds.</div>
</div>

<p>Now we can create the knowledge base using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/bedrockagent_knowledge_base"><code>aws_bedrockagent_knowledge_base</code> resource</a> as follows:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_bedrockagent_knowledge_base"</span> <span class="hljs-string">"forex_kb"</span> {
  name     = var.kb_name
  role_arn = aws_iam_role.bedrock_kb_forex_kb.arn
  knowledge_base_configuration {
    vector_knowledge_base_configuration {
      embedding_model_arn = data.aws_bedrock_foundation_model.kb.model_arn
    }
    type = <span class="hljs-string">"VECTOR"</span>
  }
  storage_configuration {
    type = <span class="hljs-string">"OPENSEARCH_SERVERLESS"</span>
    opensearch_serverless_configuration {
      collection_arn    = aws_opensearchserverless_collection.forex_kb.arn
      vector_index_name = <span class="hljs-string">"bedrock-knowledge-base-default-index"</span>
      field_mapping {
        vector_field   = <span class="hljs-string">"bedrock-knowledge-base-default-vector"</span>
        text_field     = <span class="hljs-string">"AMAZON_BEDROCK_TEXT_CHUNK"</span>
        metadata_field = <span class="hljs-string">"AMAZON_BEDROCK_METADATA"</span>
      }
    }
  }
  depends_on = [
    aws_iam_role_policy.bedrock_kb_forex_kb_model,
    aws_iam_role_policy.bedrock_kb_forex_kb_s3,
    opensearch_index.forex_kb,
    time_sleep.aws_iam_role_policy_bedrock_kb_forex_kb_oss
  ]
}
</code></pre>
<p>Note that <code>time_sleep.aws_iam_role_policy_bedrock_kb_forex_kb_oss</code> is in the <code>depends_on</code> list - this is how the aforementioned delay is enforced before the knowledge base is created by Terraform.</p>
<p>We also need to add the data source to the knowledge base using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/bedrockagent_data_source">aws_bedrock_data_source resource</a> as follows:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_bedrockagent_data_source"</span> <span class="hljs-string">"forex_kb"</span> {
  knowledge_base_id = aws_bedrockagent_knowledge_base.forex_kb.id
  name              = <span class="hljs-string">"${var.kb_name}DataSource"</span>
  data_source_configuration {
    type = <span class="hljs-string">"S3"</span>
    s3_configuration {
      bucket_arn = aws_s3_bucket.forex_kb.arn
    }
  }
}
</code></pre>
<p>Voila! We have created a stand-alone Bedrock knowledge base using Terraform! All that remains is to attach the knowledge base to an agent (the forex assistant in our case) to extend the solution.</p>
<h2 id="heading-integrating-the-knowledge-base-and-agent-resources">Integrating the knowledge base and agent resources</h2>
<p>For your convenience, you can use the Terraform configuration from the blog post <a target="_blank" href="https://blog.avangards.io/how-to-manage-an-amazon-bedrock-agent-using-terraform">How To Manage an Amazon Bedrock Agent Using Terraform</a> to create the rate assistant. It can be found in the <code>1_basic</code> directory in <a target="_blank" href="https://github.com/acwwat/terraform-amazon-bedrock-agent-example">this GitHub repository</a>.</p>
<p>Once you incorporate this Terraform configuration with the knowledge base you’ve been developing, we use the new <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/bedrockagent_agent_knowledge_base_association"><code>aws_bedrockagent_agent_knowledge_base_association</code> resource</a> to associate the knowledge base with the agent:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_bedrockagent_agent_knowledge_base_association"</span> <span class="hljs-string">"forex_kb"</span> {
  agent_id             = aws_bedrockagent_agent.forex_asst.id
  description          = file(<span class="hljs-string">"${path.module}/prompt_templates/kb_instruction.txt"</span>)
  knowledge_base_id    = aws_bedrockagent_knowledge_base.forex_kb.id
  knowledge_base_state = <span class="hljs-string">"ENABLED"</span>
}
</code></pre>
<p>For better organization, we will keep the knowledge base description in a text file called <code>kb_instruction.txt</code> in the <code>prompt_templates</code> folder. The file contains the following text:</p>
<pre><code class="lang-plaintext">Use this knowledge base to retrieve information on foreign currency exchange, such as the FX Global Code.
</code></pre>
<p>Lastly, we explained in the previous blog post that the agent must be prepared after changes are made. We used a <code>null_resource</code> to trigger the prepare action, so we will continue to use the same strategy for the knowledge base association by adding an explicit dependency:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"null_resource"</span> <span class="hljs-string">"forex_asst_prepare"</span> {
  triggers = {
    forex_api_state = sha256(jsonencode(aws_bedrockagent_agent_action_group.forex_api))
    forex_kb_state  = sha256(jsonencode(aws_bedrockagent_knowledge_base.forex_kb))
  }
  provisioner <span class="hljs-string">"local-exec"</span> {
    command = <span class="hljs-string">"aws bedrock-agent prepare-agent --agent-id ${aws_bedrockagent_agent.forex_asst.id}"</span>
  }
  depends_on = [
    aws_bedrockagent_agent.forex_asst,
    aws_bedrockagent_agent_action_group.forex_api,
    aws_bedrockagent_knowledge_base.forex_kb
  ]
}
</code></pre>
<h2 id="heading-testing-the-configuration">Testing the configuration</h2>
<p>Now, the moment of truth. We can apply the full Terraform configuration and make sure that it is working properly. My run took several minutes, with the majority of the time spent on creating the OpenSearch Serverless collection. Here is an excerpt of the output for reference:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716963346676/b1b8dd91-ae27-429d-a4ad-229889427cf0.png" alt="Excerpt of the Terraform apply output" class="image--center mx-auto" /></p>
<p>In the Bedrock console, we can see that the agent <strong>ForexAssistant</strong> is ready for testing. But we first need to upload the <a target="_blank" href="https://www.globalfxc.org/docs/fx_global.pdf">FX Global Code PDF file</a> to the S3 bucket and do a data source sync. For details on these steps, refer to the blog post <a target="_blank" href="https://blog.avangards.io/adding-an-amazon-bedrock-knowledge-base-to-the-forex-rate-assistant">Adding an Amazon Bedrock Knowledge Base to the Forex Rate Assistant</a>.</p>
<p>Using the test chat interface, I asked:</p>
<blockquote>
<p>What is the FX Global Code?</p>
</blockquote>
<p>It responded with an explanation that contains citations, indicating that the information was obtained from the knowledge base.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716963600143/261d3b1a-2be1-4543-958a-160d6e67f6d2.png" alt="Agent performing knowledge base search" class="image--center mx-auto" /></p>
<p>For good measure, we will also ask the forex assistant for an exchange rate:</p>
<blockquote>
<p>What is the exchange rate from US Dollar to Canadian Dollar?</p>
</blockquote>
<p>It responded with the latest exchange rate as expected:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716963998760/d9485999-876e-4c36-9716-93d6c38bc7f9.png" alt="Agent fetching forex rate as expected" class="image--center mx-auto" /></p>
<p>And that's a wrap! Don't forget to run <code>terraform destroy</code> when you are done, since there is a running cost for the OpenSearch Serverless collection.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">✅</div>
<div data-node-type="callout-text">For reference, I've dressed up the Terraform solution at bit and checked in the final artifacts to the <code>2_knowledge_base</code> directory in <a target="_blank" href="https://github.com/acwwat/terraform-amazon-bedrock-agent-example">this repository</a>. Feel free to check it out and use it as the basis for your Bedrock experimentation.</div>
</div>

<h2 id="heading-summary">Summary</h2>
<p>In this blog post, we developed the Terraform configuration for the knowledge base that enhances the forex rate assistant which we created interactively in the blog post <a target="_blank" href="https://blog.avangards.io/adding-an-amazon-bedrock-knowledge-base-to-the-forex-rate-assistant">Adding an Amazon Bedrock Knowledge Base to the Forex Rate Assistant</a>. I hope the explanations on key points and solutions to various issues in this blog post help you fast-track your IaC development for Amazon Bedrock solutions.</p>
<p>I will continue to evaluate different features of Amazon Bedrock, such as <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html">Guardrails for Amazon Bedrock</a>, and streamlining the data ingestion process for knowledge bases. Please look forward for more helpful content on this topic as well as many others in the <a target="_blank" href="https://blog.avangards.io">Avangards Blog</a>. Happy learning!</p>
]]></content:encoded></item><item><title><![CDATA[How To Manage IAM Access Analyzer in AWS Organizations Using Terraform]]></title><description><![CDATA[Introduction
Since I began covering the implementation of security controls in AWS, I have provided walkthroughs on configuring Amazon GuardDuty and AWS Security Hub in a centralized setup using Terraform. In this blog post, we will explore another s...]]></description><link>https://blog.avangards.io/how-to-manage-iam-access-analyzer-in-aws-organizations-using-terraform</link><guid isPermaLink="true">https://blog.avangards.io/how-to-manage-iam-access-analyzer-in-aws-organizations-using-terraform</guid><category><![CDATA[AWS]]></category><category><![CDATA[Terraform]]></category><category><![CDATA[Security]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Mon, 27 May 2024 15:59:47 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1717919073012/bc3ded87-f31c-468d-b552-b0a66f6c1331.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>Since I began covering the <a target="_blank" href="https://blog.avangards.io/series/aws-ssb-terraform">implementation of security controls in AWS</a>, I have provided walkthroughs on configuring <a target="_blank" href="https://blog.avangards.io/how-to-manage-amazon-guardduty-in-aws-organizations-using-terraform">Amazon GuardDuty</a> and <a target="_blank" href="https://blog.avangards.io/how-to-manage-aws-security-hub-in-aws-organizations-using-terraform">AWS Security Hub</a> in a centralized setup using Terraform. In this blog post, we will explore another security service: AWS IAM Access Analyzer. This service helps identify unintended external access or unused access within your organization. Setting up IAM Access Analyzer is simpler than the other services, so let's dive right in!</p>
<h2 id="heading-about-the-use-case">About the use case</h2>
<p><a target="_blank" href="https://docs.aws.amazon.com/IAM/latest/UserGuide/what-is-access-analyzer.html.">AWS Identity and Access Management (IAM) Access Analyzer</a> is a feature of AWS IAM that identifies resources shared with external entities and detects unused access, enabling you to mitigate any unintended or obsolete permissions.</p>
<p>IAM Access Analyzer <a target="_blank" href="https://aws.amazon.com/blogs/aws/new-use-aws-iam-access-analyzer-in-aws-organizations/">can be used in AWS Organizations</a>, allowing analyzers that use the organization as the zone of trust to be managed by either the management account or a <a target="_blank" href="https://docs.aws.amazon.com/IAM/latest/UserGuide/access-analyzer-settings.html">delegated administrator account</a>. This enables the consolidation of findings, which can then be ingested by AWS Security Hub in a centralized setup.</p>
<p>Since it is increasingly common to establish an AWS landing zone using <a target="_blank" href="https://docs.aws.amazon.com/controltower/latest/userguide/what-is-control-tower.html">AWS Control Tower</a>, we will use the <a target="_blank" href="https://docs.aws.amazon.com/controltower/latest/userguide/accounts.html">standard account structure</a> in a Control Tower landing zone to demonstrate how to configure IAM Access Analyzer in Terraform:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714976563379/60a68c93-c440-4c20-979c-49a3f4d842f3.png" alt="Control Tower standard OU and account structure" class="image--center mx-auto" /></p>
<p>The relevant accounts for our use case in the landing zone are:</p>
<ol>
<li><p>The <strong>Management</strong> account for the organization where AWS Organizations is configured. For details, refer to <a target="_blank" href="https://docs.aws.amazon.com/IAM/latest/UserGuide/access-analyzer-settings.html">Settings for IAM Access Analyzer</a>.</p>
</li>
<li><p>The <strong>Audit</strong> account where security and compliance services are typically centralized in a Control Tower landing zone.</p>
</li>
</ol>
<p>The objective is to delegate IAM Access Analyzer administrative duties from the <strong>Management</strong> account to the <strong>Audit</strong> account, after which all organization configurations are managed in the <strong>Audit</strong> account. With that said, let's see how we can achieve this using Terraform!</p>
<h2 id="heading-designating-an-iam-access-analyzer-administrator-account">Designating an IAM Access Analyzer administrator account</h2>
<p>The IAM Access Analyzer delegated administrator is configured in the <strong>Management</strong> account, so we need a provider associated with it in Terraform. To simplify the setup, we will use a multi-provider approach by defining two providers: one for the <strong>Management</strong> account and another for the <strong>Audit</strong> account. We will use AWS CLI profiles as follows:</p>
<pre><code class="lang-dockerfile">provider <span class="hljs-string">"aws"</span> {
  alias   = <span class="hljs-string">"management"</span>
  <span class="hljs-comment"># Use "aws configure" to create the "management" profile with the Management account credentials</span>
  profile = <span class="hljs-string">"management"</span> 
}

provider <span class="hljs-string">"aws"</span> {
  alias   = <span class="hljs-string">"audit"</span>
  <span class="hljs-comment"># Use "aws configure" to create the "audit" profile with the Audit account credentials</span>
  profile = <span class="hljs-string">"audit"</span> 
}
</code></pre>
<p>Unlike other security services that have specific Terraform resources for designating a delegated administrator, this is done using the more general <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/organizations_delegated_administrator"><code>aws_organizations_delegated_administrator</code> resource</a> as follows:</p>
<pre><code class="lang-dockerfile">data <span class="hljs-string">"aws_caller_identity"</span> <span class="hljs-string">"audit"</span> {
  provider = aws.audit
}

resource <span class="hljs-string">"aws_organizations_delegated_administrator"</span> <span class="hljs-string">"this"</span> {
  provider          = aws.management
  account_id        = data.aws_caller_identity.audit.account_id
  service_principal = <span class="hljs-string">"access-analyzer.amazonaws.com"</span>
}
</code></pre>
<p>With the <strong>Audit</strong> account designated as the IAM Access Analyzer administrator, we can now create the analyzers for the organization.</p>
<h2 id="heading-creating-analyzers-with-organizational-zone-of-trust">Creating analyzers with organizational zone of trust</h2>
<p>As mentioned earlier, there are two types of analyzers: external access and unused access. To make the setup more configurable, we will add some variables and keep them in a separate file called <code>variables.tf</code>. To create the external access analyzer with the organization as the zone of trust, we can define the Terraform configuration as follows:</p>
<pre><code class="lang-dockerfile"><span class="hljs-comment"># Defined in variables.tf</span>

variable <span class="hljs-string">"org_external_access_analyzer_name"</span> {
  description = <span class="hljs-string">"The name of the organization external access analyzer."</span>
  type        = string
  default     = <span class="hljs-string">"OrgExternalAccessAnalyzer"</span>
}
</code></pre>
<pre><code class="lang-dockerfile"><span class="hljs-comment"># Defined in main.tf</span>

resource <span class="hljs-string">"aws_accessanalyzer_analyzer"</span> <span class="hljs-string">"org_external_access"</span> {
  provider      = aws.audit
  analyzer_name = var.org_external_access_analyzer_name
  type          = <span class="hljs-string">"ORGANIZATION"</span>
  depends_on    = [aws_organizations_delegated_administrator.this]
}
</code></pre>
<p>Since the unused access analyzer is a paid feature, we ought to make it optional. The Terraform configuration can be defined in the following manner:</p>
<pre><code class="lang-dockerfile"><span class="hljs-comment"># Defined in variables.tf</span>

variable <span class="hljs-string">"org_unused_access_analyzer_name"</span> {
  description = <span class="hljs-string">"The name of the organization unused access analyzer."</span>
  type        = string
  default     = <span class="hljs-string">"OrgUnusedAccessAnalyzer"</span>
}

variable <span class="hljs-string">"eanble_unused_access"</span> {
  description = <span class="hljs-string">"Whether organizational unused access analysis should be enabled."</span>
  type        = bool
  default     = false
}

variable <span class="hljs-string">"unused_access_age"</span> {
  description = <span class="hljs-string">"The specified access age in days for which to generate findings for unused access."</span>
  type        = number
  default     = <span class="hljs-number">90</span>
}
</code></pre>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_accessanalyzer_analyzer"</span> <span class="hljs-string">"org_unused_access"</span> {
  provider      = aws.audit
  count         = var.eanble_unused_access ? <span class="hljs-number">1</span> : <span class="hljs-number">0</span>
  analyzer_name = var.org_unused_access_analyzer_name
  type          = <span class="hljs-string">"ORGANIZATION_UNUSED_ACCESS"</span>
  configuration {
    unused_access {
      unused_access_age = var.unused_access_age
    }
  }
  depends_on = [aws_organizations_delegated_administrator.this]
}
</code></pre>
<div data-node-type="callout">
<div data-node-type="callout-emoji">✅</div>
<div data-node-type="callout-text">You can find the complete Terraform in the <a target="_blank" href="https://github.com/acwwat/terraform-aws-accessanalyzer-organization-example">GitHub repository</a> that accompanies this blog post.</div>
</div>

<p>With the complete Terraform configuration, you can now apply it with the appropriate variable values to establish the <strong>Audit</strong> account as the delegated administrator and create the analyzers with the organization as the zone of trust.</p>
<h2 id="heading-additional-considerations">Additional considerations</h2>
<p>IAM Access Analyzer is a regional service, so you must create an analyzer in each region. However, it primarily applies to external access analysis, which examines the policies of regional resources such as S3 buckets and KMS keys. Since unused access analysis works with IAM users and roles, which are global resources, creating multiple unused access analyzers would only increase costs without adding value. Therefore, it is recommended to create one external access analyzer per region and only one unused access analyzer in the home region.</p>
<p>Another consideration is that there are times when the organizational zone of trust is not desirable. For example, if you wish to have full segregation of member accounts because they represent different tenants, then you would actually want analyzers created in each member account with itself as the zone of trust. This unfortunately would have to be managed at a per-account level.</p>
<h2 id="heading-summary">Summary</h2>
<p>In this blog post, you learned how to manage IAM Access Analyzer in AWS Organizations using Terraform by defining a delegated administrator and using analyzers with the organization as the zone of trust. If you have also <a target="_blank" href="https://blog.avangards.io//how-to-manage-aws-security-hub-in-aws-organizations-using-terraform">configured AWS Security Hub to operate at the organization level</a>, you can manage IAM Access Analyzer findings across accounts and regions, thereby streamlining your security operations.</p>
<p>I hope you find this blog post helpful. Be sure to keep an eye out for more how-to articles on configuring other AWS security services in Terraform, or learn about other topics like <a target="_blank" href="https://blog.avangards.io/building-a-basic-forex-rate-assistant-using-agents-for-amazon-bedrock">generative AI</a>, on the <a target="_blank" href="https://blog.avangards.io">Avangards Blog</a>.</p>
]]></content:encoded></item><item><title><![CDATA[Knowledge Base Support for the Generic Bedrock Agent Test UI]]></title><description><![CDATA[Introduction
In the blog post Developing a Generic Streamlit UI to Test Amazon Bedrock Agents, I shared the design and source code of a basic yet functional UI for testing Bedrock agents. Since then, I've explored Knowledge Bases for Amazon Bedrock a...]]></description><link>https://blog.avangards.io/knowledge-base-support-for-the-generic-bedrock-agent-test-ui</link><guid isPermaLink="true">https://blog.avangards.io/knowledge-base-support-for-the-generic-bedrock-agent-test-ui</guid><category><![CDATA[AWS]]></category><category><![CDATA[AI]]></category><category><![CDATA[Amazon Bedrock]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[Python]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Tue, 21 May 2024 19:37:16 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1716103044626/6b4ff5c2-c7fc-4ecb-afc3-139266a2eaa7.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In the blog post <a target="_blank" href="https://blog.avangards.io/developing-a-generic-streamlit-ui-to-test-amazon-bedrock-agents">Developing a Generic Streamlit UI to Test Amazon Bedrock Agents</a>, I shared the design and <a target="_blank" href="https://github.com/acwwat/amazon-bedrock-agent-test-ui">source code</a> of a basic yet functional UI for testing Bedrock agents. Since then, I've explored Knowledge Bases for Amazon Bedrock and shared my insights in another blog post, <a target="_blank" href="https://blog.avangards.io/adding-an-amazon-bedrock-knowledge-base-to-the-forex-rate-assistant">Adding an Amazon Bedrock Knowledge Base to the Forex Rate Assistant</a>. If you haven't checked it out yet, I highly recommend doing so.</p>
<p>The Bedrock console offers additional features for testing agents that are integrated with knowledge bases, including citations in the responses and trace information about the retrieved results from knowledge bases:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716101350345/6505b2d1-201a-4df5-9dc1-61de095eabd5.png" alt="Knowledge base citations and traces" class="image--center mx-auto" /></p>
<p>With a bit of work, I have added similar support to the generic test UI and I am happy to share the updates in the <a target="_blank" href="https://github.com/acwwat/amazon-bedrock-agent-test-ui">GitHub repository</a>.</p>
<h2 id="heading-design-overview">Design overview</h2>
<p>With the latest update, citations are now added to the response in a manner similar to how they are displayed in the Bedrock console:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716101361882/cae8d9e7-bd1d-4e8e-b110-8f46a371e207.png" alt="Citations in the response" class="image--center mx-auto" /></p>
<p>The <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_InvokeAgent.html">Agent for Bedrock Runtime API</a> provides, for each citation, the start and end index of the text that references the source as well as the document location in S3. Through string manipulation, citation numbers are incorporated into the response text, and references are appended to the end of the text. Spacing is a bit finicky due to the use of <a target="_blank" href="https://docs.streamlit.io/develop/api-reference/text/st.markdown">markdown</a>, so some HTML markups are used.</p>
<p>Another feature is the inclusion of citation details in the <strong>Trace</strong> section of the left pane:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716102060663/ab2cbcf6-adfd-47a9-8ca3-7dd2a0e20574.png" alt="Citation details" class="image--center mx-auto" /></p>
<p>Each citation block provides the raw output from the API response and includes the information mentioned earlier, as well as the raw results retrieved by the knowledge base.</p>
<p>Additionally, the trace blocks from the original test UI can now provide more detailed information for knowledge base invocations. You will find all references retrieved from the knowledge base, which form a superset of the citations in the final response after the model processes them.</p>
<h2 id="heading-summary">Summary</h2>
<p>With the improvements to the generic test UI outlined in this post, you should now be able to test any Bedrock agents with attached knowledge bases. I hope you find this update helpful and look forward to further enhancements as I continue on my Amazon Bedrock journey.</p>
<p>Be sure to explore other posts on the <a target="_blank" href="https://blog.avangards.io">Avangards Blog</a> to learn more about generative AI in AWS, Terraform, and other technical topics. Have a great day!</p>
]]></content:encoded></item><item><title><![CDATA[Adding an Amazon Bedrock Knowledge Base to the Forex Rate Assistant]]></title><description><![CDATA[Introduction
In our journey of experimenting with Amazon Bedrock up to this point, we have built a basic forex assistant as the basis for further enhancements to evaluate various Bedrock features and generative AI (gen AI) techniques. Our next step i...]]></description><link>https://blog.avangards.io/adding-an-amazon-bedrock-knowledge-base-to-the-forex-rate-assistant</link><guid isPermaLink="true">https://blog.avangards.io/adding-an-amazon-bedrock-knowledge-base-to-the-forex-rate-assistant</guid><category><![CDATA[AWS]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[Amazon Bedrock]]></category><category><![CDATA[AI]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Fri, 17 May 2024 05:37:45 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1715191582979/4025c465-91fb-426b-95ea-765bcfba36e6.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In our journey of experimenting with Amazon Bedrock up to this point, we have built a <a target="_blank" href="https://blog.avangards.io/building-a-basic-forex-rate-assistant-using-agents-for-amazon-bedrock">basic forex assistant</a> as the basis for further enhancements to evaluate various Bedrock features and generative AI (gen AI) techniques. Our next step is to integrate a knowledge base to the agent, so that it can provide information about foreign currency exchange in general.</p>
<p>In this blog post, we will define a representative use case for a RAG scenario for the forex rate agent, build a forex knowledge base, and attach it to the agent. Accuracy and performance of a gen AI application is also essential, so we'll conduct some tests and discuss challenges associated with RAG workflows.</p>
<h2 id="heading-about-knowledge-bases-for-amazon-bedrock">About Knowledge Bases for Amazon Bedrock</h2>
<p><a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html">Knowledge Bases for Amazon Bedrock</a> is a service that provides managed capability for implementing <a target="_blank" href="https://aws.amazon.com/what-is/retrieval-augmented-generation/">Retrieval Augmented Generation (RAG)</a> workflows. Knowledge bases can be integrated with Bedrock agents to seamlessly enable RAG functionality, or be used as a component in custom AI-enabled applications using API.</p>
<p>Knowledge Bases for Amazon Bedrock automates the ingestion of source documents, by generating embeddings with a foundation model, such as <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html">Amazon Titan</a> or <a target="_blank" href="https://aws.amazon.com/bedrock/cohere-command-embed/">Cohere Embed</a>, and storing them in a supported vector store as depicted in the following diagram:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715226225256/1a8e677f-0cd0-4733-bba0-ded832a99c86.png" alt="Bedrock knowledge base data pre-processing" class="image--center mx-auto" /></p>
<p>To keep things simple, the service provides a quick start option that provisions on your behalf an <a target="_blank" href="https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-overview.html">Amazon OpenSearch Serverless</a> vector database for its use.</p>
<p>Aside from <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-kb-add.html">native integration into a Bedrock agent</a>, the <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/APIReference/API_Operations_Agents_for_Amazon_Bedrock_Runtime.html">Agents for Amazon Bedrock Runtime API</a> offers both the ability to perform raw text and semantic search, and the ability to retrieve and generate a response with a foundation model, on knowledge bases. The latter allows the community to provide tighter integration in frameworks such as <a target="_blank" href="https://js.langchain.com/docs/integrations/retrievers/bedrock-knowledge-bases">LangChain</a> and <a target="_blank" href="https://docs.llamaindex.ai/en/latest/examples/retrievers/bedrock_retriever/">LlamaIndex</a> to simplify RAG scenarios. The runtime flow is shown in this diagram:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715226279547/ca6d1532-f6b9-4511-b7c6-6a6d922247b8.png" alt="Bedrock knowledge base runtime execution" class="image--center mx-auto" /></p>
<h2 id="heading-enhancing-the-forex-rate-assistant-use-case">Enhancing the forex rate assistant use case</h2>
<p>In the blog post <a target="_blank" href="https://blog.avangards.io/building-a-basic-forex-rate-assistant-using-agents-for-amazon-bedrock">Building a Basic Forex Rate Assistant Using Agents for Amazon Bedrock</a>, we created a basic forex assistant that helps users look up the latest forex rates. It would be helpful if the assistant can also answer other questions on the broader topic.</p>
<p>While information about the history of forex would be useful, the Claude models already possess such knowledge so it would not make a great use case for knowledge bases. As I searched for more specific subtopics, I found the <a target="_blank" href="https://www.globalfxc.org/fx_global_code.htm">FX Global Code</a>, a set of common guidelines developed by the <a target="_blank" href="https://www.globalfxc.org/overview.htm">Global Foreign Exchange Committee (GFXC)</a> which establishes universal principles to uphold integrity and ensure the effective operation of the wholesale FX market. The FX Global Code is conveniently available in <a target="_blank" href="https://www.globalfxc.org/docs/fx_global.pdf">PDF format</a>, which is perfect for ingestion by the knowledge base.</p>
<h2 id="heading-requesting-model-access-and-creating-the-s3-bucket-for-document-ingestion">Requesting model access and creating the S3 bucket for document ingestion</h2>
<p>Let's start with the <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-prereq.html">prerequisites</a> by <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html">requesting for model access</a>. For the forex knowledge base, we will be using the <strong>Titan Embeddings G1 - Text</strong> model. You can review the model pricing information <a target="_blank" href="https://aws.amazon.com/bedrock/pricing/">here</a>.</p>
<p>Next, we need to create the S3 bucket from which the knowledge base will ingest source documents. We can quickly do so in the S3 Console with the following settings:</p>
<ul>
<li><p><strong>Bucket name:</strong> forex-kb-<em>&lt;region&gt;</em>-<em>&lt;account_id&gt;</em> (such as <code>forex-kb-use1-123456789012</code>)</p>
</li>
<li><p><strong>Block all public access:</strong> Checked (by default)</p>
</li>
<li><p><strong>Bucket versioning:</strong> Enable</p>
</li>
<li><p><strong>Default encryption:</strong> SSE-S3 (by default)</p>
</li>
</ul>
<p>Once the S3 bucket is created, download the <a target="_blank" href="https://www.globalfxc.org/docs/fx_global.pdf">FX Global Code PDF file</a> and upload it to the bucket:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715230440237/bf921a87-489c-4116-ac6b-0626aebb3666.png" alt="S3 bucket and source document for the knowledge base" class="image--center mx-auto" /></p>
<p>This is sufficient for our purpose. For more information on other supported document formats and adding metadata for the filtering feature, refer to the <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-ds.html">Amazon Bedrock user guide</a>.</p>
<h2 id="heading-creating-a-knowledge-base-along-with-a-vector-database">Creating a knowledge base along with a vector database</h2>
<p>Next, we can create the knowledge base in the Amazon Bedrock console following the steps below:</p>
<ol>
<li><p>Select <strong>Knowledge bases</strong> in the left menu.</p>
</li>
<li><p>On the <strong>Knowledge bases</strong> page, click <strong>Create knowledge base</strong>.</p>
</li>
<li><p>In <strong>Step 1</strong> of the <strong>Create knowledge base</strong> wizard, enter the following information and click <strong>Next</strong>:</p>
<ul>
<li><p><strong>Knowledge base name:</strong> ForexKB</p>
</li>
<li><p><strong>Knowledge base description:</strong> A knowledge base with information on foreign currency exchange.</p>
</li>
</ul>
</li>
</ol>
<p>    <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715231341541/7e920c96-c5af-4e7e-a828-f69a3fb8a38b.png" alt="Create knowledge base - step 1" class="image--center mx-auto" /></p>
<ol start="4">
<li><p>In <strong>Step 2</strong> of the wizard, enter the following information and click <strong>Next</strong>:</p>
<ul>
<li><p><strong>Data source name:</strong> ForexKBDataSource</p>
</li>
<li><p><strong>S3 URI:</strong> <em>Browse and select the S3 bucket that we created earlier</em></p>
</li>
</ul>
</li>
</ol>
<p>    <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715231662843/4805d9cf-357c-4839-b596-6a7fe27eae85.png" alt="Create knowledge base - step 2" class="image--center mx-auto" /></p>
<ol start="5">
<li><p>In <strong>Step 3</strong> of the wizard, enter the following information and click <strong>Next:</strong></p>
<ul>
<li><p><strong>Embeddings model:</strong> Titan Embeddings G1 - Text v1.2</p>
</li>
<li><p><strong>Vector database:</strong> Quick create a new vector store</p>
</li>
</ul>
</li>
</ol>
<p>    <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715231886589/85bfbf57-028b-41dc-b787-dc4fcdf0cee8.png" alt="Create knowledge base - step 3" class="image--center mx-auto" /></p>
<ol start="6">
<li>In <strong>Step 4</strong> of the wizard, click <strong>Create knowledge base</strong>.</li>
</ol>
<p>The knowledge base and the vector database, which is an Amazon OpenSearch Server collection, will take a few minutes to create. When they are ready, you'll be directed to the knowledge base page, where you will be prompted to synchronize sync the data source. To do so, scroll down to the <strong>Data source section</strong>, select the radio button beside the data source name, and click <strong>Sync</strong>:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715232890078/b1203463-0662-4a5a-a774-e5cf32539351.png" alt="Sync data source" class="image--center mx-auto" /></p>
<p>It will take less than a minute to complete, since we only have a single moderately-sized PDF document. Now the knowledge base is ready for some validation.</p>
<h2 id="heading-testing-the-knowledge-base">Testing the knowledge base</h2>
<p>A knowledge base must provide accurate results, so we need to validate it against information we know exists out of the source documents. This can be done using the integrated test pane. To simulate an end-to-end test for the RAG scenario, configure the test environment in the Bedrock console as follows:</p>
<ol>
<li><p>Enable the <strong>Generate response</strong> option (which should already be enabled by default).</p>
</li>
<li><p>Click <strong>Select model</strong>. In the new dialog, select the <strong>Claude 3 Haiku</strong> model and click <strong>Apply</strong>.</p>
</li>
<li><p>Click on the button with the three sliders, which opens the <strong>Configuration</strong> page. This should expand the test pane so you have more screen real estate to work with.</p>
</li>
</ol>
<p>I've prepared a couple of questions after skimming through the FX Global Code PDF file. Let's start by asking a basic question:</p>
<blockquote>
<p>What is the FX Global Code?</p>
</blockquote>
<p>The knowledge base responded with an answer that's consistent with the text on page 3 of the document.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715234190388/3ff98276-05a6-4d87-b6c1-50c41443709a.png" alt="Search result for a basic question" class="image--center mx-auto" /></p>
<p>To see the underlying search results which the model used to generate the response, click <strong>Show source details</strong>. Similar to agent trace, we can view the source chunks that are related to our question, and the associated raw text and metadata (which is mainly the citation information). Some source chunks refer to the table of contents, which some refers to the same passage from page 3 of the document.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715234841154/15889c33-fe5d-4496-8cbb-8acf9fd93a4e.png" alt="Source details with raw text and metadata" class="image--center mx-auto" /></p>
<p>Next, let's ask something more specific, namely to give me information on a specific principle such as principle 15 which is on page 33 of the PDF file:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715235094470/8b2ae7b5-f7a6-41c6-858d-7d61a7426051.png" alt="Principle 15 from page 33 of the FX Global Code PDF file" class="image--center mx-auto" /></p>
<blockquote>
<p>What is principle 15 in the FX Global Code?</p>
</blockquote>
<p>Interestingly, the knowledge base doesn't seem to know the answer:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715235240518/2b70b2ff-1260-4d20-a706-b44599be3a03.png" alt="No result from question about principle 15 in the FX Global Code" class="image--center mx-auto" /></p>
<p>If I force the knowledge base to use <a target="_blank" href="https://aws.amazon.com/about-aws/whats-new/2024/03/knowledge-bases-amazon-bedrock-hybrid-search/">hybrid search</a>, which combines both sematic and text search for better response, some source chucks were fetched but it does not seem to include one with the text from page 33.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715235450852/2d70685f-3de1-4f97-94f6-d82024f42d60.png" alt="Search results with hybrid search" class="image--center mx-auto" /></p>
<p>Since there are exactly five results, I figured that it might be limited by the maximum number. After increasing it to an arbitrary 20, the knowledge base finally returned a good response with the default search option:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715235800916/06729afc-1f07-471d-92bc-5f5d3590ae24.png" alt="Better response after setting maximum number of retrieved results" class="image--center mx-auto" /></p>
<p>This goes to show that just like agents, knowledge bases must be tested and fine-tuned extensively to improve accuracy. The embedding model as well as the underlying vector store may also play a part in the overall behavior of the knowledge base.</p>
<p>In any case, you've successfully created a knowledge base using Knowledge Bases for Amazon Bedrock, which can be integrated with your gen AI applications. To complete our experimentation, let's now integrate it with our forex rate assistant.</p>
<h2 id="heading-integrating-the-knowledge-base-to-the-forex-rate-assistant">Integrating the knowledge base to the forex rate assistant</h2>
<p>If you haven't already done so, please follow the blog post <a target="_blank" href="https://blog.avangards.io/building-a-basic-forex-rate-assistant-using-agents-for-amazon-bedrock">Building a Basic Forex Rate Assistant Using Agents for Amazon Bedrock</a> to create the forex rate assistant manually, or use the Terraform configuration from the blog post <a target="_blank" href="https://blog.avangards.io/how-to-manage-an-amazon-bedrock-agent-using-terraform">How To Manage an Amazon Bedrock Agent Using Terraform</a> to deploy it.</p>
<p>Once the agent is ready, we can <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-kb-add.html">associate the knowledge base</a> to it using the steps below in the Bedrock Console:</p>
<ol>
<li><p>Select <strong>Agents</strong> in the left menu.</p>
</li>
<li><p>On the <strong>Agents</strong> page, click <strong>ForexAssistant</strong> to open it.</p>
</li>
<li><p>On the agent page, click <strong>Edit in Agent Builder</strong>.</p>
</li>
<li><p>On the <strong>Agent builder</strong> page, scroll down to the <strong>Knowledge bases</strong> section and click <strong>Add</strong>.</p>
</li>
<li><p>On the <strong>Add knowledge base</strong> page, enter the following information and click <strong>Add:</strong></p>
<ul>
<li><p><strong>Select knowledge base:</strong> ForexKB</p>
</li>
<li><p><strong>Knowledge base instructions for Agent:</strong> Use this knowledge base to retrieve information on foreign currency exchange, such as the FX Global Code.</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715237350498/1da80df2-23be-4003-a7fb-df36cf38ec8f.png" alt="Add knowledge base" class="image--center mx-auto" /></p>
</li>
</ul>
</li>
<li><p>Click <strong>Save and exit</strong>.</p>
</li>
</ol>
<p>Once the knowledge base is added, prepare the agent and ask the same first question as before to validate the integration:</p>
<blockquote>
<p>What is the FX Global Code?</p>
</blockquote>
<p>The agent responded with a decent answer. In the trace, we can see that the agent invoked the knowledge base as part of its toolset and retrieved the results for its use.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716100093792/94ef6728-0053-40db-9fb8-bb6f94906956.png" alt="Agent performing knowledge base search" class="image--center mx-auto" /></p>
<p>We also want to ask the agent to fetch some exchange rate to ensure that the functionality is still working:</p>
<blockquote>
<p>What is the exchange rate from EUR to CAD?</p>
</blockquote>
<p>The agent responded with the rate fetched from the <code>ForexAPI</code> action group, which is what we expected.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715238290040/2a25c529-c3e4-4418-87d2-4850bf792574.png" alt="Agent fetching forex rate as expected" class="image--center mx-auto" /></p>
<p>However, we run into issues when asking the second question from before:</p>
<blockquote>
<p>What is principle 15 in the FX Global Code?</p>
</blockquote>
<p>The agent responded with the inferior answer since we did not adjust the maximum number of retrieval results for the knowledge base.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715238502777/a98038c9-5f81-4ad2-b747-21de69756f52.png" alt="Undesirable search results as before" class="image--center mx-auto" /></p>
<p>Unfortunately, our issue now is that there is no way that I know of to provide the knowledge base configuration in the agent, so we are stuck. At this point, there's nothing we can do other than opening an AWS support case to inquire about the lack of support... That being said, another angle to look at the problem is that the quality of the source could also affect the knowledge base's search accuracy, which brings us to the topic of common RAG challenges.</p>
<h2 id="heading-common-rag-challenges">Common RAG challenges</h2>
<p>Let's examine the source chunk from the correct answer for the "principle 15" question from our knowledge base test:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715269589983/69a4502c-3e0f-4216-b70b-4c11278ea65b.png" alt="Source chunk containing principle 15 in the FX Global Code" class="image--center mx-auto" /></p>
<p>This is also the text that was extracted from the PDF for embedding. Comparing it to the corresponding page in the PDF file, notice the following:</p>
<ol>
<li><p>The chunk text includes information such as headers and "breadcrumbs" that are not related to the main content.</p>
</li>
<li><p>The text does not capture the context of the elements in the passage, such as the principle title in the red box and the principle summary in italic.</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715288471662/47cd7dce-1e10-4978-9c61-547f4cbaf125.png" alt="The format of the PDF page" class="image--center mx-auto" /></p>
<p>It's fair to think that undesirable artifacts and lack of structural context would impact search accuracy, performance, and ultimately cost. Consequently, it makes sense to perform some data pre-processing before passing the source documents to the RAG workflow. Third-party APIs and tools, such as <a target="_blank" href="https://github.com/run-llama/llama_parse">LlamaParse</a> and <a target="_blank" href="https://www.llamaindex.ai/blog/mastering-pdfs-extracting-sections-headings-paragraphs-and-tables-with-cutting-edge-parser-faea18870125">LayoutPDFReader</a>, can help with pre-processing PDF data, however keep in mind that source documents may take any forms and there is no one-size-fits-all solution. You may have to resort to developing custom processes for pre-processing and search your unique data.</p>
<p>There are other <a target="_blank" href="https://datasciencedojo.com/blog/challenges-in-rag-based-llm-applications/">challenges in building RAG-based LLM applications</a> and proposed solutions which you should be aware of. However, some of them cannot be implemented in a managed solution such as Knowledge Bases for Amazon Bedrock, in which case you may need to build a custom solution yourself if you have a genuine need to address them. Such is the eternal quest to balance between effort and quality.</p>
<h2 id="heading-dont-forget-to-delete-the-opensearch-serverless-collection">Don't forget to delete the OpenSearch Serverless collection</h2>
<p>Be aware that Knowledge Bases for Amazon Bedrock does not delete the vector database for you. Since the OpenSearch Serverless collection consumes at least one OpenSearch Compute Unit (OCU) which is <a target="_blank" href="https://aws.amazon.com/opensearch-service/pricing/">charged by the hour</a>, you will incur a running cost for as long as the collection exists. Consequently, ensure that you manually <a target="_blank" href="https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-manage.html#serverless-delete">delete the collection</a> after you have deleted the knowledge base and other associated artifacts.</p>
<h2 id="heading-summary">Summary</h2>
<p>In this blog post, we created a knowledge base using Knowledge Bases for Amazon Bedrock which we integrate into the <a target="_blank" href="https://blog.avangards.io/building-a-basic-forex-rate-assistant-using-agents-for-amazon-bedrock">forex rate assistant</a> to allow it to answer questions about the FX Global Code. Through some testing of the solution, we experienced some common challenges for RAG solutions and potential mitigation strategies. Although some of them are not applicable to Bedrock knowledge bases since it abstracts the implementation details, thus highlighting the potential need for a custom solution for more demanding scenarios.</p>
<p>My next step is to enhance the <a target="_blank" href="https://blog.avangards.io/how-to-manage-an-amazon-bedrock-agent-using-terraform">Terraform configuration for the forex rate assistant</a> to provision and integrate the knowledge base, and to enhance the <a target="_blank" href="https://blog.avangards.io/developing-a-generic-streamlit-ui-to-test-amazon-bedrock-agents">Streamlit test app</a> to display citations from knowledge base searches. Be sure to follow the <a target="_blank" href="https://blog.avangards.io/">Avangards Blog</a> as I continue my journey on building gen AI applications using Amazon Bedrock and other AWS services. Thanks for reading and stay curious!</p>
]]></content:encoded></item><item><title><![CDATA[How To Manage AWS Security Hub in AWS Organizations Using Terraform]]></title><description><![CDATA[Introduction
Earlier I've published the blog post How To Manage Amazon GuardDuty in AWS Organizations Using Terraform which is essential in establishing threat detection as part of a security baseline, such as the AWS Security Baseline which I covere...]]></description><link>https://blog.avangards.io/how-to-manage-aws-security-hub-in-aws-organizations-using-terraform</link><guid isPermaLink="true">https://blog.avangards.io/how-to-manage-aws-security-hub-in-aws-organizations-using-terraform</guid><category><![CDATA[AWS]]></category><category><![CDATA[Terraform]]></category><category><![CDATA[Security]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Fri, 10 May 2024 05:27:02 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1717919103029/a2faa3ed-267b-4fff-8326-fc4f6affb019.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>Earlier I've published the blog post <a target="_blank" href="https://blog.avangards.io/how-to-manage-amazon-guardduty-in-aws-organizations-using-terraform">How To Manage Amazon GuardDuty in AWS Organizations Using Terraform</a> which is essential in establishing threat detection as part of a security baseline, such as the AWS Security Baseline which I covered extensively in the blog series <a target="_blank" href="https://blog.avangards.io/series/aws-ssb-terraform">How to implement the AWS Startup Security Baseline (SSB) using Terraform</a>.</p>
<p>Similarly, a good security baseline must include the means to manage the security posture, achieved using Security Hub in AWS. In this blog post, I will walk you through the steps to configuring Security Hub with central configuration in Terraform.</p>
<h2 id="heading-about-the-use-case">About the use case</h2>
<p><a target="_blank" href="https://docs.aws.amazon.com/securityhub/latest/userguide/what-is-securityhub.html">AWS Security Hub</a> is a security service that helps you manage security posture by collecting security data from AWS and third-party sources, and enabling analysis and remediation of security issues that are found.</p>
<p>Late last year, <a target="_blank" href="https://aws.amazon.com/blogs/security/introducing-new-central-configuration-capabilities-in-aws-security-hub/">AWS introduced new central configuration capabilities in AWS Security Hub</a> in the form of Security Hub configuration policies (SHCPs). With SHCPs, we can customize many aspects of the Security Hub configuration which can be consistently applied to all members of the organization. This addresses many challenges with managing Security Hub across an organization which I experienced first hand last year. It was practically futile to build Security Hub enablement into <a target="_blank" href="https://docs.aws.amazon.com/controltower/latest/userguide/aft-overview.html">AWS Control Tower Account Factory for Terraform (AFT)</a>! As this is the new best practice, we'll be using this feature.</p>
<p>Since it is increasingly common to establish an AWS landing zone using <a target="_blank" href="https://docs.aws.amazon.com/controltower/latest/userguide/what-is-control-tower.html">AWS Control Tower</a>, we will use the <a target="_blank" href="https://docs.aws.amazon.com/controltower/latest/userguide/accounts.html">standard account structure</a> in a Control Tower landing zone to demonstrate how to configure Security Hub in Terraform:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1713944303699/07114921-91b8-492e-8acc-5ebd3b4f4b64.png" alt="Control Tower standard OU and account structure" class="image--center mx-auto" /></p>
<p>The relevant accounts for our use case in the landing zone are:</p>
<ol>
<li><p>The <strong>Management</strong> account for the organization where AWS Organizations is configured. For details, refer to <a target="_blank" href="https://docs.aws.amazon.com/securityhub/latest/userguide/designate-orgs-admin-account.html">Integrating Security Hub with AWS Organizations</a>.</p>
</li>
<li><p>The <strong>Audit</strong> account where security and compliance services are typically centralized in a Control Tower landing zone.</p>
</li>
</ol>
<p>The objective is to delegate Security Hub administrative duties from the <strong>Management</strong> account to the <strong>Audit</strong> account, after which all organization configurations are managed in the <strong>Audit</strong> account. With that said, let's see how we can achieve this using Terraform!</p>
<h2 id="heading-designating-a-security-hub-administrator-account">Designating a Security Hub administrator account</h2>
<p>Security Hub delegated administrator is configured in the <strong>Management</strong> account, so we need a provider associated with it in Terraform. To keep things simple, we will take a multi-provider approach by defining two providers, one for the <strong>Management</strong> account and another for the <strong>Audit</strong> account, using AWS CLI profiles as follows:</p>
<pre><code class="lang-dockerfile">provider <span class="hljs-string">"aws"</span> {
  alias   = <span class="hljs-string">"management"</span>
  <span class="hljs-comment"># Use "aws configure" to create the "management" profile with the Management account credentials</span>
  profile = <span class="hljs-string">"management"</span> 
}

provider <span class="hljs-string">"aws"</span> {
  alias   = <span class="hljs-string">"audit"</span>
  <span class="hljs-comment"># Use "aws configure" to create the "audit" profile with the Audit account credentials</span>
  profile = <span class="hljs-string">"audit"</span> 
}
</code></pre>
<p>We can then use the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/securityhub_organization_admin_account"><code>aws_securityhub_organization_admin_account</code> resource</a> to set the delegated administrator. However, I noticed the following in the <strong>Audit</strong> account:</p>
<ul>
<li><p>After this resource is created, Security Hub will be enabled with the default standards (AWS Foundational Security Best Practices v1.0.0 and CIS AWS Foundations Benchmark v1.2.0).</p>
</li>
<li><p>When the resource is deleted, Security Hub remains enabled.</p>
</li>
</ul>
<p>These side effects are undesirable since ideally, we want full control over the lifecycle and configuration of Security Hub in Terraform. To address this issue, we will preemptively enable Security Hub in the <strong>Audit</strong> account using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/securityhub_account"><code>aws_securityhub_account</code> resource</a>. Later we will also apply the same configuration policy that will be associated to the organization.</p>
<pre><code class="lang-dockerfile">data <span class="hljs-string">"aws_caller_identity"</span> <span class="hljs-string">"audit"</span> {
  provider = aws.audit
}

resource <span class="hljs-string">"aws_securityhub_account"</span> <span class="hljs-string">"audit"</span> {
  provider                 = aws.audit
  enable_default_standards = false
}

resource <span class="hljs-string">"aws_securityhub_organization_admin_account"</span> <span class="hljs-string">"this"</span> {
  provider         = aws.management
  admin_account_id = data.aws_caller_identity.audit.account_id
  depends_on       = [aws_securityhub_account.audit]
}
</code></pre>
<p>With the <strong>Audit</strong> account designated as the Security Hub administrator, we can now manage the organization configuration.</p>
<h2 id="heading-configuring-cross-region-aggregation">Configuring cross-region aggregation</h2>
<p>Security Hub provides a <a target="_blank" href="https://docs.aws.amazon.com/securityhub/latest/userguide/finding-aggregation.html">cross-region aggregation</a> feature that centralizes findings, finding updates, insights, control compliance statuses, and security scores from multiple regions into a single region. Being able to review all findings in one place is incredibly useful for security analysts. We can enable this feature for all regions using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/securityhub_finding_aggregator"><code>aws_securityhub_finding_aggregator</code> resource</a> in Terraform as follows:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_securityhub_finding_aggregator"</span> <span class="hljs-string">"this"</span> {
  provider     = aws.audit
  linking_mode = <span class="hljs-string">"ALL_REGIONS"</span>
  depends_on   = [aws_securityhub_account.audit]
}
</code></pre>
<h2 id="heading-enabling-central-configuration">Enabling central configuration</h2>
<p>First, we need to apply the organization configuration to enable central configuration. Since the settings are defined in an configuration policy, we need to disable all settings that are related to local configuration. We will achieve this using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/securityhub_organization_configuration"><code>aws_securityhub_organization_configuration</code> resource</a>:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_securityhub_organization_configuration"</span> <span class="hljs-string">"this"</span> {
  provider              = aws.audit
  auto_enable           = false
  auto_enable_standards = <span class="hljs-string">"NONE"</span>
  organization_configuration {
    configuration_type = <span class="hljs-string">"CENTRAL"</span>
  }
  depends_on = [
    aws_securityhub_organization_admin_account.this,
    aws_securityhub_finding_aggregator.this
  ]
}
</code></pre>
<div data-node-type="callout">
<div data-node-type="callout-emoji">⚠</div>
<div data-node-type="callout-text">If you have enabled delegated administrator at some point prior to <a target="_blank" href="https://aws.amazon.com/about-aws/whats-new/2023/11/aws-security-hub-central-configuration/">November 2023 when the central configuration feature was released</a>, you may encounter a <code>DataUnavailableException</code> indicating that the organization data is still syncing when you create the organization configuration. To resolve this error, open an AWS support case to have them fix the data in the backend.</div>
</div>

<h2 id="heading-creating-and-associating-a-configuration-policy">Creating and associating a configuration policy</h2>
<p>With the organization configuration primed, we can now create and associate a configuration policy. This can be done with the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/securityhub_configuration_policy"><code>aws_securityhub_configuration_policy</code> resource</a> and the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/securityhub_configuration_policy_association"><code>aws_securityhub_configuration_policy_association</code> resource</a>.</p>
<p>For illustration, let's assume that we want to enable only the <a target="_blank" href="https://docs.aws.amazon.com/securityhub/latest/userguide/cis-aws-foundations-benchmark.html">CIS AWS Foundations Benchmark v1.4.0</a> standard across the organization. We also want to disable the control <a target="_blank" href="https://docs.aws.amazon.com/securityhub/latest/userguide/iam-controls.html#iam-6">[IAM.6] Hardware MFA should be enabled for the root user</a>.</p>
<p>The configuration policy can be defined in Terraform as follows:</p>
<pre><code class="lang-dockerfile">data <span class="hljs-string">"aws_region"</span> <span class="hljs-string">"audit"</span> {
  provider = aws.audit
}

data <span class="hljs-string">"aws_partition"</span> <span class="hljs-string">"audit"</span> {
  provider = aws.audit
}

resource <span class="hljs-string">"aws_securityhub_configuration_policy"</span> <span class="hljs-string">"this"</span> {
  provider    = aws.audit
  name        = <span class="hljs-string">"ExamplePolicy"</span>
  description = <span class="hljs-string">"This is an example SHCP."</span>
  configuration_policy {
    service_enabled       = true
    enabled_standard_arns = [<span class="hljs-string">"arn:${data.aws_partition.audit.partition}:securityhub:${data.aws_region.audit.name}::standards/cis-aws-foundations-benchmark/v/1.4.0"</span>]
    security_controls_configuration {
      disabled_control_identifiers = [<span class="hljs-string">"IAM.6"</span>]
    }
  }
  depends_on = [aws_securityhub_organization_configuration.this]
}
</code></pre>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">You can find the ARN format for the Security Hub standards <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/securityhub_standards_subscription">here</a>. Note that all standards are regional except for CIS AWS Foundations Benchmark v1.2.0.</div>
</div>

<p>Lastly, we will associate this configuration policy to the entire organization:</p>
<pre><code class="lang-dockerfile">data <span class="hljs-string">"aws_organizations_organization"</span> <span class="hljs-string">"this"</span> {
  provider = aws.management
}

resource <span class="hljs-string">"aws_securityhub_configuration_policy_association"</span> <span class="hljs-string">"org"</span> {
  provider = aws.audit
  target_id = data.aws_organizations_organization.this.roots[<span class="hljs-number">0</span>].id
  policy_id = aws_securityhub_configuration_policy.this.id
}
</code></pre>
<p>Before you apply the Terraform configuration, there is one issue which I found while cleaning up my environment that should be addressed in the Terraform configuration.</p>
<h2 id="heading-addressing-a-state-related-issue-which-causes-policy-deletion-to-fail">Addressing a state-related issue which causes policy deletion to fail</h2>
<p>While cleaning up my environment, I encountered the following state-related error when attempting to destroy the <code>aws_securityhub_configuration_policy</code> resource:</p>
<pre><code class="lang-bash">aws_securityhub_configuration_policy_association.org: Destroying... [id=r-lzgl]
aws_securityhub_configuration_policy_association.org: Destruction complete after 2s
aws_securityhub_configuration_policy.this: Destroying... [id=f7bf343f-af38-4b1d-9116-73f43cfb5d61]
╷
│ Error: deleting Security Hub Configuration Policy (f7bf343f-af38-4b1d-9116-73f43cfb5d61): operation error SecurityHub: DeleteConfigurationPolicy, https response error StatusCode: 409, RequestID: 06f4448f-4133-412a-b89b-bda896f7fa08, ResourceConflictException: Policy f7bf343f-af38-4b1d-9116-73f43cfb5d61 is associated with one or more accounts or organizational units. You must disassociate the policy before you can delete it.
</code></pre>
<p>However, you can see in the first two lines in the output that the configuration policy association is already destroyed before the attempt to destroy the policy.</p>
<p>After examining the Terraform resource code and the AWS API contract, I found that the <a target="_blank" href="https://docs.aws.amazon.com/securityhub/1.0/APIReference/API_StartConfigurationPolicyDisassociation.html"><code>StartConfigurationPolicyDisassociation</code> API action</a> does not report the disassociation status, nor is there another API action that can query the status. So this is not a Terraform AWS Provider bug per se and having the issue addressed upstream seems unlikely.</p>
<p>As a workaround, I turned to the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/time/latest/docs/resources/sleep"><code>time_sleep</code> resource</a> that can add a wait time for resource destruction. Through trial and error, I learned that 10 seconds is sufficient for the state to be updated. So we can update the Terraform configuration as follows:</p>
<pre><code class="lang-dockerfile"><span class="hljs-comment"># Some wait time is needed to account for state changes after the configuration policy is disassociated</span>
resource <span class="hljs-string">"time_sleep"</span> <span class="hljs-string">"aws_securityhub_configuration_policy_this"</span> {
  destroy_duration = <span class="hljs-string">"10s"</span>
  depends_on       = [aws_securityhub_configuration_policy.this]
}

resource <span class="hljs-string">"aws_securityhub_configuration_policy_association"</span> <span class="hljs-string">"org"</span> {
  provider   = aws.audit
  target_id  = data.aws_organizations_organization.this.roots[<span class="hljs-number">0</span>].id
  policy_id  = aws_securityhub_configuration_policy.this.id
  depends_on = [time_sleep.aws_securityhub_configuration_policy_this]
}
</code></pre>
<p>With this change, the full Terraform configuration can be destroyed successfully.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">✅</div>
<div data-node-type="callout-text">You can find the complete Terraform in the <a target="_blank" href="https://github.com/acwwat/terraform-aws-securityhub-organization-example">GitHub repository</a> that accompanies this blog post.</div>
</div>

<p>With the complete Terraform configuration, you can now apply it to establish the <strong>Audit</strong> account as the delegated administrator and apply the SHCP to all accounts and all regions (as per the finding aggregator settings).</p>
<h2 id="heading-caveats-about-disabling-security-hub-in-member-accounts">Caveats about disabling Security Hub in member accounts</h2>
<p>Due to the design of the Security Hub API and the Terraform resources, Security Hub will not be disabled in the member accounts when you run <code>terraform destroy</code>. Normally this wouldn't be a problem for a production landing zone. However, if you are only testing, this could lead to unexpected costs especially when left running in all accounts and all regions.</p>
<p>Since it would be difficult to disable Security Hub in individual account, a smarter way would be to disable Security Hub using the SHCP. This can be done by changing the <code>aws_securityhub_configuration_policy.this</code> resource definition to the following:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_securityhub_configuration_policy"</span> <span class="hljs-string">"this"</span> {
  provider    = aws.audit
  name        = <span class="hljs-string">"ExamplePolicy"</span>
  description = <span class="hljs-string">"This is an example SHCP."</span>
  configuration_policy {
    service_enabled = false
  }
  depends_on = [aws_securityhub_organization_configuration.this]
}
</code></pre>
<p>After you re-apply the Terraform configuration, Security Hub should be disabled in all accounts and all regions. Then you can safely run <code>terraform destroy</code> to remove the remaining Security Hub resources and configuration.</p>
<h2 id="heading-summary">Summary</h2>
<p>In this blog post, you learned how to implement central configuration to manage AWS Security Hub in AWS Organizations using Terraform. By consolidating all management work of all accounts in an organization and all regions into a delegated administrator account, you now have a single pane of glass to review and manage your cloud security posture.</p>
<p>For more tips and walkthroughs on AWS, Terraform, and more, please check out the <a target="_blank" href="https://blog.avangards.io/">Avangards Blog</a>. Thanks for reading!</p>
]]></content:encoded></item><item><title><![CDATA[Developing a Generic Streamlit UI to Test Amazon Bedrock Agents]]></title><description><![CDATA[💡
2024-09-25: The UI has been updated with features to support guardrails that are associated with the agent. for details, refer to the blog post Guardrail Support for the Generic Bedrock Agent Test UI.



💡
2024-05-21: The UI has been updated with...]]></description><link>https://blog.avangards.io/developing-a-generic-streamlit-ui-to-test-amazon-bedrock-agents</link><guid isPermaLink="true">https://blog.avangards.io/developing-a-generic-streamlit-ui-to-test-amazon-bedrock-agents</guid><category><![CDATA[AWS]]></category><category><![CDATA[AI]]></category><category><![CDATA[Amazon Bedrock]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[Python]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Sun, 05 May 2024 23:31:24 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1714709395187/6e8473e4-0513-4eba-acab-fc759f453290.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text"><strong>2024-09-25: T</strong>he UI has been updated with features to support guardrails that are associated with the agent. for details, refer to the blog post <a target="_blank" href="https://blog.avangards.io/guardrail-support-for-the-generic-bedrock-agent-test-ui">Guardrail Support for the Generic Bedrock Agent Test UI</a>.</div>
</div>

<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text"><strong>2024-05-21: </strong>The UI has been updated with features to support knowledge bases that are attached to the agent. For details, refer to the blog post <a target="_blank" href="https://blog.avangards.io/knowledge-base-support-for-the-generic-bedrock-agent-test-ui">Knowledge Base Support for the Generic Bedrock Agent Test UI</a>.</div>
</div>

<h2 id="heading-introduction">Introduction</h2>
<p>In the earlier blog post <a target="_blank" href="https://blog.avangards.io/building-a-basic-forex-rate-assistant-using-agents-for-amazon-bedrock">Building a Basic Forex Rate Assistant Using Agents for Amazon Bedrock</a>, I walked readers through the process of building and testing a Bedrock agent in the AWS Management Console. While the built-in test interface is great for validation as changes are made in the Bedrock console, it is not scalable to other team members such as testers who often don't have direct access to AWS.</p>
<p>Meanwhile, a developer workflow that does not require access to AWS Management Console may provide a better experience. As a developer, I appreciate having an integrated development environment (IDE) such as <a target="_blank" href="https://code.visualstudio.com/">Visual Studio Code</a> where I can code, deploy, and test in one place.</p>
<p>To address these two challenges, I decided to build a basic but functional UI for testing Bedrock agents. In this blog post, I share with readers the end product and some details about its design.</p>
<h2 id="heading-design-and-implementation-overview">Design and implementation overview</h2>
<p>The following is the list of requirements that I defined for the test UI:</p>
<ul>
<li><p>The design should be minimal but functional, since the focus is not on the UI but on being able to validate the business logic of the agents.</p>
</li>
<li><p>The solution must provide the basic features as the Bedrock console, including trace.</p>
</li>
<li><p>The solution must be adaptable to any Bedrock agents with no to minimal changes.</p>
</li>
<li><p>The solution must run both locally and as a shared webapp for different workflows.</p>
</li>
</ul>
<p>I decided to use <a target="_blank" href="https://streamlit.io/">Streamlit</a> to build the UI as it is a popular and fitting choice. Streamlit is an open-source Python library used for building interactive web applications specially for AI and data applications. Since the application code is written only in Python, it is easy to learn and build with.</p>
<p>The <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/APIReference/API_Operations_Agents_for_Amazon_Bedrock_Runtime.html">Agents for Amazon Bedrock Runtime API</a> can be used to interact with a Bedrock agent. Since the Streamlit app is developed in Python, we will naturally use the <a target="_blank" href="https://boto3.amazonaws.com/v1/documentation/api/latest/index.html">AWS SDK for Python (Boto3)</a> for the integration. The <a target="_blank" href="https://aws.amazon.com/developer/code-examples/">AWS SDK Code Examples</a> code library provides an <a target="_blank" href="https://docs.aws.amazon.com/code-library/latest/ug/bedrock-agent-runtime_example_bedrock-agent-runtime_InvokeAgent_section.html">example</a> on how to use the <a target="_blank" href="https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-agent-runtime/client/invoke_agent.html"><code>AgentsforBedrockRuntime.Client.invoke_agent</code> function</a> to call the Bedrock agent. The function documentation was essential to determine the response format and the information.</p>
<p>The UI design is rather minimal as you can see in the following screenshot:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714941183242/34a4370e-e04f-4dee-9acb-c01c91c2d56a.png" alt="Test UI design elements" class="image--center mx-auto" /></p>
<p>In the left <a target="_blank" href="https://docs.streamlit.io/develop/api-reference/layout/st.sidebar">sidebar</a>, I include elements related to troubleshooting such as a session reset button and trace information similar to the Bedrock console. The main pane is a simple chat interface with the <a target="_blank" href="https://docs.streamlit.io/develop/api-reference/chat/st.chat_message">messages</a> and the <a target="_blank" href="https://docs.streamlit.io/develop/api-reference/chat/st.chat_input">input</a>. I made the favicon and page title configurable using environment variables for a white label experience.</p>
<h2 id="heading-about-the-repository-structure">About the repository structure</h2>
<p>You can find the source code for the test UI in the <a target="_blank" href="https://github.com/acwwat/amazon-bedrock-agent-test-ui">acwwat/amazon-bedrock-agent-test-ui</a> GitHub repository. The repository structure follows a <a target="_blank" href="https://github.com/markdouthwaite/streamlit-project">standard structure</a> as recommended by Mark Douthwaite for Streamlit projects. For a detailed explanation of the structure, refer to the <a target="_blank" href="https://github.com/markdouthwaite/streamlit-project/blob/master/docs/template-info.md">getting started</a> documentation. The only tweak I made is that I put the backend integration code into <code>bedrock_agent_runtime.py</code> in a <code>services</code> directory.</p>
<pre><code class="lang-plaintext">├── services
│   ├── bedrock_agent_runtime.py
├── .gitignore
├── app.py
├── Dockerfile
├── LICENSE
├── README.md
├── requirements.txt
</code></pre>
<h2 id="heading-configuring-and-running-the-app-locally">Configuring and running the app locally</h2>
<p>To run the Streamlit app locally, you just need to have the <a target="_blank" href="https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html">AWS CLI</a> and <a target="_blank" href="https://www.python.org/downloads/">Python 3</a> installed. Then you can clone the <a target="_blank" href="https://github.com/acwwat/amazon-bedrock-agent-test-ui">acwwat/amazon-bedrock-agent-test-ui</a> GitHub repository and follow the steps below:</p>
<ol>
<li><p>Run the following command to install the dependencies:<br /> <code>pip install -r requirements.txt</code></p>
</li>
<li><p>Configure the environment variables for the AWS CLI and Boto3. You would typically <a target="_blank" href="https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html">configure the AWS CLI</a> to create a named profile, then set the <a target="_blank" href="https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html#cli-configure-files-using-profiles"><code>AWS_PROFILE</code> environment variable</a> to refer to it.</p>
</li>
<li><p>Set the following environment variables as appropriate:</p>
<ul>
<li><p><code>BEDROCK_AGENT_ID</code> - The ID of the Bedrock agent, which you can find in the Bedrock console or by running the <a target="_blank" href="https://docs.aws.amazon.com/cli/latest/reference/bedrock-agent/list-agents.html"><code>aws bedrock-agent list-agents</code></a> command.</p>
</li>
<li><p><code>BEDRROCK_AGENT_ALIAS_ID</code> - The ID of the agent alias, which you can find in the Bedrock console or by running the <a target="_blank" href="https://docs.aws.amazon.com/cli/latest/reference/bedrock-agent/list-agent-aliases.html"><code>aws bedrock-agent list-agent-aliases</code></a> command. If this environment variable is not set, the default test alias ID <code>TSTALIASID</code> will be used.</p>
</li>
<li><p><code>BEDROCK_AGENT_TEST_UI_TITLE</code> - (Optional) The page title. If this environment is not set, the generic title in the above screenshot will be used.</p>
</li>
<li><p><code>BEDROCK_AGENT_TEST_UI_ICON</code> - (Optional) The favicon code, such as <code>:bar_chart:</code>. If this environment is not set, the generic icon in the above screenshot will be used.</p>
</li>
</ul>
</li>
<li><p>Run the following command to start the Streamlit app:<br /> <code>streamlit run app.py --server.port=8080 --server.address=localhost</code></p>
</li>
</ol>
<p>Once the app is started, you can access it in your web browser at <code>http://localhost:8080</code>.</p>
<p>As an example, here is the list of bash commands I run in bash inside VS Code to start the app for testing my forex rate agent (which you can learn how to build or <a target="_blank" href="https://blog.avangards.io/how-to-manage-an-amazon-bedrock-agent-using-terraform">deploy using my Terraform configuration</a>):</p>
<pre><code class="lang-bash"><span class="hljs-built_in">cd</span> amazon-bedrock-agent-test-ui
pip install -r requirements.txt
<span class="hljs-comment"># Use a named profile created by the "aws configure sso" command</span>
<span class="hljs-built_in">export</span> AWS_PROFILE=AWSAdministratorAccess-&lt;redacted&gt;
<span class="hljs-built_in">export</span> BEDROCK_AGENT_ID=WENOOVMMEK
<span class="hljs-built_in">export</span> BEDROCK_AGENT_TEST_UI_TITLE=<span class="hljs-string">"Forex Rate Assistant"</span>
<span class="hljs-built_in">export</span> BEDROCK_AGENT_TEST_UI_ICON=<span class="hljs-string">":currency_exchange:"</span>
<span class="hljs-comment"># Log in via the browser when prompted</span>
aws sso login
streamlit run app.py --server.port=8080 --server.address=localhost
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714950341398/0e8c9ed4-b126-4c66-8bf4-cf8c2c410e08.png" alt="The test UI" class="image--center mx-auto" /></p>
<p>To stop the app, send an INT signal (Ctrl+C) in the prompt where you are running the <code>streamlit</code> command.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">⚠</div>
<div data-node-type="callout-text">On Windows, the <code>streamlit</code> command doesn't seem to end the process if you don't have the UI opened in the browser. If you run into this issue, simply go to <code>http://localhost:8080</code> in your browser, then hit Ctrl+C again in the prompt.</div>
</div>

<h2 id="heading-next-steps">Next steps</h2>
<p>While the Streamlit app serves my purpose as it is, there are a few missing features which I will continue to add over time:</p>
<ul>
<li><p>Support for the use of <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html">Knowledge Bases for Amazon Bedrock</a> in an agent, such as displaying citations</p>
</li>
<li><p>Support for <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-returncontrol.html">returning control to the agent developer</a></p>
</li>
<li><p>Ability to switch between agents and aliases within the app</p>
</li>
</ul>
<p>I also have not shown how to build and deploy the Streamlit app as a container in AWS, which I will perhaps demonstrate in another future blog post.</p>
<p>You are encountered to fork or copy the repository and built upon the existing code to suit your needs.</p>
<h2 id="heading-summary">Summary</h2>
<p>In this blog post, I provided an introduction to a generic Streamlit UI that I built to facilitate more efficient testing of agents built with the Agents for Amazon Bedrock service. You can clone the repository and follow the instructions to run it locally, and improve upon the baseline as you see fit.</p>
<p>I will be adding more features and fixing bugs over time, so be sure to check out the repository from time to time. Be sure to follow the <a target="_blank" href="https://blog.avangards.io">Avangards Blog</a> as I continue my journey with building generative AI applications using Amazon Bedrock.</p>
<p>Thanks for checking in!</p>
]]></content:encoded></item><item><title><![CDATA[How To Manage an Amazon Bedrock Agent Using Terraform]]></title><description><![CDATA[Introduction
In the previous blog post Building a Basic Forex Rate Assistant Using Agents for Amazon Bedrock, I demonstrated how to create a Bedrock agent in the AWS Management Console and outlined some ideas on improving the solution. Before further...]]></description><link>https://blog.avangards.io/how-to-manage-an-amazon-bedrock-agent-using-terraform</link><guid isPermaLink="true">https://blog.avangards.io/how-to-manage-an-amazon-bedrock-agent-using-terraform</guid><category><![CDATA[AWS]]></category><category><![CDATA[Terraform]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[Amazon Bedrock]]></category><category><![CDATA[AI]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Wed, 01 May 2024 04:30:36 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1717919023179/e790a89a-ebda-40ef-860d-b470dac51926.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In the previous blog post <a target="_blank" href="https://blog.avangards.io/building-a-basic-forex-rate-assistant-using-agents-for-amazon-bedrock">Building a Basic Forex Rate Assistant Using Agents for Amazon Bedrock</a>, I demonstrated how to create a Bedrock agent in the AWS Management Console and outlined some ideas on improving the solution. Before further experimentation, it makes sense to automate the deployment of the solution to enable quicker updates as we go through trail and error in fine-tuning an agent.</p>
<p>In this blog post, we will automate the deployment of the basic forex rate assistant in Terraform using the resources that were recently released in <a target="_blank" href="https://github.com/hashicorp/terraform-provider-aws/releases/tag/v5.47.0">v5.47.0 of the Terraform AWS Provider</a>. Let's start by looking at the AWS resources in the AWS Management Console.</p>
<h2 id="heading-taking-inventory-of-the-required-resources">Taking inventory of the required resources</h2>
<p>By examining the agent we previously built, we see that it is comprised of the following AWS resources:</p>
<ol>
<li><p>The <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-create.html">agent</a> itself</p>
</li>
<li><p>The <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-permissions.html">agent resource role</a> which is an IAM service role that provides the agent with access to other AWS services and resources</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714502768974/cb630838-b259-45cd-a206-4e272ef3c586.png" alt="The agent and its resource role" class="image--center mx-auto" /></p>
</li>
<li><p>The <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-action-create.html">action group</a> that defines API actions that the agent can perform</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714499176516/d756d372-87f0-405c-a9a5-45a9b32e5720.png" alt="The action group" class="image--center mx-auto" /></p>
</li>
<li><p>The <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-lambda.html">Lambda function</a> associated with the action group, which itself requires an <a target="_blank" href="https://docs.aws.amazon.com/lambda/latest/dg/lambda-intro-execution-role.html">execution role</a> and a <a target="_blank" href="https://docs.aws.amazon.com/lambda/latest/dg/access-control-resource-based.html">resource policy</a> that <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-permissions.html">allows the agent to invoke the function</a></p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714502839985/b92ca848-f8a9-46e7-822e-ff6a823ca337.png" alt="The Lambda execution role and the resource policy" class="image--center mx-auto" /></p>
</li>
</ol>
<p>With the list of resources we need to provision, we can begin creating the Terraform configuration starting with the resources that the agent depends on.</p>
<h2 id="heading-defining-resources-for-the-iam-and-lambda-dependencies">Defining resources for the IAM and Lambda dependencies</h2>
<p>For the agent resource role, the <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-permissions.html">documentation</a> already provides the trust policy and the permissions required. It also specifies that the prefix <code>AmazonBedrockExecutionRoleForAgents_</code> must be used for the role name.</p>
<p>The permission requires the foundational model's ARN, so we need at least the model's ID, which in our case is <code>anthropic.claude-3-haiku-20240307-v1:0</code> for Claude 3 Haiku. For consistency, we will use the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/bedrock_foundation_model"><code>aws_bedrock_foundational_model</code> data source</a> to look up its ARN. Thus we can define the Terraform configuration for the agent resource role as follows using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role"><code>aws_iam_role</code> resource</a> and the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role_policy"><code>aws_iam_role_policy</code> resource</a>:</p>
<pre><code class="lang-dockerfile"><span class="hljs-comment"># Use data sources to get common information about the environment</span>
data <span class="hljs-string">"aws_caller_identity"</span> <span class="hljs-string">"this"</span> {}
data <span class="hljs-string">"aws_partition"</span> <span class="hljs-string">"this"</span> {}
data <span class="hljs-string">"aws_region"</span> <span class="hljs-string">"this"</span> {}
locals {
  account_id = data.aws_caller_identity.this.account_id
  partition  = data.aws_partition.this.partition
  region     = data.aws_region.this.name
}

data <span class="hljs-string">"aws_bedrock_foundation_model"</span> <span class="hljs-string">"this"</span> {
  model_id = <span class="hljs-string">"anthropic.claude-3-haiku-20240307-v1:0"</span>
}
<span class="hljs-comment"># Agent resource role</span>
resource <span class="hljs-string">"aws_iam_role"</span> <span class="hljs-string">"bedrock_agent_forex_asst"</span> {
  name = <span class="hljs-string">"AmazonBedrockExecutionRoleForAgents_ForexAssistant"</span>
  assume_role_policy = jsonencode({
    Version = <span class="hljs-string">"2012-10-17"</span>
    Statement = [
      {
        Action = <span class="hljs-string">"sts:AssumeRole"</span>
        Effect = <span class="hljs-string">"Allow"</span>
        Principal = {
          Service = <span class="hljs-string">"bedrock.amazonaws.com"</span>
        }
        Condition = {
          StringEquals = {
            <span class="hljs-string">"aws:SourceAccount"</span> = local.account_id
          }
          ArnLike = {
            <span class="hljs-string">"aws:SourceArn"</span> = <span class="hljs-string">"arn:${local.partition}:bedrock:${local.region}:${local.account_id}:agent/*"</span>
          }
        }
      }
    ]
  })
}

resource <span class="hljs-string">"aws_iam_role_policy"</span> <span class="hljs-string">"bedrock_agent_forex_asst"</span> {
  name = <span class="hljs-string">"AmazonBedrockAgentBedrockFoundationModelPolicy_ForexAssistant"</span>
  role = aws_iam_role.bedrock_agent_forex_asst.name
  policy = jsonencode({
    Version = <span class="hljs-string">"2012-10-17"</span>
    Statement = [
      {
        Action   = <span class="hljs-string">"bedrock:InvokeModel"</span>
        Effect   = <span class="hljs-string">"Allow"</span>
        Resource = data.aws_bedrock_foundation_model.this.model_arn
      }
    ]
  })
}
</code></pre>
<p>Next, we will define the Lambda execution role which just needs the basic permissions to write logs to CloudWatch that the AWS-managed IAM policy <code>AWSLambdaBasicExecutionRole</code> provides. The Terraform configuration for this IAM role can be defined as follows:</p>
<pre><code class="lang-dockerfile">data <span class="hljs-string">"aws_iam_policy"</span> <span class="hljs-string">"lambda_basic_execution"</span> {
  name = <span class="hljs-string">"AWSLambdaBasicExecutionRole"</span>
}

<span class="hljs-comment"># Action group Lambda execution role</span>
resource <span class="hljs-string">"aws_iam_role"</span> <span class="hljs-string">"lambda_forex_api"</span> {
  name = <span class="hljs-string">"FunctionExecutionRoleForLambda_ForexAPI"</span>
  assume_role_policy = jsonencode({
    Version = <span class="hljs-string">"2012-10-17"</span>
    Statement = [
      {
        Action = <span class="hljs-string">"sts:AssumeRole"</span>
        Effect = <span class="hljs-string">"Allow"</span>
        Principal = {
          Service = <span class="hljs-string">"lambda.amazonaws.com"</span>
        }
        Condition = {
          StringEquals = {
            <span class="hljs-string">"aws:SourceAccount"</span> = <span class="hljs-string">"${local.account_id}"</span>
          }
        }
      }
    ]
  })
  managed_policy_arns = [data.aws_iam_policy.lambda_basic_execution.arn]
}
</code></pre>
<p>We will then define the Terraform configuration for the Lambda function and its resource policy. Here is the source code for the Forex API Lambda function from the previous blog post for reference:</p>
<pre><code class="lang-dockerfile">import json
import urllib.parse <span class="hljs-comment"># urllib is available in Lambda runtime w/o needing a layer</span>
import urllib.request

def lambda_handler(event, context):
    agent = event[<span class="hljs-string">'agent'</span>]
    actionGroup = event[<span class="hljs-string">'actionGroup'</span>]
    apiPath = event[<span class="hljs-string">'apiPath'</span>]
    httpMethod =  event[<span class="hljs-string">'httpMethod'</span>]
    parameters = event.get(<span class="hljs-string">'parameters'</span>, [])
    requestBody = event.get(<span class="hljs-string">'requestBody'</span>, {})

    <span class="hljs-comment"># Read and process input parameters</span>
    code = None
    for parameter in parameters:
        if (parameter[<span class="hljs-string">"name"</span>] == <span class="hljs-string">"code"</span>):
            <span class="hljs-comment"># Just in case, convert to lowercase as expected by the API</span>
            code = parameter[<span class="hljs-string">"value"</span>].lower()

    <span class="hljs-comment"># Execute your business logic here. For more information, refer to: https://docs.aws.amazon.com/bedrock/latest/userguide/agents-lambda.html</span>
    apiPathWithParam = apiPath
    <span class="hljs-comment"># Replace URI path parameters</span>
    if code is not None:
        apiPathWithParam = apiPathWithParam.replace(<span class="hljs-string">"{code}"</span>, urllib.parse.quote(code))

    <span class="hljs-comment"># <span class="hljs-doctag">TODO:</span> Use a environment variable or Parameter Store to set the URL</span>
    url = <span class="hljs-string">"https://cdn.jsdelivr.net/npm/@fawazahmed0/currency-api@latest/v1{apiPathWithParam}.min.json"</span>.format(apiPathWithParam = apiPathWithParam)

    <span class="hljs-comment"># Call the currency exchange rates API based on the provided path and wrap the response</span>
    apiResponse = urllib.request.urlopen(
        urllib.request.Request(
            url=url,
            headers={<span class="hljs-string">"Accept"</span>: <span class="hljs-string">"application/json"</span>},
            method=<span class="hljs-string">"GET"</span>
        )
    )
    responseBody =  {
        <span class="hljs-string">"application/json"</span>: {
            <span class="hljs-string">"body"</span>: apiResponse.read()
        }
    }

    action_response = {
        <span class="hljs-string">'actionGroup'</span>: actionGroup,
        <span class="hljs-string">'apiPath'</span>: apiPath,
        <span class="hljs-string">'httpMethod'</span>: httpMethod,
        <span class="hljs-string">'httpStatusCode'</span>: <span class="hljs-number">200</span>,
        <span class="hljs-string">'responseBody'</span>: responseBody

    }

    api_response = {<span class="hljs-string">'response'</span>: action_response, <span class="hljs-string">'messageVersion'</span>: event[<span class="hljs-string">'messageVersion'</span>]}
    print(<span class="hljs-string">"Response: {}"</span>.format(api_response))

    return api_response
</code></pre>
<p>We will save this source code into a file called <code>index.py</code> in the <code>lambda/forex_api</code> directory in the same directory as the Terraform configuration, which will be packaged as a zip file using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/archive/latest/docs/data-sources/file"><code>archive_file</code> data source</a> to pass as an argument to the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/lambda_function"><code>aws_lambda_function</code> resource</a>.</p>
<p>Here is the Terraform configuration for the Lambda function based on my battle-tested templates:</p>
<pre><code class="lang-dockerfile"><span class="hljs-comment"># Action group Lambda function</span>
data <span class="hljs-string">"archive_file"</span> <span class="hljs-string">"forex_api_zip"</span> {
  type             = <span class="hljs-string">"zip"</span>
  source_file      = <span class="hljs-string">"${path.module}/lambda/forex_api/index.py"</span>
  output_path      = <span class="hljs-string">"${path.module}/tmp/forex_api.zip"</span>
  output_file_mode = <span class="hljs-string">"0666"</span>
}

resource <span class="hljs-string">"aws_lambda_function"</span> <span class="hljs-string">"forex_api"</span> {
  function_name = <span class="hljs-string">"ForexAPI"</span>
  role          = aws_iam_role.lambda_forex_api.arn
  description   = <span class="hljs-string">"A Lambda function for the forex API action group"</span>
  filename      = data.archive_file.forex_api_zip.output_path
  handler       = <span class="hljs-string">"index.lambda_handler"</span>
  runtime       = <span class="hljs-string">"python3.12"</span>
  <span class="hljs-comment"># source_code_hash is required to detect changes to Lambda code/zip</span>
  source_code_hash = data.archive_file.forex_api_zip.output_base64sha256
}
</code></pre>
<p>Lastly, we will set the Lambda resource policy using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/lambda_permission"><code>aws_lambda_permission</code> resource</a> according to the specifications in the documentation:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_lambda_permission"</span> <span class="hljs-string">"forex_api"</span> {
  action         = <span class="hljs-string">"lambda:invokeFunction"</span>
  function_name  = aws_lambda_function.forex_api.function_name
  principal      = <span class="hljs-string">"bedrock.amazonaws.com"</span>
  source_account = local.account_id
  source_arn     = <span class="hljs-string">"arn:aws:bedrock:${local.region}:${local.account_id}:agent/*"</span>
}
</code></pre>
<h2 id="heading-defining-the-agent-and-action-group-resources">Defining the agent and action group resources</h2>
<p>With the dependencies out of the way, we can now define the Terraform resource for the agent with the new <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/bedrockagent_agent"><code>aws_bedrockagent_agent</code> resource</a>, which is rather straightforward:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_bedrockagent_agent"</span> <span class="hljs-string">"forex_asst"</span> {
  agent_name              = <span class="hljs-string">"ForexAssistant"</span>
  agent_resource_role_arn = aws_iam_role.bedrock_agent_forex_asst.arn
  description             = <span class="hljs-string">"An assisant that provides forex rate information."</span>
  foundation_model        = data.aws_bedrock_foundation_model.this.model_id
  instruction             = <span class="hljs-string">"You are an assistant that looks up today's currency exchange rates. A user may ask you what the currency exchange rate is for one currency to another. They may provide either the currency name or the three-letter currency code. If they give you a name, you may first need to first look up the currency code by its name."</span>
}
</code></pre>
<p>The action group can be defined in the agent using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/bedrockagent_agent_action_group"><code>aws_bedrockagent_action_group</code> resource</a>. We will need the <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-api-schema.html">OpenAPI schema</a> YAML file from the previous blog post, which is included below for reference:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">openapi:</span> <span class="hljs-number">3.0</span><span class="hljs-number">.0</span>
<span class="hljs-attr">info:</span>
  <span class="hljs-attr">title:</span> <span class="hljs-string">Currency</span> <span class="hljs-string">API</span>
  <span class="hljs-attr">description:</span> <span class="hljs-string">Provides</span> <span class="hljs-string">information</span> <span class="hljs-string">about</span> <span class="hljs-string">different</span> <span class="hljs-string">currencies.</span>
  <span class="hljs-attr">version:</span> <span class="hljs-number">1.0</span><span class="hljs-number">.0</span>
<span class="hljs-attr">servers:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">url:</span> <span class="hljs-string">https://cdn.jsdelivr.net/npm/@fawazahmed0/currency-api@latest/v1</span>
<span class="hljs-attr">paths:</span>
  <span class="hljs-string">/currencies:</span>
    <span class="hljs-attr">get:</span>
      <span class="hljs-attr">description:</span> <span class="hljs-string">|
        List all available currencies
</span>      <span class="hljs-attr">responses:</span>
        <span class="hljs-attr">"200":</span>
          <span class="hljs-attr">description:</span> <span class="hljs-string">Successful</span> <span class="hljs-string">response</span>
          <span class="hljs-attr">content:</span>
            <span class="hljs-attr">application/json:</span>
              <span class="hljs-attr">schema:</span>
                <span class="hljs-attr">type:</span> <span class="hljs-string">object</span>
                <span class="hljs-attr">description:</span> <span class="hljs-string">|
                  A map where the key refers to the lowercase three-letter currency code and the value to the currency name in English.
</span>                <span class="hljs-attr">additionalProperties:</span>
                  <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
  <span class="hljs-string">/currencies/{code}:</span>
    <span class="hljs-attr">get:</span>
      <span class="hljs-attr">description:</span> <span class="hljs-string">|
        List the exchange rates of all available currencies with the currency specified by the given currency code in the URL path parameter as the base currency
</span>      <span class="hljs-attr">parameters:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">in:</span> <span class="hljs-string">path</span>
          <span class="hljs-attr">name:</span> <span class="hljs-string">code</span>
          <span class="hljs-attr">required:</span> <span class="hljs-literal">true</span>
          <span class="hljs-attr">description:</span> <span class="hljs-string">The</span> <span class="hljs-string">lowercase</span> <span class="hljs-string">three-letter</span> <span class="hljs-string">code</span> <span class="hljs-string">of</span> <span class="hljs-string">the</span> <span class="hljs-string">base</span> <span class="hljs-string">currency</span> <span class="hljs-string">for</span> <span class="hljs-string">which</span> <span class="hljs-string">to</span> <span class="hljs-string">fetch</span> <span class="hljs-string">exchange</span> <span class="hljs-string">rates</span>
          <span class="hljs-attr">schema:</span>
            <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
      <span class="hljs-attr">responses:</span>
        <span class="hljs-attr">"200":</span>
          <span class="hljs-attr">description:</span> <span class="hljs-string">Successful</span> <span class="hljs-string">response</span>
          <span class="hljs-attr">content:</span>
            <span class="hljs-attr">application/json:</span>
              <span class="hljs-attr">schema:</span>
                <span class="hljs-attr">type:</span> <span class="hljs-string">object</span>
                <span class="hljs-attr">description:</span> <span class="hljs-string">|
                  A map where the key refers to the three-letter currency code of the target currency and the value to the exchange rate to the target currency.
</span>                <span class="hljs-attr">additionalProperties:</span>
                  <span class="hljs-attr">type:</span> <span class="hljs-string">number</span>
                  <span class="hljs-attr">format:</span> <span class="hljs-string">float</span>
</code></pre>
<p>We will save the file as <code>schema.yaml</code> in the <code>lambda/forex_api</code> directory for the Lambda function, since they somewhat go together. Since we are providing the OpenAPI schema in-line, the Terraform resource can be defined as follows:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_bedrockagent_agent_action_group"</span> <span class="hljs-string">"forex_api"</span> {
  action_group_name          = <span class="hljs-string">"ForexAPI"</span>
  agent_id                   = aws_bedrockagent_agent.forex_asst.id
  agent_version              = <span class="hljs-string">"DRAFT"</span>
  description                = <span class="hljs-string">"The currency exchange rates API"</span>
  skip_resource_in_use_check = true
  action_group_executor {
    lambda = aws_lambda_function.forex_api.arn
  }
  api_schema {
    payload = file(<span class="hljs-string">"${path.module}/lambda/forex_api/schema.yaml"</span>)
  }
}
</code></pre>
<h2 id="heading-testing-the-configuration">Testing the configuration</h2>
<p>Now that the full Terraform configuration is developed, we can apply it and make sure that it is working correctly. For me it took less than a minute to complete - here is the output for reference:</p>
<pre><code class="lang-bash">aws_iam_role.bedrock_agent_forex_asst: Creating...
aws_iam_role.lambda_forex_api: Creating...
aws_iam_role.bedrock_agent_forex_asst: Creation complete after 0s [id=AmazonBedrockExecutionRoleForAgents_ForexAssistant]
aws_iam_role_policy.bedrock_agent_forex_asst: Creating...
aws_bedrockagent_agent.forex_asst: Creating...
aws_iam_role.lambda_forex_api: Creation complete after 1s [id=FunctionExecutionRoleForLambda_ForexAPI]
aws_lambda_function.forex_api: Creating...
aws_iam_role_policy.bedrock_agent_forex_asst: Creation complete after 1s [id=AmazonBedrockExecutionRoleForAgents_ForexAssistant:AmazonBedrockAgentBedrockFoundationModelPolicy_ForexAssistant]
aws_bedrockagent_agent.forex_asst: Creation complete after 4s [id=LTR1P1OJUC]
aws_lambda_function.forex_api: Still creating... [10s elapsed]
aws_lambda_function.forex_api: Creation complete after 14s [id=ForexAPI]
aws_lambda_permission.forex_api: Creating...
aws_bedrockagent_agent_action_group.forex_api: Creating...
aws_lambda_permission.forex_api: Creation complete after 0s [id=terraform-20240430193700768300000002]
aws_bedrockagent_agent_action_group.forex_api: Creation complete after 0s [id=W1PDUUCT8P,LTR1P1OJUC,DRAFT]

Apply complete! Resources: 7 added, 0 changed, 0 destroyed.
</code></pre>
<p>In the Bedrock console, we can see that the agent <strong>ForexAssistant</strong> is ready for testing. Using the test chat interface, I asked:</p>
<blockquote>
<p>What is the exchange rate from US Dollar to Canadian Dollar?</p>
</blockquote>
<p>However, I got the following unexpected answer:</p>
<blockquote>
<p>I apologize, but I am unable to look up the current exchange rate between US Dollar and Canadian Dollar. There seems to be an issue with the function call format that I am unable to resolve. I cannot provide the exchange rate information you requested.</p>
</blockquote>
<p>Looking at the <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/trace-events.html">trace</a>, it seems that the agent was not given the tool list and it tried to make up random functions to call, leading to errors:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714513771562/d1e4e340-c938-47ff-a635-1e7070d17574.png" alt="Trace showing the model's attempt to call an unknown function" class="image--center mx-auto" /></p>
<p>On closer look, it seems that this is because there are pending changes in the agent which is requires <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-test.html">preparation</a> as indicated in the Bedrock console:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714512432513/d7ad084c-d3dd-49b6-9348-36715a0a6ff4.png" alt="Agent needed to be prepared" class="image--center mx-auto" /></p>
<p>This tells me that Terraform is not performing the preparation. In any case, once I click <strong>Prepare</strong> and ask the same question again in a new session, the agent responds with the currency exchange rate I asked for:</p>
<blockquote>
<p>The exchange rate from US Dollar (USD) to Canadian Dollar (CAD) is 1 USD = 1.36660199 CAD.</p>
</blockquote>
<p>This is also confirmed in the trace which I will not show for brevity. Now we are one step away from an end-to-end IaC solution for the forex rate assistant, so let's try to address the issue.</p>
<h2 id="heading-workaround-for-agent-preparation-using-a-null-resource">Workaround for agent preparation using a null resource</h2>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text"><strong>2024-05-23: </strong>As of Terraform AWS Provider <a target="_blank" href="https://github.com/hashicorp/terraform-provider-aws/releases/tag/v5.49.0">v5.49.0</a>, the <code>aws_bedrockagent_agent</code> resource has a <code>prepare_agent</code> argument (<code>true</code> by default) that controls whether the agent is prepared after the agent is created or updated. The Terraform configuration in the GitHub repository has been updated to account for this enhancement. However, the null resource is still required for action groups since <code>aws_bedrockagent_action_group</code> still does not prepare the agent.</div>
</div>

<p>Looking at the Terraform AWS Provider documentation, I couldn't find any resource that supports preparation. As well, the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/bedrockagent_agent"><code>aws_bedrockagent_agent</code> resource</a> and the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/bedrockagent_agent_action_group"><code>aws_bedrockagent_action_group</code> resource</a> don't seem to have any argument that controls the preparation behavior. To be fair, the action is implemented as a separate API action called <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_PrepareAgent.html">PrepareAgent</a> in the Agents for Bedrock API, which does not directly fit into the resource concept in Terraform.</p>
<p>While I opened an <a target="_blank" href="https://github.com/hashicorp/terraform-provider-aws/issues/37162">issue</a> in the <a target="_blank" href="https://github.com/hashicorp/terraform-provider-aws">hashicorp/terraform-provider-aws GitHub repository</a>, one quick workaround I can think of is to use a <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/null/latest/docs/resources/resource">null resource</a> with the <a target="_blank" href="https://developer.hashicorp.com/terraform/language/resources/provisioners/local-exec">local-exec provisioner</a> to run the equivalent AWS CLI command for the PrepareAgent API, which is the <a target="_blank" href="https://docs.aws.amazon.com/cli/latest/reference/bedrock-agent/prepare-agent.html"><code>aws bedrock-agent prepare-agent</code> command</a>.</p>
<p>Our objective is to trigger this null resource to be rerun (technically replaced) every time there are changes to the agent, which also extends to the action group. It is inefficient to simply prepare every time you apply the Terraform configuration, and if anything it is just one more moving part that can break. With that in mind, I devised the following resource that serve the purpose well.</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"null_resource"</span> <span class="hljs-string">"forex_asst_prepare"</span> {
  triggers = {
    forex_asst_state = sha256(jsonencode(aws_bedrockagent_agent.forex_asst))
    forex_api_state  = sha256(jsonencode(aws_bedrockagent_agent_action_group.forex_api))
  }
  provisioner <span class="hljs-string">"local-exec"</span> {
    command = <span class="hljs-string">"aws bedrock-agent prepare-agent --agent-id ${aws_bedrockagent_agent.forex_asst.id}"</span>
  }
  depends_on = [
    aws_bedrockagent_agent.forex_asst,
    aws_bedrockagent_agent_action_group.forex_api
  ]
}
</code></pre>
<p>As you can see, I am using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/null/latest/docs/resources/resource#triggers"><code>triggers</code> argument</a> in the null resource to control when the resource should be replaced. We target the two main sources of change, which is the agent and the action group. Since trigger requires a string, a good candidate is to use the two resource's state somehow, as long as it doesn't contain any attributes that change every time Terraform is run. To keep the string short, we simply derive the SHA256 checksum from the resource state JSON as the triggers. The local-exec provisioner simply calls the AWS CLI command with the agent ID from <code>aws_bedrockagent_agent.forex_asst</code>.</p>
<p>With this change, we will run <code>terraform destroy</code> and then <code>terraform apply</code> to ensure full validity of the re-test. After Terraform completes successfully, we first check the agent in the Bedrock console to ensure that the <strong>Prepare</strong> button is no longer shown. As well, we ask our question to hopefully receive an expected result, which we did:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714512611481/e6ee8bf5-fa79-4f05-bd91-82a800c93c9d.png" alt="Prepare button no visible and agent responds correctly" class="image--center mx-auto" /></p>
<p>So there you have it, a functional Terraform configuration to deploy a basic forex rate assistant implemented using Agents for Amazon Bedrock!</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">✅</div>
<div data-node-type="callout-text">For reference, I've dressed up the Terraform solution with variables and such, and checked in the final artifacts to the <code>1_basic</code> directory in <a target="_blank" href="https://github.com/acwwat/terraform-amazon-bedrock-agent-example">this repository</a>. Feel free to check it out and use it as the basis for your Bedrock experimentation.</div>
</div>

<h2 id="heading-current-limitations-its-brand-new-after-all">Current limitations (it's brand new after all)</h2>
<p>It is not unexpected that we encounter some issues with brand new features, such as what we encountered in this blog post with the Agents for Amazon Bedrock resources. I myself dove a bit deeper and found a few more issues which I reported. I encourage you to <a target="_blank" href="https://github.com/hashicorp/terraform-provider-aws/issues?q=is%3Aissue+is%3Aopen+label%3Aservice%2Fbedrockagent">report any issues</a> that you see as you work more with the Terraform resources.</p>
<p>Meanwhile, there are still a couple resources related to <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html">Knowledge bases for Amazon Bedrock</a> still under development. I plan to integrate knowledge bases to our forex rate assistant, so I will eagerly wait for the Terraform resources to be ready for my next step in my Bedrock journey.</p>
<h2 id="heading-summary">Summary</h2>
<p>In this blog post, we developed the Terraform configuration for the basic forex rate assistant that we created interactively in the blog post <a target="_blank" href="https://blog.avangards.io/building-a-basic-forex-rate-assistant-using-agents-for-amazon-bedrock">Building a Basic Forex Rate Assistant Using Agents for Amazon Bedrock</a>. While we encountered some issues, we were able to work around it as the community continues to build out the features in the Terraform AWS Provider. For now, I will pivot to enhancing the forex rate agent to add new capabilities and to address some of its known shortcomings.</p>
<p>If you like this blog post, please be sure to check out other helpful articles on AWS, Terraform, and other DevOps topics in the <a target="_blank" href="https://blog.avangards.io">Avangards Blog</a>.</p>
]]></content:encoded></item><item><title><![CDATA[Building a Basic Forex Rate Assistant Using Agents for Amazon Bedrock]]></title><description><![CDATA[Introduction
With the prevalence of generative AI (gen AI), I've been keeping abreast on AWS' AI offerings for the past while. My journey started with Amazon Q Business, a fully managed service for building gen AI assistants. While the idea is great,...]]></description><link>https://blog.avangards.io/building-a-basic-forex-rate-assistant-using-agents-for-amazon-bedrock</link><guid isPermaLink="true">https://blog.avangards.io/building-a-basic-forex-rate-assistant-using-agents-for-amazon-bedrock</guid><category><![CDATA[AWS]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[Amazon Bedrock]]></category><category><![CDATA[AI]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Mon, 29 Apr 2024 17:09:28 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1714347382696/b3947cbf-108d-40d0-a78b-9cb44aab0ce8.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>With the prevalence of generative AI (gen AI), I've been keeping abreast on AWS' AI offerings for the past while. My journey started with <a target="_blank" href="https://docs.aws.amazon.com/amazonq/latest/qbusiness-ug/what-is.html">Amazon Q Business</a>, a fully managed service for building gen AI assistants. While the idea is great, it seems to be too basic as it is today and lacks the advanced features to improve the user experience in practice.</p>
<p>I then ventured into the more advanced use cases using Amazon Bedrock and went through many workshops such as <a target="_blank" href="https://catalog.workshops.aws/building-with-amazon-bedrock/en-US">Building with Amazon Bedrock and LangChain</a>. The challenge I find is that these workshops still tend to be basic, and they don't answer my questions about complex use cases. I came to learn about agents while going through LangChain literatures, but developing a full workflow felt like a daunting task when my full-time job is DevOps, not software development. Things always seem too simple that it doesn't provide enough business value, or too complex that it becomes too costly.</p>
<p>After attending a recent AWS PartnerCast webinar on building intelligent enterprise apps using gen AI on AWS, I learned about Agents for Amazon Bedrock and some recent new features added to the service. The service seems to be within the Goldilocks zone matching my current skillsets, so I decided to dive heads-first to learn all about it. I decided to build something realistic and figured that I should share my journey with folks in this blog post.</p>
<h2 id="heading-about-agents-for-amazon-bedrock">About Agents for Amazon Bedrock</h2>
<p><a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents.html">Agents for Amazon Bedrock</a> is a service that enables gen AI applications to execute multi-step tasks across company systems and data sources. It is effectively a managed service for agents and <a target="_blank" href="https://aws.amazon.com/what-is/retrieval-augmented-generation/">retrieval-augmented generation (RAG)</a>, which are common patterns to extend the capabilities of large language models (LLMs).</p>
<p>Agents for Amazon Bedrock assumes the complexity of orchestrating the interactions between different components in such workflows, which must otherwise be programmed into your gen AI application. While you can use frameworks such as <a target="_blank" href="https://python.langchain.com/docs/get_started/introduction/">LangChain</a> or <a target="_blank" href="https://docs.llamaindex.ai/en/stable/">LlamaIndex</a> to develop these workflows, Agents for Amazon Bedrock makes it much more efficient for common use cases. Agents can also integrate with <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html">knowledge bases</a> to enable RAG, as shown in the following diagram from the AWS documentation:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714349287135/d983ef96-3299-4fda-a013-36e0cb6f9f15.png" alt="The agent's process during runtime" class="image--center mx-auto" /></p>
<h2 id="heading-coming-up-with-a-basic-but-representative-use-case">Coming up with a basic but representative use case</h2>
<p>To help with brainstorming ideas for an agent, I decided to on these principles:</p>
<ol>
<li><p>The idea must be practical and with real-life data.</p>
</li>
<li><p>Follow the <a target="_blank" href="https://en.wikipedia.org/wiki/KISS_principle">KISS principle</a>.</p>
</li>
</ol>
<p>For inspirations on what type of agents I should build, I turned to the <a target="_blank" href="https://github.com/public-apis/public-apis">Public APIs</a> GitHub repository which has a curated lists of free APIs. I narrowed my search for an API that does not require sign-up or an API key and returns useful information. I ultimately decided to use the <a target="_blank" href="https://github.com/fawazahmed0/exchange-api">Free Currency Exchange Rates API</a>, which seemed promising upon some basic testing.</p>
<p>Naturally, the idea was steered towards a forex rate assistant which helps users look up rates from the API. The API supports lookup by dates, however to keep it simple I decided to limit the lookup to only the latest rates for now. This also leaves some room for enhancing the agent later.</p>
<h2 id="heading-requesting-for-model-access">Requesting for model access</h2>
<p>Agents for Amazon Bedrock is a relatively new feature, so it is <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-supported.html">supported only in limited regions with limited model support</a>. At the time of writing this blog post, it is only supported in US East (N. Virginia) (<code>us-east-1</code> ) and US West (Oregon) (<code>us-west-2</code>) and only supports Anthropic models. We will use the <code>us-west-2</code> region for our evaluation.</p>
<p>You should also be aware of the <a target="_blank" href="https://aws.amazon.com/bedrock/pricing/">pricing</a> for different Anthropic models. With the recent addition of the <a target="_blank" href="https://www.anthropic.com/news/claude-3-family">Claude 3 model family</a>, Haiku emerges as highly competitive with great price-to-performance balance. Thus we will use Haiku as the model for our agent.</p>
<p>When you first use Amazon Bedrock, you must <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html">request for access to the models</a>. This can be done in the Amazon Bedrock console using the <strong>Model access</strong> page which can be opened in the left menu. On that page, you will see the list of base models by vendor and their access status similar to the following:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714276017966/a094926c-2708-4db2-aafd-3ce544505dc1.png" alt="Model access page" class="image--center mx-auto" /></p>
<p>To request for access, do the following:</p>
<ol>
<li><p>Click on the <strong>Manage model access</strong> button.</p>
</li>
<li><p>On the <strong>Request model access</strong> page, scroll down to the Anthropic models in the list.</p>
</li>
<li><p>If this is the first time you are request access to Anthropic models, you will be required to <a target="_blank" href="https://repost.aws/knowledge-center/bedrock-access-anthropic-model">submit use case details</a>. Click on the <strong>Submit use case details</strong> button to open the form, then fill it in as appropriate and click <strong>Submit</strong>.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714276542058/9e217b43-b983-48dc-9c46-f9a2b43eda30.png" alt="Submit use case details for Anthropic" class="image--center mx-auto" /></p>
</li>
<li><p>Check the box next to the models to which you wish to request access. Since we might compare different Anthropic models, let's check the box next to <strong>Anthropic</strong> to request access to all of them. Lastly, click <strong>Request model access</strong> at the end of the page.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714276817019/c57b60f9-2d5a-4871-b77f-0f828409e4f0.png" alt="Request Anthropic model access" class="image--center mx-auto" /></p>
</li>
</ol>
<p>The access status should now show "In progress" and the request will only take a few minutes to be approved if all goes well. Once available, the access status should change to "Access granted".</p>
<h2 id="heading-creating-the-openapi-schema-for-the-currency-exchange-api">Creating the OpenAPI schema for the currency exchange API</h2>
<p>In our agent, we will be using an <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-action-create.html">action group</a> that defines an action that the agent can help the user perform by calling APIs via a Lambda function. Consequently, the action group in our agent requires the following:</p>
<ol>
<li><p>An <a target="_blank" href="https://swagger.io/docs/specification/data-models/">OpenAPI schema</a> that provides the specifications of the API</p>
</li>
<li><p>A Lambda function to which the action group makes API requests</p>
</li>
</ol>
<p>That is also to say, the Lambda function is effectively a "proxy" API that calls the actual APIs, which in our case is the free currency exchange rates API. Based on the <a target="_blank" href="https://github.com/fawazahmed0/exchange-api">API documentation</a>, we know the following:</p>
<ul>
<li><p>Since we will only support the latest exchange rate, the base URI for our API would be <code>https://cdn.jsdelivr.net/npm/@fawazahmed0/currency-api@latest/v1</code> .</p>
</li>
<li><p>We need to use the <code>/currencies.min.json</code> API, which gets the list of available currencies in minified JSON format. This helps minimize the number of tokens (and thus cost and limit) processed by the model.</p>
</li>
<li><p>We also need to use the <code>/currencies/{code}.min.json</code> API, gets the currency exchange rates with <code>{code}</code> as the base currency.</p>
</li>
</ul>
<p>Since this API does not provide the OpenAPI schema, we need to create it ourselves. I figured that this might be a regular exercise if I start testing Bedrock agents with different APIs, so I started looking for a tool that can generate OpenAPI schema, such as those listed in in <a target="_blank" href="https://openapi.tools/">OpenAPI.Tools</a>. One category of tools seems to use network traffic, often in the <a target="_blank" href="https://en.wikipedia.org/wiki/HAR_(file_format)">HAR format</a>, to generate the OpenAPI schema. I tried the <a target="_blank" href="https://chromewebstore.google.com/detail/openapi-devtools/jelghndoknklgabjgaeppjhommkkmdii">OpenAPI DevTools</a> which is a Chrome extension, however it did not work for the currency exchange rates API.</p>
<p>After wrestling with it for a bit and eventually giving up, I instead turned to <a target="_blank" href="https://chat.openai.com/">ChatGPT</a> to see if it is smart enough for the task. With my free plan, I asked ChatGPT 3.5 the following:</p>
<blockquote>
<p>Can you generate the OpenAPI spec YAML from this API GET URL: https://cdn.jsdelivr.net/npm/@fawazahmed0/currency-api@latest/v1/currencies.min.json</p>
</blockquote>
<p>To my surprise, it did generate a somewhat decent API spec:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714279796393/3e4629fb-6e71-4234-83d5-3eea7e685854.png" alt="Using ChatGPT to generate the OpenAPI spec" class="image--center mx-auto" /></p>
<p>While it is not usable as-is because the URL is missing the <code>/v1</code> part and it is lacking some descriptions, it has almost everything that I need. However, it struck me as odd that the response has uppercase currency code which is NOT what the API returns. So I started a new ChatGPT session and ask the same question, only to get a very different spec:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714280327858/242e133e-531c-4e3c-9bda-ae500f379ede.png" alt="Second attempt to generate the API spec using ChatGPT" class="image--center mx-auto" /></p>
<p>At this point, I was certain that ChatGPT is not calling the API to generate the spec but rely on what its knowledge to generate an answer. It is probably experiencing <a target="_blank" href="https://en.wikipedia.org/wiki/Hallucination_(artificial_intelligence)">hallucination</a>, but it is good enough as a starting point <strong>🤷</strong></p>
<p>I did the same for the other API and adjusted the spec using the <a target="_blank" href="https://editor.swagger.io/">Swagger Editor</a>. Specifically, I added detailed descriptions that should help the agent understand the API usages. The resulting OpenAPI YAML file is as follows:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">openapi:</span> <span class="hljs-number">3.0</span><span class="hljs-number">.0</span>
<span class="hljs-attr">info:</span>
  <span class="hljs-attr">title:</span> <span class="hljs-string">Currency</span> <span class="hljs-string">API</span>
  <span class="hljs-attr">description:</span> <span class="hljs-string">Provides</span> <span class="hljs-string">information</span> <span class="hljs-string">about</span> <span class="hljs-string">different</span> <span class="hljs-string">currencies.</span>
  <span class="hljs-attr">version:</span> <span class="hljs-number">1.0</span><span class="hljs-number">.0</span>
<span class="hljs-attr">servers:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">url:</span> <span class="hljs-string">https://cdn.jsdelivr.net/npm/@fawazahmed0/currency-api@latest/v1</span>
<span class="hljs-attr">paths:</span>
  <span class="hljs-string">/currencies.min.json:</span>
    <span class="hljs-attr">get:</span>
      <span class="hljs-attr">description:</span> <span class="hljs-string">|
        List all available currencies
</span>      <span class="hljs-attr">responses:</span>
        <span class="hljs-attr">'200':</span>
          <span class="hljs-attr">description:</span> <span class="hljs-string">Successful</span> <span class="hljs-string">response</span>
          <span class="hljs-attr">content:</span>
            <span class="hljs-attr">application/json:</span>
              <span class="hljs-attr">schema:</span>
                <span class="hljs-attr">type:</span> <span class="hljs-string">object</span>
                <span class="hljs-attr">description:</span> <span class="hljs-string">|
                  A map where the key refers to the three-letter currency code and the value to the currency name in English.
</span>                <span class="hljs-attr">additionalProperties:</span>
                  <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
  <span class="hljs-string">/currencies/{code}.min.json:</span>
    <span class="hljs-attr">get:</span>
      <span class="hljs-attr">description:</span> <span class="hljs-string">|
        List the exchange rates of all available currencies with the currency specified by the given currency code in the URL path parameter as the base currency
</span>      <span class="hljs-attr">parameters:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">in:</span> <span class="hljs-string">path</span>
          <span class="hljs-attr">name:</span> <span class="hljs-string">code</span>
          <span class="hljs-attr">required:</span> <span class="hljs-literal">true</span>
          <span class="hljs-attr">description:</span> <span class="hljs-string">The</span> <span class="hljs-string">three-letter</span> <span class="hljs-string">code</span> <span class="hljs-string">of</span> <span class="hljs-string">the</span> <span class="hljs-string">base</span> <span class="hljs-string">currency</span> <span class="hljs-string">for</span> <span class="hljs-string">which</span> <span class="hljs-string">to</span> <span class="hljs-string">fetch</span> <span class="hljs-string">exchange</span> <span class="hljs-string">rates</span>
          <span class="hljs-attr">schema:</span>
            <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
      <span class="hljs-attr">responses:</span>
        <span class="hljs-attr">'200':</span>
          <span class="hljs-attr">description:</span> <span class="hljs-string">Successful</span> <span class="hljs-string">response</span>
          <span class="hljs-attr">content:</span>
            <span class="hljs-attr">application/json:</span>
              <span class="hljs-attr">schema:</span>
                <span class="hljs-attr">type:</span> <span class="hljs-string">object</span>
                <span class="hljs-attr">description:</span> <span class="hljs-string">|
                  A map where the key refers to the three-letter currency code of the target currency and the value to the exchange rate to the target currency.
</span>                <span class="hljs-attr">additionalProperties:</span>
                  <span class="hljs-attr">type:</span> <span class="hljs-string">number</span>
                  <span class="hljs-attr">format:</span> <span class="hljs-string">float</span>
</code></pre>
<h2 id="heading-creating-the-agent">Creating the agent</h2>
<p>Now let's <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-create.html">create the agent</a> in the Amazon Bedrock console following the steps below:</p>
<ol>
<li><p>Select <strong>Agents</strong> in the left menu.</p>
</li>
<li><p>On the <strong>Agents</strong> page, click <strong>Create Agent</strong>.</p>
</li>
<li><p>In the <strong>Create Agent</strong> dialog, enter the following information and click <strong>Create</strong>:</p>
<ul>
<li><p><strong>Name:</strong> ForexAssistant</p>
</li>
<li><p><strong>Description:</strong> An assistant that provides forex rate information.</p>
</li>
</ul>
</li>
</ol>
<p>    <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714405124007/cac1cd6d-172e-457f-878b-2bbdb8a710c4.png" alt="Create agent" class="image--center mx-auto" /></p>
<ol start="4">
<li><p>On the <strong>Agent builder</strong> page, enter the following information and click <strong>Save</strong>:</p>
<ul>
<li><p><strong>Agent resource role:</strong> Create and use a new service role</p>
</li>
<li><p><strong>Select model:</strong> Anthropic, Claude 3 Haiku</p>
</li>
<li><p><strong>Instructions for the Agent:</strong> You are an assistant that looks up today's currency exchange rates. A user may ask you what the currency exchange rate is for one currency to another. They may provide either the currency name or the three-letter currency code. If they give you a name, you may first need to first look up the currency code by its name.</p>
</li>
</ul>
</li>
</ol>
<p>    <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714405831221/48570c5f-1152-4b3b-ac08-c240fc8459eb.png" alt="Agent builder" class="image--center mx-auto" /></p>
<p>Note that I try to provide concise instructions for the agent to help it reason up front. Depending on the test results, we might need to adjust it later with more <a target="_blank" href="https://en.wikipedia.org/wiki/Prompt_engineering">prompt engineering</a>.</p>
<h2 id="heading-creating-the-action-group">Creating the action group</h2>
<p>While still in the agent builder, we will <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-action-create.html">create the action group</a> that calls our APIs. Let's perform the following steps:</p>
<ol>
<li><p>In the <strong>Action groups</strong> section, click <strong>Add</strong>.</p>
</li>
<li><p>On the <strong>Create Action group</strong> page, enter the following information and click <strong>Create</strong>:</p>
<ul>
<li><p><strong>Enter Action group name:</strong> ForexAPI</p>
</li>
<li><p><strong>Description:</strong> The currency exchange rates API</p>
</li>
<li><p><strong>Action group type:</strong> Define with API schemas</p>
</li>
<li><p><strong>Action group invocation:</strong> Quick create a new Lambda function</p>
</li>
<li><p><strong>Action group schema:</strong> Define via in-line schema editor</p>
</li>
<li><p><strong>In-line OpenAPI schema:*</strong>Copy and paste the OpenAPI YAML from previous section*</p>
</li>
</ul>
</li>
</ol>
<p>    <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714406565179/cd3000b2-08b1-43df-adb6-425c6b99d607.png" alt="Create action group" class="image--center mx-auto" /></p>
<p>After 15 seconds or so, you should receive a success message and be returned to the agent builder page. A dummy Lambda function should have been created, so our next step would be to add the logic to call the actual currency exchange rates API.</p>
<h2 id="heading-updating-the-lambda-function-to-call-the-api">Updating the Lambda function to call the API</h2>
<p>Let's go back into the action group page by clicking on the name of the action group (i.e. <strong>ForexAPI</strong>) in the list. In the edit page, click on the <strong>View</strong> button near the <strong>Select Lambda function</strong> field, which should take you to the function page in the Lambda console.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714406885484/3f075ec0-9780-4a23-be51-cbd90a0c7214.png" alt="View Lambda function" class="image--center mx-auto" /></p>
<p>On the function page, you will see the code template that has been generated for you, which provides some basic processing of the input event and the response event.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714407083755/1c0e1dfc-26e1-409e-88f6-f242f44b44cf.png" alt="The Lambda function dummy code" class="image--center mx-auto" /></p>
<p>After examining the <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-lambda.html#agents-lambda-input">input event format</a>, we will recognize that the attributes that we need to use are:</p>
<ul>
<li><p><code>apiPath</code>, which should provide the path to the API as defined in the OpenAPI YAML (namely <code>/currencies.min.json</code> or <code>/currencies/{code}.min.json</code>).</p>
</li>
<li><p><code>httpMethod</code>, which should always be <code>get</code> in our case. We thus won't make use of this attribute directly in our example.</p>
</li>
<li><p><code>parameters</code>, which we need to provide for the rate lookup API which expects the <code>code</code> URI path parameter to be a three-level currency code.</p>
</li>
</ul>
<p>I will spare you the gory details on writing the Lambda function, so here's is the code and some implementation details provided in comments:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> json
<span class="hljs-keyword">import</span> urllib.parse <span class="hljs-comment"># urllib is available in Lambda runtime w/o needing a layer</span>
<span class="hljs-keyword">import</span> urllib.request

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">lambda_handler</span>(<span class="hljs-params">event, context</span>):</span>
    agent = event[<span class="hljs-string">'agent'</span>]
    actionGroup = event[<span class="hljs-string">'actionGroup'</span>]
    apiPath = event[<span class="hljs-string">'apiPath'</span>]
    httpMethod =  event[<span class="hljs-string">'httpMethod'</span>]
    parameters = event.get(<span class="hljs-string">'parameters'</span>, [])
    requestBody = event.get(<span class="hljs-string">'requestBody'</span>, {})

    <span class="hljs-comment"># Read and process input parameters</span>
    code = <span class="hljs-literal">None</span>
    <span class="hljs-keyword">for</span> parameter <span class="hljs-keyword">in</span> parameters:
        <span class="hljs-keyword">if</span> (parameter[<span class="hljs-string">"name"</span>] == <span class="hljs-string">"code"</span>):
            <span class="hljs-comment"># Just in case, convert to lowercase as expected by the API</span>
            code = parameter[<span class="hljs-string">"value"</span>].lower()

    <span class="hljs-comment"># Execute your business logic here. For more information, refer to: https://docs.aws.amazon.com/bedrock/latest/userguide/agents-lambda.html</span>
    apiPathWithParam = apiPath
    <span class="hljs-comment"># Replace URI path parameters</span>
    <span class="hljs-keyword">if</span> code <span class="hljs-keyword">is</span> <span class="hljs-keyword">not</span> <span class="hljs-literal">None</span>:
        apiPathWithParam = apiPathWithParam.replace(<span class="hljs-string">"{code}"</span>, urllib.parse.quote(code))

    <span class="hljs-comment"># <span class="hljs-doctag">TODO:</span> Use a environment variable or Parameter Store to set the URL</span>
    url = <span class="hljs-string">"https://cdn.jsdelivr.net/npm/@fawazahmed0/currency-api@latest/v1{apiPathWithParam}"</span>.format(apiPathWithParam = apiPathWithParam)

    <span class="hljs-comment"># Call the currency exchange rates API based on the provided path and wrap the response</span>
    apiResponse = urllib.request.urlopen(
        urllib.request.Request(
            url=url,
            headers={<span class="hljs-string">"Accept"</span>: <span class="hljs-string">"application/json"</span>},
            method=<span class="hljs-string">"GET"</span>
        )
    )
    responseBody =  {
        <span class="hljs-string">"application/json"</span>: {
            <span class="hljs-string">"body"</span>: apiResponse.read()
        }
    }

    action_response = {
        <span class="hljs-string">'actionGroup'</span>: actionGroup,
        <span class="hljs-string">'apiPath'</span>: apiPath,
        <span class="hljs-string">'httpMethod'</span>: httpMethod,
        <span class="hljs-string">'httpStatusCode'</span>: <span class="hljs-number">200</span>,
        <span class="hljs-string">'responseBody'</span>: responseBody

    }

    api_response = {<span class="hljs-string">'response'</span>: action_response, <span class="hljs-string">'messageVersion'</span>: event[<span class="hljs-string">'messageVersion'</span>]}
    print(<span class="hljs-string">"Response: {}"</span>.format(api_response))

    <span class="hljs-keyword">return</span> api_response
</code></pre>
<p>You can copy and paste this code into the editor and click <strong>Deploy</strong> to update it. At this point, we should test the Lambda function before returning to the Amazon Bedrock console. To do this, you can use the following event template to test the <code>/currencies.min.json</code> API (note that some irrelevant fields are omitted):</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"messageVersion"</span>: <span class="hljs-string">"1.0"</span>,
    <span class="hljs-attr">"agent"</span>: {
        <span class="hljs-attr">"name"</span>: <span class="hljs-string">"TBD"</span>,
        <span class="hljs-attr">"id"</span>: <span class="hljs-string">"TBD"</span>,
        <span class="hljs-attr">"alias"</span>: <span class="hljs-string">"TBD"</span>,
        <span class="hljs-attr">"version"</span>: <span class="hljs-string">"TBD"</span>
    },
    <span class="hljs-attr">"inputText"</span>: <span class="hljs-string">"TBD"</span>,
    <span class="hljs-attr">"sessionId"</span>: <span class="hljs-string">"TBD"</span>,
    <span class="hljs-attr">"actionGroup"</span>: <span class="hljs-string">"TBD"</span>,
    <span class="hljs-attr">"apiPath"</span>: <span class="hljs-string">"/currencies.min.json"</span>,
    <span class="hljs-attr">"httpMethod"</span>: <span class="hljs-string">"get"</span>
}
</code></pre>
<p>You should see the success response with the list of currencies:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714407177905/4f4b8169-e938-4784-bff4-23297cd804fb.png" alt="Testing the first API via the Lambda function" class="image--center mx-auto" /></p>
<p>You can then use the following event template to test the <code>/currencies/{code}.min.json</code> API:</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"messageVersion"</span>: <span class="hljs-string">"1.0"</span>,
    <span class="hljs-attr">"agent"</span>: {
        <span class="hljs-attr">"name"</span>: <span class="hljs-string">"TBD"</span>,
        <span class="hljs-attr">"id"</span>: <span class="hljs-string">"TBD"</span>,
        <span class="hljs-attr">"alias"</span>: <span class="hljs-string">"TBD"</span>,
        <span class="hljs-attr">"version"</span>: <span class="hljs-string">"TBD"</span>
    },
    <span class="hljs-attr">"inputText"</span>: <span class="hljs-string">"TBD"</span>,
    <span class="hljs-attr">"sessionId"</span>: <span class="hljs-string">"TBD"</span>,
    <span class="hljs-attr">"actionGroup"</span>: <span class="hljs-string">"TBD"</span>,
    <span class="hljs-attr">"apiPath"</span>: <span class="hljs-string">"/currencies/{code}.min.json"</span>,
    <span class="hljs-attr">"httpMethod"</span>: <span class="hljs-string">"get"</span>,
    <span class="hljs-attr">"parameters"</span>: [
        {
            <span class="hljs-attr">"name"</span>: <span class="hljs-string">"code"</span>,
            <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
            <span class="hljs-attr">"value"</span>: <span class="hljs-string">"usd"</span>
        }
    ]
}
</code></pre>
<p>You should see the success response with the list of exchange rates from US dollar to other currencies:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714407285597/064bbd5b-426e-473a-a1ae-ee604e2f8dbc.png" alt="Testing the second API via the Lambda function" class="image--center mx-auto" /></p>
<p>With the Lambda function verified, we can close the Lambda console and return to the Bedrock console to test the agent.</p>
<h2 id="heading-testing-the-agent">Testing the agent</h2>
<p>It is imperative that we test the agent thoroughly to ensure that it provides accurate answers. Back to the agent builder, we need to click on the <strong>Prepare</strong> button to prepare it, which is required <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-manage.html#agents-edit">whenever the agent is changed</a>. We can then test the agent using the built-in chat interface to the right of the console using the following prompt:</p>
<blockquote>
<p>What is the forex rate from US Dollar to Japanese Yen?</p>
</blockquote>
<p>Interestingly, I got the following response from the agent:</p>
<blockquote>
<p>Sorry, I do not have the capability to look up the current forex rate from US Dollar to Japanese Yen. I can only provide a list of available currencies, but cannot retrieve the specific exchange rate you requested.</p>
</blockquote>
<div data-node-type="callout">
<div data-node-type="callout-emoji">⚠</div>
<div data-node-type="callout-text">When I was validating the solution from scratch, the agent was able to return the correct answer. This could be caused by the model parameters that affects variability of responses among other things - the model is a bit of a black box after all! If you cannot reproduce this problem, try a few prompt sessions and ask the same question.</div>
</div>

<p>This is seemingly implying that the agent only knows of one API but not the other. So we need to troubleshoot the problem, which is where the ever-important <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/trace-events.html">trace feature</a> come into play. The trace helps you follow the agent's reasoning that leads it to the response it gives at that point in the conversation.</p>
<p>When we show the trace using the link below the agent response, we can see the traces for each orchestration steps. There are four traces under the <strong>Orchestration and knowledge base</strong> tab:</p>
<ul>
<li><p>Trace step 1 indicates the agent's rationale of first getting the currency code from the list then calling the <code>/currencies/{code}.min.json</code> API to get the rate, which seems correct. It is also able to call the <code>/currencies.min.json</code> API to get the list of currencies to look up the code. So far so good.</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714408247850/ad2e29bd-a5bf-48f9-abbd-d46b109ebab4.png" alt="Trace step 1" class="image--center mx-auto" /></p>
</li>
<li><p>Trace step 2 indicates that it was able to get the currency code for US Dollar as <code>USD</code>, however we are not sure why it's in uppercase. It also indicates that <code>get::ForexAPI::/currencies/USD.min.json</code> is not a valid function, which is not true. It is unclear about the logic behind the decision.</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714408475610/e75ccf56-7344-4d5e-89e2-f0643e0e3f88.png" alt="Trace step 2" class="image--center mx-auto" /></p>
</li>
<li><p>Trace step 3 indicates that it is calling the <code>/currencies.min.json</code> API again for whatever reason. Lastly trace step 4 indicates that it cannot get the currency exchange rate and therefore gave up with the response we saw in the chat.</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714288158065/2aa90f58-423d-4eaa-a487-08b7177ce00c.png" alt="Trace step 4" class="image--center mx-auto" /></p>
</li>
</ul>
<p>Since LLM is for the most part a black box, unfortunately we likely won't be able to get to the root cause. The only wild guess I could make is that the <code>.min.json</code> part is throwing it off because it doesn't resemble a normal RESTful API, so perhaps we can try to adjust the API specifications to remove that part.</p>
<h2 id="heading-adjusting-the-api-specs-and-re-testing">Adjusting the API specs and re-testing</h2>
<p>Let's make the adjustment in the OpenAPI YAML by stripping out the <code>.min.json</code> part from both API URLs:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">openapi:</span> <span class="hljs-number">3.0</span><span class="hljs-number">.0</span>
<span class="hljs-attr">info:</span>
  <span class="hljs-attr">title:</span> <span class="hljs-string">Currency</span> <span class="hljs-string">API</span>
  <span class="hljs-attr">description:</span> <span class="hljs-string">Provides</span> <span class="hljs-string">information</span> <span class="hljs-string">about</span> <span class="hljs-string">different</span> <span class="hljs-string">currencies.</span>
  <span class="hljs-attr">version:</span> <span class="hljs-number">1.0</span><span class="hljs-number">.0</span>
<span class="hljs-attr">servers:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">url:</span> <span class="hljs-string">https://cdn.jsdelivr.net/npm/@fawazahmed0/currency-api@latest/v1</span>
<span class="hljs-attr">paths:</span>
  <span class="hljs-string">/currencies:</span>
    <span class="hljs-attr">get:</span>
      <span class="hljs-attr">description:</span> <span class="hljs-string">|
        List all available currencies
</span>      <span class="hljs-attr">responses:</span>
        <span class="hljs-attr">'200':</span>
          <span class="hljs-attr">description:</span> <span class="hljs-string">Successful</span> <span class="hljs-string">response</span>
          <span class="hljs-attr">content:</span>
            <span class="hljs-attr">application/json:</span>
              <span class="hljs-attr">schema:</span>
                <span class="hljs-attr">type:</span> <span class="hljs-string">object</span>
                <span class="hljs-attr">description:</span> <span class="hljs-string">|
                  A map where the key refers to the three-letter currency code and the value to the currency name in English.
</span>                <span class="hljs-attr">additionalProperties:</span>
                  <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
  <span class="hljs-string">/currencies/{code}:</span>
    <span class="hljs-attr">get:</span>
      <span class="hljs-attr">description:</span> <span class="hljs-string">|
        List the exchange rates of all available currencies with the currency specified by the given currency code in the URL path parameter as the base currency
</span>      <span class="hljs-attr">parameters:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">in:</span> <span class="hljs-string">path</span>
          <span class="hljs-attr">name:</span> <span class="hljs-string">code</span>
          <span class="hljs-attr">required:</span> <span class="hljs-literal">true</span>
          <span class="hljs-attr">description:</span> <span class="hljs-string">The</span> <span class="hljs-string">three-letter</span> <span class="hljs-string">code</span> <span class="hljs-string">of</span> <span class="hljs-string">the</span> <span class="hljs-string">base</span> <span class="hljs-string">currency</span> <span class="hljs-string">for</span> <span class="hljs-string">which</span> <span class="hljs-string">to</span> <span class="hljs-string">fetch</span> <span class="hljs-string">exchange</span> <span class="hljs-string">rates</span>
          <span class="hljs-attr">schema:</span>
            <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
      <span class="hljs-attr">responses:</span>
        <span class="hljs-attr">'200':</span>
          <span class="hljs-attr">description:</span> <span class="hljs-string">Successful</span> <span class="hljs-string">response</span>
          <span class="hljs-attr">content:</span>
            <span class="hljs-attr">application/json:</span>
              <span class="hljs-attr">schema:</span>
                <span class="hljs-attr">type:</span> <span class="hljs-string">object</span>
                <span class="hljs-attr">description:</span> <span class="hljs-string">|
                  A map where the key refers to the three-letter currency code of the target currency and the value to the exchange rate to the target currency.
</span>                <span class="hljs-attr">additionalProperties:</span>
                  <span class="hljs-attr">type:</span> <span class="hljs-string">number</span>
                  <span class="hljs-attr">format:</span> <span class="hljs-string">float</span>
</code></pre>
<p>This will cause the agent to pass the API URL without the <code>.min.json</code> part to the Lambda function in the event, so we need to add it to the URL before calling the currency exchange rates API in line 27. The resulting Lambda code is thus:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> json
<span class="hljs-keyword">import</span> urllib.parse <span class="hljs-comment"># urllib is available in Lambda runtime w/o needing a layer</span>
<span class="hljs-keyword">import</span> urllib.request

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">lambda_handler</span>(<span class="hljs-params">event, context</span>):</span>
    agent = event[<span class="hljs-string">'agent'</span>]
    actionGroup = event[<span class="hljs-string">'actionGroup'</span>]
    apiPath = event[<span class="hljs-string">'apiPath'</span>]
    httpMethod =  event[<span class="hljs-string">'httpMethod'</span>]
    parameters = event.get(<span class="hljs-string">'parameters'</span>, [])
    requestBody = event.get(<span class="hljs-string">'requestBody'</span>, {})

    <span class="hljs-comment"># Read and process input parameters</span>
    code = <span class="hljs-literal">None</span>
    <span class="hljs-keyword">for</span> parameter <span class="hljs-keyword">in</span> parameters:
        <span class="hljs-keyword">if</span> (parameter[<span class="hljs-string">"name"</span>] == <span class="hljs-string">"code"</span>):
            <span class="hljs-comment"># Just in case, convert to lowercase as expected by the API</span>
            code = parameter[<span class="hljs-string">"value"</span>].lower()

    <span class="hljs-comment"># Execute your business logic here. For more information, refer to: https://docs.aws.amazon.com/bedrock/latest/userguide/agents-lambda.html</span>
    apiPathWithParam = apiPath
    <span class="hljs-comment"># Replace URI path parameters</span>
    <span class="hljs-keyword">if</span> code <span class="hljs-keyword">is</span> <span class="hljs-keyword">not</span> <span class="hljs-literal">None</span>:
        apiPathWithParam = apiPathWithParam.replace(<span class="hljs-string">"{code}"</span>, urllib.parse.quote(code))

    <span class="hljs-comment"># <span class="hljs-doctag">TODO:</span> Use a environment variable or Parameter Store to set the URL</span>
    url = <span class="hljs-string">"https://cdn.jsdelivr.net/npm/@fawazahmed0/currency-api@latest/v1{apiPathWithParam}.min.json"</span>.format(apiPathWithParam = apiPathWithParam)

    <span class="hljs-comment"># Call the currency exchange rates API based on the provided path and wrap the response</span>
    apiResponse = urllib.request.urlopen(
        urllib.request.Request(
            url=url,
            headers={<span class="hljs-string">"Accept"</span>: <span class="hljs-string">"application/json"</span>},
            method=<span class="hljs-string">"GET"</span>
        )
    )
    responseBody =  {
        <span class="hljs-string">"application/json"</span>: {
            <span class="hljs-string">"body"</span>: apiResponse.read()
        }
    }

    action_response = {
        <span class="hljs-string">'actionGroup'</span>: actionGroup,
        <span class="hljs-string">'apiPath'</span>: apiPath,
        <span class="hljs-string">'httpMethod'</span>: httpMethod,
        <span class="hljs-string">'httpStatusCode'</span>: <span class="hljs-number">200</span>,
        <span class="hljs-string">'responseBody'</span>: responseBody

    }

    api_response = {<span class="hljs-string">'response'</span>: action_response, <span class="hljs-string">'messageVersion'</span>: event[<span class="hljs-string">'messageVersion'</span>]}
    print(<span class="hljs-string">"Response: {}"</span>.format(api_response))

    <span class="hljs-keyword">return</span> api_response
</code></pre>
<p>Once you updated both, you can prepare and test the agent again. Interestingly, we now get a proper response:</p>
<blockquote>
<p>The current forex rate from US Dollar (USD) to Japanese Yen (JPY) is 1 USD = 158.3147925 JPY.</p>
</blockquote>
<p>To ensure that the number is from the API and not other sources, we can review the agent's rationale from the trace like before. In trace step 2, we can see the right rationale and the invocation of the <code>/currencies/{code}</code> API with <code>USD</code> as the code parameter (again, not sure why it's in uppercase) as expected:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714408762808/8d830334-f81f-4325-a9be-9e26cbbecd7c.png" alt="Trace step 2 from the retry" class="image--center mx-auto" /></p>
<p>Hurray, we have successfully build a basic forex rate assistant using Agents for Amazon Bedrock! Naturally, you should test the agent extensively since LLMs are sometimes unpredictable and may require adjustments.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">In a follow-up blog post <a target="_blank" href="https://blog.avangards.io/how-to-manage-an-amazon-bedrock-agent-using-terraform">How To Manage an Amazon Bedrock Agent Using Terraform</a>, I provide details on how to automate the deployment of this solution using Terraform. Please feel free to read it or refer directly to the configuration in <a target="_blank" href="https://github.com/acwwat/terraform-amazon-bedrock-agent-example">this repository</a>.</div>
</div>

<h2 id="heading-testing-reveals-more-limitations">Testing reveals more limitations</h2>
<p>For sake of experimentation, let's see what happens when we ask the assistant to do the reverse conversion. We will continue with the conversation in the same chat session and enter the following prompt:</p>
<blockquote>
<p>What about the other way around?</p>
</blockquote>
<p>The agent responded with the following:</p>
<blockquote>
<p>The current forex rate from Japanese Yen (JPY) to US Dollar (USD) is 1 JPY = 0.0063163 USD.</p>
</blockquote>
<p>However, a quick check shows that the number is a bit off. The response from <a target="_blank" href="https://cdn.jsdelivr.net/npm/@fawazahmed0/currency-api@latest/v1/currencies/jpy.json">https://cdn.jsdelivr.net/npm/@fawazahmed0/currency-api@latest/v1/currencies/jpy.json</a> (at the time of writing) shows 0.0063165291 which is also what I got from the calculator for 1 / 158.3147925. Again, we will need to review the trace to see what the agent is up to. The trace revealed that the agent is doing an inverse calculation but the calculation is incorrect for some reason:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714289516452/af5a8a81-71b9-4743-a485-55e615eb92bd.png" alt="Trace step 1 from the follow-up question" class="image--center mx-auto" /></p>
<p>My expectation is that the agent should do another lookup from the API to get the right number. If the API were developed for a business and has a spread between the two exchange rates for profit, the agent would have given the wrong information. Putting that aside, the calculation is simply wrong.</p>
<p>After doing some reading online, it seems that <a target="_blank" href="https://www.xda-developers.com/why-llms-are-bad-at-math/">LLMs in general are bad at math</a> because their design is to predict words and not performing computations. So the exchange right 0.0063163 might just be a predication by Haiku based on the data that it was trained with.</p>
<h2 id="heading-additional-thoughts-and-summary">Additional thoughts and summary</h2>
<p>While we have built a functional forex rate assistant using Agents for Amazon Bedrock, it is certainly not production grade since it is not super accurate and it is a bit slow. Improving its accuracy is where the bulk of the effort for gen AI lies. AWS recommends the following strategies which developers should sequentially employ to improve their gen AI application:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714334652345/9f3d5b38-8c3f-4302-8cf6-3079a76c51d3.png" alt="Approaches for improving quality of gen AI solutions" class="image--center mx-auto" /></p>
<p>For instance, my next iteration of improvement could start with adjusting the model <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/inference-parameters.html">inference parameters</a> and <a target="_blank" href="https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-engineering-guidelines.html">prompt engineering</a>, perhaps to ensure that it always calls the API instead of trying to do calculations. We also ought to look at why the LLM provide uppercase currency codes. Prompt engineering is admittedly more of an art and will require many rounds of trial and error, so be prepared for that.</p>
<p>I hope you learn something new from this blog post and has a better understanding of the features, potentials, and limitations of Agents for Amazon Bedrock. We are only scratching the surface here, so you are encouraged to use this forex agent as a start point for more improvements or develop your own agent. You would also need to expose the agent to end-users with a new frontend or an existing application. For me, the next step is to look into <a target="_blank" href="https://blog.avangards.io/how-to-manage-an-amazon-bedrock-agent-using-terraform">how to manage Bedrock agents using Terraform with the hot-off-the-press resources</a>.</p>
<p>If you enjoyed this blog post, please be sure to check out other contents related to AWS and DevOps in the <a target="_blank" href="https://blog.avangards.io">Avangards Blog</a>. Thanks for your time and have fun with gen AI!</p>
]]></content:encoded></item><item><title><![CDATA[How To Manage Amazon GuardDuty in AWS Organizations Using Terraform]]></title><description><![CDATA[Introduction
Since I released the blog series How to implement the AWS Startup Security Baseline (SSB) using Terraform recently, I've received some feedback and questions on it. In particular, there were some questions around setting up GuardDuty in ...]]></description><link>https://blog.avangards.io/how-to-manage-amazon-guardduty-in-aws-organizations-using-terraform</link><guid isPermaLink="true">https://blog.avangards.io/how-to-manage-amazon-guardduty-in-aws-organizations-using-terraform</guid><category><![CDATA[AWS]]></category><category><![CDATA[Terraform]]></category><category><![CDATA[Security]]></category><dc:creator><![CDATA[Anthony Wat]]></dc:creator><pubDate>Tue, 23 Apr 2024 16:54:51 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1717919136735/0e33d727-ac68-4fff-bf7f-7328784a695a.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>Since I released the blog series <a target="_blank" href="https://blog.avangards.io/series/aws-ssb-terraform">How to implement the AWS Startup Security Baseline (SSB) using Terraform</a> recently, I've received some feedback and questions on it. In particular, there were some questions around setting up GuardDuty in an organization using Terraform. Since the configuration involves multiple accounts and there are some quirks with the resources, I decided to write a separate blog post on how to properly implement it with explanation on each step.</p>
<h2 id="heading-about-the-use-case">About the use case</h2>
<p><a target="_blank" href="https://docs.aws.amazon.com/guardduty/latest/ug/what-is-guardduty.html">Amazon GuardDuty</a> is a managed threat detection service that continuously monitors AWS accounts and workloads for malicious or unauthorized activity using machine learning, anomaly detection, and integrated threat intelligence.</p>
<p>GuardDuty supports <a target="_blank" href="https://docs.aws.amazon.com/guardduty/latest/ug/guardduty_organizations.html">managing multiple accounts with AWS Organizations</a> via the delegated administrator feature, with which you designate an AWS account in the organization to centrally manage GuardDuty for all members. This is great for managing a multi-account landing zone by centralizing management of GuardDuty settings in a consistent manner.</p>
<p>Since it is increasingly common to establish an AWS landing zone using <a target="_blank" href="https://docs.aws.amazon.com/controltower/latest/userguide/what-is-control-tower.html">AWS Control Tower</a>, we will use the <a target="_blank" href="https://docs.aws.amazon.com/controltower/latest/userguide/accounts.html">standard account structure</a> in a Control Tower landing zone to demonstrate how to configure GuardDuty in Terraform:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1713858762060/1346a5a6-bdfc-426b-bf3c-562570b155b3.png" alt="Control Tower standard OU and account structure" class="image--center mx-auto" /></p>
<p>The relevant accounts for our use case in the landing zone are:</p>
<ol>
<li><p>The <strong>Management</strong> account for the organization where AWS Organizations is configured. For details, refer to <a target="_blank" href="https://docs.aws.amazon.com/organizations/latest/userguide/services-that-can-integrate-guardduty.html">Managing GuardDuty accounts with AWS Organizations</a>.</p>
</li>
<li><p>The <strong>Audit</strong> account where security and compliance services are typically centralized in a Control Tower landing zone.</p>
</li>
</ol>
<p>The objective is to delegate GuardDuty administrative duties from the <strong>Management</strong> account to the <strong>Audit</strong> account, after which all organization configurations are managed in the <strong>Audit</strong> account. With that said, let's see how we can achieve this using Terraform!</p>
<h2 id="heading-designating-a-guardduty-administrator-account">Designating a GuardDuty administrator account</h2>
<p>GuardDuty delegated administrator is configured in the <strong>Management</strong> account, so we need a provider associated with it in Terraform. To keep things simple, we will take a multi-provider approach by defining two providers, one for the <strong>Management</strong> account and another for the <strong>Audit</strong> account, using AWS CLI profiles as follows:</p>
<pre><code class="lang-dockerfile">provider <span class="hljs-string">"aws"</span> {
  alias   = <span class="hljs-string">"management"</span>
  <span class="hljs-comment"># Use "aws configure" to create the "management" profile with the Management account credentials</span>
  profile = <span class="hljs-string">"management"</span> 
}

provider <span class="hljs-string">"aws"</span> {
  alias   = <span class="hljs-string">"audit"</span>
  <span class="hljs-comment"># Use "aws configure" to create the "audit" profile with the Audit account credentials</span>
  profile = <span class="hljs-string">"audit"</span> 
}
</code></pre>
<div data-node-type="callout">
<div data-node-type="callout-emoji">⚠</div>
<div data-node-type="callout-text">Since GuardDuty is a regional service, you must apply this Terraform configuration on each region that you are using. Consider using the <code>region</code> argument in your provider definition and a variable to make your Terraform configuration rerunnable in other regions.</div>
</div>

<p>We can then use the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/guardduty_organization_admin_account"><code>aws_guardduty_organization_admin_account</code> resource</a> to set the delegated administrator. However, I noticed the following in the <strong>Audit</strong> account:</p>
<ul>
<li><p>After this resource is created, GuardDuty will be enabled with both the foundational data sources and all protection plans enabled.</p>
</li>
<li><p>When the resource is deleted, GuardDuty remains enabled.</p>
</li>
</ul>
<p>These side effects are not desirable since we would ideally want full control over the lifecycle and configuration of GuardDuty in Terraform. To address this issue, we will preemptively enable GuardDuty in the <strong>Audit</strong> account using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/guardduty_detector"><code>aws_guardduty_detector</code> resource</a>. We will also manage the protection plans using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/guardduty_detector_feature"><code>aws_guardduty_detector_feature</code> resource</a> in subsequent steps after we define the org-wide settings.</p>
<p>The resulting Terraform configuration should be defined as follows (pay special attention to the <code>provider</code> argument in each resource):</p>
<pre><code class="lang-dockerfile">data <span class="hljs-string">"aws_caller_identity"</span> <span class="hljs-string">"audit"</span> {
  provider = aws.audit
}

resource <span class="hljs-string">"aws_guardduty_detector"</span> <span class="hljs-string">"audit"</span> {
  provider = aws.audit
}

resource <span class="hljs-string">"aws_guardduty_organization_admin_account"</span> <span class="hljs-string">"this"</span> {
  provider         = aws.management
  admin_account_id = data.aws_caller_identity.audit.account_id
  depends_on       = [aws_guardduty_detector.audit]
}
</code></pre>
<p>With the <strong>Audit</strong> account designated as the GuardDuty administrator, we can now manage the organization configuration.</p>
<h2 id="heading-configuring-organization-auto-enable-preferences"><strong>Configuring organization auto-enable preferences</strong></h2>
<p>GuardDuty distinguishes the foundational data sources settings from the protection plans settings. The former is managed using the <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/guardduty_organization_configuration"><code>aws_guardduty_organization_configuration</code> resource</a>. In our case, we want to manage GuardDuty for all accounts (i.e. both new and existing accounts). The resulting Terraform configuration should thus look like the following:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_guardduty_organization_configuration"</span> <span class="hljs-string">"this"</span> {
  provider                         = aws.audit
  auto_enable_organization_members = <span class="hljs-string">"ALL"</span>
  detector_id                      = aws_guardduty_detector.audit.id
  depends_on                       = [aws_guardduty_organization_admin_account.this]
}
</code></pre>
<p>Next, let's manage the protection plan configuration. For illustration, let's assume that we only want to enable only <a target="_blank" href="https://docs.aws.amazon.com/guardduty/latest/ug/guardduty-eks-audit-log-monitoring.html">EKS Audit Log Monitoring</a>. To ensure full configurability, we will define the setting for all protection plans using a variable:</p>
<pre><code class="lang-dockerfile"><span class="hljs-comment"># Terraform configuration (.tf)</span>

variable <span class="hljs-string">"guardduty_features"</span> {
  description = <span class="hljs-string">"An object map that defines the GuardDuty organization configuration."</span>
  type = map(object({
    auto_enable = string
    name        = string
    additional_configuration = optional(list(object({
      auto_enable = string
      name        = string
    })))
  }))
}
</code></pre>
<pre><code class="lang-dockerfile"><span class="hljs-comment"># Variable definition (.tfvars)</span>

guardduty_features = {
  s3 = {
    auto_enable = <span class="hljs-string">"NONE"</span>
    name        = <span class="hljs-string">"S3_DATA_EVENTS"</span>
  }
  eks = {
    auto_enable = <span class="hljs-string">"ALL"</span>
    name        = <span class="hljs-string">"EKS_AUDIT_LOGS"</span>
  }
  eks_runtime_monitoring = {
    <span class="hljs-comment"># EKS_RUNTIME_MONITORING is deprecated and should thus be explicitly disabled</span>
    auto_enable = <span class="hljs-string">"NONE"</span>
    name        = <span class="hljs-string">"EKS_RUNTIME_MONITORING"</span>
    additional_configuration = [
      {
        auto_enable = <span class="hljs-string">"NONE"</span>
        name        = <span class="hljs-string">"EKS_ADDON_MANAGEMENT"</span>
      },
    ]
  }
  runtime_monitoring = {
    auto_enable = <span class="hljs-string">"NONE"</span>
    name        = <span class="hljs-string">"RUNTIME_MONITORING"</span>
    additional_configuration = [
      {
        auto_enable = <span class="hljs-string">"NONE"</span>
        name        = <span class="hljs-string">"EKS_ADDON_MANAGEMENT"</span>
      },
      {
        auto_enable = <span class="hljs-string">"NONE"</span>
        name        = <span class="hljs-string">"ECS_FARGATE_AGENT_MANAGEMENT"</span>
      },
      {
        auto_enable = <span class="hljs-string">"NONE"</span>
        name        = <span class="hljs-string">"EC2_AGENT_MANAGEMENT"</span>
      }
    ]
  }
  malware = {
    auto_enable = <span class="hljs-string">"NONE"</span>
    name        = <span class="hljs-string">"EBS_MALWARE_PROTECTION"</span>
  }
  rds = {
    auto_enable = <span class="hljs-string">"NONE"</span>
    name        = <span class="hljs-string">"RDS_LOGIN_EVENTS"</span>
  }
  lambda = {
    auto_enable = <span class="hljs-string">"NONE"</span>
    name        = <span class="hljs-string">"LAMBDA_NETWORK_LOGS"</span>
  }
}
</code></pre>
<div data-node-type="callout">
<div data-node-type="callout-emoji">⚠</div>
<div data-node-type="callout-text">The <code>EKS_RUNTIME_MONITORING</code> feature has been superseded by the <code>RUNTIME_MONITORING</code> feature, but to avoid perpetual differences in Terraform configuration, we must set its enablement state to <code>NONE</code>.</div>
</div>

<p>We can then use this variable with the <a target="_blank" href="https://developer.hashicorp.com/terraform/language/meta-arguments/for_each"><code>for_each</code> meta-argument</a> with <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/guardduty_organization_configuration_feature">the <code>aws_guardduty_organization_configuration_feature</code> resource</a> as follows:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_guardduty_organization_configuration_feature"</span> <span class="hljs-string">"this"</span> {
  provider    = aws.audit
  for_each    = var.guardduty_features
  auto_enable = each.value.auto_enable
  detector_id = aws_guardduty_detector.audit.id
  name        = each.value.name
  dynamic <span class="hljs-string">"additional_configuration"</span> {
    for_each = try(each.value.additional_configuration, [])
    content {
      auto_enable = additional_configuration.value.auto_enable
      name        = additional_configuration.value.name
    }
  }
  depends_on = [aws_guardduty_organization_admin_account.this]
}
</code></pre>
<p>Lastly, we will circle back to recalibrating the protection plan settings for the <strong>Audit</strong> account itself. Let's piggyback on the same variable and use <a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/guardduty_detector_feature">the <code>aws_guardduty_detector_feature</code> resource</a> to achieve this:</p>
<pre><code class="lang-dockerfile">resource <span class="hljs-string">"aws_guardduty_detector_feature"</span> <span class="hljs-string">"audit"</span> {
  provider    = aws.audit
  for_each    = var.guardduty_features
  detector_id = aws_guardduty_detector.audit.id
  name        = each.value.name
  status      = each.value.auto_enable == <span class="hljs-string">"NONE"</span> ? <span class="hljs-string">"DISABLED"</span> : <span class="hljs-string">"ENABLED"</span>
  dynamic <span class="hljs-string">"additional_configuration"</span> {
    for_each = try(each.value.additional_configuration, [])
    content {
      status = additional_configuration.value.auto_enable == <span class="hljs-string">"NONE"</span> ? <span class="hljs-string">"DISABLED"</span> : <span class="hljs-string">"ENABLED"</span>
      name   = additional_configuration.value.name
    }
  }
}
</code></pre>
<div data-node-type="callout">
<div data-node-type="callout-emoji">✅</div>
<div data-node-type="callout-text">You can find the complete Terraform in the <a target="_blank" href="https://github.com/acwwat/terraform-aws-guardduty-organization-example">GitHub repository</a> that accompanies this blog post.</div>
</div>

<p>With the complete Terraform configuration, you can now apply it to establish the <strong>Audit</strong> account as the delegated administrator and apply organization settings to all accounts in the target region. Note that it will <a target="_blank" href="https://docs.aws.amazon.com/guardduty/latest/APIReference/API_UpdateOrganizationConfiguration.html#guardduty-UpdateOrganizationConfiguration-request-autoEnableOrganizationMembers">take up to 24 hours</a> for GuardDuty to automatically enable it in all accounts. YMMV, but it took about 3 hours in the evening in the Eastern time zone.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">⚠</div>
<div data-node-type="callout-text">There is currently an <a target="_blank" href="https://github.com/hashicorp/terraform-provider-aws/issues/36400">issue</a> where the <code>additional_configuration</code> block order causes differences when applying the Terraform configuration without making any changes.</div>
</div>

<h2 id="heading-caveats-about-suspending-guardduty-in-member-accounts">Caveats about suspending GuardDuty in member accounts</h2>
<p>Due to limitations with the GuardDuty Terraform resources, GuardDuty is unfortunately not automatically disabled when you run <code>terraform destroy</code>. Normally this wouldn't be a problem for a production landing zone. However, if you are only testing, this could lead to unexpected costs especially when GuardDuty is a somewhat costly service.</p>
<p>As a workaround, I would recommend using the AWS CLI or AWS SDK to at least suspend GuardDuty for all members using the <a target="_blank" href="https://docs.aws.amazon.com/guardduty/latest/APIReference/API_StopMonitoringMembers.html"><code>StopMonitoringMembers</code> API</a>. For your convenience, you can use the following shell script to do so before running <code>terraform destroy</code>:</p>
<pre><code class="lang-bash"><span class="hljs-meta">#!/bin/bash</span>

<span class="hljs-comment"># Note: Make sure that you set the AWS_PROFILE environment variable to "audit" before running the script</span>

<span class="hljs-comment"># Get the GuardDuty detector ID</span>
DETECTOR_ID=$(aws guardduty list-detectors --query DetectorIds[0] --output text)

<span class="hljs-comment"># Disable auto-enable organization members</span>
aws guardduty update-organization-configuration --detector-id <span class="hljs-variable">$DETECTOR_ID</span> --auto-enable-organization-member NONE

<span class="hljs-comment"># Loop through each member account and disable GuardDuty</span>
MEMBER_ACCOUNTS=$(aws guardduty list-members --detector-id <span class="hljs-variable">$DETECTOR_ID</span> --query Members[*].AccountId --output text)
<span class="hljs-keyword">for</span> MEMBER_ACCOUNT <span class="hljs-keyword">in</span> <span class="hljs-variable">$MEMBER_ACCOUNTS</span>
<span class="hljs-keyword">do</span>
  <span class="hljs-built_in">echo</span> <span class="hljs-string">"Suspending GuardDuty for account <span class="hljs-variable">$MEMBER_ACCOUNT</span>"</span>
  aws guardduty stop-monitoring-members --account-ids <span class="hljs-variable">$MEMBER_ACCOUNT</span> --detector-id <span class="hljs-variable">$DETECTOR_ID</span>
<span class="hljs-keyword">done</span>
</code></pre>
<h2 id="heading-summary">Summary</h2>
<p>In this blog post, you learned how to manage Amazon GuardDuty in AWS Organizations using Terraform. While there are some caveats, this allows you to streamline the setup of a security baseline for your AWS landing zone. The centralized approach to detective security can help you ensure compliance and timely reaction to security incidents.</p>
<p>I hope you found this blog post helpful. If you are interested in this type of content, be sure to check out other blog posts in the <a target="_blank" href="https://blog.avangards.io">Avangards Blog</a>. Thank you and have a great one!</p>
]]></content:encoded></item></channel></rss>