GraphQL Load Testing with Artillery
GraphQL Load Testing with Artillery¶
This document outlines the comprehensive requirements and steps for the FireFly user to perform load testing in FireFly. It includes an MCM creation and instructions on using FireFly-LoadGenArtillery to conduct the load test.
Load Testing Requirement¶
To prepare for load testing, user needs to:
Add test scenario to FireFly-LoadGenArtillery and verify the test scenario correctness with low TPS such as executing test with 1 TPS.
Evaluate the traffic projection and update FireFly Heron TPS/SLA document with the estimated traffic projection for Heron launch.
Create MCM using TM-121647: Music FireFly Load Test Template. It should look like this reference MCM.
Get sign-off on MCM from:
FireFly load test POC (@fkhalee or @sophap)
Panda team for token generation. Note: We request the token from Panda only once at the start of the load test. Since the token is valid for only one hour, please make sure the test completes within that time frame.
Related downstream services which will be invoked by your GraphQL query.
Prepare a list of user credentials (email + password) to use in load test.
Execute load test.
Load Testing Setup¶
FireFly provides load testing tool - FireFly-LoadGenArtillery. This tool utilized open source tool - Artillery.
FireFly-LoadGenArtillery¶
This is the package where we defined Artillery configuration/setup to perform load test against FireFly query and mutation.
Getting started¶
First, you need to setup your machine using following commands:
yarn auth
yarn install
Note: Make sure you are using Node18 or greater
Package structure¶
config.yaml¶
This file contains FireFly endpoint configuration and TPS configuration which you can update based on your requirement.
Scenario¶
This package contains all the FireFly queries that you can use to execute load test.
credentials.csv¶
This file is to be updated with the credential of the Amazon Music accounts to use for this test run.
Test execution¶
Prior to load test, you will need to do the following:
Update user credentials in credentials.csv to use for load test
Update TPS and duration of the test in config.yml
- duration: 1 # run for N second
arrivalRate: 1 # TPS
Note: For a complete list of options for your use case, please refer to the Artillery documentation.
Executing a load test from your machine¶
Navigate to root of the project directory. For example,
cd <your workspace>/firefly-loadgenartilleryRun:
region=<region such as us-east-1, us-west-2, or eu-west-1> scenario=src/scenario/<your scenario directory>/<your test scenario.yml> yarn setup
For instance,
region=eu-west-1 scenario=src/scenario/podcast/PodcastEpisodeDetailQuery.yaml yarn setup. This step retrieves an authentication token from the Panda service, merges the test scenario withconfig.yml, and publishes all required files to the/generateddirectory.Execute Artillery command
artillery run generated/generatedScenario.yml \
--output <report.json> \
--variables <variables_for_query> \
Option explanation¶
--output: Optional field. If you want to persist the summary of the test result which include min, max, median, P50, p70, p90, etc. The result from the test will be saved into the file specifies in--output. You can generate HTML report based on this output file usingartillery report report.json. This will include visualizations of rates, latencies, etc.--variables: Required field. You may also specify other variables needed to execute your request. For example, ArtistDetailQuery.yaml expects an Album ASIN using theidvariable.
Documentation on the other Artillery variables can be found here.
Example¶
This is an example command to execute FireFly album detail query which expects album ASIN
artillery run generated/generatedScenario.yml \
--output ArtistDetailQuery-report.json \
--variables '{ "id": "Replace_With_Artist_ASIN" }'
Or below command if you want to see the network call and the response
DEBUG=http,http:response artillery run generated/generatedScenario.yml \
--output ArtistDetailQuery-report.json \
--variables '{ "id": "Replace_With_Artist_ASIN" }'
Executing a load test using Fargate¶
Setup¶
This guide describes Artillery’s support for running highly-distributed serverless load tests on AWS Fargate.
To execute tests in AWS Fargate the Artillery CLI makes use of AWS SDK to create the resources needed to run your tests . To execute the test on Fargate, you need to follow steps below:
Create or use your existing AWS account
Create the following IAM policy and attach it to your IAM user
Note that 123456789000 will need to be replaced with the id of the AWS account you’ll be using.
Click to expand
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "CreateOrGetECSRole",
"Effect": "Allow",
"Action": ["iam:CreateRole", "iam:GetRole", "iam:AttachRolePolicy"],
"Resource": "arn:aws:iam::123456789000:role/artilleryio-ecs-worker-role"
},
{
"Sid": "CreateECSPolicy",
"Effect": "Allow",
"Action": ["iam:CreatePolicy"],
"Resource": "arn:aws:iam::123456789000:policy/artilleryio-ecs-worker-policy"
},
{
"Effect": "Allow",
"Action": ["iam:CreateServiceLinkedRole"],
"Resource": [
"arn:aws:iam::*:role/aws-service-role/ecs.amazonaws.com/AWSServiceRoleForECS*"
],
"Condition": {
"StringLike": {
"iam:AWSServiceName": "ecs.amazonaws.com"
}
}
},
{
"Effect": "Allow",
"Action": ["iam:PassRole"],
"Resource": ["arn:aws:iam::123456789000:role/artilleryio-ecs-worker-role"]
},
{
"Sid": "SQSPermissions",
"Effect": "Allow",
"Action": ["sqs:*"],
"Resource": "arn:aws:sqs:*:123456789000:artilleryio*"
},
{
"Sid": "SQSListQueues",
"Effect": "Allow",
"Action": ["sqs:ListQueues"],
"Resource": "*"
},
{
"Sid": "ECSPermissionsGeneral",
"Effect": "Allow",
"Action": [
"ecs:ListClusters",
"ecs:CreateCluster",
"ecs:RegisterTaskDefinition",
"ecs:DeregisterTaskDefinition"
],
"Resource": "*"
},
{
"Sid": "ECSPermissionsScopedToCluster",
"Effect": "Allow",
"Action": ["ecs:DescribeClusters", "ecs:ListContainerInstances"],
"Resource": "arn:aws:ecs:*:123456789000:cluster/*"
},
{
"Sid": "ECSPermissionsScopedWithCondition",
"Effect": "Allow",
"Action": [
"ecs:SubmitTaskStateChange",
"ecs:DescribeTasks",
"ecs:ListTasks",
"ecs:ListTaskDefinitions",
"ecs:DescribeTaskDefinition",
"ecs:StartTask",
"ecs:StopTask",
"ecs:RunTask"
],
"Condition": {
"ArnEquals": {
"ecs:cluster": "arn:aws:ecs:*:123456789000:cluster/*"
}
},
"Resource": "*"
},
{
"Sid": "S3Permissions",
"Effect": "Allow",
"Action": [
"s3:CreateBucket",
"s3:DeleteObject",
"s3:GetObject",
"s3:GetObjectAcl",
"s3:GetObjectTagging",
"s3:GetObjectVersion",
"s3:PutObject",
"s3:PutObjectAcl",
"s3:ListBucket",
"s3:GetBucketLocation",
"s3:GetBucketLogging",
"s3:GetBucketPolicy",
"s3:GetBucketTagging",
"s3:PutBucketPolicy",
"s3:PutBucketTagging",
"s3:PutMetricsConfiguration",
"s3:GetLifecycleConfiguration",
"s3:PutLifecycleConfiguration"
],
"Resource": [
"arn:aws:s3:::artilleryio-test-data-*",
"arn:aws:s3:::artilleryio-test-data-*/*"
]
},
{
"Sid": "LogsPermissions",
"Effect": "Allow",
"Action": ["logs:PutRetentionPolicy"],
"Resource": [
"arn:aws:logs:*:123456789000:log-group:artilleryio-log-group/*"
]
},
{
"Effect": "Allow",
"Action": ["secretsmanager:GetSecretValue"],
"Resource": ["arn:aws:secretsmanager:*:123456789000:secret:artilleryio/*"]
},
{
"Effect": "Allow",
"Action": [
"ssm:PutParameter",
"ssm:GetParameter",
"ssm:GetParameters",
"ssm:DeleteParameter",
"ssm:DescribeParameters",
"ssm:GetParametersByPath"
],
"Resource": [
"arn:aws:ssm:us-east-1:123456789000:parameter/artilleryio/*",
"arn:aws:ssm:us-east-2:123456789000:parameter/artilleryio/*",
"arn:aws:ssm:us-west-1:123456789000:parameter/artilleryio/*",
"arn:aws:ssm:us-west-2:123456789000:parameter/artilleryio/*",
"arn:aws:ssm:ca-central-1:123456789000:parameter/artilleryio/*",
"arn:aws:ssm:eu-west-1:123456789000:parameter/artilleryio/*",
"arn:aws:ssm:eu-west-2:123456789000:parameter/artilleryio/*",
"arn:aws:ssm:eu-west-3:123456789000:parameter/artilleryio/*",
"arn:aws:ssm:eu-central-1:123456789000:parameter/artilleryio/*",
"arn:aws:ssm:eu-north-1:123456789000:parameter/artilleryio/*",
"arn:aws:ssm:ap-south-1:123456789000:parameter/artilleryio/*",
"arn:aws:ssm:ap-east-1:123456789000:parameter/artilleryio/*",
"arn:aws:ssm:ap-northeast-1:123456789000:parameter/artilleryio/*",
"arn:aws:ssm:ap-northeast-2:123456789000:parameter/artilleryio/*",
"arn:aws:ssm:ap-southeast-1:123456789000:parameter/artilleryio/*",
"arn:aws:ssm:ap-southeast-2:123456789000:parameter/artilleryio/*",
"arn:aws:ssm:me-south-1:123456789000:parameter/artilleryio/*",
"arn:aws:ssm:sa-east-1:123456789000:parameter/artilleryio/*"
]
},
{
"Effect": "Allow",
"Action": [
"ec2:DescribeRouteTables",
"ec2:DescribeVpcs",
"ec2:DescribeSubnets"
],
"Resource": ["*"]
}
]
}
Go to this IAM user -> Security Credentials -> Create access key. Note down access key information
In your dev laptop terminal, configure AWS credential using access key information above. Execute
aws configureand provide all the required information.Execute the test.
Command¶
artillery run-fargate generated/generatedScenario.yml --region us-east-1
Note: Please refer to official Artillery run-fargate documentation to view the complete options.
How it works¶
Artillery will create a number of AWS resources behind the scenes to be able to execute your tests. All resources created by Artillery are serverless and created on-demand. There are no long-running infrastructure components involved.
Executing a load test using AWS Lambda¶
Setup¶
This guide describes Artillery’s support for running highly-distributed serverless load tests on AWS Lambda.
To execute tests in AWS Lambda the Artillery CLI makes use of AWS SDK to create the resources needed to run your tests . To execute the test on Lambda, you need to follow steps below:
Create or use existing AWS account
Create the following IAM policy and attach it to your IAM user
Note that 123456789000 will need to be replaced with the id of the AWS account you’ll be using.
Click to expand
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "CreateOrGetLambdaRole",
"Effect": "Allow",
"Action": [
"iam:CreateRole",
"iam:GetRole",
"iam:PassRole",
"iam:AttachRolePolicy"
],
"Resource": "arn:aws:iam::123456789000:role/artilleryio-default-lambda-role-*"
},
{
"Sid": "CreateLambdaPolicy",
"Effect": "Allow",
"Action": ["iam:CreatePolicy"],
"Resource": "arn:aws:iam::123456789000:policy/artilleryio-lambda-policy"
},
{
"Sid": "SQSPermissions",
"Effect": "Allow",
"Action": ["sqs:*"],
"Resource": "arn:aws:sqs:*:123456789000:artilleryio*"
},
{
// ListQueues does cannot be scoped to individual resources
// https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonsqs.html#amazonsqs-queue
"Sid": "SQSListQueues",
"Effect": "Allow",
"Action": ["sqs:ListQueues"],
"Resource": "*"
},
{
"Sid": "LambdaPermissions",
"Effect": "Allow",
"Action": [
"lambda:InvokeFunction",
"lambda:CreateFunction",
"lambda:DeleteFunction",
"lambda:GetFunctionConfiguration"
],
"Resource": "arn:aws:lambda:*:123456789000:function:artilleryio-*"
},
{
"Sid": "EcrPullImagePermissions",
"Effect": "Allow",
"Action": [
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage"
],
"Resource": "*",
"Condition": {
"StringLike": {
"aws:sourceArn": "arn:aws:lambda:*:123456789000:function:artilleryio-*"
}
}
},
{
"Sid": "S3Permissions",
"Effect": "Allow",
"Action": [
"s3:CreateBucket",
"s3:DeleteObject",
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket",
"s3:GetLifecycleConfiguration",
"s3:PutLifecycleConfiguration"
],
"Resource": [
"arn:aws:s3:::artilleryio-test-data-*",
"arn:aws:s3:::artilleryio-test-data-*/*"
]
}
]
}
Go to this IAM user -> Security Credentials -> Create access key. Note down access key information
In your dev laptop terminal, configure AWS credential using access key information above. Execute
aws configureand provide all the required information.Execute the test.
Command¶
artillery run-lambda generated/generatedScenario.yml --region us-east-1
Note: Please refer to official Artillery run-lambda documentation to view the complete options.
How it works¶
Artillery will create a number of AWS resources behind the scenes to be able to execute your tests. All resources created by Artillery are serverless and created on-demand. There are no long-running infrastructure components involved.
Limitations¶
AWS Lambda support is in preview. There are some limitations to what’s possible, and you may run into bugs. Please report any issues via GitHub issues on https://github.com/artilleryio/artillery/issues
Each AWS Lambda is limited to 15 minutes of running time, which means that the entire load test cannot run for longer than 15 minutes at the moment.
Once an AWS Lambda starts running, there is no way to stop it. Neither the AWS SDK, nor the AWS Console provide that ability. This means that once a load test starts, it will run to completion. Be mindful of this, and ramp up load on your applications gradually.
Test Termination¶
To backoff on stress/load testing we can stop/terminate lambda or fargate containers any time from the AWS account.
Navigate to the Fargate container using AWS Console
Delete the containers to terminate the test immediately in case of issues
Customer Account Pool For Load Test¶
Each experience owner can have different test parameters, where test scenarios need a pool of test accounts with variations. In some cases, downstream services will have requirements as well in terms of the minimum number of accounts required to simulate production traffic.
Load test query scenarios could be account agnostic (catalog queries) or have heavy customization logic based on the account parameters (recommendations, browse home) such as customer tier, locale, etc.
Experience owners can create a pool of test accounts based on their test requirements. There are a few ways to create test accounts in bulk. Load test POCs can evaluate the following test account tools to determine what works for their use-case.
Each service/tool has its pros and cons. For example, Kamino UI is meant to create permanent test accounts one at a time. Whereas others may create temporary accounts in bulk. All the services above invoke Tipoca service under the hood and apply account decorators or subscriptions as needed.
Load test owners must determine the number of test accounts in their pool based on their query behavior and downstream service requirements. Use the tools above to create account pools and plug them into the load test tool based on the instructions above.
Theoretically, FireFly-LoadGenArtillery should be able to handle a few hundred accounts, up to 1k based on the approval from Panda/MIS/Stratus depending on your load test scenario.
Since FireFly has cross-region routing enabled, load test for a specific region must use accounts which belong to the region. Using an NA account in EU will result in the requests getting routed to NA.
Test Windows¶
Load tests must strictly follow these test windows:
FE: 10:30 AM - 15:00 PM PST
EU: 15:30 PM - 20:00 PM PST
NA: 21:30 PM — 02:00 AM PST
Customer traffic in prod is at its lowest during this time period. Service owners will not approve a load test during peak traffic window as it can negatively impact prod traffic due to throttling, brownout or total blackouts.
Test Endpoints¶
To ensure that the traffic is routed to regional gateways accurately, use these regional gateway endpoints:
us-east-1.gql.music.amazon.dev
eu-west-1.gql.music.amazon.dev
us-west-2.gql.music.amazon.dev
FAQ¶
What are the key differences and appropriate use cases for artillery run-lambda and artillery run-fargate in load testing?¶
Use artillery run-lambda when you need serverless and execute for short-duration (up to 15 minutes) with moderate load.
Use artillery run-fargate for long-running, resource-intensive, or complex.
How do I disable cache lookup in FireFly during load testing?¶
To disable cache in FireFly for load test requests, add the header cache-control: no-cache to the request headers in the config.yml file as shown below:
config:
target: 'https://gql.music.amazon.dev'
phases:
- duration: 1
arrivalRate: 1
processor: "util.js"
defaults:
headers:
content-type: 'application/json'
x-api-key: amzn1.application.be92e0e8c3344e7ea5f211e7f107547c
x-amzn-test-call: true # https://sage.amazon.dev/posts/1323527?t=7
cache-control: no-cache
payload:
- path: authTokens.csv
fields:
- 'token'
The supported options for cache-control header are:
no-cache– avoids checking the cache, but will store the result of any subsequent upstream service requestsno-store– checks the cache, but does not modify it if an upstream service request is requiredno-cache, no-store– does not check the cache, and does not modify it with any responses from upstream services
When should I execute the test on my local machine?¶
You can run the test on your local machine if the TPS is small, for example, less than few hundred (each dev machine is different). Running large load tests on your local machine is not recommended due to limitations in networking and resources (CPU, memory, etc.).
When should I execute the test on Fargate?¶
If you need to run a larger load or if you encounter issues running the test on your local machine, you should execute the test on Fargate.
