Troubleshooting

Troubleshooting

This comprehensive troubleshooting guide addresses common issues across the Firefly documentation.

Development Environment Issues

Node Version Requirements

Error: Build failures or compatibility issues

Fix: Ensure Node.js version >= 20.15.0

nvm install 20
nvm use 20
node --version  # Should show v20.15.0 or higher

Firefly Alias Not Found

Error: firefly: command not found

Fix: Make sure you added the firefly alias to your shell profile and reloaded it

# Reload your shell configuration
source ~/.zshrc  # or ~/.bashrc

# Then run the setup
firefly

Authentication Expired or CodeArtifact Issues

Error: Authentication errors or CodeArtifact login failures

Fix: Simply run the firefly alias again

firefly

Commitizen Not Working

Error: git cz command not found or not working

Fix: Reinstall globally

npm install -g commitizen cz-conventional-changelog
echo '{ "path": "cz-conventional-changelog" }' > ~/.czrc

Deployment Issues

MetalFly API Deployment Required

Error: deploy:Core:dev [region] error when submitting MR

Fix: Deploy MetalFly API to all regions before submitting changes

cd metalfly_api/
yarn dev-deploy:all

Uncommitted Changes Error

Error: Deployment fails due to uncommitted changes

Fix: Commit all local changes before developer stack deployment, including:

  • Packed .tgz files

  • package.json updates

  • yarn.lock file changes

git add .
git commit -m "feat: local development changes"

CDK Installation Issues

Error: CDK-related deployment failures

Fix: Ensure yarn install is run in both root and cdk directories

# In project root
yarn install --latest

# In cdk directory
cd cdk
yarn install --latest

Also verify AWS_PROFILE consistency between local environment and cdk/package.json.

Local Development Issues

Docker Compose Issues

Error: Command "nodemon" not found when running docker compose up in MetalFly

Fix: Run yarn install first

yarn install
docker compose up

Docker Compose Command Not Valid

Error: docker compose up or docker-compose up shows “not a valid command”

Fix: If you are running docker compose (with the space), double check your docker CLI installation. If you are running docker-compose (with the hyphen), trying reinstalling:

brew install docker-compose
docker-compose up

JavaScript Heap Out of Memory (Local Development)

Error: “JavaScript heap out of memory” when spinning up MetalFly Core

Fix for Docker Users:

colima start --memory 8

Fix for Podman Users:

  1. Go to Podman Desktop settings and adjust memory to above 8GB

  2. You may also need to run:

podman machine set --memory 8192

Fresh Start for Local Development

Error: Need to clean restart local development environment

Fix using Yarn:

yarn clean:docker

Fix using Docker Compose:

docker-compose down --volumes --remove-orphans && docker network prune

Service Connectivity Issues

Error: Cannot connect to services in local development

Fix:

  • Ensure you’re connected to Cisco VPN in your target region

  • Verify both MetalFly API and Core containers are running

  • Check that credentials are properly refreshed for both services

AWS Credentials in Containers

Error: “Could not load credentials from any providers” when making AWS service calls from MetalFly Core

Fix: The docker compose up command should automatically include credentials. To force inclusion, run this command: docker compose -f docker-compose.override.yml -f docker-compose.yml up. If you still have issues, then you can check their docker logs for the credentials container: docker compose logs ecs-local-endpoints

Memory Issues

Build Memory Issues

Error:

Bundling with Webpack...
(node:28613) [DEP_WEBPACK_EXTERNALS_FUNCTION_PARAMETERS] DeprecationWarning: The externals-function should be defined like ({context, request}, cb) => { ... }

⠼ Packaging (73s)

<--- Last few GCs --->

[28613:0x7fe54284d000]    81151 ms: Scavenge 4041.9 (4127.2) -> 4037.1 (4127.2) MB, 4.6 / 0.0 ms  (average mu = 0.801, current mu = 0.729) allocation failure;
[28613:0x7fe54284d000]    81227 ms: Scavenge 4044.2 (4127.2) -> 4038.7 (4128.0) MB, 8.2 / 0.0 ms  (average mu = 0.801, current mu = 0.729) allocation failure;
[28613:0x7fe54284d000]    81308 ms: Scavenge 4045.5 (4128.2) -> 4039.8 (4145.7) MB, 26.6 / 0.0 ms  (average mu = 0.801, current mu = 0.729) allocation failure;

<--- JS stacktrace --->

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory

Fix:

export NODE_OPTIONS="--max-old-space-size=8192"

Note: The firefly alias automatically sets NODE_OPTIONS=--max_old_space_size=8172 to prevent memory issues.

Authentication and API Issues

Common Error Codes

ACCESS_DENIED (403)

  • Cause: Not providing an authorization token; not providing an authorization header to firefly; an invalid API key; an improperly set-up data provider service using CloudAuth, or if no token exists.

  • Fix: Check your authorization token and API key. Ensure they are properly formatted and not expired.

INTERNAL (500)

  • Cause: Default catch-all when the error doesn’t fit in any specified category.

  • Fix: Check the error details. If not available, view the FireFly logs.

THROTTLED (429)

  • Cause: API key rate limit exceeded.

  • Fix: If testing, make fewer requests. If in production, contact the FireFly team to raise your rate limit using this ticket template.

SERVICE_TIMEOUT (408)

  • Cause: FireFly has reserved 3 seconds to call the upstream data provider service. If the service takes longer than 3 seconds, FireFly abandons the call.

  • Fix: Retry or investigate if the issue is with the data provider service.

AUTH_INVALID_CLIENT_ID (401) AUTH_INVALID_CUSTOMER_ID (401) AUTH_INVALID_DEVICE_ID (401)

  • Cause: Incorrect token or incorrect API key.

  • Fix: Check Client Onboarding documentation, ensure token is formatted correctly and regenerate the token. Tokens are valid for 1 hour maximum.

Authentication Failed in GraphQL Console

Error: “Authentication failed” in GraphQL Console

Fix: Make sure you’re signed in to music.amazon.com with the same account, then refresh your token.

Query Failed or Timeout

Error: “Query failed” or timeout errors

Fix: Check query syntax (explorer highlights errors) and try simpler queries first.

Insomnia Setup Issues

Unable to Resolve deviceId

Error: “Unable to resolve deviceId. Check request and/or headers”

Fix:

  • Check the “Auth” tab of the request

  • Ensure authentication type is set to “OAuth 2.0” (not “None” or “Inherit from parent”)

  • Verify all populated fields in the OAuth 2.0 form are correct

  • Click “Clear OAuth 2 Session”

  • Click “Fetch Tokens” and follow the popup login flow

OAuth2 Failed to Refresh Token

Error: [oauth2] Failed to refresh token url=https://api.amazon.com/auth/o2/token status=400

Fix: This is a known bug in Insomnia. Simply click “Fetch Tokens” again to go through the login flow once more.

Client ID Unset Error

Error: “We’re sorry! An error occurred when we tried to process your request. Rest assured, we’re already working on the problem and expect to resolve it shortly”

Fix: This occurs when the Client ID is unset. Ensure it is updated with _.oAuthClientId and try the request again.

Client Secret Invalid

Error: Failed to fetch token url=https://api.amazon.com/auth/o2/token status=401 no description provided

Fix: This occurs when the Client Secret is unset or invalid. Ensure it is updated with _.oAuthClientSecret and try the request again.

Invalid Scope Error

Error: OAuth 2.0 Error invalid_scope An unknown scope was requested undefined

Fix: Click on the “Auth” tab and under Advanced Options, set the “Scope” field to profile. Try the request again.

Service Integration Issues

CloudAuth Service Integration

Error: "message": "getaddrinfo ENOTFOUND mis-q1t-vui-na-p-tcp.iad.amazon.com"

Fix:

  • Check if VPC/CloudAuth endpoints are defined correctly

  • May have missed required CloudAuth code changes (within amu_webapi or fireflyservicecdk)

  • May have mismatched dev and prod endpoints

  • Check connection in the FireFly VPC console

  • Create a lambda in the same VPC and try to hit the service VPC endpoint to check if connection is established

Error: Request failed with status code 404

Fix:

  • Check if you missed CloudAuth configurations in the code

  • Check service/targetModel/path/CloudAuth endpoints defined in the service package

  • Verify service name in constants.ts matches service name in FF_package

  • Check if service endpoints require a path to be defined in FF_package

Error: Request failed with status code 401

Fix:

  • Check AAA relationship between FireFly and the service you are integrating

  • Check if AuthToken is being generated for the API you are calling in the service

Error: Request failed with status code 500

Fix:

  • Check if input parameters are correct

  • Refresh the LWA token

Yarn Package Management Issues

Yarn Pack Not Reflecting Changes

Error: Deployment not reflecting changes after yarn pack

Fix:

  • You may need to bump the version number of the client package in package.json under version

  • Run yarn cache clean

  • Increment the dependency’s version number to get yarn cache to recognize newly-built changes in the .tgz file

Yarn Install Errors with Packed Files

Error: Error when running yarn install with packed files

Fix:

  • Check for spaces after file: in package.json that need to be removed

  • Run yarn cache clean

  • Increment the dependency’s version number

Testing Issues

Integration Test Failures

Error: Integration tests failing against deployed endpoints

Fix:

  • Ensure you have created the endpoint.env file in the root directory with your endpoint values:

NA_ENDPOINT=https://REPLACE-WITH-YOUR-NA-ENDPOINT.execute-api.us-east-1.amazonaws.com/dev/
EU_ENDPOINT=https://REPLACE-WITH-YOUR-EU-ENDPOINT.execute-api.eu-west-1.amazonaws.com/dev/
FE_ENDPOINT=https://REPLACE-WITH-YOUR-FE-ENDPOINT.execute-api.us-west-2.amazonaws.com/dev/

Lambda Role Access Issues

Error: Token returns null for new client integration

Fix: Have the Devex API Team on-call add your Lambda role to the AAA page for MusicFireFlyService.

Cache Debugging Issues

Cache Entries Not Found

Error: Cannot find expected cache entries

Fix:

  1. Check other endpoints in the Elasticache cluster, as cache entries may exist in any of them

  2. Try using different wildcards (*, customerId, API) for the search

  3. Check the TTL for the entry - it may be deleting quickly

  4. Make sure you’re using the Elasticache endpoint for the correct region

  5. Verify your code is being hit and not bypassed due to pre-existing cache values

Getting Help

Support Channels

Additional Resources